[Project D] Devlog #5: Setting up Multiple Databases

Before we begin, you can check out the previous entry in this series over here.


Today was an interesting day. In the previous entry in this series, I mentioned that I would attempt to implement the new design for the reporting APIs based on the discussion I had with a colleague. I implemented that a couple of days ago but I forgot to put it in the devlog. The only thing left for me to do with that was work out some date-related stuff. It's on the back burner for now as I have a new task (which after completion should make it easier for me to work with the dates anyhow).

That brings me to today's event: Setting up multiple databases(data sources) in Spring Boot. I will try to walk through the steps I used as much as I can so that it's easier to follow for future readers.

First off, my project had Spring batch already setup because I needed to do some batch processing of data so there was an issue surrounding that (but I'm getting ahead of myself).

I started by looking up a few tutorials online which made it clear what I needed to do, the only thing left was execution.

Basically, if we have one data source, spring boot configures the data source, the entity manager and the transaction manager. However, when we have more than one Spring boot becomes confused so it doesn't know which one to configure so we have to do it manually. Of course, spring boot provides all the tools that we need to do this ourselves.

So the first step is to create a Configuration for your DataSources. You can either do this one class or create multiple classes for your DataSources. I opted to use a single class for the configuration. So the configuration was defined like so

@Configuration 
public class DataSourceConfiguration {
    @Bean
    @ConfigurationProperties(prefix="...")
    public DataSourceProperties dbOneProperties() {
        return new DataSourceProperties();
    }

    // ... do same for DB2

    @Primary
    @Bean(name="dbOneDataSource")
    public DataSource dataSource() {
        return dbOneProperties()
                .initializeDataSourceBuilder()
                .build();
    }

    //... do same for DB2
}

That was basically it for my data source configuration. In this class, I defined the properties that should be loaded from my application-*.properties file. I have a prefix defined for each data source in my properties file so that is where the dbOneProperties bean will pick the config for db1 from and the prefix for db2 is where the data will be picked from for db2 and so on.

The next thing was to set up JPA so that I could have JPA create my entities and link correctly to my repositories. This was the approach I went with

@Configuration
@EnableTransactionManagement
@EnableJpaRepositories(
    basePackages = "...packageToScanPath",
    entityManagerFactoryRef="eMFBeanName",
    transactionManagerRef="tmrBeanName"
)
public class DbOneJpaConfig {

    @Primary
    @Bean(name="eMFBeanName")
     public LocalContainerEntityManagerFactoryBean entityManagerFactoryGeneral(
            EntityManagerFactoryBuilder builder,
            @Qualifier("dbOneDataSource") DataSource dataSource
    ) {
        final var em =  builder
                .dataSource(dataSource)
                .packages("...entitiesPath")
                .persistenceUnit("persistentUnitName")
                .properties(hibernatePropertiesForGeneral())
                .build();

        // Set the naming strategies
        em.getJpaPropertyMap().put("hibernate.physical_naming_strategy", "org.springframework.boot.orm.jpa.hibernate.SpringPhysicalNamingStrategy");
        em.getJpaPropertyMap().put("hibernate.implicit_naming_strategy", "org.springframework.boot.orm.jpa.hibernate.SpringImplicitNamingStrategy");

        return em;
    }

    private Map<String, Object> hibernatePropertiesForGeneral() {
        Map<String, Object> properties = new HashMap<>();
        properties.put("hibernate.hbm2ddl.auto", "update");
        properties.put("hibernate.dialect", "org.hibernate.dialect.PostgreSQLDialect");

        return properties;
    }

    @Primary
    @Bean(name = "tmrBeanName")
    public PlatformTransactionManager transactionManagerGeneral(@Qualifier("eMFBeanName") EntityManagerFactory entityManagerFactory) {
        return new JpaTransactionManager(entityManagerFactory);
    }
}

What I did here is set up JPA for data source one. You can pretty much do the same for data source 2..N, just make sure you change the bean names accordingly.

What is happening here is that I enabled transaction management and jpa repositories for DBOne. What basePackages does is, it tells JPA which packages to look for its repositories from. entityManagerFactoryRef and transactionManagerRef specify the entity manager and transaction manager respectively. You may have noticed that I am setting some hibernate.implicit_naming_strategy & hibernate.physical_naming_strategy. I set these values because after initial setup when I tried to generate my tables, I ended up in a state where tables that were supposed to exist were being created again, which was not ideal. This happened because Spring didn't know which strategy to use, it defaulted to hibernate's which in my case wasn't very ideal. I was using the PostgreSQL dialect initially, which tells hibernate how to interact with my database. Because PostgreSQL favors snake case but hibernate favors camel-casing, I ended up with tables like account_contact and accountcontact. These 2 are supposed to be the same but now hibernate was confused (and I as well), so setting the naming strategy and dialect helped resolve this issue. It was even trying to overwrite existing fields in my tables that had only one name because in that case, the tables would have the same name.

Before I set this up, I had to contend with the fact that Spring batch could not determine my database type and as such, my app could not build. As I mentioned earlier, with only one data source, everything will be configured automatically for you but if you have more than one, there's confusion. Luckily, the fix to this was to (after correctly setting up the data source configuration), I had to create my own BatchConfigurer. It looks like this

@Bean
    public BatchConfigurer batchConfigurer(@Qualifier("yourDesiredDataSource") DataSource dataSource) {
        return new DefaultBatchConfigurer(dataSource);
    }

Because I already had existing data and wanted to make the update as harmless as possible, I opted to use the data source I configured for my primary. Speaking of primary, you may have noticed some @Primary annotation. What this does is tell Spring Boot that it should default to those beans for any auto-wiring that needs to be done (an over-simplification). So in this case, it will default to dbOne et al as primary candidates for dataSource, entityManager, and transactionManager.

After I tested and everything was working, I decided to call it a day. I mentioned earlier that I ran into a situation where tables that were existing were being recreated, this was all part of my testing. It led me down a few rabbit holes but I managed to resolve all issues and now everything works (I think). The next thing is to set up the repositories so that I can see how best to insert the data.

That's all from me for now. Until next time, I bid you adieu.