'Why `spring-data-jpa` with `spring-data-cassandra` won't create cassandra tables automatically?

I'm using spring-data-cassandra:3.1.9 and the properties looks like :

spring:
  data:
    cassandra:
      keyspace-name: general_log
      session-name: general_log
      local-datacenter: datacenter1
      schema-action: CREATE
  • Cassandra version: apache-cassandra-4.0.1
  • spring-boot: 2.4.7
  • spring-data-jpa: 2.4.9
  • spring-jdbc: 5.3.8
  • spring-orm: 5.3.8

My entity looks like:

@ApiModel(description = "Audit log")
@Entity
@Table(name = "audit_log")
@org.springframework.data.cassandra.core.mapping.Table("audit_log")
public class AuditLogPO implements Serializable {

    @PrimaryKeyClass
    public static class Id implements Serializable {
        private static final long serialVersionUID = 1L;

        @ApiModelProperty(value = "业务标识")
        @Column(name = "business_key")
        @PrimaryKeyColumn(ordinal = 1, ordering = Ordering.ASCENDING)
        private String businessKey;
        // setters & getters ...
    }

    @javax.persistence.Id
    @PrimaryKey
    @org.springframework.data.annotation.Id
    @Transient
    private Id id;

    @ApiModelProperty(value = "业务分区")
    @Column(name = "business_partition")
    @org.springframework.data.cassandra.core.mapping.Column(value = "business_partition")
    private String businessPartition;
    // getters & setters ...
}

After running this application, table audit_log will not be created automatically.



Solution 1:[1]

Actually, after digging into the source code located in spring-data-cassandra:3.1.9, you can check the implementation:

org.springframework.data.cassandra.config.SessionFactoryFactoryBean#performSchemaAction

wich implementation as following:

protected void performSchemaAction() throws Exception {

    boolean create = false;
    boolean drop = DEFAULT_DROP_TABLES;
    boolean dropUnused = DEFAULT_DROP_UNUSED_TABLES;
    boolean ifNotExists = DEFAULT_CREATE_IF_NOT_EXISTS;

    switch (this.schemaAction) {
        case RECREATE_DROP_UNUSED:
            dropUnused = true;
        case RECREATE:
            drop = true;
        case CREATE_IF_NOT_EXISTS:
            ifNotExists = SchemaAction.CREATE_IF_NOT_EXISTS.equals(this.schemaAction);
        case CREATE:
            create = true;
        case NONE:
        default:
            // do nothing
    }

    if (create) {
        createTables(drop, dropUnused, ifNotExists);
    }
}

which means you have to assign CREATE to schemaAction if the table has never been created. And CREATE_IF_NOT_EXISTS dose not work.


Unfortunately, we've not done yet.

SessionFactoryFactoryBean#performSchemaAction will be invoked as expected, however tables are still not be created, why?

It is because Spring Data JPA will add entities in org.springframework.data.cassandra.repository.support.CassandraRepositoryFactoryBean#afterPropertiesSet(org.springframework.data.mapping.context.AbstractMappingContext#addPersistentEntity(org.springframework.data.util.TypeInformation<?>)). But performSchemaAction method will be invoked in SessionFactoryFactoryBean. And all of these two FactoryBeans do not have an order and we do not know who will be firstly invoked.

Which means if SessionFactoryFactoryBean#afterPropertiesSet has been invoked firstly, probably no Entity is already there. In this circumstance, no tables will be created automatically for sure.


And how to create these tables automatically?

One solution is that you can invoke SessionFactoryFactoryBean#performSchemaAction in a bean of ApplicationRunner manually.

First of all, let's create another class extends from SessionFactoryFactoryBean as:

public class ExecutableSessionFactoryFactoryBean extends SessionFactoryFactoryBean {
    @Override
    public void createTables(boolean drop, boolean dropUnused, boolean ifNotExists) throws Exception {
        super.createTables(drop, dropUnused, ifNotExists);
    }
}

Next we should override org.springframework.data.cassandra.config.AbstractCassandraConfiguration#cassandraSessionFactory as:

@Override
@Bean
@Primary
public SessionFactoryFactoryBean cassandraSessionFactory(CqlSession cqlSession) {
    sessionFactoryFactoryBean = new ExecutableSessionFactoryFactoryBean();

    // Initialize the CqlSession reference first since it is required, or must not be null!
    sessionFactoryFactoryBean.setSession(cqlSession);

    sessionFactoryFactoryBean.setConverter(requireBeanOfType(CassandraConverter.class));
    sessionFactoryFactoryBean.setKeyspaceCleaner(keyspaceCleaner());
    sessionFactoryFactoryBean.setKeyspacePopulator(keyspacePopulator());
    sessionFactoryFactoryBean.setSchemaAction(getSchemaAction());

    return sessionFactoryFactoryBean;
}

Now we can create an ApplicationRunner to perform the schema action:

@Bean
public ApplicationRunner autoCreateCassandraTablesRunner() {
    return args -> {
        if (SchemaAction.CREATE.name().equalsIgnoreCase(requireBeanOfType(CassandraProperties.class).getSchemaAction())) {
            sessionFactoryFactoryBean.createTables(false, false, true);
        }
    };
}

Solution 2:[2]

please refer this doc https://docs.spring.io/spring-data/cassandra/docs/4.0.x/reference/html/#cassandra.schema-management.initializing.config

But you still need to create keyspace before excuting the following codes:

@Configuration
public class SessionFactoryInitializerConfiguration extends AbstractCassandraConfiguration {

  @Bean
  SessionFactoryInitializer sessionFactoryInitializer(SessionFactory sessionFactory) {

    SessionFactoryInitializer initializer = new SessionFactoryInitializer();
    initializer.setSessionFactory(sessionFactory);

    ResourceKeyspacePopulator populator = new ResourceKeyspacePopulator();
    populator.setSeparator(";");
    populator.setScripts(new ClassPathResource("com/myapp/cql/db-schema.cql"));

    initializer.setKeyspacePopulator(populator);

    return initializer;
  }

  // ...
}

Solution 3:[3]

You can also specify this behavior in your application.yml:

spring:
  data:
    cassandra:
      schema-action: create-if-not-exists

Although, you will need to create the keyspace (with appropriate data center / replication factor pairs) ahead of time.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Gemini Keith
Solution 2 Lingsheng Meng
Solution 3 Aaron