'Not able to run multiple where clause without Cassandra allow filtering

Hi I am new to Cassandra. We are working on IOT project where car sensor data will be stored in cassandra.

Here is the example of one table where I am going to store one of the sensor data. enter image description here

This is some sample data. The way I want to partition the data is based on the organization_id so that different organization data is partitioned.

Here is the create table command:

CREATE TABLE IF NOT EXISTS autonostix360.engine_speed (
id UUID,
engine_speed_rpm text,
position int,
vin_number text,
last_updated timestamp,
organization_id int,
odometer int,
PRIMARY KEY ((id, organization_id), vin_number)
);

This works fine. However all my queries will be as bellow:

select * from engine_speed
where vin_number='xyz'
and organization_id = 1 
and last_updated >='from time stamp' and last_updated <='to timestamp'

Almost all queries in all the table will have similar / same where clause.

I am getting error and it is asking to add "Allow filtering". Kindly let me know how do I partition the table and define right primary key and indexs so that I don't have to add "allow filtering" in the query.

Apologies for this basic question but I'm just starting using cassandra.(using apache cassandra:3.11.12 )



Solution 1:[1]

The order of where clause should match with the order of partition and clustering keys you have defined in your DDL and you cannot skip any part of primary key while applying the WHERE clause before using the next key. So as per the query pattern u have defined, you can try the below DDL:

CREATE TABLE IF NOT EXISTS autonostix360.engine_speed (
vin_number text,
organization_id int,
last_updated timestamp,

id UUID,
engine_speed_rpm text,
position int,
odometer int,
PRIMARY KEY ((vin_number, organization_id), last_updated)
);

But remember,

PRIMARY KEY ((vin_number, organization_id), last_updated)

PRIMARY KEY ((vin_number), organization_id, last_updated)

above two are different in Cassandra, In case 1 your data will be partitioned by combination of vin_number and organization_id while last_updated will act as ordering key. In case 2, your data will be partitioned only by vin_number while organization_id and last_updated will act as ordering key. So you need to figure out which case suits your use case.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 rafi ansari