'An Alternative to Counter Columns in ScyllaDB
According to the official documentation, among the restrictions on counter columns in ScyllaDB
are:
The only other columns in a table with a counter column can be columns of the primary key (which cannot be updated). No other kinds of column can be included. This limitation safeguards correct handling of counter and non-counter updates by not allowing them in the same operation.
All non-counters columns in the table must be part of the primary key.
Counter columns, on the other hand, may not be part of the primary key.
In my table design I have hundreds of rows where each row needs to be linked to a counter column.
The table has the following structure:
K1 K2 C1 C2 C3 V1 V2 V3 COUNTER
Where K
is the partition key, C
is the cluster key, V
is a value and COUNTER
is a counter column.
Generally, the above table is queried as follows:
SELECT * from table WHERE K1 = A & K2 = B & C1 = X & C2 = Y & C3 = Z.
This results in about 500 rows being returned per query. As seen, each row is linked to a COUNTER column. Specifically, each combination of K1, K2, C1, C2, C3
is linked to a different COUNTER value.
How am I suppose to model this table, if the COUNTER has to be moved to an entirely different table?
If I understand correctly, the only way to do this, is to define another table -> table_counter
without any of the values (V
):
K1 K2 C1 C2 C3 COUNTER
However, I have several issues with this approach:
1) It seems extremely inelegant to break up a logically cohesive table like this
2) It means that whenever I want to execute the previous query I would essentially need to execute two queries, instead of one, just to get the counter information linked to each K1 K2 C1 C2 C3
combination
3) I would also be forced to combine the results of the two queries above into a single data structure on the client side (for it to be useful)
Is this correct? If yes, is there an alternative to COUNTER column where I could add the COUNTER column to the first table?
One approach I was thinking of is to use a regular INTEGER
as the counter column. Whenever the counter column needs to be updated I can read the current counter (integer) value and increment it on the client side and then write the new value back to the database. I understand that I won’t be protected from concurrent reads/writes so that if two clients read the counter value at the same time, and they both increment/update it, one write (update) will be lost (e.g, only the last write will be preserved). However, I can live with the occasional lost write as we are not tracking anything critical where a single (or handful) of lost writes will make a major difference. I also understand that this would require a read and then a write (two operations) each time I want to update the counter column, however, it allows me to keep the counter column as part of the original table as well as reduce the querying from two tables to one. Additionally, there would be no need to combine the results of querying two tables on the client side with this design. Seems more efficient than using a counter column and much more elegant.
Is this approach a viable alternative to the COUNTER column? Are there any pitfalls I missed? Is there another approach that might work better in my example?
Thanks
Solution 1:[1]
An Alternative to the "classic" DRDT-inspired (but not quite) counter column is to use Scylla's lightweight transactions (LWT) - basically your counter becomes a normal integer column, which you can read normally if you wish, but writes use a conditional update (UPDATE ... IF ...). For example to modify value V and also increment the counter you can:
- Read the current value of the row
- Decide on the new value of V and increment the counter, and finally
- Write back the row with the new V and the incremented counter witht the condition (IF) the current counter is still the original value of the counter.
- If step 3 failed, i.e., the IF condition wasn't true, goto step 1.
This pattern is known as "optimistic locking" - step 1-2 are "optimistic" in assuming that the new item they build will be able to be written, but if some other concurrent update beat us, step 1-2 will need to be repeated. This is to contrast with pessimistic locking approaches, where the client "takes a lock", and only after knowing it is holding the lock, bothers to calculate the new value of the item (step 2).
LWT is much more powerful than counters, and you can do with it a lot more than you can do with counters. Reads can be as efficient as regular reads, but note that writes do become slower. Scylla is working on a next-generation LWT implementation based on Raft (the current implementation is based on Paxos), so you can expect improvements in LWT write performance in the future.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Nadav Har'El |