'Database schema design optimized for scoring rows to sort on
I want a table like this which has scores column, and whenever basket1
or basket2
are added to table I want to update scores of the columns in basket.So say basket1 =[potato, tomato, orange]
and basket2 = [potato, apple, orange]
are inserted in db, I want to subtract 1 from items in baskets and increment the score for basket.
So if I have a table like this and when I insert 5,6 I want to do the math operation
prod_id name score
---------------------------
1 potato 10 (-1, -1)
2 tomato 10 (-1)
3. orange 10 (-1, -1)
4. apple. 10 (-1)
5. basket1. 40 (+3)
6. basket2 40 (+3)
Obviously I can look at basket 1 array and do N number of db queries to update rows, because say if there are 1000 items in basket, will DB become slower/locked if we are updating so many rows in 1 query, but I am trying to figure out if there is any optimized way where I can do this with minimal impact on read queries performance. Alternatives I thought were maybe create another table for scores and keep track of scores there. Are there any other way where I can intelligently layout the db schema such that performance impact is bare minimum
Solution 1:[1]
I can't see any issue with your current data model. Keep it smart and simple. Writing data to Postgresql will never lock reading because of its multiversion concurrency control feature.
Also 1000 or 2000 updated rows per transaction shouldn't be any issue, too. However, if write performance is no issue I would recommend that you use as less concurrent workers for writing as possible, at best only one. This will help that the writing jobs don’t block each other.
From a functional point of view I wouldn't call the id “prod id“ because baskets are no products. The table contains scores, so better call it scores. But of cause I don't know the rest of the data model.
General rules:
- always plan your indexes and queries thoroughly
- if you have concurrent writers and you write to multiple tables in one transaction then do it always in the same order to avoid deadlocks
- if you have concurrent writers and if you update multiple rows of the same table in one transaction then always do it in the same order, best by increasing primary key, to avoid deadlocks
- use bulk update features of the DB framework of your choice (example)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |