'Best and most efficient way for ELO-score calculation for users in database

I'm having a hard time wrapping my head around the issue of an ELO-score-like calculation for a large amount of users on our platform.

For example. For every user in a large set of users, a complex formule, based on variable amounts of "things done", will result in a score for each user for a match-making-like principle.

For our situation, it's based on the amount of posts posted, connections accepted, messages sent, amount of sessions in a time period of one month, .. other things done etc.

I had two ideas to go about doing this:

  • Real-time: On every post, message, .. run the formula for that user
  • Once a week: Run the script to calculate everything for all users.

The concerns about these two I have:

  • Real-time: Would be an overkill of queries and calculations for each action a user performs. If let's say, 500 users are active, all of them are performing actions, the database would be having a hard time I think. There would them also run a script to re-calculate the score for inactive users (to lower their score)

  • Once a week: If we have for example 5.000 users (for our first phase), than that would result into running the calculation formula 5.000 times and could take a long time and will increase in time when more users join.

The calculation-queries for a single variable in a the entire formula of about 12 variables are mostly a simple 'COUNT FROM table', but a few are like counting "all connections of my connections" which takes a few joins.

I started with "logging" every action into a table for this purpose, just the counter values and increase/decrease them with every action and running the formula with these values (a record per week). This works but can't be applied for every variable (like the connections of connections).

Note: Our server-side is based on PHP with MySQL.

We're also running Redis, but I'm not sure if this could improve those bits and pieces.

We have the option to export/push data to other servers/databases if needed.

My main example is the app 'Tinder' which uses a sort-like algorithm for match making (maybe with less complex data variables because they're not using groups and communities that you can join)

I'm wondering if they run that real-time on every swipe, every setting change, .. or if they have like a script that runs continiously for a small batch of users each time.

Where it all comes down to. What would be the most efficient/non-database-table-locking way to do this, with keeping the idea in mind that there will be a moment that we're having 50.000 users for example?



Solution 1:[1]

The way I would handle this:

  1. Implement the realtime algorithm.
  2. Measure. Is it actually slow? Try optimizing
  3. Still slow? Move the algorithm to a separate asynchronous process. Have the process run whenever there's an update. Really this is the same thing as 1, but it doesn't slow down PHP requests and if it gets busy, it can take more time to catch up.
  4. Still slow? Now you might be able to optimize by batching several changes.

If you have 5000 users right now, make sure it runs well with 5000 users. You're not going to grow to 50.000 overnight, so adjust and invest in this as your problem changes. You might be surprised where your performance problems are.

Measuring is key though. If you really want to support 50K users right now, simulate and measure.

Solution 2:[2]

I suspect you should use the database as the "source of truth" aka "persistent storage".

Then fetch whatever is needed from the dataset when you update the ratings. Even lots of games by 5000 players should not take more than a few seconds to fetch and compute on.

Bottom line: Implement "realtime"; come back with table schema and SELECTs if you find that the table fetching is a significant fraction of the total time. Do the "math" in a programming language, not SQL.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Evert
Solution 2 Rick James