'Data caching with ClickHouse

Intro

I have ClickHouse as data warehouse (tables with billions of rows). Users interact with the DWH using my application backend that generates SQL queries to ClickHouse. Different users can access the same data (sometimes the WHERE filtering conditions can change in queries). It is assumed that in the future ClickHouse will scale across different servers.

The task

At the moment, I am caching the results of frequent SQL queries with creating new tables based on those stored in the database and declaring a TTL for the table equal to 1 day. If during the day another query arrives at the table, then I do ALTER TABLE and update the TTL for another 1 day. I doubt that this method is efficient. I also additionally store a table where I fix the name of the table and the time of the last access (in order to delete obsolete empty tables using my application).

Is it possible that there are some patterns for implementing efficient access to the most frequently used data or ready-made mechanisms in ClickHouse? I would also be grateful for links to literature where I can get acquainted with such information or approach this issue from a different angle.



Solution 1:[1]

ClickHouse does not have caching mechanism. On the other hand, it relies heavily on the file system cache.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 mel