'Is there any way to know the bytes which were processed by query jobs per reference table?

I would like to know how to calculate the total bytes which were processed by query jobs in a specified period per reference table to reduce BigQuery costs.

I can use INFORMATION_SCHEMA to know the reference table and bytes processed for each query, but I don't know how to directly calculate the processed bytes per table.



Solution 1:[1]

There are a couple of ways to do this, see the documentation here for more details but below is a summary:

  1. From the UI just add the query and it will calculate the expected consumption. enter image description here
  2. Alternatively, you can issue a bq command with the --dry_run flag like this:
bq query \
--use_legacy_sql=false \
--dry_run \
'SELECT * FROM `bigquery-public-data.austin_311.311_service_requests`;'

It will return the following message: Query successfully validated. Assuming the tables are not modified, running this query will process 259261 bytes of data.

Unfortunately you will not be able to apply a general statement to each table individually as each query will have different usage of the data itself, the partitions, clusters, etc.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1