'Connecting Tranco Google BigQuery with Metabase

I am trying to connect third party ranking management system (https://tranco-list.eu/) with metabase. Tranco is giving us an option to see the record on Google BigQuery but when I am trying to connect the Tranco with Metabase then it is asking for dataset from my Google cloud console project. Since Tranco is an external database source and I don't have access to the dataset Id from this.

If you want to get the result of tranco in Google BigQuery then run below query.

select * from `tranco.daily.daily` where domain ='google.com' limit 10

When I am searching Tranco in public dataset then also I am not finding this over their also. Is anyone aware of, how to add the third party dataset to our Google cloud project.

Metabase view

Google cloud console to add the Public dataset

Thanks in advance.



Solution 1:[1]

Unfortunately, you can’t read the Tranco dataset directly from BigQuery; but, what you can do is to load the CSV data from Tranco into a Cloud Storage Bucket and then read your bucket in BigQuery.

When you load data from Cloud Storage into a BigQuery table, the dataset that contains the table must be in the same regional or multi- regional location as the Cloud Storage bucket.

Note that it has the next limitations:

  • CSV files do not support nested or repeated data.

  • Remove byte order mark (BOM) characters. They might cause unexpected issues.

  • If you use gzip compression, BigQuery cannot read the data in parallel. Loading compressed CSV data into BigQuery is slower than loading uncompressed data.

  • You cannot include both compressed and uncompressed files in the same load job.

  • The maximum size for a gzip file is 4 GB. When you load CSV or JSON data, values in DATE columns must use the dash (-) separator and the date must be in the following format: YYYY-MM-DD (year-month-day).

  • When you load JSON or CSV data, values in TIMESTAMP columns must use a dash (-) separator for the date portion of the timestamp, and the date must be in the following format: YYYY-MM-DD (year-month-day). The hh:mm:ss (hour-minute-second) portion of the timestamp must use a colon (:) separator.

Also, you can see this documentation if you don’t know how you can upload and read your CSV data.

And also in the next link I'm sending you is a step by step guide in how yo can create / select the bucket you will use.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1