Category "google-bigquery"

BigQuery public datasets cannot be found (`bigquery-public-data`)

In the left panel of BigQuery, the dataset bigquery-public-data is nowhere to be found. I have no idea how it disappeared. Does anyone have a solution to integ

Provider com.google.cloud.spark.bigquery.BigQueryRelationProvider could not be instantiated while reading from bigquery in Jupyter lab

I have followed this post pyspark error reading bigquery: java.lang.ClassNotFoundException: org.apache.spark.internal.Logging$class and followed the resolution

Make existing bigquery table clustered

I have a quite huge existing partitioned table in bigquery. I want to make the table clustered, at least for the new partition. From the documentation: https:/

How can I see the service account that the python bigquery client uses?

To create a default bigquery client I use: from google.cloud import bigquery client = bigquery.Client() This uses the (default) credentials available in the en

BigQuery SQL JSON Returning additional rows when current row contains multiple values

I have a table that looks like this keyA | data:{"value":false}} keyB | data:{"value":3}} keyC | data:{"value":{"paid":10,"unpaid"

"pyarrow.lib.ArrowInvalid: Casting from timestamp[ns] to timestamp[ms] would lose data" when sending data to BigQuery without schema

I'm working on a script where I'm sending a dataframe to BigQuery: load_job = bq_client.load_table_from_dataframe( df, '.'.join([PROJECT, DATASET, PROGRAMS

Adding a single static column to SQL query results

I have a pretty big query (no pun intended) written out in BigQuery that returns about 5 columns. I simply want to append an extra column to it that is not join

Log4net integration with google-big-query

I'm trying to capture the logs using log4net package and store it in google bigquery table. I have successfully captured the logs and stored it in file. I can a

Function of Dataproc Metastore in a Datalake environment

In a Google Datalake environment, what is the Dataproc Metastore service used for? I'm watching a Google Cloud Tech video and in this video around the 17:33 mar

Log4net integration with google-big-query

I'm trying to capture the logs using log4net package and store it in google bigquery table. I have successfully captured the logs and stored it in file. I can a

How to store dynamically generated JSON object in Big Query Table?

I have a use case to store dynamic JSON objects in a column in Big Query. The schema of the object is dynamically generated by the source and not known beforeha

BigQuery - DELETE statement to remove duplicates

There are plenty of great posts on SQL that selects unique rows and write (truncates) a table so the dus are removed. e.g WITH ev AS ( SELECT *, ROW_

Get MONTH NAME from date in BigQuery SQL

I'm trying to extract the MONTH NAME from a date in BigQuery, the type is DATE (i.e., 2019-09-19). I tried something like: SELECT PARSE_DATE('%B',CAST(date_

How to Convert Geohash to Geometry in BigQuery?

PostGIS has this function ST_GeomFromGeoHash to get the bounding box geometry of the geohash area (https://postgis.net/docs/ST_GeomFromGeoHash.html), but it has

How to Convert Geohash to Geometry in BigQuery?

PostGIS has this function ST_GeomFromGeoHash to get the bounding box geometry of the geohash area (https://postgis.net/docs/ST_GeomFromGeoHash.html), but it has

Partitioning BigQuery Tables via API in python

I'm using Python to hit the BigQuery API. I've been successful at running queries and writing new tables, but would like to ensure those output tables are par

DATAFRAME TO BIGQUERY - Error: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp1yeitxcu_job_4b7daa39.parquet'

I am uploading a dataframe to a bigquery table. df.to_gbq('Deduplic.DailyReport', project_id=BQ_PROJECT_ID, credentials=credentials, if_exists='append') And I

Creating a View in BigQuery from Temporary Function and Dynamic SQL

I want to create a view dynamically with a string generated by a temporary function. The code below fails with Creating views with temporary user-defined functi

What does "EXCEPT distinct select * " in SQL language mean?

I am following a tutorial on Qwiklabs on Bigquery and financial fraud detection and came across a query, below, that I am failing to understand CREATE OR REPLAC

Make calculations across multiple tables based on the table suffix in Bigquery

I have a database of daily tables (with prefixes formatted as yyyymmdd) with customers info, and I need to get a 90 day timeline of 90 day ARPUs (average revenu