Category "google-bigquery"

Extracting excel files from the FTP to BigQuery using Cloud Functions

I am working on creating an automated script to download files from a FTP and store them into BigQuery. Problem is that BigQuery accepts only .csv files. For t

What method ga4 use for streaming data to bigquery? In SQL terms, is it just insert or update too?

Sorry, I'm new to this. I read a few sources including some google documentation guides but still don't quiet understand: Every time GA4 streams data into bigqu

Get a rolling order count into session data

I have the following table One client has two purchases in one session. My goal is to assign a order counter to each row of the table. To reach this goal I am

Ingest RDBMS data to BigQuery

If we have an on-prem sources like SQL-Server and Oracle. Data from it has to be ingested periodically in batch mode in Big Query. What shud be the architecture

Get Table_Id along with Rowcount for all Tables in a Project

There are >100 datasets in one of my project and I want to get the Table_id * No_of_rows of each table lying in these 50 datasets. I can get the metadata o

How can I refresh datasets/resources in the new Google BigQuery Web UI?

I'm creating tables via the Big Query command-line utility, but occasionally ad-hoc querying with the new web UI. After creating a table via the CLI, how do I

Move BigQuery Data Transfer Service(DCM) data to another project

I have BigQuery Data Transfer Service for Campaign Manager setup in dataset A in GCP project A. I would like to move this to dataset B located in project B. How

BigQuery public datasets cannot be found (`bigquery-public-data`)

In the left panel of BigQuery, the dataset bigquery-public-data is nowhere to be found. I have no idea how it disappeared. Does anyone have a solution to integ

Provider com.google.cloud.spark.bigquery.BigQueryRelationProvider could not be instantiated while reading from bigquery in Jupyter lab

I have followed this post pyspark error reading bigquery: java.lang.ClassNotFoundException: org.apache.spark.internal.Logging$class and followed the resolution

Make existing bigquery table clustered

I have a quite huge existing partitioned table in bigquery. I want to make the table clustered, at least for the new partition. From the documentation: https:/

How can I see the service account that the python bigquery client uses?

To create a default bigquery client I use: from google.cloud import bigquery client = bigquery.Client() This uses the (default) credentials available in the en

BigQuery SQL JSON Returning additional rows when current row contains multiple values

I have a table that looks like this keyA | data:{"value":false}} keyB | data:{"value":3}} keyC | data:{"value":{"paid":10,"unpaid"

"pyarrow.lib.ArrowInvalid: Casting from timestamp[ns] to timestamp[ms] would lose data" when sending data to BigQuery without schema

I'm working on a script where I'm sending a dataframe to BigQuery: load_job = bq_client.load_table_from_dataframe( df, '.'.join([PROJECT, DATASET, PROGRAMS

Adding a single static column to SQL query results

I have a pretty big query (no pun intended) written out in BigQuery that returns about 5 columns. I simply want to append an extra column to it that is not join

Log4net integration with google-big-query

I'm trying to capture the logs using log4net package and store it in google bigquery table. I have successfully captured the logs and stored it in file. I can a

Function of Dataproc Metastore in a Datalake environment

In a Google Datalake environment, what is the Dataproc Metastore service used for? I'm watching a Google Cloud Tech video and in this video around the 17:33 mar

Log4net integration with google-big-query

I'm trying to capture the logs using log4net package and store it in google bigquery table. I have successfully captured the logs and stored it in file. I can a

How to store dynamically generated JSON object in Big Query Table?

I have a use case to store dynamic JSON objects in a column in Big Query. The schema of the object is dynamically generated by the source and not known beforeha

BigQuery - DELETE statement to remove duplicates

There are plenty of great posts on SQL that selects unique rows and write (truncates) a table so the dus are removed. e.g WITH ev AS ( SELECT *, ROW_

Get MONTH NAME from date in BigQuery SQL

I'm trying to extract the MONTH NAME from a date in BigQuery, the type is DATE (i.e., 2019-09-19). I tried something like: SELECT PARSE_DATE('%B',CAST(date_