'What is the best way to import Google Analytics data into Azure Blob/Data Lake?

I am trying to import Google Analytics data into Azure Blob or Data Lake storage for analysis or reporting. But I don't see a Google Analytics connector in Azure Data Factory.

I see some third party connectors such as CData, Xplenty, Stitchdata, etc, but they all require payments. I also tried the Google Analytics API but I am not sure how to use it to bring data into Azure, with my limited knowledge. Is there a way to bring in Google Analytics data into Azure for free?



Solution 1:[1]

Based on my researching,you may follow 2 ways to transfer data from Google Analytics data into Azure Blob.

1.In this case How could I import google analytics data to Google Cloud Platform? could transfer data from Google Analytics into Google BigQuery. ADF supports Google BigQuery connector.

enter image description here

2.ADF supports REST connector. You could refer to this api document:https://developers.google.com/analytics/devguides/reporting/core/v3/reference

Solution 2:[2]

Unfortunately, Azure Data Factory don’t support Google Analytics connector.

Reference: Azure Data Factory supported connectors.

I would suggest you to vote up an idea submitted by another Azure customer.

https://feedback.azure.com/d365community/idea/4ca9dce8-6d26-ec11-b6e6-000d3a4f032c

All of the feedback you share in these forums will be monitored and reviewed by the Microsoft engineering teams responsible for building Azure.

Solution 3:[3]

I hope I am not too late for this question.

I have been looking into this as well, and narrowed down my options to the following:

  1. Granular/Hit level data with Google BigQuery: There is a connector available as of November 2019 in Azure data factory. In order to utilize it you must connect Analytics 360 to BigQuery. Analytics 360 costs about 150k / year which I don't think is the most reasonable option.

  2. Sampled data : You can write a service worker to fetch data (live or otherwise) in azure using the Reporting v4 api or Streaming analytics api. Again, this is the sampled data, and I don't think it is going to bring value significantly.

  3. Granular/Hit level raw data using event routing: You can modify the google analytics javascript code by injecting custom javascript to route hit level data to your server. This post describes that in more detail. Ingesting raw google analytics data

I am going to work on this one and capture everything on azure using azure workers and SQL Server, next week. Let me know if I can be of any help there.

Solution 4:[4]

Refer Core reporting API allows you to get quite a lot of dimensions and metrics. The data factory has a rest connector that works for pagination also OOB.

Another good option is to use a big query in between and utilize a data factory big query connector.

The third option is to use GTM call back method to push your data layer to API that can be listened by one of the functions app in azure.

Solution 5:[5]

I have implemented a reasonably low cost solution without the need for G360. This as done using Google Tag Manager which duplicates hits and pushes to Azure Event Hub where you can then consume that however you want. One option is saving files to blob storage for later use (using Event Hub Capture) or with an Azure Stream Analytics job or even perhaps Azure Functions.

This was inspired by this blog which pushes to snowplow.

Solution 6:[6]

This question is very broad and plenty of good solutions has been shared. However just for the update, the new version of Google Analytics (GA4) offer a free data integration with BigQuery. From there it should be easy to move it to Azure data factory.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Jay Gong
Solution 2 Sam Firke
Solution 3 eAlie
Solution 4 Grigory Zhadko
Solution 5
Solution 6 Simon Breton