'Google Analytics response to Pandas Dataframe in Python
Still a newbie to Python so please be gentle.
I'm trying to parse a Google Analytics Reporting API V4 response to a Pandas dataframe in Python, specifically using Repl if that makes any difference. I've tried a few commonly accepted answers from SO but none seem to be working. The code 'successfully' (no error msgs) executes but even basic print('Done') commands at the end of the script aren't showing up in the terminal.
Can anyone see what might be wrong with my code?
import pandas as pd
from pandas import json_normalize
response = {'reports':[{'columnHeader':{'dimensions':['ga:date','ga:sourceMedium','ga:landingPagePath','ga:deviceCategory','ga:browser'],'metricHeader':{'metricHeaderEntries':[{'name':'ga:sessions','type':'INTEGER'}]}},'data':{'rows':[{'dimensions':['20201126','(direct) / (none)','/test/page.html','desktop','Chrome'],'metrics':[{'values':['1']}]}],'totals':[{'values':['1000']}],'rowCount':100,'minimums':[{'values':['1']}],'maximums':[{'values':['10']}],'isDataGolden':True},'nextPageToken':'1'}]}
def parse_data(response):
reports = response['reports'][0]
columnHeader = reports['columnHeader']['dimensions']
metricHeader = reports['columnHeader']['metricHeader']['metricHeaderEntries']
columns = columnHeader
for metric in metricHeader:
columns.append(metric['name'])
data = pd.json_normalize(reports['data']['rows'])
data_dimensions = pd.DataFrame(data['dimensions'].tolist())
data_metrics = pd.DataFrame(data['metrics'].tolist())
data_metrics = data_metrics.applymap(lambda x: x['values'])
data_metrics = pd.DataFrame(data_metrics[0].tolist())
result = pd.concat([data_dimensions, data_metrics], axis=1, ignore_index=True)
return result
print(result)
result.to_csv('result.csv')
print('Done')
Solution 1:[1]
It's because in a given function, any code that occurs after the return
statement is not executed. Once return
is reached, the function is done. If you put your print
call and to_csv
method before the return
statement, they should run.
Try reorganizing your code as such:
def parse_data(response):
reports = response['reports'][0]
columnHeader = reports['columnHeader']['dimensions']
metricHeader = reports['columnHeader']['metricHeader']['metricHeaderEntries']
columns = columnHeader
for metric in metricHeader:
columns.append(metric['name'])
data = pd.json_normalize(reports['data']['rows'])
data_dimensions = pd.DataFrame(data['dimensions'].tolist())
data_metrics = pd.DataFrame(data['metrics'].tolist())
data_metrics = data_metrics.applymap(lambda x: x['values'])
data_metrics = pd.DataFrame(data_metrics[0].tolist())
result = pd.concat([data_dimensions, data_metrics], axis=1, ignore_index=True)
print(result)
result.to_csv('result.csv')
print('Done')
return result
Solution 2:[2]
I would just modify the above function, in order to add colum names as well:
def parse_data(response):
reports = response['reports'][0]
columnHeader = reports['columnHeader']['dimensions']
metricHeader = reports['columnHeader']['metricHeader']['metricHeaderEntries']
#Get dimenssion names
dim_names = [columnHeader[n].split(":")[1] for n in range (len(columnHeader))]
#Get metric names
metric_names = [metricHeader[n]['name'].split(":")[1] for n in range (len(metricHeader))]
column_names = dim_names+metric_names
columns = columnHeader
for metric in metricHeader:
columns.append(metric['name'])
data = pd.json_normalize(reports['data']['rows'])
data_dimensions = pd.DataFrame(data['dimensions'].tolist())
data_metrics = pd.DataFrame(data['metrics'].tolist())
data_metrics = data_metrics.applymap(lambda x: x['values'])
data_metrics = pd.DataFrame(data_metrics[0].tolist())
result = pd.concat([data_dimensions, data_metrics], axis=1, ignore_index=True)
# Assign columns names to DF
result.columns = column_names
print(result)
result.to_csv('result.csv')
print('Done')
return result
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | Kirill Dolbnev |