'how to covert a json to pandas dataframe when the value is completely in the string fomat

I am trying to convert the data from a json to dataframe. My son

{"data":"key=IAfpK, age=58, key=WNVdi, age=64, key=jp9zt, age=47, key=0Sr4C, age=68, key=CGEqo, age=76,
key=IxKVQ, age=79, key=eD221, age=29, key=XZbHV, age=32, key=k1SN5, age=88, key=4SCsU, age=65, key=q3kG6,
age=33, key=MGQpf, age=13, key=Kj6xW, age=14, key=tg2VM, age=30, key=WSnCU, age=24, key=f1Vvz, age=46, }

I want to create a data frame with key and age as columns. I have parsed the str and extracted key,value, created a dict and then converted to dataframe. I know that there are several inbuilt functions in pandas for making our life easier. Is there any such method or easier way to create a dataframe.

r = requests.get('https://coderbyte.com/api/challenges/json/age-counting')
input_str = (r.json()['data'])

input_str_split = input_str.split(',')
age_dict = {}
i = 0
while i < len(input_str_split) - 2:
    key = input_str_split[i].split('=')[1]
    value = input_str_split[i+1].split('=')[1]
    age_dict[key] = value
    i += 2

data = pd.DataFrame(age_dict.items(),columns = ['Item','Age'])


Solution 1:[1]

You can try list-conprehension and then select every 2 elements using data[::2]:

data = [x.split("=")[1] for x in input_str.split(", ")]
df = pd.DataFrame({"age": data[1::2], "key": data[::2]})

print(df)
#     age    key
# 0    58  IAfpK
# 1    64  WNVdi
# 2    47  jp9zt
# 3    68  0Sr4C
# 4    76  CGEqo
# ..   ..    ...
# 295  13  lRf1j
# 296  50  0iJGV
# 297   5  cFCfU
# 298  48  J8an1
# 299   5  dkSlj

Explanations:

  1. Split data to identify each element using split: input_str.split(", ")
  2. Explode each element to select value after =: [x.split("=")[1] for x in input_str.split(", ")]
  3. Create the dataframe by selecting every two elements: df = pd.DataFrame({"age": data[1::2], "key": data[::2]})

Full illustration:

r = requests.get('https://coderbyte.com/api/challenges/json/age-counting')
input_str = r.json().get('data')

print(input_str.split(", "))
# ['key=IAfpK', 'age=58', 'key=WNVdi', 'age=64', ... 'key=dkSlj', 'age=5']

print([x.split("=") for x in input_str.split(", ")])
# [['key', 'IAfpK'], ['age', '58'], ['key', 'WNVdi'], ['age', '64'],  ... , ['key', 'dkSlj'], ['age', '5']]

print([x.split("=")[1] for x in input_str.split(", ")])
# ['IAfpK', '58', 'WNVdi', '64', ..., 'dkSlj', '5']

data = [x.split("=")[1] for x in input_str.split(", ")]

print(data[1::2])
# ['58', '64', ... , '5']
df = pd.DataFrame({"age": data[1::2], "key": data[::2]})
print(df)
#     age    key
# 0    58  IAfpK
# 1    64  WNVdi
# 2    47  jp9zt
# 3    68  0Sr4C
# 4    76  CGEqo
# ..   ..    ...
# 295  13  lRf1j
# 296  50  0iJGV
# 297   5  cFCfU
# 298  48  J8an1
# 299   5  dkSlj

# [300 rows x 2 columns]

Solution 2:[2]

Here is a solution you can try out,

zip(split_[::2], split_[1::2]) would yield,

key=IAfpK age=58, key=WNVdi age=64 & so on..

import pandas as pd

split_ = data.split(",")

df = pd.DataFrame(
    {"Item": i.split("=")[-1], "Age": j.split("=")[-1]}
    for i, j in zip(split_[::2], split_[1::2])
)

     Item Age
0   IAfpK  58
1   WNVdi  64
2   jp9zt  47
3   0Sr4C  68
    ...
    ...

Solution 3:[3]

Unfortunately, your output is wrong.

here is an answer.

import requests
import re

r = requests.get('https://coderbyte.com/api/challenges/json/age-counting')
input_str = (r.json()['data'])
input_str_split = input_str.split(', ')

key_pattern = re.compile("key\=.*")
age_pattern = re.compile("age\=.*")

key_list = [x[4:] for x in input_str_split if key_pattern.match(x)]
age_list = [x[4:] for x in input_str_split if age_pattern.match(x)]

data = pd.DataFrame({'Item': key_list, 'Age': age_list})

output is

      Item Age
0    IAfpK  58
1    WNVdi  64
2    jp9zt  47
3    0Sr4C  68
4    CGEqo  76
..     ...  ..
295  lRf1j  13
296  0iJGV  50
297  cFCfU   5
298  J8an1  48
299  dkSlj   5

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 sushanth
Solution 3 GH KIM