'Python - Scraping API and Transform Json Data to CSV

My experience: Just started with Python and no developer

My Goal: Trying to scrape Sofascore API for getting Soccer Lineups to CSV. The json data need to be transformed. Final Output should be "Player Name", "substitute" and "avgRating". Also creating a loop for scraping multiple Lineups. ("https://api.sofascore.com/api/v1/event/xxxxxx/lineups")

import pandas as pd
import requests
from io import StringIO
import json
import codecs

url = "https://api.sofascore.com/api/v1/event/9576298/lineups"
headers = {"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:66.0) Gecko/20100101 Firefox/66.0"}
r= requests.get(url, headers=headers)
json = r.json()

# Show json
print (json)

So having this data from one lineup, I don't understand how to transform this data to my goal. Any advice to put me in the right direction would be awesome!



Solution 1:[1]

This should do the work for one lineup

import requests
import csv

# Get the data from API
url = r"https://api.sofascore.com/api/v1/event/9576298/lineups"
headers = {"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:66.0) Gecko/20100101 Firefox/66.0"}
r = requests.get(url, headers=headers)
json_data = r.json()

# Load the lists of players from the scrapped json data
home_players_json = json_data["home"]["players"]
away_players_json = json_data["away"]["players"]


# Open a new file as our CSV file
with open("players.csv", "w", newline='') as csv_file:
    csv_writer = csv.writer(csv_file)

    # Add a header
    csv_writer.writerow(["Player Name", "Substitute", "avgRating"])

    # Combine the list of home players and away player inside a tuple...
    # ...to iterate through both of them with less code!
    for players_lists in (home_players_json, away_players_json):

        # Iterate through each player's data
        for player in players_lists:

            # Add each player's data to a row
            csv_writer.writerow([
                player["player"]["name"],
                player["substitute"],
                player["avgRating"]
            ])

For multiple lineups, you may need more data checking because some lineups don't provide a substitute or avgRating values for all players.

Also, some of the imports you used aren't actually needed.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1