'Data Transfers to Big Query error "Error while reading data, error message: JSON table encountered too many errors, giving up. Rows: 1; errors: 1"

Hello im currently trying to establish daily data transfers from Google Cloud Storage to Big Query tables. Theses tables are just meant to store raw data (JSON files) and i unnest them later with scheduled queries. I have no issue when i create the table manually but im getting the error Error while reading data, error message: JSON table encountered too many errors, giving up. Rows: 1; errors: 1 when i launch a test transfer (even if I previously deleted the data in the table).

Here is an example of the raw data table/JSON schema:

{
  "fields": [{
    "name": "data",
    "type": "RECORD",
    "mode": "REPEATED",
    "schema": {
      "fields": [{
        "name": "relationships",
        "type": "RECORD",
        "mode": "NULLABLE",
        "schema": {
          "fields": [{
            "name": "createdBy",
            "type": "RECORD",
            "mode": "NULLABLE",
            "schema": {
              "fields": [{
                "name": "data",
                "type": "RECORD",
                "mode": "NULLABLE",
                "schema": {
                  "fields": [{
                    "name": "type",
                    "type": "STRING",
                    "mode": "NULLABLE"
                  }, {
                    "name": "id",
                    "type": "INTEGER",
                    "mode": "NULLABLE"
                  }]
                }
              }]
            }
          }, {
            "name": "validationWorkflow",
            "type": "RECORD",
            "mode": "NULLABLE",
            "schema": {
              "fields": [{
                "name": "data",
                "type": "RECORD",
                "mode": "REPEATED",
                "schema": {
                  "fields": [{
                    "name": "type",
                    "type": "STRING",
                    "mode": "NULLABLE"
                  }, {
                    "name": "id",
                    "type": "INTEGER",
                    "mode": "NULLABLE"
                  }]
                }
              }]
            }
          }, {
            "name": "projects",
            "type": "RECORD",
            "mode": "NULLABLE",
            "schema": {
              "fields": [{
                "name": "data",
                "type": "RECORD",
                "mode": "REPEATED",
                "schema": {
                  "fields": [{
                    "name": "type",
                    "type": "STRING",
                    "mode": "NULLABLE"
                  }, {
                    "name": "id",
                    "type": "INTEGER",
                    "mode": "NULLABLE"
                  }]
                }
              }]
            }
          }, {
            "name": "expensesReport",
            "type": "RECORD",
            "mode": "NULLABLE",
            "schema": {
              "fields": [{
                "name": "data",
                "type": "RECORD",
                "mode": "NULLABLE",
                "schema": {
                  "fields": [{
                    "name": "type",
                    "type": "STRING",
                    "mode": "NULLABLE"
                  }, {
                    "name": "id",
                    "type": "INTEGER",
                    "mode": "NULLABLE"
                  }]
                }
              }]
            }
          }, {
            "name": "agency",
            "type": "RECORD",
            "mode": "NULLABLE",
            "schema": {
              "fields": [{
                "name": "data",
                "type": "RECORD",
                "mode": "NULLABLE",
                "schema": {
                  "fields": [{
                    "name": "type",
                    "type": "STRING",
                    "mode": "NULLABLE"
                  }, {
                    "name": "id",
                    "type": "INTEGER",
                    "mode": "NULLABLE"
                  }]
                }
              }]
            }
          }, {
            "name": "files",
            "type": "RECORD",
            "mode": "NULLABLE",
            "schema": {
              "fields": [{
                "name": "data",
                "type": "STRING",
                "mode": "REPEATED"
              }]
            }
          }, {
            "name": "resource",
            "type": "RECORD",
            "mode": "NULLABLE",
            "schema": {
              "fields": [{
                "name": "data",
                "type": "RECORD",
                "mode": "NULLABLE",
                "schema": {
                  "fields": [{
                    "name": "type",
                    "type": "STRING",
                    "mode": "NULLABLE"
                  }, {
                    "name": "id",
                    "type": "INTEGER",
                    "mode": "NULLABLE"
                  }]
                }
              }]
            }
          }, {
            "name": "validations",
            "type": "RECORD",
            "mode": "NULLABLE",
            "schema": {
              "fields": [{
                "name": "data",
                "type": "RECORD",
                "mode": "REPEATED",
                "schema": {
                  "fields": [{
                    "name": "type",
                    "type": "STRING",
                    "mode": "NULLABLE"
                  }, {
                    "name": "id",
                    "type": "INTEGER",
                    "mode": "NULLABLE"
                  }]
                }
              }]
            }
          }, {
            "name": "orders",
            "type": "RECORD",
            "mode": "NULLABLE",
            "schema": {
              "fields": [{
                "name": "data",
                "type": "RECORD",
                "mode": "REPEATED",
                "schema": {
                  "fields": [{
                    "name": "type",
                    "type": "STRING",
                    "mode": "NULLABLE"
                  }, {
                    "name": "id",
                    "type": "INTEGER",
                    "mode": "NULLABLE"
                  }]
                }
              }]
            }
          }]
        }
      }, {
        "name": "attributes",
        "type": "RECORD",
        "mode": "NULLABLE",
        "schema": {
          "fields": [{
            "name": "plannedTimes",
            "type": "STRING",
            "mode": "REPEATED"
          }, {
            "name": "state",
            "type": "STRING",
            "mode": "NULLABLE"
          }, {
            "name": "exceptionalTimes",
            "type": "STRING",
            "mode": "REPEATED"
          }, {
            "name": "closed",
            "type": "BOOLEAN",
            "mode": "NULLABLE"
          }, {
            "name": "informationComments",
            "type": "STRING",
            "mode": "NULLABLE"
          }, {
            "name": "regularTimes",
            "type": "RECORD",
            "mode": "REPEATED",
            "schema": {
              "fields": [{
                "name": "batch",
                "type": "RECORD",
                "mode": "NULLABLE",
                "schema": {
                  "fields": [{
                    "name": "title",
                    "type": "STRING",
                    "mode": "NULLABLE"
                  }, {
                    "name": "id",
                    "type": "INTEGER",
                    "mode": "NULLABLE"
                  }]
                }
              }, {
                "name": "workUnitType",
                "type": "RECORD",
                "mode": "NULLABLE",
                "schema": {
                  "fields": [{
                    "name": "name",
                    "type": "STRING",
                    "mode": "NULLABLE"
                  }, {
                    "name": "activityType",
                    "type": "STRING",
                    "mode": "NULLABLE"
                  }, {
                    "name": "reference",
                    "type": "INTEGER",
                    "mode": "NULLABLE"
                  }]
                }
              }, {
                "name": "project",
                "type": "RECORD",
                "mode": "NULLABLE",
                "schema": {
                  "fields": [{
                    "name": "reference",
                    "type": "STRING",
                    "mode": "NULLABLE"
                  }, {
                    "name": "id",
                    "type": "INTEGER",
                    "mode": "NULLABLE"
                  }]
                }
              }, {
                "name": "row",
                "type": "INTEGER",
                "mode": "NULLABLE"
              }, {
                "name": "delivery",
                "type": "RECORD",
                "mode": "NULLABLE",
                "schema": {
                  "fields": [{
                    "name": "endDate",
                    "type": "DATE",
                    "mode": "NULLABLE"
                  }, {
                    "name": "startDate",
                    "type": "DATE",
                    "mode": "NULLABLE"
                  }, {
                    "name": "title",
                    "type": "STRING",
                    "mode": "NULLABLE"
                  }, {
                    "name": "id",
                    "type": "INTEGER",
                    "mode": "NULLABLE"
                  }]
                }
              }, {
                "name": "startDate",
                "type": "DATE",
                "mode": "NULLABLE"
              }, {
                "name": "duration",
                "type": "FLOAT",
                "mode": "NULLABLE"
              }, {
                "name": "id",
                "type": "INTEGER",
                "mode": "NULLABLE"
              }]
            }
          }, {
            "name": "updateDate",
            "type": "TIMESTAMP",
            "mode": "NULLABLE"
          }, {
            "name": "creationDate",
            "type": "TIMESTAMP",
            "mode": "NULLABLE"
          }, {
            "name": "absencesTimes",
            "type": "RECORD",
            "mode": "REPEATED",
            "schema": {
              "fields": [{
                "name": "workUnitType",
                "type": "RECORD",
                "mode": "NULLABLE",
                "schema": {
                  "fields": [{
                    "name": "name",
                    "type": "STRING",
                    "mode": "NULLABLE"
                  }, {
                    "name": "activityType",
                    "type": "STRING",
                    "mode": "NULLABLE"
                  }, {
                    "name": "reference",
                    "type": "INTEGER",
                    "mode": "NULLABLE"
                  }]
                }
              }, {
                "name": "duration",
                "type": "FLOAT",
                "mode": "NULLABLE"
              }, {
                "name": "startDate",
                "type": "DATE",
                "mode": "NULLABLE"
              }]
            }
          }, {
            "name": "term",
            "type": "STRING",
            "mode": "NULLABLE"
          }]
        }
      }, {
        "name": "type",
        "type": "STRING",
        "mode": "NULLABLE"
      }, {
        "name": "id",
        "type": "INTEGER",
        "mode": "NULLABLE"
      }]
    }
  }, {
    "name": "date",
    "type": "DATE",
    "mode": "NULLABLE"
  }]
}

I know that BQ works better with JSON newline delimited format but this table must contains raw data even if its just 1 row in the end. The weirdest thing is that the transfer works for some files with similar schemas.

What should I do to make these transfers work ?

Thank you for your help



Solution 1:[1]

Regarding GCP documentation, I would say that using JSON newline delimited format is a known limitation.

"JSON data must be newline delimited. Each JSON object must be on a separate line in the file."

Solution 2:[2]

Here is the solution:

Some of the fields were auto-detected as "TIMESTAMP" when uploading manually. However, Data Transfer didn't recognize these fields as "TIMESTAMP" during the daily updates of these tables. To solve this I edited the schema and declared these fields as "STRING".

Thank you for your help ;)

Solution 3:[3]

I'm also facing the same issue while I was working with Cloud DataFlow [ Apache Beam ] to migrate .csv data from Cloud Storage to Google BigQuery.

This error occurs during the conversion of .csv file to .json file resulting into an error. I resolved this issue by adding few lines of code to convert .csv file to .json file.

CSV file:

Gender,Math,Physics,Chemistry
male,57,50,53
male,63,66,59
male,65,56,54

CSV to JSON conversion:

Your output data must be in this below format then only it will load data to Google Bigquery.

[ {"Gender":"male","Math":"57","Physics":"50","Chemistry":"53"},
{"Gender":"male","Math":"63","Physics":"66","Chemistry":"59"},
{"Gender":"male","Math":"65","Physics":"56","Chemistry":"54"}, ]

Python code to convert .csv to .json file

count = "start"
lst =[]
final = ""
class csv2json(beam.DoFn):
    def process(self,csvFile):
        global count
        global header
        global lst,final
        min_dict={}
        lst=[]
        print("**************")
        print(csvFile)
        print(type(csvFile))
        print("**************")
    #Consider first row as a header
    if "start" == count:
        for item in range(len(csvFile)):
            min_dict[csvFile[item]]=""
        count = "end"
        header = csvFile
    #Consider rest are the value for headers.
    else:
        for item in range(len(csvFile)):
            min_dict[header[item]]=csvFile[item]
     #converting dict into list of dict
        lst.append(min_dict)
        
        final = lst 
        print("**************")
        print(final)
        print(type(final))
        print("**************")
        
    return final

Hope my solution works for you as well.

For your reference have a glance here: https://beam.apache.org/documentation/io/built-in/google-bigquery/#writing-to-a-table

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 DamPlz
Solution 2 malevil667
Solution 3