'Azure Data Factory - copy task using Rest API is only returning first row upon execution

I have a copy task in ADF that is pulling data from a REST API into an Azure SQL Database. I've created the mappings, and pulled in a collection reference as follows:

preview of json data

source

sink

mappings

output

You will notice it's only outputting 1 row (the first row) when running the copy task. I know this is usually because you are pulling from a nested JSON array, in which the collection reference should resolve this to pull from the array - but I can't for the life of me get it to pull multiple records even after setting the collection.



Solution 1:[1]

Because of Azure Data Factory design limitation, pulling JSON data and inserting into Azure SQL Database isn't a good approach. Even after using the "Collective reference" you might not get the desired results.

The recommended approach is to store the output of REST API as a JSON file in Azure blob storage by Copy Data activity. Then you can use that file as Source and do transformation in Data Flow. Also you can use Lookup activity to get the JSON data and invoke the Stored Procedure to store the data in Azure SQL Database(This way will be cheaper and it's performance will be better).

Use the flatten transformation to take array values inside hierarchical structures such as JSON and unroll them into individual rows. This process is known as denormalization.

Refer this third-party tutorial for more details.

Solution 2:[2]

There's a trick to this. You import schemas, then you put the name of the array in collection reference then you import schemas again then it works Screen shot from azure data factory

Solution 3:[3]

Hey I had this issue and I noticed that the default column names for the json branches were really long and in my target csv the header row got truncated after a bit and I was able to get ADF working by just renaming them in the mapping section. For example i had:

['hours']['monday']['openIntervals'][0]['endTime'] in source and changed it to MondayCloseTime in destination.

Just started working. Can also just turn off the header on the output for a quick test before re writing all the column names as that also got it working for me

I assume it writes out the truncated header row at the same time as the 1st row of data and then tries to use that header row afterwards but as it doesn't match what its expecting it just ends. Bit annoying it doesn't give an error or anything but anyway this worked for me.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 UtkarshPal-MT
Solution 2 Mr E J Cadwallader
Solution 3 Jim