'How to check JSON and add/remove key

I have a JSON file as an export of a nosql database, the sample below is 2 of 1xxx records.

{
    "task": {
        "id1": {
            "completed": true,
            "title": "Testing",
            "desc": "short desc",
            "status": "completed",
            "dueDate": {
                "_seconds": 1622607310,
                "_nanoseconds": 867000000
            },
            "citizenID": "Uvr0vZqIPON5UZsgJsMYPe2Qfsi2",
            "datePosted": {
                "_seconds": 1622607408,
                "_nanoseconds": 365000000
            },
            "genre": "Errands"
        },
        "id2": {
            "completed": true,
            "title": "Testing 2",
            "status": "completed",
            "dueDate": {
                "_seconds": 1622608576,
                "_nanoseconds": 476999000
            },
            "citizenID": "Uvr0vZqIPON5UZsgJsMYPe2Qfsi2",
            "datePosted": {
                "_seconds": 1622608592,
                "_nanoseconds": 628999000
            },
            "genre": "Errands",
            "latLng": null
        }
    }
}

As the keys inside id1 and id2 are not consistent, im trying to find a way/tool to make sure that the keys are consistant before i use them. also, as the export push the date to as in seconds, im also trying to convert it to a timestamp/string

The final output ideally should be:

{
"task": {
    "id1": {
        "completed": true,
        "title": "Testing",
        "desc": "short desc",
        "status": "completed",
        "dueDate": "Wednesday, 2 June 2021 12:15:10.867 GMT+08:00",
        "citizenID": "Uvr0vZqIPON5UZsgJsMYPe2Qfsi2",
        "datePosted": "Wednesday, 2 June 2021 12:16:48.365 GMT+08:00",
        "genre": "Errands"
    },
    "id2": {
        "completed": true,
        "title": "Testing 2",
        "desc": "",
        "status": "completed",
        "dueDate": "Wednesday, 2 June 2021 12:36:16.476 GMT+08:00",
        "citizenID": "Uvr0vZqIPON5UZsgJsMYPe2Qfsi2",
        "datePosted": "Wednesday, 2 June 2021 12:36:32.628 GMT+08:00",
        "genre": "Errands"
    }
}

Any help is appreciated

Edit 1: I found the wonderful jq, but im not able to write a good enough query:

    . as $data | 
paths(scalars) | . as $path | 
"\($path[1]):{\($path[2]):\($data | getpath($path))}"

So this gives me:

"id1:{completed:true}"
"id1:{title:Testing}"
"id1:{desc:short desc}"
"id1:{status:completed}"
"id1:{dueDate:1622607310}"
"id1:{dueDate:867000000}"
"id1:{citizenID:Uvr0vZqIPON5UZsgJsMYPe2Qfsi2}"
"id1:{datePosted:1622607408}"
"id1:{datePosted:365000000}"
"id1:{genre:Errands}"
"id2:{completed:true}"
"id2:{title:Testing 2}"
"id2:{status:completed}"
"id2:{dueDate:1622608576}"
"id2:{dueDate:476999000}"
"id2:{citizenID:Uvr0vZqIPON5UZsgJsMYPe2Qfsi2}"
"id2:{datePosted:1622608592}"
"id2:{datePosted:628999000}"
"id2:{genre:Errands}"

which is still far from the ideal output



Solution 1:[1]

Given your input file as input.json and the following JQ script as transform.jq:

def stddate:
    ._seconds | todate;

.task
| map_values(
    .dueDate |= stddate
    | .datePosted |= stddate
)

This command

jq -f transform.jq input.json

returns this output:

{
  "id1": {
    "completed": true,
    "title": "Testing",
    "desc": "short desc",
    "status": "completed",
    "dueDate": "2021-06-02T04:15:10Z",
    "citizenID": "Uvr0vZqIPON5UZsgJsMYPe2Qfsi2",
    "datePosted": "2021-06-02T04:16:48Z",
    "genre": "Errands"
  },
  "id2": {
    "completed": true,
    "title": "Testing 2",
    "status": "completed",
    "dueDate": "2021-06-02T04:36:16Z",
    "citizenID": "Uvr0vZqIPON5UZsgJsMYPe2Qfsi2",
    "datePosted": "2021-06-02T04:36:32Z",
    "genre": "Errands",
    "latLng": null
  }
}

Of course you can squeeze the complete script into one command line argument.

Missing things:

  • The _nanoseconds are ignored. You can readup the date/time handling of JQ and follow the links to the used C functions.

  • The unnecessary task level is missing. If you really need it, replace .task | with .task |= to get it back.

  • Whatever "consistent keys" mean to you can be done inside map_values.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 A.H.