'How to check JSON and add/remove key
I have a JSON file as an export of a nosql database, the sample below is 2 of 1xxx records.
{
"task": {
"id1": {
"completed": true,
"title": "Testing",
"desc": "short desc",
"status": "completed",
"dueDate": {
"_seconds": 1622607310,
"_nanoseconds": 867000000
},
"citizenID": "Uvr0vZqIPON5UZsgJsMYPe2Qfsi2",
"datePosted": {
"_seconds": 1622607408,
"_nanoseconds": 365000000
},
"genre": "Errands"
},
"id2": {
"completed": true,
"title": "Testing 2",
"status": "completed",
"dueDate": {
"_seconds": 1622608576,
"_nanoseconds": 476999000
},
"citizenID": "Uvr0vZqIPON5UZsgJsMYPe2Qfsi2",
"datePosted": {
"_seconds": 1622608592,
"_nanoseconds": 628999000
},
"genre": "Errands",
"latLng": null
}
}
}
As the keys inside id1 and id2 are not consistent, im trying to find a way/tool to make sure that the keys are consistant before i use them. also, as the export push the date to as in seconds, im also trying to convert it to a timestamp/string
The final output ideally should be:
{
"task": {
"id1": {
"completed": true,
"title": "Testing",
"desc": "short desc",
"status": "completed",
"dueDate": "Wednesday, 2 June 2021 12:15:10.867 GMT+08:00",
"citizenID": "Uvr0vZqIPON5UZsgJsMYPe2Qfsi2",
"datePosted": "Wednesday, 2 June 2021 12:16:48.365 GMT+08:00",
"genre": "Errands"
},
"id2": {
"completed": true,
"title": "Testing 2",
"desc": "",
"status": "completed",
"dueDate": "Wednesday, 2 June 2021 12:36:16.476 GMT+08:00",
"citizenID": "Uvr0vZqIPON5UZsgJsMYPe2Qfsi2",
"datePosted": "Wednesday, 2 June 2021 12:36:32.628 GMT+08:00",
"genre": "Errands"
}
}
Any help is appreciated
Edit 1: I found the wonderful jq, but im not able to write a good enough query:
. as $data |
paths(scalars) | . as $path |
"\($path[1]):{\($path[2]):\($data | getpath($path))}"
So this gives me:
"id1:{completed:true}"
"id1:{title:Testing}"
"id1:{desc:short desc}"
"id1:{status:completed}"
"id1:{dueDate:1622607310}"
"id1:{dueDate:867000000}"
"id1:{citizenID:Uvr0vZqIPON5UZsgJsMYPe2Qfsi2}"
"id1:{datePosted:1622607408}"
"id1:{datePosted:365000000}"
"id1:{genre:Errands}"
"id2:{completed:true}"
"id2:{title:Testing 2}"
"id2:{status:completed}"
"id2:{dueDate:1622608576}"
"id2:{dueDate:476999000}"
"id2:{citizenID:Uvr0vZqIPON5UZsgJsMYPe2Qfsi2}"
"id2:{datePosted:1622608592}"
"id2:{datePosted:628999000}"
"id2:{genre:Errands}"
which is still far from the ideal output
Solution 1:[1]
Given your input file as input.json
and the following JQ script as transform.jq
:
def stddate:
._seconds | todate;
.task
| map_values(
.dueDate |= stddate
| .datePosted |= stddate
)
This command
jq -f transform.jq input.json
returns this output:
{
"id1": {
"completed": true,
"title": "Testing",
"desc": "short desc",
"status": "completed",
"dueDate": "2021-06-02T04:15:10Z",
"citizenID": "Uvr0vZqIPON5UZsgJsMYPe2Qfsi2",
"datePosted": "2021-06-02T04:16:48Z",
"genre": "Errands"
},
"id2": {
"completed": true,
"title": "Testing 2",
"status": "completed",
"dueDate": "2021-06-02T04:36:16Z",
"citizenID": "Uvr0vZqIPON5UZsgJsMYPe2Qfsi2",
"datePosted": "2021-06-02T04:36:32Z",
"genre": "Errands",
"latLng": null
}
}
Of course you can squeeze the complete script into one command line argument.
Missing things:
The
_nanoseconds
are ignored. You can readup the date/time handling of JQ and follow the links to the used C functions.The unnecessary
task
level is missing. If you really need it, replace.task |
with.task |=
to get it back.Whatever "consistent keys" mean to you can be done inside
map_values
.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | A.H. |