'Fluentbit nested json parsing

I have the following log


{
  "log": {
    "header": {
      "key": "value",
      "nested": "{\"key1\":\"value\",\"key2\":\"value\"}",
      "dateTime": "2019-05-08T20:58:06+00:00"
    },
    "body": {
      "path": "/request/path/",
      "method": "POST",
      "ua": "curl/7.54.0",
      "resp": 200
    }
  }
}

I'm trying to aggregate logs using fluentbit and I want the entire record to be JSON. The specific problem is the "log.header.nested" field, which is a JSON string. How can I parse and replace that string with its contents?

I tried using a parser filter from fluentbit. But I have an issue with key_name it doesn't work well with nested json values. I tried testing it locally with non nested fields and the following configuration works:

[INPUT]
    name             tail
    path             nst.log
    read_from_head   true
    Parser           json
[FILTER]
    name          parser
    Match         *
    Parser        json
    key_name      log
    Reserve_Data  On

[FILTER]
    name          parser
    Match         *
    Parser        json
    key_name      nested
    Reserve_Data  On

[OUTPUT]
    name             stdout
    match            *

But when I try this filter for nested values:

[FILTER]
    name          parser
    Match         *
    Parser        json
    key_name      log.header.nested
    Reserve_Data  On

It doesn't work, there is nothing on fluentbit documentation about how to use nested keys in key_name fileds. So I tried:

  • log.header.nested
  • log_header_nest
  • log['header']['nest']
  • log[header][nest]

For clarity, I'd like the logs output by fluentbit to look like this:

{
  "log": {
    "header": {
      "key": "value",
      "nested": {
          "key1": "value",
          "key2": "value"
      },
      "dateTime": "2019-05-08T20:58:06+00:00"
    },
    "body": {
      "path": "/request/path/",
      "method": "POST",
      "ua": "curl/7.54.0",
      "resp": 200
    }
  }
}



Solution 1:[1]

You can try to combine Nest filter plugin with Parser filter plugin. For Instance I manage to parse nested json at first level with the following configuration:

    [FILTER]
        Name               nest
        Match              application.*
        Operation          lift
        Nested_under       log_processed
        Add_prefix         log_
        Wildcard           message
   [FILTER]
       Name                parser
       Match               application.*
       Key_Name            log_message
       Parser              docker        
       Preserve_Key        On
       Reserve_Data        On 

Before the two filter the message is:

{
  "time": "2022-05-10T19:43:04.655207298Z",
  "stream": "stdout",
  "_p": "F",
  "log": "{\"timestamp\":\"2022-05-10 19:43:04.654\",\"level\":\"DEBUG\",\"nested\":"{\"key1\":\"value\",\"key2\":\"value\"}",\"context\":\"default\"}",
  "log_processed": {
      "timestamp": "2022-05-10 19:43:04.654",
      "level": "DEBUG",     
      "message": "{\"key1\":\"value\",\"key2\":\"value\"}",
      "context": "default"
   }
 }

after the two filter the one level nested json was parsed to:

{
  "time": "2022-05-10T19:43:04.655207298Z",
  "stream": "stdout",
  "_p": "F",
  "log": "{\"timestamp\":\"2022-05-10 19:43:04.654\",\"level\":\"DEBUG\",\"nested\":"{\"key1\":\"value\",\"key2\":\"value\"}",\"context\":\"default\"}",
  "log_processed": {
      "timestamp": "2022-05-10 19:43:04.654",
      "level": "DEBUG",     
      "message": "{\"key1\":\"value\",\"key2\":\"value\"}",
      "context": "default"
   },
 log_message: "{\"key1\":\"value\",\"key2\":\"value\"}".
 "key1" : "value",
 "key2" : "value"
}

You could try to apply nest plugin with lift operation more than once.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 landal79