'Empty response from druid when using quantile data sketch aggregator

I am trying to compute the the 95th percentile value of a metric in druid. I came across this documentation https://druid.apache.org/docs/latest/development/extensions-core/datasketches-quantiles.html which says that we can build sketches from raw data at query time.

I prepared this druid query

{
    "queryType": "timeseries",
    "intervals": [
        "2020-05-08T11:45:00.000Z/2020-05-08T11:50:00.000Z"
    ],
    "granularity": "minute",
    "dataSource": "datasource",
    "filter": {
        "type": "selector",
        "dimension": "hostName",
        "value": "host"
    },
    "postAggregation": [
        {
            "type": "quantilesDoublesSketchToQuantile",
            "name": "dim",
            "field": "dim",
            "fraction": 0.5
        }
    ]
}

But I am getting empty response from druid when I fire this query. The output is

[
  {
    "timestamp": "2020-05-08T11:45:00.000Z",
    "result": {

    }
  },
  {
    "timestamp": "2020-05-08T11:46:00.000Z",
    "result": {

    }
  },
  {
    "timestamp": "2020-05-08T11:47:00.000Z",
    "result": {

    }
  },
  {
    "timestamp": "2020-05-08T11:48:00.000Z",
    "result": {

    }
  },
  {
    "timestamp": "2020-05-08T11:49:00.000Z",
    "result": {

    }
  }
]

I verified that data is present in druid within that time range. Thanks in advance



Solution 1:[1]

You have a postAggregator of type quantilesDoublesSketchToQuantile, which you're passing the field dim. That function takes a sketch and converts it to a quantile after aggregation, but it doesn't build the sketch, rather it expects the sketch as input. You first need to create an aggregation using the function quantilesDoublesSketch, that's where you pass in dim, you can then use the sketch that results in your postAggregator.

Solution 2:[2]

In case anyone's looking for a working code, here's what works for me:

{
    "queryType": "timeseries",
    "intervals": "2022-04-17T00:00:00+00:00/2101-01-01T00:00:00+00:00",
    "aggregations": [
        {
            "fieldName": "my_dimension",
            "name": "my_dimension_agg",
            "type": "quantilesDoublesSketch"
        }
    ],
    "postAggregations": [
        {
            "type"  : "quantilesDoublesSketchToQuantile",
            "name": "median_value",
            "field"  :  {
                "type": "fieldAccess",
                 "fieldName": "my_dimension_agg"
              },
            "fraction" : 0.5
          }
    ],
    "dataSource": "my_datasource",
    "granularity": "all"
}

Send query using:

curl -X POST 'http://your-server:8082/druid/v2/?pretty' -H 'Content-Type:application/json' -H 'Accept:application/json' -d @/home/directory/druid/your-query.json

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Tristan Reid
Solution 2 Clover