'Mark Great Expectation validation as failed or passed based on a percentage of failure
I am using Great Expectations in my ETL data pipeline for a POC. I have a validation which is failing (as expected), and I have the following data in my validation JSON:
"unexpected_count": 205,
"unexpected_percent": 10.25,
"unexpected_percent_nonmissing": 10.25,
"unexpected_percent_total": 10.25
Please note that the unexpected_percent_total is 10.25%. Is there a way to configure the validation such that the validation would show as success if the failed percentage is that low? For eg, show the validation as failed only if the unexpected_percent_total is more than 50%, else show it as passed. Please let me know if anyone configured such a scenario using Apache Great Expectations
Solution 1:[1]
Yes. Use the "mostly" keyword argument.
import pandas as pd
import great_expectations as ge
d = {'fruit': ['apple','apple','apple','orange','banana']}
df = pd.DataFrame(data=d)
ge_df=ge.from_pandas(df)
ge_df.expect_column_values_to_be_in_set('fruit',['apple','banana'],mostly=.5)
This expectations returns a "Success" even though "orange" is not in the set.
{
"result": {
"element_count": 5,
"missing_count": 0,
"missing_percent": 0.0,
"unexpected_count": 1,
"unexpected_percent": 20.0,
"unexpected_percent_total": 20.0,
"unexpected_percent_nonmissing": 20.0,
"partial_unexpected_list": [
"orange"
]
},
"exception_info": {
"raised_exception": false,
"exception_traceback": null,
"exception_message": null
},
"meta": {},
"success": true
}
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Andy Jessen |