'Azure AutoML seems to add extra input?
I'm using azure Automated ML to do some proof of concepts. I'm trying to identify a person based on some parameters.
In my dataset I have 4 columns of floats and 1 column containing the name of the person. My ambition is to be able to detect the person, based on the input of these 4 floats.
I have successfully trained some models based on this information. The data transformation chart looks like this, which is as I would expect:
So it ignores one column (the "person" column I assume) and uses the remaining 4 as input to a RandomForrest classifier. All is well and good so far.
When I then go and deploy the model, I now need to add a new variable simply called "Column2". This variable seems to have significant influence on the output data
When I make a request to the endpoint with two inputs where the only difference is the value of the "Column2" I get two different probabilities back:
{'PCA_0': -574.0043295463845, 'PCA_1': 3455.9091610620617, 'PCA_2': 2352.2555893520835, 'PCA_3': -6941.596091271862, 'Column2': '0'} = [0.24, 0.4, 0.06, 0.3]
{'PCA_0': -574.0043295463845, 'PCA_1': 3455.9091610620617, 'PCA_2': 2352.2555893520835, 'PCA_3': -6941.596091271862, 'Column2': '1'} = [0.26, 0.19, 0.54, 0.01]
Anyone has any idea about what I'm doing wrong here?
Solution 1:[1]
Here is the link to AutoML support for Test Datasets and samples. https://github.com/Azure/automl-testdataset-preview
Testing a model takes as input a labeled test dataset (not used at all in the training and validation process) and outputs predictions and test metrics related to those predictions.
The predictions and test metrics are stored within the test run in Azure ML so users can download the predictions and see the test metrics in the UI (or through the SDK) at any time after testing the model.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Ram-msft |