Category "scikit-learn"

Python scikit learn pipelines (no transformation on features)

I am running different machine learning models on my data set. I am using sklearn pipelines to try different transforms on the numeric features to evaluate if o

How to pass a trained model to KerasClassifier?

I have a dozen pre-trained DNNs that I wish to add to a sklearn ensemble. The issue is that it seems I can not provide pre-trained models to KerasClassifier. cl

How to pass custom weights to scikit_learn wrappers (e.g. KerasClassifier) in a multilabel classification problem

I'm building a chain classifier for a multiclass problem that uses Keras binary Classifier model in a chain. I have 17 labels as classification target and datas

Right way to use RFECV and Permutation Importance - Sklearn

There is a proposal to implement this in Sklearn #15075, but in the meantime, eli5 is suggested as a solution. However, I'm not sure if I'm using it the right w

How to create a for loop with checking appended models

I have a list of models that I iterate through in a for loop getting their performances. I've added catboost to my model list, but when I try to add it's best e

scikit-learn GridSearchCV() fit() performance improvement

I am using GridSearchCV() and its fit() method to build a model. I currently have this working, but would like to improve the accuracy of the model by supplying

ValueError: source code string cannot contain null bytes while import sklearn

I'm trying to import MinMaxScaler from sklearn.preprocessing. when I run my code this error appears that refers to the import line: ValueError: source code str

How to fit a line using RANSAC in Cartesian coordinates?

I am using a 2D Lidar and getting the data as angle and distance with respect to lidar Position. I have to create a floor plan using Lidar and the data is given

How to install sklearn on visual studio 2019?

I'm working on a sample project with python language and visual studio 2019 IDE and I want to know how/where can I install packages like "sklearn"? When I run

SHAP: XGBoost and LightGBM difference in shap_values calculation

I have this code in visual studio code: import pandas as pd import numpy as np import shap import matplotlib.pyplot as plt import xgboost as xgb from sklearn.m

Difference between Shuffle and Random_State in train test split?

I tried both on a small dataset sample and it returned the same output. So the question is, what is the difference between the "shuffle" and the "random_state"

AttributeError: 'RandomOverSampler' object has no attribute 'fit_sample'

I am trying to use RandomOverSampler from imblearn but I'm getting error. Looking at other posts, there seems to be a problem with older versions, but I checked

ValueError: Unable to coerce to Series, length must be 1: given n

I have been trying to use RF regression from scikit-learn, but I’m getting an error with my standard (from docs and tutorials) model. Here is the code: im

Get prediction confidence through Decision Tree Regression in sklearn

Is there a way I can attach some sort of confidence with my predictions from Decision Tree Regression output in python? from sklearn.tree import DecisionTreeR

Difference between cosine similarity and cosine distance

It looks like scipy.spatial.distance.cdist cosine similariy distance: link to cos distance 1 1 - u*v/(||u||||v||) is different from sklearn.metrics.pairwis

Getting a value Error : how to use string data type in model.fit for jupyter using DecisionTreeClassifier?

this is the code import pandas as pd from sklearn.tree import DecisionTreeClassifier dataset = pd.read_csv("emotion.csv") X = dataset.drop(columns = ["mood"]) y

ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject

Importing from pyxdameraulevenshtein gives the following error, I have pyxdameraulevenshtein==1.5.3, pandas==1.1.4 and scikit-learn==0.20.2. Numpy is 1.16.1.

Sklearn - Permutation Importance leads to non-zero values for zero-coefficients in model

I'm confused by sklearn's permutation_importance function. I have fitted a pipeline with a regularized logistic regression, leading to several feature coefficie

How to plot the pricipal vectors of each variable after performing PCA?

My question mainly comes from this post :https://stats.stackexchange.com/questions/53/pca-on-correlation-or-covariance In the article, the author plotted the v

featureUnion vs columnTransformer?

what is the difference between FeatureUnion() and ColumnTransformer() in sklearn? which should i use if i want to build a supervised model with features cont