Category "scikit-learn"

Pandas and scikit-learn: KeyError: [....] not in index

I do not understand why do I get the error KeyError: '[ 1351 1352 1353 ... 13500 13501 13502] not in index' when I run this code: cv = KFold(n_splits=10) fo

VS Code: ModuleNotFoundError: No module named 'sklearn'

I am working in VS Code to run a Python script in conda environment named myenv where sklearn is already installed. However when I import it and run the script

True Positive Rate and False Positive Rate (TPR, FPR) for Multi-Class Data in python [duplicate]

How do you compute the true- and false- positive rates of a multi-class classification problem? Say, y_true = [1, -1, 0, 0, 1, -1, 1, 0,

True Positive Rate and False Positive Rate (TPR, FPR) for Multi-Class Data in python [duplicate]

How do you compute the true- and false- positive rates of a multi-class classification problem? Say, y_true = [1, -1, 0, 0, 1, -1, 1, 0,

sklearn decision tree plot_tree nodes are overlapping

When I plot my sklearn decision tree using sklearn.tree.plot_tree(), the nodes are overlapping on the deeper levels and I cannot read what is in the nodes. It i

Cache entry deserialization failed, entry ignored

C:\Users\deypr>pip3 install sklearn Collecting sklearn Cache entry deserialization failed, entry ignored Retrying (Retry(total=4, connect=None, read=N

AttributeError: 'CRF' object has no attribute 'keep_tempfiles'

I am currently trying to replicate certain methods from this blog https://towardsdatascience.com/named-entity-recognition-and-classification-with-scikit-learn-f

'TimeseriesGenerator' object has no attribute 'shape'

I have a LSTM model. which when I try to fit i get the error mentioned in the title. I have an array of timeseries data with multiple features I'm feeding as in

I cannot train tensorflow

I am trying to follow these instructions in order to train tensorflow: https://www.datacamp.com/community/tutorials/tensorflow-tutorial?utm_source=adwords_ppc&a

Is there any place in scikit-learn Lasso/Quantile Regression source code that L1 regularization is applied?

I could not find where the Manhattan distance of weights is calculated and multiplied with alpha (L1 reg. coefficient) in the Lasso Regression and the Quantile

Get intermediate data state in scikit-learn Pipeline

Given the following example: from sklearn.feature_extraction.text import TfidfVectorizer from sklearn.decomposition import NMF from sklearn.pipeline import Pi

What is the data type of X in pca.fit_transform(X)?

I got a word2vec model abuse_model trained by Gensim. I want to apply PCA and make a plot on CERTAIN words that I only care about (vs. all words in the model).

Using Scikit's StandardScaler correctly across multiple programs

I am having a question that is very similar to this topic but I want to reuse the StandardScaler instead of LabelEncoder. Here's what I have done: # in one pro

PCA on sklearn - how to interpret pca.components_

I ran PCA on a data frame with 10 features using this simple code: pca = PCA() fit = pca.fit(dfPca) The result of pca.explained_variance_ratio_ shows: array

In keras/ tensorflow, Is there a way to add a preprocessing layer to the output, similar to TargetTransformRegressor in sklearn?

I want to use keras to build a neural network regression model from X_train -> Y_train. In this example, however, I need to perform a preprocessing transform

ImportError: No module named 'sklearn.lda'

When I run classifier.py in the openface demos directory using: classifier.py train ./generated-embeddings/ I get the following error message: --> fro

multilayer_perceptron : ConvergenceWarning: Stochastic Optimizer: Maximum iterations reached and the optimization hasn't converged yet.Warning?

I have written a basic program to understand what's happening in MLP classifier? from sklearn.neural_network import MLPClassifier data: a dataset of body met

PLS-DA Loading Plot in Python

How can I make a Loading plot with Matplotlib of a PLS-DA plot, like the loading plot like that of PCA? This answer explains how it can be done with PCA: Plot

sklearn lda gridsearchcv with pipeline

pipe = Pipeline([('reduce_dim', LinearDiscriminantAnalysis()),('classify', LogisticRegression())]) param_grid = [{'classify__penalty': ['l1', 'l2'],

sklearn RandomForestRegressor discrepancy in the displayed tree values

while using the RandomForestRegressor I noticed something strange. To illustrate the problem, here a small example. I applied the RandomForestRegressor on a tes