'Python 3.7: xgboost.core.XGBoostError
I am new to Python and Iam getting this error when running xgBoost: xgboost.core.XGBoostError: [15:49:05] C:/Users/Administrator/workspace/xgboost-win64_release_1.3.0/src/learner.cc:567: Check failed: mparam_.num_feature != 0 (0 vs. 0) : 0 feature is supplied. Are you using raw Booster interface?
I tried to search for this error, could not find much useful resources.
I guess the error occurs in the prediction stage. But I'm not sure about that.
Μy dataset consists of two columns: ["Posts Frequency","Likes Count"]
as seen below.
Here's my code:
# Load Dependencies
import pandas as pd
from numpy import where
import matplotlib.pyplot as plt
import numpy as np
from numpy import unique
from sklearn import metrics
import warnings
warnings.filterwarnings('ignore')
# Load the Data
# Define Columns
names = ["Posts Frequency","Likes Count"]
data = pd.read_csv("RANKING TEST (1).csv", encoding="utf-8", sep=";", delimiter=None,
names=names, delim_whitespace=False,
header=0, engine="python")
X = data.values[:,0:1]
y = data.values[:,1]
# Training
from sklearn.model_selection import GroupShuffleSplit
gss = GroupShuffleSplit(test_size=.20, n_splits=1, random_state = 7).split(data, groups=data['Posts Frequency'])
X_train_inds, X_test_inds = next(gss)
train_data= data.iloc[X_train_inds]
X_train = train_data.loc[:, ~train_data.columns.isin(['Posts Frequency','Likes Count'])]
y_train = train_data.loc[:, train_data.columns.isin(['Likes Count'])]
test_data= data.iloc[X_test_inds]
X_test = test_data.loc[:, ~test_data.columns.isin(['Likes Count'])]
y_test = test_data.loc[:, test_data.columns.isin(['Likes Count'])]
from sklearn.model_selection import GroupShuffleSplit
gss = GroupShuffleSplit(test_size=.20, n_splits=1, random_state = 7).split(data, groups=data['Posts Frequency'])
X_train_inds, X_test_inds = next(gss)
train_data= data.iloc[X_train_inds]
X_train = train_data.loc[:, ~train_data.columns.isin(['Posts Frequency','Likes Count'])]
y_train = train_data.loc[:, train_data.columns.isin(['Likes Count'])]
groups = train_data.groupby('Posts Frequency').size().to_frame('size')['size'].to_numpy()
test_data= data.iloc[X_test_inds]
X_test = test_data.loc[:, ~test_data.columns.isin(['Likes Count'])]
y_test = test_data.loc[:, test_data.columns.isin([' Likes Count'])]
import xgboost as xgb
model = xgb.XGBRanker(
tree_method='gpu_hist',
booster='gbtree',
objective='rank:pairwise',
random_state=42,
learning_rate=0.1,
colsample_bytree=0.9,
eta=0.05,
max_depth=6,
n_estimators=110,
subsample=0.75
)
model.fit(X_train, y_train, group=groups, verbose=True)
# make predictions
def predict(model, data):
return model.predict( data.loc[:, ~data.columns.isin( ['Posts Frequency'] )] )
predictions = (data.groupby( 'Posts Frequency' )
.apply( lambda x: predict( model, x ) ))
Can anyone help me??
Thank you in advance !!
Sofia
Solution 1:[1]
You need to pass at least 1 feature to xgBoost. Maybe check what you are doing here:
X_train = train_data.loc[:, ~train_data.columns.isin(['Posts Frequency','Likes Count'])]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Alberto Mario Castillo |