'Python 3.7: xgboost.core.XGBoostError

I am new to Python and Iam getting this error when running xgBoost: xgboost.core.XGBoostError: [15:49:05] C:/Users/Administrator/workspace/xgboost-win64_release_1.3.0/src/learner.cc:567: Check failed: mparam_.num_feature != 0 (0 vs. 0) : 0 feature is supplied. Are you using raw Booster interface?

I tried to search for this error, could not find much useful resources.

I guess the error occurs in the prediction stage. But I'm not sure about that. Μy dataset consists of two columns: ["Posts Frequency","Likes Count"] as seen below.

Here's my code:

# Load Dependencies
import pandas as pd
from numpy import where
import matplotlib.pyplot as plt
import numpy as np
from numpy import unique
from sklearn import metrics
import warnings
warnings.filterwarnings('ignore')


#  Load the Data
# Define Columns
names = ["Posts Frequency","Likes Count"]


data = pd.read_csv("RANKING TEST (1).csv", encoding="utf-8", sep=";", delimiter=None,
                 names=names, delim_whitespace=False,
                 header=0, engine="python")
X = data.values[:,0:1]
y = data.values[:,1]



# Training 
from sklearn.model_selection import GroupShuffleSplit

gss = GroupShuffleSplit(test_size=.20, n_splits=1, random_state = 7).split(data, groups=data['Posts Frequency'])

X_train_inds, X_test_inds = next(gss)

train_data= data.iloc[X_train_inds]
X_train = train_data.loc[:, ~train_data.columns.isin(['Posts Frequency','Likes Count'])]
y_train = train_data.loc[:, train_data.columns.isin(['Likes Count'])]

test_data= data.iloc[X_test_inds]


X_test = test_data.loc[:, ~test_data.columns.isin(['Likes Count'])]
y_test = test_data.loc[:, test_data.columns.isin(['Likes Count'])]



from sklearn.model_selection import GroupShuffleSplit

gss = GroupShuffleSplit(test_size=.20, n_splits=1, random_state = 7).split(data, groups=data['Posts Frequency'])

X_train_inds, X_test_inds = next(gss)

train_data= data.iloc[X_train_inds]
X_train = train_data.loc[:, ~train_data.columns.isin(['Posts Frequency','Likes Count'])]
y_train = train_data.loc[:, train_data.columns.isin(['Likes Count'])]

groups = train_data.groupby('Posts Frequency').size().to_frame('size')['size'].to_numpy()

test_data= data.iloc[X_test_inds]


X_test = test_data.loc[:, ~test_data.columns.isin(['Likes Count'])]
y_test = test_data.loc[:, test_data.columns.isin([' Likes Count'])]





import xgboost as xgb

model = xgb.XGBRanker(
    tree_method='gpu_hist',
    booster='gbtree',
    objective='rank:pairwise',
    random_state=42,
    learning_rate=0.1,
    colsample_bytree=0.9,
    eta=0.05,
    max_depth=6,
    n_estimators=110,
    subsample=0.75
    )

model.fit(X_train, y_train, group=groups, verbose=True)


# make predictions
def predict(model, data):
    return model.predict( data.loc[:, ~data.columns.isin( ['Posts Frequency'] )] )


predictions = (data.groupby( 'Posts Frequency' )
               .apply( lambda x: predict( model, x ) ))

Can anyone help me??

Thank you in advance !!

Sofia



Solution 1:[1]

You need to pass at least 1 feature to xgBoost. Maybe check what you are doing here:

X_train = train_data.loc[:, ~train_data.columns.isin(['Posts Frequency','Likes Count'])]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Alberto Mario Castillo