'How to apply cross_val_score to cross valid our own model

Usually, we apply cross_val_score to the Sklearn models by doing the following way.

scores = cross_val_score(clf, X, y, cv=5, scoring='f1_macro')

Now I have my own models that I wish to perform cross validation. How should I approach it?

tf.keras.backend.clear_session()


model = tf.keras.models.Sequential()
model.add(Masking(mask_value=0.0, input_shape=(X_train.shape[1], X_train.shape[2])))
model.add(Bidirectional(LSTM(128, dropout=dropout, recurrent_dropout=Rdropout, return_sequences=True)))
# model.add(Bidirectional(LSTM(64, dropout=dropout, recurrent_dropout=Rdropout, return_sequences=True)))
# model.add(Bidirectional(LSTM(128, dropout=dropout, recurrent_dropout=Rdropout, return_sequences=True)))
model.add(Bidirectional(LSTM(32, dropout=dropout, recurrent_dropout=Rdropout)))
# model.add(Dense(6, activation='relu'))
# model.add(Dense(4, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))

adamopt = tf.keras.optimizers.Adam(lr=0.003, beta_1=0.9, beta_2=0.999, epsilon=1e-8)
RMSopt = tf.keras.optimizers.RMSprop(lr=0.0007,rho=0.9, epsilon=1e-6)

model.compile(loss='binary_crossentropy',
              optimizer=RMSopt,
              metrics=['accuracy'])
print(cross_val_score(model, X_train, y_train, cv=2,scoring='accuracy'))

TypeError: Cannot clone object '<tensorflow.python.keras.engine.sequential.Sequential object at 0x7f86481170f0>' (type <class 'tensorflow.python.keras.engine.sequential.Sequential'>): it does not seem to be a scikit-learn estimator as it does not implement a 'get_params' methods.

I think that cross_val_score is exclusive to Sklearn models?



Solution 1:[1]

cross_val_score is indeed exclusive to Sklearn models, or models that implements the same required functions, which is not the case for a Keras model.

There is no pre-build function for Keras that allow you to cross validate your model, you will need to code your cross validation algorithm.

First you should decide how many folds do you want to have, then you can use the KFold class from sklearn to divide your dataset in that many folds. (note that KFold.split returns the indices of the datapoints and not the actual datapoints)

Then, you should train a new model for each split and computes the metrics you want. You can follow this tutorial for more information.

Solution 2:[2]

We cannot directly integrate Keras model in sklearn pipeline. So if you are looking for evaluation of your Keras model using cross_val_score you need to use the wrapper module tf.keras.wrappers.scikit_learn for using the sklearn API with Keras models. For eg,

from tf.keras.wrappers.scikit_learn import KerasClassifier

def LSTM_Network(neurons=100):

    model = tf.keras.models.Sequential()
    .
    .
    model.compile(loss='binary_crossentropy',
          optimizer=RMSopt,
          metrics=['accuracy'])    
    return model

lstm_clf = KerasClassifier(build_fn=LSTM_Network, epochs=6, batch_size=64, verbose=0)

model_selection.cross_val_score(lstm_clf, X_train, Y_test, cv=10, scoring='accuracy')

Solution 3:[3]

Try this :

## How to use cross_val_score for Cross Validation in Keras

def Learn_By_Example_320(): 

    print()
    print(format('How to use cross_val_score for Cross Validation in Keras','*^82'))    

    import warnings
    warnings.filterwarnings("ignore")

    # load libraries
    from keras.wrappers.scikit_learn import KerasClassifier
    from keras.initializers import VarianceScaling
    from keras.regularizers import l2
    from keras.models import Sequential
    from keras.layers import Dense
    from sklearn import datasets
    from sklearn.model_selection import cross_val_score

    # simulated data
    dataset = datasets.make_classification(n_samples=10000, n_features=20, n_informative=5, 
                n_redundant=2, n_repeated=0, n_classes=2, n_clusters_per_class=2, 
                weights=None, flip_y=0.01, class_sep=1.0, hypercube=True, shift=0.0, 
                scale=1.0, shuffle=True, random_state=None)

    X = dataset[0];  y = dataset[1]

    print(X.shape);  print(y.shape)

    # Define a Deep Learning Model
    def create_network(optimizer='RMSprop'):
        model = Sequential()
        model.add(Dense(units=36, input_shape=(X.shape[1],), 
                        kernel_regularizer=l2(0.001),           # weight regularizer
                        kernel_initializer=VarianceScaling(),   # initializer
                        activation='relu'))
        model.add(Dense(units=28, 
                        kernel_regularizer=l2(0.01),            # weight regularizer
                        kernel_initializer=VarianceScaling(),   # initializer                   
                        activation='relu'))
        model.add(Dense(units=1, activation='sigmoid'))
    
        # Compile the Model
        model.compile(loss='binary_crossentropy', optimizer = optimizer, 
                      metrics=['acc','mae'])    
        return model

    # Wrap Keras model so it can be used by scikit-learn
    neural_network = KerasClassifier(build_fn=create_network, epochs=5, batch_size=10,
                                     verbose=0)

    # evaluate using 10-fold cross validation    
    results = cross_val_score(neural_network, X, y, cv=10, scoring='accuracy')
    print(); print(results)
    print(); print("Accucary: ", results.mean()*100)
    print("Standard Deviation: ", results.std())

Learn_By_Example_320()

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 desertnaut
Solution 2 samrat230599
Solution 3 J R