The Python Oracle

TypeError: 'KFold' object is not iterable

--------------------------------------------------
Hire the world's top talent on demand or became one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: Puzzle Game 3

--

Chapters
00:00 Typeerror: 'Kfold' Object Is Not Iterable
02:12 Accepted Answer Score 15
02:46 Answer 2 Score 19
03:39 Thank you

--

Full question
https://stackoverflow.com/questions/4864...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #machinelearning #scikitlearn #crossvalidation

#avk47



ANSWER 1

Score 19


That depends on how you have imported the KFold.

If you have did this:

from sklearn.cross_validation import KFold

Then your code should work. Because it requires 3 params :- length of array, number of splits, and shuffle

But if you are doing this:

from sklearn.model_selection import KFold

then this will not work and you only need to pass the number of splits and shuffle. No need to pass the length of array along with making changes in enumerate().

By the way, the model_selection is the new module and recommended to use. Try using it like this:

fold = KFold(5,shuffle=False)

for train_index, test_index in fold.split(X):

    # Call the logistic regression model with a certain C parameter
    lr = LogisticRegression(C = c_param, penalty = 'l1')
    # Use the training data to fit the model. In this case, we use the portion of the fold to train the model
    lr.fit(x_train_data.iloc[train_index,:], y_train_data.iloc[train_index,:].values.ravel())

    # Predict values using the test indices in the training data
    y_pred_undersample = lr.predict(x_train_data.iloc[test_index,:].values)

    # Calculate the recall score and append it to a list for recall scores representing the current c_parameter
    recall_acc = recall_score(y_train_data.iloc[test_index,:].values,y_pred_undersample)
    recall_accs.append(recall_acc)



ACCEPTED ANSWER

Score 15


KFold is a splitter, so you have to give something to split.

example code:

X = np.array([1,1,1,1], [2,2,2,2], [3,3,3,3], [4,4,4,4]])
y = np.array([1, 2, 3, 4])
# Now you create your Kfolds by the way you just have to pass number of splits and if you want to shuffle.
fold = KFold(2,shuffle=False)
# For iterate over the folds just use split
for train_index, test_index in fold.split(X):
    X_train, X_test = X[train_index], X[test_index]
    y_train, y_test = y[train_index], y[test_index]
    # Follow fitting the classifier

If you want to get the index for the loop of train/test, just add enumerate

for i, train_index, test_index in enumerate(fold.split(X)):
    print('Iteration:', i)
    X_train, X_test = X[train_index], X[test_index]
    y_train, y_test = y[train_index], y[test_index]

I hope this works