'Multiclass-multioutput is not supported' Error in Scikit learn for Knn classifier

radix

I have two variables X and Y.

The structure of X (i.e an np.array):

[[26777 24918 26821 ...    -1    -1    -1]
[26777 26831 26832 ...    -1    -1    -1]
[26777 24918 26821 ...    -1    -1    -1]
...
[26811 26832 26813 ...    -1    -1    -1]
[26830 26831 26832 ...    -1    -1    -1]
[26830 26831 26832 ...    -1    -1    -1]]

The structure of Y :

[[1252, 26777, 26831], [1252, 26777, 26831], [1252, 26777, 26831], [1252, 26777, 26831], [1252, 26777, 26831], [1252, 26777, 26831], [25197, 26777, 26781], [25197, 26777, 26781], [25197, 26777, 26781], [26764, 25803, 26781], [26764, 25803, 26781], [25197, 26777, 26781], [25197, 26777, 26781], [1252, 26777, 16172], [1252, 26777, 16172]]

The array in Y , example [1252, 26777, 26831] are three separate features.

I am using Knn classifier from scikit learn module

classifier = KNeighborsClassifier(n_neighbors=3)
classifier.fit(X,Y)
predictions = classifier.predict(X)
print(accuracy_score(Y,predictions))

But I get an error saying :

ValueError: multiclass-multioutput is not supported

I guess the structure of 'Y' is not supported , what changes do I make in order for the program to execute?

Input :

  Deluxe Single room with sea view

Expected Output :

c_class = Deluxe
c_occ = single
c_view = sea
Venkatachalam

As mentioned in the error, KNN does not support multi-output regression/classification.

For your problem, you need MultiOutputClassifier().

from sklearn.multioutput import MultiOutputClassifier

knn = KNeighborsClassifier(n_neighbors=3)
classifier = MultiOutputClassifier(knn, n_jobs=-1)
classifier.fit(X,Y)

Working example:

>>> from sklearn.feature_extraction.text import TfidfVectorizer
>>> corpus = [
...     'This is the first document.',
...     'This document is the second document.',
...     'And this is the third one.',
...     'Is this the first document?',
... ]
>>> vectorizer = TfidfVectorizer()
>>> X = vectorizer.fit_transform(corpus)

>>> Y = [[124323,1234132,1234],[124323,4132,14],[1,4132,1234],[1,4132,14]]

>>> from sklearn.multioutput import MultiOutputClassifier
>>> from sklearn.neighbors import KNeighborsClassifier
>>> knn = KNeighborsClassifier(n_neighbors=3)
>>> classifier = MultiOutputClassifier(knn, n_jobs=-1)
>>> classifier.fit(X,Y)
>>> predictions = classifier.predict(X)

array([[124323,   4132,     14],
       [124323,   4132,     14],
       [     1,   4132,   1234],
       [124323,   4132,     14]])

>>> classifier.score(X,np.array(Y))
0.5

>>> test_data = ['I want to test this']
>>> classifier.predict(vectorizer.transform(test_data))
array([[124323,   4132,     14]])

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

Error: Classification metrics can't handle a mix of multiclass-multioutput and multilabel-indicator targets

Multiclass classification with xgboost classifier?

Python/Scikit-Learn - Can't handle mix of multiclass and continuous

Python scikit-learn: exporting trained classifier

Save classifier to disk in scikit-learn

How to plot ROC curve with scikit learn for the multiclass case?

Scikit Learn multiclass classification (perfect results)

Call predict function for nearest neighbor (knn) classifier with Python scikit sklearn

Grid Search parameter and cross-validated data set in KNN classifier in Scikit-learn

python error Can't handle mix of multiclass and continuous-multioutput

which coefficients go to which class in multiclass logistic regression in scikit learn?

Using the score method from Sklearn and get ValueError: multiclass-multioutput is not supported

Save classifier to postrgesql database, in scikit-learn

ValueError: multiclass-multioutput format is not supported using sklearn roc_auc_score function

ROC curve for discrete classifier using scikit learn

scikit learn averaged perceptron classifier

Jaccard similarity score ValueError: multiclass-multioutput is not supported Python

Scikit Learn RFECV ValueError: continuous is not supported

Python Scikit-learn MultiOutput Regression - enforce floor limit when predicting numerical values

Error in fit method of scikit learn chain classifier with a keras model for a multilabel problem

Python Sklearn "ValueError: Classification metrics can't handle a mix of multiclass-multioutput and binary targets" error

ROC in Multiclass kNN

Evaluating convergence of SGD classifier in scikit learn

Precision-recall curve with average='micro' for multiclass classifier in scikit-learn

SKLearn Multiclass Classifier

Fitting a Support Vector Classifier in scikit-learn with image data produces error

Error encountered: Classification metrics can't handle a mix of multiclass-multioutput and binary targets

What is the default mechanism used by scikit-learn algorithms for multiclass classification?

Can scikit-learn 'dummy classifier' be applied to multiclass scenario