multiclass classification in xgboost (python)

user3804483

I can't figure out how to pass number of classes or eval metric to xgb.XGBClassifier with the objective function 'multi:softmax'.

I looked at many documentations but the only talk about the sklearn wrapper which accepts n_class/num_class.

My current setup looks like

kf = cross_validation.KFold(y_data.shape[0], \
    n_folds=10, shuffle=True, random_state=30)
err = [] # to hold cross val errors
# xgb instance
xgb_model = xgb.XGBClassifier(n_estimators=_params['n_estimators'], \
    max_depth=params['max_depth'], learning_rate=_params['learning_rate'], \
    min_child_weight=_params['min_child_weight'], \
    subsample=_params['subsample'], \
    colsample_bytree=_params['colsample_bytree'], \
    objective='multi:softmax', nthread=4)

# cv
for train_index, test_index in kf:
    xgb_model.fit(x_data[train_index], y_data[train_index], eval_metric='mlogloss')
    predictions = xgb_model.predict(x_data[test_index])
    actuals = y_data[test_index]
    err.append(metrics.accuracy_score(actuals, predictions))
Adrien Renaud

You don't need to set num_class in the scikit-learn API for XGBoost classification. It is done automatically when fit is called. Look at xgboost/sklearn.py at the beginning of the fit method of XGBClassifier:

    evals_result = {}
    self.classes_ = np.unique(y)
    self.n_classes_ = len(self.classes_)

    xgb_options = self.get_xgb_params()

    if callable(self.objective):
        obj = _objective_decorator(self.objective)
        # Use default value. Is it really not used ?
        xgb_options["objective"] = "binary:logistic"
    else:
        obj = None

    if self.n_classes_ > 2:
        # Switch to using a multiclass objective in the underlying XGB instance
        xgb_options["objective"] = "multi:softprob"
        xgb_options['num_class'] = self.n_classes_

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

Multiclass Classification with xgboost in R

Multiclass classification with xgboost classifier?

Multiclass Text Classification in Python

Spark Multiclass Classification using python

setting bias for multiclass classification python tensorflow keras

XGBoost only predcinting single class for the unseen data out of 18 classes for Multiclass Text Classification problem

How to deal with unbalanced xgboost multiclass classification within Scikit.learn pipeline?

How to get f-measure in multiclass-multioutput classification in python?

How to get the adjacent accuracy scores for a multiclass classification problem in Python?

Multiclass classification or regression?

Multiclass Classification with LightGBM

Multiclass classification on iris dataset

sklearn metrics for multiclass classification

Sklearn: ROC for multiclass classification

Spark Multiclass Classification Example

Keras LSTM multiclass classification

Multiclass Classification and probability prediction

Create a DataFrame for MultiClass classification

MultiClass Classification Confusion Matrix

CNN - Wrong prediction with multiclass classification

Multiclass classification with Google AutoML Tables

Number of trees in multiclass classification in LightGBM

Tensorflow confusion matrix for multiclass classification

Multiclass classification one vs one

Vowpal Wabbit Multiclass Linear Classification

Reduce multiclass to binary classification problem

LSTM - Multiclass classification - Data prep

Error when applying incremental learning in XGBoost for a classification setting (python)

Low probabilities when using xgboost on multiclass problem