I am working on a machine learning project and using 3 classification methods, namely:
and while modeling I needed to use a feature scaling technique called StandardScaler
to improve the performance of the models.
I am getting the following results:
Are they appropriate? and can the model performance be worse after implementing the standardization? as in the case with the SVM
?
Your results are sound and make sense. To understand the results a little more in depth it is worth taking a quick overview over how those algorithms work:
1- MLP is a linear function of the inputs which is later passed through a non-linear activation function. This means that if some of your features have a different numerical scale than the others they will weigh more in the activation function. The standardization in this case helps the mlp network learn better from the features, as they are weighed equally.
2- The KNN algorithm is a non-parametric algorithm and the classification depends on how the current data point is similar to other already labeled data points in the feature space. The similarity function is often computed as a distance function in the feature space (euclidean distance for instance). This implies that Standardization can reduce the distances between data in the feature space, and improve overall performance.
3- The SVM algorithm tries to find a hyper-plane that best separates the classes. In this case the Standardization can cause the data points to be more closely packed so that the best fitting line misclassifies more data points.
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments