I'm trying to train an SVC classifier for image data. Yet, when I run this code:
classifier = svm.SVC(gamma=0.001)
classifier.fit(train_set, train_set_labels)
I get this error:
ValueError: setting an array element with a sequence.
I produced the images into an array with Matplotlib: plt.imread(image)
.
The error seems like it's not in an array, yet when I check the types of the data and the labels they're both lists (I manually add to a list for the labels data):
print(type(train_set))
print(type(train_set_labels))
<class 'list'>
<class 'list'>
If I do a plt.imshow(items[0])
then the image shows correctly in the output.
I also called train_test_split
from scikit-learn
:
train_set, test_set = train_test_split(items, test_size=0.2, random_state=42)
Example input:
train_set[0]
array([[[212, 134, 34],
[221, 140, 48],
[240, 154, 71],
...,
[245, 182, 51],
[235, 175, 43],
[242, 182, 50]],
[[230, 152, 51],
[222, 139, 47],
[236, 147, 65],
...,
[246, 184, 49],
[238, 179, 43],
[245, 186, 50]],
[[229, 150, 47],
[205, 122, 28],
[220, 129, 46],
...,
[232, 171, 28],
[237, 179, 35],
[244, 188, 43]],
...,
[[115, 112, 103],
[112, 109, 102],
[ 80, 77, 72],
...,
[ 34, 25, 28],
[ 55, 46, 49],
[ 80, 71, 74]],
[[ 59, 56, 47],
[ 66, 63, 56],
[ 48, 45, 40],
...,
[ 32, 23, 26],
[ 56, 47, 50],
[ 82, 73, 76]],
[[ 29, 26, 17],
[ 41, 38, 31],
[ 32, 29, 24],
...,
[ 56, 47, 50],
[ 59, 50, 53],
[ 84, 75, 78]]], dtype=uint8)
Example label:
train_set_labels[0]
'Picasso'
I'm not sure what step I'm missing to get the data in the form that the classifier needs in order to train it. Can anyone see what may be needed?
The error message you are receiving:
ValueError: setting an array element with a sequence,
normally results when you are trying to put a list somewhere that a single value is required. This would suggest to me that your train_set is made up of a list of multidimensional elements, although you do state that your inputs are lists. Would you be able to post an example of your inputs and labels?
UPDATE Yes, it's as I thought. The first element of your training data, train_set[0], corresponds to a long list (I can't tell how long), each element of which consists of a list of 3 elements. You are therefore calling the classifier on a list of lists of lists, when the classifier requires a list of lists (m rows corresponding to the number of training examples with each row made up of a list of n features). What else is in your train_set array? Is the full data set in train_set[0]? If so, you would need to create a new array with each element corresponding to each of the subelements of train_set[0], and then I believe your code should run, although I am not too familiar with that classifier. Alternatively you could try running the classifier with train_set[0].
UPDATE 2
I don't have experience with scikit-learn.svc so I wouldn't be able to tell you what the best way of preprocessing the data in order for it to be acceptable to the algorithm, but one method would be to do as I said previously and for each element of train_set, which is composed of lists of lists, would be to recurse through and place all the elements of sublist into the list above. For example
new_train_set = []
for i in range(len(train_set)):
for j in range(len(train_set[i]):
new_train_set.append([train_set[i,j])
I would then train with new_train_set and the training labels.
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments