Let's say I generate the following toy dataset from Matlab, and I save it as a mat file:
>> arr = rand(100);
>> whos arr
Name Size Bytes Class Attributes
arr 100x100 80000 double
>> save('arr.mat', 'arr')
The saved arr.mat
file is of size 75829 Bytes
according to the output of the ls
command.
If I load the same file using scipy.io.loadmat()
and save it again using scipy.io.savemat()
:
arr = io.loadmat('arr.mat')
with open('arrscipy.mat', 'w') as f:
io.savemat(f, arr)
I obtain a file with a considerably different size (∼ 4KB larger):
$ ls -al
75829 Nov 6 11:52 arr.mat
80184 Nov 6 11:52 arrscipy.mat
I now have two binary mat files containing the same data. My understanding is that the size of a binary mat file is determined by the size of its contained variables, plus some overhead due to file headers. However the sizes of these two files are considerably different. Why is this? Is it a data format problem?
I tried this with arrays of structures too, and the result is similar: scipy-saved mat files are larger than Matlab-saved ones.
Look at the docs:
scipy.io.savemat(file_name, mdict, appendmat=True, format='5',
long_field_names=False, do_compression=False, oned_as='row')
Compression is turned off by default. In matlab compression is always turned on.
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments