我正在读取一个大文件并保存在内存中。我需要为数据框中的每一列指定dtype。我想从已经为dtype创建的列表中进行操作。
import pandas as pd
headers=['Record Identifier','Respondent_ID','Agency Code','Loan Type','Property Type','Loan Purpose','Owner Occupancy',
'Loan Amount','Preapprovals','Type of Action Taken','Metropolitan Statistical Area/Metropolitan Division','State Code',
'County Code','Census Tract','Applicant Ethnicity','Co-applicant Ethnicity','Applicant Race: 1','Applicant Race: 2',
'Applicant Race: 3','Applicant Race: 4','Applicant Race: 5','Co-applicant Race: 1','Co-applicant Race: 2',
'Co-applicant Race: 3','Co-applicant Race: 4','Co-applicant Race: 5','Applicant Sex','Co-applicant Sex',
'Applicant Income','Type of Purchaser','Denial Reason: 1','Denial Reason: 2','Denial Reason: 3','Rate Spread',
'HOEPA Status','Lien Status','Population','Minority Population %','FFIEC Median Family Income',
'Tract to MSA/MD Median Family Income %','Number of Owner Occupied Units','Number of 1- to 4-Family units']
dtypes=['int64','object','int64','int64','int64','int64','int64','int64','int64','int64','object','object','object','object',
'int64','int64','int64','int64','int64','int64','int64','int64','int64','int64','int64','int64','int64','int64',
'object','int64','int64','int64','int64','object','object','object','object','float64','int64','float64','int64',
'int64']
df = pd.read_csv('2017_lar.txt', sep="|", header=None, names=headers, dtype=dtypes, nrows=100)
print(df)
错误:TypeError:数据类型无法理解
您使用的参数不正确。您只能指定一个类型名称,也可以指定一个dict
将列标题与类型匹配的名称。
文档中明确涵盖了这一点:
dtype
:输入名称或列的字典-> type,可选数据或列的数据类型。例如{'a':np.float64,'b':np.int32,'c':'Int64'}使用str或object以及合适的na_values设置来保留而不解释dtype。如果指定了转换器,则将它们应用于dtype转换的INSTEAD。
由于您要传递列表,因此将整个列表假定为dtype,这是无法理解的。
这是正确的用法。
import io
import pandas as pd
i = io.StringIO("""
1|2|3
4|5|6
7|8|9
""")
headers = ['a', 'b', 'c']
dtypes = ['int64', 'object', 'int']
df = pd.read_csv(i, header=None, names=headers, sep='|', dtype=dict(zip(headers, dtypes)))
>>> df
a b c
0 1 2 3
1 4 5 6
2 7 8 9
>>> df.dtypes
a int64
b object
c int32
dtype: object
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句