在Python中导入具有多个定界符的CSV文件

月桂树

我正在尝试使用Python将数据文件导入笔记本。

这是实际数据:https : //drive.google.com/file/d/1Fr5urzbuGx7QIg_2ueMXAAlDM9xU5e4P/view?usp=sharing

这是csv文件格式的一种方式:

"AwardNumber","Title","NSFOrganization","Program(s)","StartDate","LastAmendmentDate","PrincipalInvestigator","State","Organization","AwardInstrument","ProgramManager","EndDate","AwardedAmountToDate","Co-PIName(s)","PIEmailAddress","OrganizationStreet","OrganizationCity","OrganizationState","OrganizationZip","OrganizationPhone","NSFDirectorate","ProgramElementCode(s)","ProgramReferenceCode(s)","ARRAAmount","Abstract"
"1624943","Testing the Impact of Race on Jury Evaluations of Informants","SES","Sociology, Social Psychology, LSS-Law And Social Sciences","08/15/2016","07/17/2017","Mona Lynch","CA","University of California-Irvine","Standard Grant","Reggie Sheehan","06/30/2019","$353,747.00","","[email protected]","141 Innovation Drive, Ste 250","Irvine","CA","926173213","9498247295","SBE","1331, 1332, 1372","9251","$0.00","An important body of legal scholarship has emerged about the justice risks associated with the use of informants, who provide information to law enforcement officials about criminal activity usually in exchange for leniency consideration or dismissal on a pending criminal charge. Despite the increasing concern, there has been very little empirical research on the use of informants as witnesses."
"1917573","States and Security: Border Orientation in the Modern World","SES","Political Science","08/15/2019","08/26/2019","Beth Simmons","PA","University of Pennsylvania","Standard Grant","Brian Humes","07/31/2021","$476,137.00","Michael Kenwick","[email protected]","Research Services","Philadelphia","PA","191046205","2158987293","SBE","1371","","$0.00","Border security is one of the most significant policy issues of our time. How do states benefit from globalization, while at the same time protecting a national space from unwanted influences, people, goods and activities?"
"1931871","CPS: Medium: A Secure, Trustworthy, and Reliable Air Quality Monitoring System for Smart and Connected Communities","SES","CPS-Cyber-Physical Systems","10/01/2019","10/24/2019","Haofei Yu","FL","University of Central Florida","Standard Grant","Sara Kiesler","09/30/2022","$1,198,111.00","Xinwen Fu, Deliang Fan, Haofei Yu, Kelly Stevens, Thomas Bryer","[email protected]","4000 CNTRL FLORIDA BLVD","Orlando","FL","328168005","4078230387","SBE","7918","7924, 9150","$0.00","A critical application of smart technologies is a smart, connected, and secured environmental monitoring network that can help administrators and researchers find better ways to incorporate evidence and data into public decision-making related to the environment."
"1922424","Standard Research: Consensus, Democracy, and the Public Understanding of Science","SES","STS-Sci, Tech & Society","09/01/2019","09/07/2019","James Weatherall","CA","University of California-Irvine","Continuing grant","Frederick Kronz","08/31/2022","$431,892.00","Cailin O'Connor","[email protected]","141 Innovation Drive, Ste 250","Irvine","CA","926173213","9498247295","SBE","7603","1353","$0.00","This award supports a research project that studies how changing social networks influence public belief about science; it will focus specifically on how false beliefs can persist and spread even in evidence-rich environments, and how these beliefs in turn feed back into collective decision-making through democratic institutions."

我遇到的问题是,值不是仅由列分隔,而是也用引号引起来,这是必需的,因为其中一列包含大量的字符串文本。

这是我通常导入它的方式,但是出现错误。

import pandas as pd
import numpy as np

award = pd.read_csv('ses_awards.csv')
award.head()

先谢谢您的帮助!

亚当·扎尔丁

我尝试了您提供的文件,但实际上给了我一个编码错误。

尝试以下编码:

pd.read_csv('ses_awards.csv', encoding = 'ISO-8859-1')

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章

我可以导入CSV文件并自动推断定界符吗?

将字符串拆分为具有多个单词边界定界符的单词

在C中使用strtok分割具有多个定界符的字符串

具有多个定界符的字符串令牌生成器,包括不带Boost的定界符

打印具有相似列且带有多个定界符的行

将具有多个定界符的字符串分成3部分

单个CSV文件中有多个定界符

C ++从具有多个定界符的文件中读取矩阵

读取列标题中具有多个定界符的文件,并在末尾跳过一些行

在R中难以导入具有多个不同定界符的文本文件

Google BigQuery:使用具有自定义字段定界符的Java API从本地CSV文件加载数据

具有两个定界符的csv

Python-分割具有多个相同定界符的字符串

具有多个定界符的分割列表

如何将具有多个定界符的文件转换为数据帧

将.txt导入具有多个定界符的Pandas Dataframe

将包含多个定界符的文本文件转换为CSV

启用数组定界符时,带有多个标签的neo4j管理员导入csv无法解析标签

Stata:导入带有多个多字符定界符的txt

使用Python进行CSV导入;不正确的“,”定界符行为

如何从文本文件导入数据而没有任何定界符或分隔符?

带有多个定界符的AWK

在Java中读取具有多个定界符的文件

C ++具有多个字符的多个定界符

SSIS导入文件csv =“”文本定界符ignore =

如何使用fscanf获取具有多个定界符的数字?

如何对具有多个定界符的文件进行排序?

在MATLAB中导入并解析具有多个数据块的.csv文件

在Java中读取具有多个定界符的文件行