如何使用Pandas读取不包含标题的CSV文件，仅捕获第一列中的数据并执行删除操作？

good_pro

我有一个CSV文件，其中包含有关人员的信息以及占用100多个列的各种数据。没有标题，我的主要目的是仅获取人民的名字。没有其他与此相关的数据。我怎样才能做到这一点？

这是我的CSV文件---'data.csv'：

John   12 34 23 48 14 44 94 24  ...    #extends till 100
Becky  23 40 93 47 84 43 64 31  ...    #extends till 100
Lio    63 90 53 77 14 12 69 20  ...    #extends till 100

接下来，假设我的代码中有一个列表，其中填充了很多名称：

names = ['Timothy', 'Joshua', 'Rio', 'Catherine', 'Poorva', 'Gome', 'Lachlan', 'John', 'Lio']

我用Python打开了CSV文件，并使用列表理解功能来读取第一列中的所有名称，并将它们存储在分配了变量'people_list'的列表中。

现在，对于people_list中的所有元素，如果未在“名称”列表中看到该元素，我想在CSV文件中删除该元素。在此示例中，我想删除Becky，因为她没有出现在名称列表中。这是我到目前为止尝试过的...

演示-data.py：

names = ['Timothy', 'Joshua', 'Rio', 'Catherine', 'Poorva', 'Gome', 'Lachlan', 'John', 'Lio']
csv_filename = data.csv

with open(csv_filename, 'r') as readfile:
reader = csv.reader(readfile, delimiter=',') 
people_list = [row[0] for row in reader]

for person in people_list:
    if person not in names:
        id = people_list.index(person) #grab the index of the person in people_list who's not found in the names list.

        #using pandas
        df = pd.read_csv(csv_filename) #read data.csv file
        df.drop(df.index[id], in_place = True) #delete the row id for the person who does not exist in names list.
        df.to_csv(csv_filename, index = False, sep=',')  #close the csv file with no index
    else:
        print("This person is found in the names list")

没有删除Becky，而是删除了我CSV文件中的所有记录（包括Becky）。有人可以解释如何做吗？

耶斯列尔

将参数添加header=None到read_csv默认列0,1,2...：

df = pd.read_csv(csv_filename,  header=None)

names = ['Timothy', 'Joshua', 'Rio', 'Catherine', 'Poorva', 'Gome', 'Lachlan', 'John', 'Lio']

然后选择第一列，df[0]并测试成员资格，Series.isin然后筛选boolean indexing：

df = df[df[0].isin(names)]
print (df)

上次写入文件：

df.to_csv(csv_filename1, index = False, header=None)

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。

编辑于 2021-01-21

我来说两句

0 条评论

登录后参与评论

上一篇：创建IAM角色并将策略附加到该角色时出错

如何使用Pandas读取不包含标题的CSV文件，仅捕获第一列中的数据并执行删除操作？

如何使用Pandas读取不包含标题的CSV文件，仅捕获第一列中的数据并执行删除操作？

UITableView的项目向下滚动后更改颜色，然后快速备份

Linux的官方Adobe Flash存储库是否已过时？

用日期数据透视表和日期顺序查询

应用发明者仅从列表中选择一个随机项一次

Mac OS X更新后的GRUB 2问题

验证REST API参数

Java Eclipse中的错误13，如何解决？

带有错误“ where”条件的查询如何返回结果？

ggplot：对齐多个分面图-所有大小不同的分面

尝试反复更改屏幕上按钮的位置 - kotlin android studio

如何从视图一次更新多行（ASP.NET - Core）

计算数据帧中每行的NA

蓝屏死机没有修复解决方案

在 Python 2.7 中。如何从文件中读取特定文本并分配给变量

离子动态工具栏背景色

VB.net将2条特定行导出到DataGridView

通过 Git 在运行 Jenkins 作业时获取 ClassNotFoundException

在Windows 7中无法删除文件（2）

python中的boto3文件上传

当我尝试下载 StanfordNLP en 模型时，出现错误

Node.js中未捕获的异常错误，发生调用