如何用“N/A”替换“00”跳过python中的第一行和第一列

拉朱·纳塔

我正在使用 200 万列和 522 行的 GWAS 数据。在这里，我需要将数据上的“00”替换为“N/A”。由于我有一个巨大的文件，我正在使用 open_reader 方法。谁能帮忙

注意：需要跳过第一行第一列

样本数据：

ID,kgp11270025,kgp570033,rs707,kgp7500
1,CT,GT,CA,00
200,00,TG,00,GT
300,AA,00,CG,AA
400,GG,CC,AA,TA

期望的输出：

ID,kgp11270025,kgp570033,rs707,kgp7500
1,CT,GT,CA,N/A
200,N/A,TG,N/A,GT
300,AA,N/A,CG,AA
400,GG,CC,AA,TA

我写的代码：

import re

input_file = "test.csv"
output_file = "testresult.csv"

# print("Processing data from", input_file)
with open(input_file) as f:
    lineno = 0
    for line in f:
        lineno = lineno + 1
        if (lineno == 1):
            #need to skip first line
            # print("Skipping line 1 which is a header")
            print(line.rstrip())
        else:
            # print("Processing line {}".format(lineno))
            line = re.sub(r',00', ',N/A', line.rstrip())
            print(line)
    # print("Processed {} lines".format(lineno))

这个我试过了，还是不行，求大神帮忙！！

大宇

当我使用时print(line)，它的显示很好

然后只需使用如下file的关键字参数print

import re

input_file = "test.csv"
output_file = "testresult.csv"

# print("Processing data from", input_file)
with open(input_file) as f, open(output_file, "w") as g:
    lineno = 0
    for line in f:
        lineno = lineno + 1
        if (lineno == 1):
            #need to skip first line
            # print("Skipping line 1 which is a header")
            print(line.rstrip(),file=g)
        else:
            # print("Processing line {}".format(lineno))
            line = re.sub(r',00', ',N/A', line.rstrip())
            print(line,file=g)
    # print("Processed {} lines".format(lineno))

w请注意，虽然仅打开输入文件名就足够了，因为默认模式是读取文本，但输出文件需要指定写入模式 ( )。

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。