我有一个csv文件和这样的数据模式:
我正在从csv文件导入它。在输入数据中,有一些空格,我通过使用上述模式来处理它。对于输出,我想编写一个函数,将该文件作为输入并输出最低和最高血压。同样,它将返回所有平均值的平均值。另一方面,我不应该使用熊猫。
我写下面的代码博客。
bloods=open(bloodfilename).read().split("\n")
blood_pressure=bloods[4].split(",")[1]
pattern=r"\s*(\d+)\s*\[\s*(\d+)-(\d+)\s*\]"
re.findall(pattern,blood_pressure)
#now extract mean, min and max information from the blood_pressure of each patinet and write a new file called blood_pressure_modified.csv
pattern=r"\s*(\d+)\s*\[\s*(\d+)-(\d+)\s*\]"
outputfilename="blood_pressure_modified.csv"
# create a writeable file
outputfile=open(outputfilename,"w")
for blood in bloods:
patient_id, blood_pressure=bloods.strip.split(",")
mean=re.findall(pattern,blood_pressure)[0]
blood_pressure_modified=re.sub(pattern,"",blood_pressure)
print(patient_id, blood_pressure_modified, mean, sep=",", file=outputfile)
outputfile.close()
输出应如下所示:
这是非常简单的答案。没有正则表达式,大熊猫或其他任何东西。让我知道这是否有效。我可以尝试使其在任何无法使用的情况下更好地工作。
bloods=open("bloodfilename.csv").read().split("\n")
means = []
'''
Also, rather than having two list of mins and maxs,
we can have just one and taking min and max from this
list later would do the same trick. But just for clarity I kept both.
'''
mins = []
maxs = []
for val in bloods[1:]: #assuming first line is header of the csv
mean, ranges = val.split(',')[1].split('[')
means.append(int(mean.strip()))
first, second = ranges.split(']')[0].split('-')
mins.append(int(first.strip()))
maxs.append(int(second.strip()))
print(f'the lowest and the highest blood pressure are: {min(mins)} {max(maxs)} respectively\naverage of mean values is {sum(means)/len(means)}')
您还可以创建函数来执行小的小条形任务。通常,这是一种更好的编码方式。我写得很着急,所以不要介意。
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句