它们都以“ rsid_set(variable)”开头。我几乎没有编码经验,但是一直在尝试使用R和python。有什么快速的方法来获取我想要的那些列?
跟进:有没有一种方法可以利用每一列的平均值并将其转换为具有10,000个值的正态分布?
# read in
df <- read.tsv("path/to/your/file")
# select only colnames beginning with rsid_set
df <- df[grep("^rsid_set",colnames(df)),]
Your follow-up, I don't understand. You'll have to clarify what you want.
# Take the means of each column:
means <- colMeans(df)
# normal distribution with 10k values
norms <- rnorm(10e3)
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句