SNP协调基因名称

AST

我在UCSC提供的病床文件中有SNP ID和坐标。我想将它们映射到他们的基因名称。

chr1    9160974     9160975     rs1013578619    0   +
chr1    164528869   164528870   rs1016074293    0   +
chr1    192216772   192216773   rs1018731047    0   +
chr1    117157669   117157670   rs1022293363    0   +
chr1    33148118    33148119    rs1022386792    0   +

我提到过很多建议使用bedtools相交,UCSC表浏览器等的帖子,但是我无法获得成功的结果。请建议用于此特定数据的选项。

zx8754

我们可以使用biomaRt软件包

# data
mySNPs <- read.table(text = "chr1    9160974     9160975     rs1013578619    0   +
chr1    164528869   164528870   rs1016074293    0   +
chr1    192216772   192216773   rs1018731047    0   +
chr1    117157669   117157670   rs1022293363    0   +
chr1    33148118    33148119    rs1022386792    0   +")
colnames(mySNPs) <- c("chr", "start", "end", "name", "x", "strand")

library(biomaRt)

snpmart = useMart(biomart = "ENSEMBL_MART_SNP", dataset = "hsapiens_snp")

# Check which filters and attributes we wan't to use:
# listAttributes(snpmart)
# listFilters(snpmart)

# result
getBM(attributes = c("refsnp_id", "chr_name", "chrom_start", "chrom_end", "ensembl_gene_stable_id"), 
      filters = c("snp_filter"), 
      values = mySNPs$name, 
      mart = snpmart)

#      refsnp_id chr_name chrom_start chrom_end ensembl_gene_stable_id
# 1 rs1013578619        1     9160975   9160975        ENSG00000228526
# 2 rs1016074293        1   164528870 164528870                       
# 3 rs1018731047        1   192216773 192216773        ENSG00000285280
# 4 rs1022293363        1   117157670 117157670        ENSG00000134258
# 5 rs1022386792        1    33148119  33148119        ENSG00000278997
# 6 rs1022386792        1    33148119  33148119        ENSG00000116525

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章