我有这个(简化的)数据集:
x <- read.table(text = ' id seq
1 1 AACCAAGCCCTTGCTCAAATCGAAAAAAAGTTGAGCAAACCGAGTTTTGAG
2 2 AAGTTGAGCAAACCGAGTTTTGAGACTTGGATGAAGTCAACCAAAGCCCAC')
因此,它看起来像这样:
id seq
1 1 AACCAAGCCCTTGCTCAAATCGAAAAAAAGTTGAGCAAACCGAGTTTTGAG
2 2 AAGTTGAGCAAACCGAGTTTTGAGACTTGGATGAAGTCAACCAAAGCCCAC
然后,我将其置于cSplit位置:cSplit(x, 'seq', direction = 'wide', stripWhite = FALSE, sep = '')
它返回TRUE
位置20和32,而不是字符本身:
id seq_01 seq_02 seq_03 seq_04 seq_05 seq_06 seq_07 seq_08 seq_09 seq_10 seq_11 seq_12 seq_13 seq_14 seq_15 seq_16 seq_17 seq_18
1: 1 A A C C A A G C C C T T G C T C A A
2: 2 A A G T T G A G C A A A C C G A G T
seq_19 seq_20 seq_21 seq_22 seq_23 seq_24 seq_25 seq_26 seq_27 seq_28 seq_29 seq_30 seq_31 seq_32 seq_33 seq_34 seq_35 seq_36
1: A TRUE C G A A A A A A A G T TRUE G A G C
2: T TRUE T G A G A C T T G G A TRUE G A A G
seq_37 seq_38 seq_39 seq_40 seq_41 seq_42 seq_43 seq_44 seq_45 seq_46 seq_47 seq_48 seq_49 seq_50 seq_51
1: A A A C C G A G T T T T G A G
2: T C A A C C A A A G C C C A C
(如果我不是改direction = 'wide'
到direction = 'long'
使用和传播比它自己tidyr::spread
,它看起来罚款)
问题是type.convert
这是TRUE
默认。因此,如果仅存在T
或存在F
于列中,它将以asTRUE/FALSE
代替字符串“ T”或“ F”并将其转换为logical
type
library(splitstackshape)
cSplit(x, 'seq', direction = 'wide', stripWhite = FALSE,
sep = '', type.convert = FALSE)
# id seq_01 seq_02 seq_03 seq_04 seq_05 seq_06 seq_07 seq_08 seq_09 seq_10 seq_11 seq_12 seq_13 seq_14 seq_15
#1: 1 A A C C A A G C C C T T G C T
#2: 2 A A G T T G A G C A A A C C G
# seq_16 seq_17 seq_18 seq_19 seq_20 seq_21 seq_22 seq_23 seq_24 seq_25 seq_26 seq_27 seq_28 seq_29 seq_30
#1: C A A A T C G A A A A A A A G
#2: A G T T T T G A G A C T T G G
# seq_31 seq_32 seq_33 seq_34 seq_35 seq_36 seq_37 seq_38 seq_39 seq_40 seq_41 seq_42 seq_43 seq_44 seq_45
#1: T T G A G C A A A C C G A G T
#2: A T G A A G T C A A C C A A A
# seq_46 seq_47 seq_48 seq_49 seq_50 seq_51
#1: T T T G A G
#2: G C C C A C
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句