I'm trying to split a list of dna kmers into dimers, for split them in nucleotides I've been using this tidyverse function
kmer <- "ATTCCCGG"
ntd_kmr <- str_split_fixed(kmer,"",8)
and the output is the next
A,T,T,C,C,C,G,G
I would like to split the kmer into dimer so the output looks like the next
AT,TT,TC,CC,CC,CG,GG
I know that seqinr package has a function that do it, but I don't know how to do with overlapping
you can do it in command line:
Almost... expected output for your example input should be
at, tg, gc, cc, cg
.with awk, could get it: