Biostar Beta. Not for public use.
how can I retrieve FASTA sequence using ID in another text file?
0
Entering edit mode
18 months ago
tcf.hcdg • 60
European Union

Dear BIostar community

I have around 50,000 sequence identifier in a text file and around 1Million sequences in a large fasta file.

I want to get the sequence of these 50,000 from the fasta file.

I checked the questions and answers already in the community pages. But most of them are using perl,linux,python.

How can it be done in R(fasomerecord/grep/something else)?

Any suggestion?

Thanks

fasta • 1.0k views
ADD COMMENTlink
1
Entering edit mode

You want this done with grep, but not in linux?

ADD REPLYlink
0
Entering edit mode

I just want to know it this is possible in R?

ADD REPLYlink
1
Entering edit mode
ADD REPLYlink
1
Entering edit mode

samtools faidx is pretty handy.

ADD REPLYlink
6
Entering edit mode
13 months ago
Freiburg, Germany

If you really want to do this in R, you could use the seqinr package. You end up loading the fasta file into memory, subsetting the results according to the names() accessor, and then writing the results of that to a file. Presumably you could alternatively pass only the names you want to write.fasta().

ADD COMMENTlink
2
Entering edit mode
5.0 years ago
Indonesia
From google , may be is : a-little-book-of-r-for-bioinformatics.readthedocs.org/en/latest/src/chapter1.html
ADD COMMENTlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1