Matching gene id and extracting fasta sequences using shell script
1
0
Entering edit mode
8.8 years ago

Dear All,

I have a list of genes id e.g.

gene01

gene22

gene27 and so on.

I need extract the gene names with fasta sequence from assembly fie. Assembly file looks:

>gene01

ATAGCGATCCCCCTTTTTCCTT

>gene02

ATACCCCCGCGAT

>gene03

ATACCCAAAAAAACCGCGAT and so on.

Can anyone help me to write a shell script that will search gene names of my gene list in the assembly file and will give the output with associated DNA sequence. Example output for gene01 is:

>gene01

ATAGCGATCCCCCTTTTTCCTT

fasta shell • 3.9k views
ADD COMMENT
1
Entering edit mode

I'm not sure if this website is like Stackoverflow, but you should really post what you tried instead of just asking people to do your work for you. It looks like you did not try anything at all, and while many people on this site don't have much programming experience, this is a relatively simple task which you could find the solution to on google or code in a few mins.

ADD REPLY
0
Entering edit mode

Thank you for your reply. Your answer helped. I found this link: http://unix.stackexchange.com/questions/156783/getting-matched-fasta-file.

ADD REPLY

Login before adding your answer.

Traffic: 3832 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6