how to extract the gene sequence according to its coordinates on the reference genome?
1
0
Entering edit mode
5.0 years ago
zhangdengwei ▴ 210

hi, how to extract the gene sequence according to its coordinates on the reference genome? Thanks!

sequence • 2.3k views
ADD COMMENT
0
Entering edit mode

What have you tried? I would suggest using BioPython and string slicing notation of which there are many examples on the forum.

You also haven’t told us what format your data is in.

ADD REPLY
0
Entering edit mode

this is a very briefly formulated question ! perhaps read this first : How To Ask Good Questions On Technical And Scientific Forums

What kind of input files do you have? do you want to do this for a single gene, multiple genes, all genes ... ? Do take the effort to include a bit more info to get more suitable answers.

ADD REPLY
0
Entering edit mode

And do you want the whole gene (introns and exons), cDNA, CDS, protein sequences?

ADD REPLY
0
Entering edit mode

I am sorry I don't state my question clearly. I am writing a python script to integrate my pipeline, and there is a step which I need to get the DNA sequence by a random pair of start and end position from a quite large FASTA file. So what I want to ask is just which tool or approach can handle it quickly, biopython or else?

ADD REPLY
0
Entering edit mode

if it's within a python pipeline / project, then yes likely biopython is the more sensible option. Otherwise you could for instance also get this through blast if you have a blastdb formatted version of your fasta file

ADD REPLY
0
Entering edit mode

Take a look at my code here https://github.com/jrjhealey/bioinfo-tools/blob/master/Genbank_slicer.py

The same approach would work for fasta files as well as Genbanks etc.

ADD REPLY
1
Entering edit mode

Thank you very much for your time!

ADD REPLY
0
Entering edit mode
5.0 years ago
zhangdengwei ▴ 210

I found a python module named "pyfaidx" which could satisfy my needs. It can make things simple which fetch sequence from a FASTA file. And here is the link https://pypi.org/project/pyfaidx/#description

ADD COMMENT
0
Entering edit mode

if you're done with it, close it

ADD REPLY

Login before adding your answer.

Traffic: 2037 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6