Biostar Beta. Not for public use.
Question: Extract gff of a particular chromosome
1
Entering edit mode

I have a gff file that contains all the information from Chromosome 1 - Chromosome 14. But I need gff information on individual chromosome basis. For example, I am performing some experiment on Chromosome 11 and trying to visualize my result on IGV. When I load the gff file in IGV it is showing gene information of all the chromosome. How can I get the gff of only Chromosome 11?

ADD COMMENTlink 3.6 years ago mhasa006 • 50 • updated 16 months ago ahmedferoz20 • 10
Entering edit mode
0

Why are you bothered by IGV showing you all chromosomes? You just need to double-click on the chromosome you are interested in to select and zoom to just that chromosome (or use the drop-down menu).

ADD REPLYlink 3.6 years ago
genomax
68k
Entering edit mode
0

Hello, I was trying to grep chromosome X from a gff file. I have an output but it is empty. I am a beginner in this. Please help me.

ADD REPLYlink 16 months ago
ahmedferoz20
• 10
Entering edit mode
0

Ok, you are not supposed to ask questions in other threads but before you refresh even more old ones, please show the output of cut -f1 your.gff | sort -k1,1 | grep -v '^#' | uniq -c and post the command you used.

ADD REPLYlink 16 months ago
ATpoint
17k
Entering edit mode
0

hanks a lot. Sorry, I am not aware of it. I am still not getting a file with chromosome 'x' only. I have used grep chrX myfile.gff>chrx.gff

ADD REPLYlink 16 months ago
ahmedferoz20
• 10
Entering edit mode
0

@ahmedferoz20 I deleted your comment because you added it as an answer instead of using Add Reply.

What is the output of cut -f1 your.gff | sort -k1,1 | grep -v '^#' | uniq -c

ADD REPLYlink 16 months ago
ATpoint
17k
Entering edit mode
0

Yep ok i ad doing it now.

  1. I have used grep chrX myfile.gff>chrx.gff to extract output of chrX.

2 The output of the script you gave me is

## species https://www.ncbi.nlm.nih.gov/Taxonomoy/Browser/www.tax.cgi?id=70

I hope i followed your suggestion to get help.

ADD REPLYlink 16 months ago
ahmedferoz20
• 10
• updated 16 months ago
genomax
68k
Entering edit mode
1

What? A link to ncbi is for sure not the output. head -n 20 your.gff will also do.

ADD REPLYlink 16 months ago
ATpoint
17k
Entering edit mode
0

@ATpoint, I admire your patience.

ADD REPLYlink 16 months ago
Carambakaracho
♦ 1.2k
Entering edit mode
0

We all were unexperienced at some point. I assume that the problem is that the chromosomes are labelled as 1,2,3...X rather than chr1,chr2,chr3...chrX, that is why I asked for the cut -f1 your.gff | sort -k1,1 | grep -v '^#' | uniq -c because that will list the unique chromosome names.

ADD REPLYlink 16 months ago
ATpoint
17k
Entering edit mode
0

Thanks a lot. I got it. However, my next challenge is to create a fasta file from that chromosome x file. I have to create a fasta file which contain 1000bp upstream of each gene.

ADD REPLYlink 16 months ago
ahmedferoz20
• 10
5
Entering edit mode

Assuming your chromosomes are named chrNN the following should extract chr11

grep chr11 your_file.gff > chr11.gff

If chr11 is in the first column and you want only those lines then do

grep ^chr11 your_file.gff > chr11.gff
ADD COMMENTlink 3.6 years ago genomax 68k
Entering edit mode
3

Small correction, some times when you grep 'chr1' you end up in getting 'chr11', 'chr12' etc., adding '-w' will solve the problem.

grep -w chr11 your_file.gff > chr11.gff

ADD REPLYlink 3.6 years ago
EagleEye
6.4k
Entering edit mode
1

With awk for exact matches, preserving the header:

awk '$1 ~ /^#/ {print $0;next} {if ($1 == "chr11") print}' your_file.gff
ADD REPLYlink 16 months ago
ATpoint
17k

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0