filter genome above N50 value
1
0
Entering edit mode
5.4 years ago

Hi everyone, I want to filter my assembly to get contigs above the N50 value only.

How can this be done.

--Thanks in advance

assembly • 1.5k views
ADD COMMENT
0
Entering edit mode

You realize that your new filtered assembly will have a new, higher N50 value and you are just chasing a moving target until you have one contig left?

ADD REPLY
0
Entering edit mode
5.4 years ago
Michael 54k

Calculate the N50 value, and extract all sequences with length >= N50 from your fasta file. The question is just, why? N50 is not a magical threshold below which contigs are not real.

ADD COMMENT
0
Entering edit mode

Hi! I have two genomes (both draft) of an organism. I have to find out which genome between these two has to be used as a reference for my downstream analysis (transcriptome and SNP profile study etcc) For this genome-genome comparison I have used approaches such as synmap, LAST and mauve but I still cannot reach a conclusion. So, that is the reason I am wanting to filter them at N50 and see where it goes. Kindly suggest any other alternatives as well if possible.

--Thanks in advance

ADD REPLY
2
Entering edit mode

Have you tried to check if you can reconcile the two assemblies to see if you can make a better combined one?

ADD REPLY
0
Entering edit mode

There is "newer" pipeline for combining the two assemblies called NucMerge that you might try (https://www.biorxiv.org/content/early/2018/11/30/483701), but I would first consider C: filter genome above N50 value

ADD REPLY
2
Entering edit mode

There are many:

  • remapping completeness of DNA/RNA seq
  • linkage map - linkage errors
  • BUSCO
  • estimate of contamination, e.g. by bacterial contigs
  • repeat rate, GC content, assembly size vs. expected values
  • ....
ADD REPLY
0
Entering edit mode

Thanks for your suggestions!

ADD REPLY

Login before adding your answer.

Traffic: 2607 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6