I have some paired-end sequencing reads, 300 bp long. The adapter content is too high for both runs according to FASTQC. However, the only overrepresented sequences that show up in FASTQC are polyA and polyT. Why would the adapter sequence not show up as an overrepresented sequence? And is there an easy way to figure out the sequence of the adapter that is contaminating my reads?
Thanks!
I would be interested in the latter option genomax2 :)
See the update above.
Ah, yeah i tried that recently but my reads didn't overlap enough to call adapters :(
Good tool/idea though.
Can you try this (replace the N with expected length of your adapter) to see if it works?
$ commonkmers.sh in=reads.fq out=kmers.txt k=N count=t display=999