Error of SAMSequenceDictionary
1
0
Entering edit mode
5.9 years ago

Hi there, after I sort and index my sam file by igvtools, I load my sam file into igv and got the following error: Error loading: Cannot add sequence that already exists in SAMSequenceDictionary Here is my sorted SAM file's header, any suggestion? Thanks!

SAM IGV Samtools • 2.9k views
ADD COMMENT
0
Entering edit mode
@HD VN:1.4  SO:unsorted
@SQ SN:chr1 LN:249250621
@SQ SN:chr10    LN:135534747
@SQ SN:chr11    LN:135006516
@SQ SN:chr11_gl000202_random    LN:40103
@SQ SN:chr12    LN:133851895
@SQ SN:chr13    LN:115169878
@SQ SN:chr14    LN:107349540
@SQ SN:chr15    LN:102531392
@SQ SN:chr16    LN:90354753
@SQ SN:chr17    LN:81195210
@SQ SN:chr17_ctg5_hap1  LN:1680828
@SQ SN:chr17_gl000203_random    LN:37498
@SQ SN:chr17_gl000204_random    LN:81310
@SQ SN:chr17_gl000205_random    LN:174588
@SQ SN:chr17_gl000206_random    LN:41001
@SQ SN:chr18    LN:78077248
@SQ SN:chr18_gl000207_random    LN:4262
@SQ SN:chr19    LN:59128983
@SQ SN:chr19_gl000208_random    LN:92689
@SQ SN:chr19_gl000209_random    LN:159169
@SQ SN:chr1_gl000191_random LN:106433
@SQ SN:chr1_gl000192_random LN:547496
@SQ SN:chr2 LN:243199373
@SQ SN:chr20    LN:63025520
@SQ SN:chr21    LN:48129895
@SQ SN:chr21_gl000210_random    LN:27682
@SQ SN:chr22    LN:51304566
@SQ SN:chr3 LN:198022430
@SQ SN:chr4 LN:191154276
@SQ SN:chr4_ctg9_hap1   LN:590426
@SQ SN:chr4_gl000193_random LN:189789
@SQ SN:chr4_gl000194_random LN:191469
@SQ SN:chr5 LN:180915260
@SQ SN:chr6 LN:171115067
@SQ SN:chr6_apd_hap1    LN:4622290
@SQ SN:chr6_cox_hap2    LN:4795371
@SQ SN:chr6_dbb_hap3    LN:4610396
@SQ SN:chr6_mann_hap4   LN:4683263
@SQ SN:chr6_mcf_hap5    LN:4833398
@SQ SN:chr6_qbl_hap6    LN:4611984
@SQ SN:chr6_ssto_hap7   LN:4928567
@SQ SN:chr7 LN:159138663
@SQ SN:chr7_gl000195_random LN:182896
@SQ SN:chr8 LN:146364022
@SQ SN:chr8_gl000196_random LN:38914
@SQ SN:chr8_gl000197_random LN:37175
@SQ SN:chr9 LN:141213431
@SQ SN:chr9_gl000198_random LN:90085
@SQ SN:chr9_gl000199_random LN:169874
@SQ SN:chr9_gl000200_random LN:187035
@SQ SN:chr9_gl000201_random LN:36148
@SQ SN:chrM LN:16571
@SQ SN:chrUn_gl000211   LN:166566
@SQ SN:chrUn_gl000212   LN:186858
@SQ SN:chrUn_gl000213   LN:164239
@SQ SN:chrUn_gl000214   LN:137718
@SQ SN:chrUn_gl000215   LN:172545
@SQ SN:chrUn_gl000216   LN:172294
@SQ SN:chrUn_gl000217   LN:172149
@SQ SN:chrUn_gl000218   LN:161147
@SQ SN:chrUn_gl000219   LN:179198
@SQ SN:chrUn_gl000220   LN:161802
@SQ SN:chrUn_gl000221   LN:155397
@SQ SN:chrUn_gl000222   LN:186861
@SQ SN:chrUn_gl000223   LN:180455
@SQ SN:chrUn_gl000224   LN:179693
@SQ SN:chrUn_gl000225   LN:211173
@SQ SN:chrUn_gl000226   LN:15008
@SQ SN:chrUn_gl000227   LN:128374
@SQ SN:chrUn_gl000228   LN:129120
@SQ SN:chrUn_gl000229   LN:19913
@SQ SN:chrUn_gl000230   LN:43691
@SQ SN:chrUn_gl000231   LN:27386
@SQ SN:chrUn_gl000232   LN:40652
@SQ SN:chrUn_gl000233   LN:45941
@SQ SN:chrUn_gl000234   LN:40531
@SQ SN:chrUn_gl000235   LN:34474
@SQ SN:chrUn_gl000236   LN:41934
@SQ SN:chrUn_gl000237   LN:45867
@SQ SN:chrUn_gl000238   LN:39939
@SQ SN:chrUn_gl000239   LN:33824
@SQ SN:chrUn_gl000240   LN:41933
@SQ SN:chrUn_gl000241   LN:42152
@SQ SN:chrUn_gl000242   LN:43523
@SQ SN:chrUn_gl000243   LN:43341
@SQ SN:chrUn_gl000244   LN:39929
@SQ SN:chrUn_gl000245   LN:36651
@SQ SN:chrUn_gl000246   LN:38154
@SQ SN:chrUn_gl000247   LN:36422
@SQ SN:chrUn_gl000248   LN:39786
@SQ SN:chrUn_gl000249   LN:38502
@SQ SN:chrX LN:155270560
@SQ SN:chrY LN:59373566
@SQ SN:chr1 LN:195471971
@SQ SN:chr10    LN:130694993
@SQ SN:chr11    LN:122082543
@SQ SN:chr12    LN:120129022
@SQ SN:chr13    LN:120421639
@SQ SN:chr14    LN:124902244
@SQ SN:chr15    LN:104043685
@SQ SN:chr16    LN:98207768
@SQ SN:chr17    LN:94987271
@SQ SN:chr18    LN:90702639
@SQ SN:chr19    LN:61431566
@SQ SN:chr1_GL456210_random LN:169725
@SQ SN:chr1_GL456211_random LN:241735
@SQ SN:chr1_GL456212_random LN:153618
@SQ SN:chr1_GL456213_random LN:39340
@SQ SN:chr1_GL456221_random LN:206961
@SQ SN:chr2 LN:182113224
@SQ SN:chr3 LN:160039680
@SQ SN:chr4 LN:156508116
@SQ SN:chr4_GL456216_random LN:66673
@SQ SN:chr4_GL456350_random LN:227966
@SQ SN:chr4_JH584292_random LN:14945
@SQ SN:chr4_JH584293_random LN:207968
@SQ SN:chr4_JH584294_random LN:191905
@SQ SN:chr4_JH584295_random LN:1976
@SQ SN:chr5 LN:151834684
@SQ SN:chr5_GL456354_random LN:195993
@SQ SN:chr5_JH584296_random LN:199368
@SQ SN:chr5_JH584297_random LN:205776
@SQ SN:chr5_JH584298_random LN:184189
@SQ SN:chr5_JH584299_random LN:953012
@SQ SN:chr6 LN:149736546
@SQ SN:chr7 LN:145441459
@SQ SN:chr7_GL456219_random LN:175968
@SQ SN:chr8 LN:129401213
@SQ SN:chr9 LN:124595110
@SQ SN:chrM LN:16299
ADD REPLY
0
Entering edit mode
@SQ SN:chrUn_GL456359   LN:22974
@SQ SN:chrUn_GL456360   LN:31704
@SQ SN:chrUn_GL456366   LN:47073
@SQ SN:chrUn_GL456367   LN:42057
@SQ SN:chrUn_GL456368   LN:20208
@SQ SN:chrUn_GL456370   LN:26764
@SQ SN:chrUn_GL456372   LN:28664
@SQ SN:chrUn_GL456378   LN:31602
@SQ SN:chrUn_GL456379   LN:72385
@SQ SN:chrUn_GL456381   LN:25871
@SQ SN:chrUn_GL456382   LN:23158
@SQ SN:chrUn_GL456383   LN:38659
@SQ SN:chrUn_GL456385   LN:35240
@SQ SN:chrUn_GL456387   LN:24685
@SQ SN:chrUn_GL456389   LN:28772
@SQ SN:chrUn_GL456390   LN:24668
@SQ SN:chrUn_GL456392   LN:23629
@SQ SN:chrUn_GL456393   LN:55711
@SQ SN:chrUn_GL456394   LN:24323
@SQ SN:chrUn_GL456396   LN:21240
@SQ SN:chrUn_JH584304   LN:114452
@SQ SN:chrX LN:171031299
@SQ SN:chrX_GL456233_random LN:336933
@SQ SN:chrY LN:91744698
@SQ SN:chrY_JH584300_random LN:182347
@SQ SN:chrY_JH584301_random LN:259875
@SQ SN:chrY_JH584302_random LN:155838
@SQ SN:chrY_JH584303_random LN:158099
ADD REPLY
0
Entering edit mode

Sorry I need to post the header separately, because the biostars system think it's in German rather than in English...

ADD REPLY
0
Entering edit mode

Please edit the original post to add new information.

Always use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized.

ADD REPLY
0
Entering edit mode
5.9 years ago
GenoMax 141k

If the listing is correct then you seem to have several chromosomes listed in the header twice.

   2 SN:chr1
   2 SN:chr10
   2 SN:chr11
   2 SN:chr12
   2 SN:chr13
   2 SN:chr14
   2 SN:chr15
   2 SN:chr16
   2 SN:chr17
   2 SN:chr18
   2 SN:chr19
   2 SN:chr2
   2 SN:chr3
   2 SN:chr4
   2 SN:chr5
   2 SN:chr6
   2 SN:chr7
   2 SN:chr8
   2 SN:chr9
   2 SN:chrM
   2 SN:chrX
   2 SN:chrY

See the solution proposed in this samtools thread to address the error you are seeing.

ADD COMMENT
0
Entering edit mode

Thanks! In the post, they proposed to delete the duplicate lines on my own. However, I'm not sure which duplicates should I delete?

ADD REPLY
0
Entering edit mode

I have shown you duplicate entries in my answer above.

ADD REPLY
0
Entering edit mode

For example, chr1 has @SQ SN:chr1 LN:249250621, @SQ SN:chr1_gl000191_random LN:106433, @SQ SN:chr1_gl000192_random LN:547496, etc. Which should I delete? Thanks!

ADD REPLY
0
Entering edit mode

Those are not duplicated. Main chromosome listings are e.g

@SQ SN:chr3 LN:198022430
@SQ SN:chr3 LN:198022430
ADD REPLY
0
Entering edit mode

Sorry, but I still don't know what to do with the header. do you have any suggestion? For example, I have @SQ SN:chr3 LN:198022430 and @SQ SN:chr3 LN:160039680. What should I do? Thanks!

ADD REPLY
0
Entering edit mode

Do you have any suggestion ? Thanks!

ADD REPLY
0
Entering edit mode

This is the solution posted in the link I had included above:

Use samtools view to convert your BAM to SAM, and then edit the SAM to remove the extraneous @SQ line in the header, and then convert back to BAM using either samtools or Picard.

ADD REPLY
0
Entering edit mode

I don't quite understand what is "extraneous @SQ line"? For example, I have @SQ SN:chr3 LN:198022430 and @SQ SN:chr3 LN:160039680. Should I remove both of them?

ADD REPLY
0
Entering edit mode

Hmm. I thought you had two lines that were identical. Like this example.

@SQ SN:chr3 LN:198022430
@SQ SN:chr3 LN:198022430
ADD REPLY
0
Entering edit mode

Where do you see that? I don't have the same two lines like that actually. Thanks!

ADD REPLY
0
Entering edit mode

When I had grabbed copies of data from two posts you had it contained duplications for the chromosome names I had originally posted in my answer.

Are you saying that you don't have any duplications in your @SQ lines and are still getting that error?

ADD REPLY
0
Entering edit mode

Yes. I don't have any duplications in my @SQ lines and got the error

ADD REPLY
0
Entering edit mode

Do you have any suggestion? Thanks !

ADD REPLY

Login before adding your answer.

Traffic: 1915 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6