Question

Bam to mapped fasta

1

Entering edit mode

4.9 years ago

bioz ▴ 20

Hai,

I have a bam file and i need two fasta file, One is exactly mapped to reference and next one is not mapped to reference

How to do with samtools ?

RNA-Seq • 1.3k views

ADD COMMENT • link updated 4.9 years ago by Devon Ryan 104k • written 4.9 years ago by bioz ▴ 20

score 1 · Answer 1 · 2019-05-13

1

Entering edit mode

4.9 years ago

Devon Ryan 104k

The one with exact matches you already have, it's the reference fasta file you used for mapping. For the other fasta file, see the many questions like this and the links provided by Genomax: How to get the consensus sequence from a BAM alignment

ADD COMMENT • link 4.9 years ago by Devon Ryan 104k

0

Entering edit mode

I'm not provided with the used reference fasta file. Is there any way i could extract it from BAM file which was provided to me?

ADD REPLY • link 4.9 years ago by vaish01kv • 0

0

Entering edit mode

It should be apparent from the BAM header where the fasta file was from.

ADD REPLY • link 4.9 years ago by Devon Ryan 104k

0

Entering edit mode

�BC]'�]M�6�Qv�Oa�l�?g�D@��P �Q�D )B��m�U��w7��q��q\��ɏ��/��}�w��/��|��?��O��?��?�/�ǃ��K��Aj�b�ǃ�<�L�=h�R��Vz��$��|޿ %��o�#�H��'��T�x��η�}�},@s��R�� P��/> ��K�|a��&|�~�ŵV��k��E~�)��J��K�O�J_��g��~��>� 9�5W��.�\U\��kv �;��WG I��j�ڒ�(a^�|�$5��74H"��"v��?��s��5��(�׊�D�D�B��X��Q7*Q�5Ek<pa*5��d�,pq�#��\�� z�"��?��#��y�",�="" rl�{="" l`�?="" ��="" xb�nj�&��.��}��?�Ǘ��i��ռ�cj�="" �uj��xr�nl<="" ��s$g�ޭ��a#r��="" <br=""/> \W 6�� Ш$'��[8��xa�s�a>�_��G�/��x�3S��y_3�9}>��.��<�y��Y�O��ϵ��}Q��1��Os�� A}��ϓp" �/�� F?$��y��$�m�c��jX this was the header when i tried to open the BAM file.

ADD REPLY • link 4.9 years ago by vaish01kv • 0

1

Entering edit mode

samtools view -H

ADD REPLY • link 4.9 years ago by Devon Ryan 104k

0

Entering edit mode

@RG ID:SampleName_S1_L005_001_1 PL:ILLUMINA PU:SampleName_S1_L005_001_1 LB:SampleName_S1_L005_001   SM:SampleName_S1_L005_001
@PG ID:bwa  CL:/usr/local/sentieon-genomics-201808/libexec/bwa mem -t 32 genome/hs38DH.fa /home/dnanexus/in/reads_fastqgzs/0/SampleName_S1_L005_R1_001.fastq.gz /home/dnanexus/in/reads2_fastqgzs/0/SampleName_S1_L005_R2_001.fastq.gz -K 100000000 -M -R @RG\tID:SampleName_S1_L005_001_1\tPL:ILLUMINA\tPU:SampleName_S1_L005_001_1\tLB:SampleName_S1_L005_001\tSM:SampleName_S1_L005_001  PN:bwa  VN:0.7.15-r1140
@PG ID:sentieon-sort    CL:/usr/local/sentieon-genomics-201808/libexec/util sort -o sorted_0.bam -t 32 --sam2bam --bam_compression 6 --block_size 2G -i -   PN:sentieon-sort    PP:bwa  VN:sentieon-genomics-201808
@PG ID:sentieon-Dedup   CL:/usr/local/sentieon-genomics-201808/libexec/driver --traverse_param 1000000/10000 -t 32 -i sorted_0.bam --algo Dedup --bam_compression 6 --score_info score.txt --metrics dedup_metrics.txt markdup.bam  PN:sentieon-Dedup   PP:sentieon-sort    VN:sentieon-genomics-201808

Please help me find out!

ADD REPLY • link 4.9 years ago by vaish01kv • 0

1

Entering edit mode

That's hs38DH, which you can google for and probably comes from the 1000 genomes project.