Another newbie question I'm afraid...... I was informed that my bam file was generated against 'Human Genome Reference 38' (is this GRCh38) yet in my bam file header I see what appears to be a mapping against 'hg19'.
@PG ID:rtg-49A3938E VN:RTG Core 3.6.1 / Core 891de81 (2016-01-25) CL:map -t /opt/annai/levobox/ref/hg19.sdf -i /tmp/box4jh2n06g/reads.sdf -o /tmp0/boxyyljj8_l/map_412857990_495429577 --start-read 412857990 --end-read 495429578 PN:rtg
Is hg19 the same as Human Genome Reference 38 (is this GRCh38?). A bit confused as I have seen it directly linked to GRCh37.
Hope you can help. Thank you.
Thank you so much for confirming that. Is it within my 'newbie' realms to use something like samtools to convert my bam file back to its 'virgin' state (fastq?) and apply reference 38 against it or is that not posssible? Thank you so much for helping.
There is no need to "apply" reference 38 against your fastq files. As I had said in your other thread you started with VCF/BAM files which is the end point of a WGS analysis. You were working backwards to get the original sequence data which is the fastq files.
Thank you. My mistake, I thought I had to return everything back to its 'virgin' state before I could apply another reference. Going to try and read up a bit more before asking questions.Thank you for your help.
No mistake. You would want to get the fastq files back in order to align the data against a different reference. So you were correct in that sense.
A newer reference build (in your case GRCh38) is considered an improvement on the older one (GRCh37/hg19). That is one reason why it has an incremented genome build number. In general, one should not be going back to an older reference but go forward. If and when GRCh39 (if that is the name they choose for next major genome build) comes out that would be the time to align your fastq files against the new reference.
It sounds like you have done your own genome sequencing (outside of a scientific establishment). Were you doing this for fun or are you motivated by some specific reason (e.g. trying to understand a phenotype). Just curious,
No I haven't done my own sequencing. How I got here is ......... medically I am 'out of the box' so I thought I would have a WGS done in the USA that my help highlight something. From that I received a Promethease Report, VCF file and a BAM file in 5 parts. The Promethease Report returned something disturbing that not only affected me but possibly my family. I live in the UK and my health is looked after by the NHS so I took my Promethease Report to them. During that consultation they told me they were unable to merge/process or diagnose anything from my BAM or VCF file as they didn't have the facilities to do that. They questioned the results in my Promethease Report and almost dismissed them. After this consultation I decided that I had to find a way to merge my BAM files. I have done that with all the help given here (including yours). I have merged, sorted and indexed my BAM file. This morning I spotted the hg19 reference in the header so I raised this post as I was under the impression my BAM file was aligned to Human Genome Reference 39. I haven't converted to FASTQ yet as I am not sure what command I need to use to do that. i.e. 'samtools fastq (do I need to use any of the available options for it to be accurate?). Where I had the WGS done are unable to give any help as they are regulated by the FDA and not allowed to give advice. That leaves me with my data and nothing to do than try and delve into it myself. My next step is to produce the FASTQ file. Hope that helps explain where I am coming from. Because there maybe implications to my family I will not stop until I have everything sorted. Hence all the newbie questions. Hope that helps. Thank you for helping.
None of us are qualified to give you advice beyond what we have done so far. There are many mysteries that remain beyond the linear sequence of the genome and the differences (SNP). Your best bet may be to concentrate on the VCF report you have (for now) and try to understand if something in there correlates with your phenotypic history (more than likely it won't). If you can, find a medical geneticist to consult with in NHS or a local researcher that you can talk with in person (forums are always going to be limited).
We wish you luck.