Is 'hg19' the same as 'Human Genome Reference 38' ?
2
0
Entering edit mode
6.7 years ago
ycsm ▴ 10

Another newbie question I'm afraid...... I was informed that my bam file was generated against 'Human Genome Reference 38' (is this GRCh38) yet in my bam file header I see what appears to be a mapping against 'hg19'.

@PG ID:rtg-49A3938E VN:RTG Core 3.6.1 / Core 891de81 (2016-01-25) CL:map -t /opt/annai/levobox/ref/hg19.sdf -i /tmp/box4jh2n06g/reads.sdf -o /tmp0/boxyyljj8_l/map_412857990_495429577 --start-read 412857990 --end-read 495429578 PN:rtg

Is hg19 the same as Human Genome Reference 38 (is this GRCh38?). A bit confused as I have seen it directly linked to GRCh37.

Hope you can help. Thank you.

hg19 grch38 genome sequencing alignment • 10.0k views
ADD COMMENT
3
Entering edit mode
6.7 years ago

hg19 is the same as GRCh37 and is not at all the same as GRCh38 (aka, hg20 or hg38). Either the person who produced the file made a mistake or they have very odd file names. The chromosome sizes and contigs included will be a bit different between these two, so have a look at those for 100% confirmation.

ADD COMMENT
0
Entering edit mode

Thank you so much for confirming that. Is it within my 'newbie' realms to use something like samtools to convert my bam file back to its 'virgin' state (fastq?) and apply reference 38 against it or is that not posssible? Thank you so much for helping.

ADD REPLY
0
Entering edit mode

There is no need to "apply" reference 38 against your fastq files. As I had said in your other thread you started with VCF/BAM files which is the end point of a WGS analysis. You were working backwards to get the original sequence data which is the fastq files.

ADD REPLY
0
Entering edit mode

Thank you. My mistake, I thought I had to return everything back to its 'virgin' state before I could apply another reference. Going to try and read up a bit more before asking questions.Thank you for your help.

ADD REPLY
0
Entering edit mode

No mistake. You would want to get the fastq files back in order to align the data against a different reference. So you were correct in that sense.

A newer reference build (in your case GRCh38) is considered an improvement on the older one (GRCh37/hg19). That is one reason why it has an incremented genome build number. In general, one should not be going back to an older reference but go forward. If and when GRCh39 (if that is the name they choose for next major genome build) comes out that would be the time to align your fastq files against the new reference.

It sounds like you have done your own genome sequencing (outside of a scientific establishment). Were you doing this for fun or are you motivated by some specific reason (e.g. trying to understand a phenotype). Just curious,

ADD REPLY
0
Entering edit mode

No I haven't done my own sequencing. How I got here is ......... medically I am 'out of the box' so I thought I would have a WGS done in the USA that my help highlight something. From that I received a Promethease Report, VCF file and a BAM file in 5 parts. The Promethease Report returned something disturbing that not only affected me but possibly my family. I live in the UK and my health is looked after by the NHS so I took my Promethease Report to them. During that consultation they told me they were unable to merge/process or diagnose anything from my BAM or VCF file as they didn't have the facilities to do that. They questioned the results in my Promethease Report and almost dismissed them. After this consultation I decided that I had to find a way to merge my BAM files. I have done that with all the help given here (including yours). I have merged, sorted and indexed my BAM file. This morning I spotted the hg19 reference in the header so I raised this post as I was under the impression my BAM file was aligned to Human Genome Reference 39. I haven't converted to FASTQ yet as I am not sure what command I need to use to do that. i.e. 'samtools fastq (do I need to use any of the available options for it to be accurate?). Where I had the WGS done are unable to give any help as they are regulated by the FDA and not allowed to give advice. That leaves me with my data and nothing to do than try and delve into it myself. My next step is to produce the FASTQ file. Hope that helps explain where I am coming from. Because there maybe implications to my family I will not stop until I have everything sorted. Hence all the newbie questions. Hope that helps. Thank you for helping.

ADD REPLY
2
Entering edit mode

None of us are qualified to give you advice beyond what we have done so far. There are many mysteries that remain beyond the linear sequence of the genome and the differences (SNP). Your best bet may be to concentrate on the VCF report you have (for now) and try to understand if something in there correlates with your phenotypic history (more than likely it won't). If you can, find a medical geneticist to consult with in NHS or a local researcher that you can talk with in person (forums are always going to be limited).

We wish you luck.

ADD REPLY
2
Entering edit mode
6.7 years ago
skim ▴ 60

The answer to your question: Not even close. Use Crossmap (http://crossmap.sourceforge.net/) to do a liftover between the two files. You can get chain files at the UCSC website if you need them. They are all links on the crossmap site but they are based here. http://hgdownload.cse.ucsc.edu/downloads.html You might not need this but do check your BAM file.

ADD COMMENT
1
Entering edit mode

Thank you, Looks like that will keep me quiet for a while. Will let you know how I get on. Thank you so much for the information and links. I've downloaded the 'Learn Bioinformatics' ebook (from the top of this page) to try and help me understand things and a book 'Bioinformatics for Dummies' (that's me) has just arrived here. Hopefully they will help also. Once again thank you so much for your help. Off to study and play .........

ADD REPLY
0
Entering edit mode

Well I am also a newbie who seeks a career as a bioinformatics researcher. (I am a 16 year old boy currently having headaches with graph data structures and Campbell Biology in school...) I hope bioinformatics results as a good thing to study for both of us :)

ADD REPLY

Login before adding your answer.

Traffic: 1943 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6