Biostar Beta. Not for public use.
Forum:TCGA: How to tell if the data is exome or WGS?
3
Entering edit mode
3.4 years ago
Les Ander • 110
United States

Hi

I am trying to obtain variants identified from whole genome sequencing (not exome sequencing) for various tumors sequenced by the TCGA consortium. I looked here but there does not appear to be a clear way to do this. https://tcga-data.nci.nih.gov/tcga/dataAccessMatrix.htm?mode=ApplyFilter

Can you please help.

Thanks

ADD COMMENTlink
0
Entering edit mode

If you are using TCGA MAF (or from broad firehouse) files as your variant source, look for column Sequence_Source , if its exome seq you should find value 'WXS' ; if its genome seq, it will be WGS.

MAF specification here+Specification).

ADD REPLYlink
0
Entering edit mode

@poisonAlien Awesome, thanks!

ADD REPLYlink
0
Entering edit mode

As far as I can see, this column is not filled out in files from the harmonized portal. Does anybody has any idea why this is the case? And how I can find out about whether the variants are from WXS or WGS?

ADD REPLYlink
1
Entering edit mode
16 months ago
Ying W ♦ 3.9k
South San Francisco, CA

You can try to use cgquery to identify the samples you are interested in. The library_strategy field will tell you if its WGS or WXS. You might need to specify your key to use this tool though (its the same tool that you would use to download controlled data).

ADD COMMENTlink
0
Entering edit mode

Thank you so much. It is good to know I can get this from UCSC.

However, is there a way to simply get this from TCAG (https://tcga-data.nci.nih.gov/tcga/dataAccessMatrix.htm) or broad firehose (http://gdac.broadinstitute.org/)?

Seems like this information should be present somewhere in TCAG or Broad Firehose.

ADD REPLYlink
0
Entering edit mode

@poisonAlien response is correct, look at the MAF fields. I misread your question and thought you were looking for sequence files not variants.

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.3.1