hg19 or hg38 for variant calling
1
1
Entering edit mode
6.2 years ago

I've recently been troubleshooting an error in part of my variant calling pipeline, which has been traced back to me using bam files aligned to hg38 as input for an Agilent deduplication tool which has yet to migrate from hg19 to hg38. Currently my workaround is to align to hg19, deduplicate, then split the resultant sam back into fastqs and re-align to hg38, which seems convoluted.

Should I continue working with hg38 once I'm past this step, or should I stick with hg19 all the way? How do other people balance pipelines when some tools/datasets are in hg38 and others have yet to switch over from hg19? Any advice on this whole hg19 v. hg38 issue would be appreciated.

Edit: The tool is LocatIt, which is used for deduplication of reads by the molecular barcodes used in the HaloPlex HS Target Enrichment System. https://www.agilent.com/cs/library/software/Public/AGeNT%20ReadMe.pdf

genome variant calling Assembly • 2.5k views
ADD COMMENT
0
Entering edit mode

And you have to use this Agilent deduplication tool? There are alternatives, unless it's something specific you need.

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

The tool is LocatIt, which is used for deduplication of reads by the molecular barcodes used in the HaloPlex HS Target Enrichment System. https://www.agilent.com/cs/library/software/Public/AGeNT%20ReadMe.pdf

ADD REPLY
0
Entering edit mode

What is the tool? Which kind of data? Is it the AgilentMBCDedup Tool, used to process the Molecular Barcode (MBC) of a HaloPlex runs?

ADD REPLY
0
Entering edit mode

The tool is LocatIt, which is used for deduplication of reads by the molecular barcodes used in the HaloPlex HS Target Enrichment System. https://www.agilent.com/cs/library/software/Public/AGeNT%20ReadMe.pdf

ADD REPLY
0
Entering edit mode
6.2 years ago
h.mon 35k

From the documentation you linked, LocatIt does not necessarily expects / uses hg19, it just expects the chomosome names will follow its conventional naming scheme. Maybe you have random / unplaced / alt chromosomes? Anyway, did you try to use the -H parameter?

-H SAM header file: By default, LocatIt expects hg19 names, chr1-chrM. If the contig names are different (for example, GRCh37 names or nonhuman), one can use this option and provide a SAM header file containing a dictionary of the contigs used by the data files, SAM/BAM and, optionally, the bed file.

ADD COMMENT

Login before adding your answer.

Traffic: 2272 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6