Fasta File For Exomedepth And Cnv Calling
1
1
Entering edit mode
10.2 years ago
Jimbou ▴ 950

Hello,

I have a problem using ExomeDepth with a referenceFASTA file. I analysed app. 500 target genes (>7000 Exons, hg19) and without including a FASTA file and the GC content everything worked fine.

As GC content influencing amplification, coverage and CNV detection, I want to include this factor. Therefore I tried several things to get appropiate FASTA sequences.

  1. BioMart

    mart <- useMart("ensembl", dataset="hsapiens_gene_ensembl") FASTA <- getSequence(chromosom,start,stop,type="hgnc_symbol",seqType="gene_exon", mart = mart)

Error:

  `Reference fasta file provided so exomeDepth will compute the GC content in each window
   Error in (function (classes, fdef, mtable)  : 
   unable to find an inherited method for function ‘scanFa’ for signature ‘"data.frame", "GRanges"’`
  1. BSgenome

    FASTA<-getSeq(BSgenome.Hsapiens.UCSC.hg19,chromosom,start,stop)

    show(FASTA)

    A DNAStringSet instance of length 10 width seq [1] 845 GTTGAAAAGTGATCAGGTTCATTTTATTGACTACACAGAAGCAATTCCATTT...GAGGAGGCAGATCACGGCGAAGACAATGAAGCTGTACGGGCCGAGGCCCTC [2] 129 CCTGGATGAACGGGAAGATCAAGCCCACGGTGAAGTTGGAGAGCCAGTGCAC...GCCGAGAGGACTGCAGGAAGATCTCAGTGATGAGCAGCGCGGGTATGGGAC [3] 77 CTGGGCCCGAGGGCATGTCCTATGACGTAGGAGATGACACAGACGATGCTGATGTATGGCATCCAGGACACTGTGTC [4] 103 CCTGCAGTGCCAGAGCTGCAGTGAGCACGCAGCAGGCTATGAGGCAGATGGAGAAGCCCAGCAGCAGCAGCAGCCTCCGACCCAGGAGCTCCACCACGAACAC [5] 112 CGGCGCAGAAGGTCATGACCACGTTCACGGCCCCGGTGCCGGCCGTCACGTA...CTCCTCCGGCACGCCGGCGCTCAGGTAGATCTGGTCCGCGTAGTAGTAGAT

Error:

Reference fasta file provided so exomeDepth will compute the GC content in each window
Error in (function (classes, fdef, mtable)  : unable to find an inherited method for function ‘scanFa’ for signature ‘"DNAStringSet", "GRanges"’

3. Downloading human_g1k_v37.fasta.gz

Chromsome, Start & Stop including only the 7000 exon locations. With method 1 and 2, I was able to get target sequences, but the counting (getbamcounts) didn't work, different errors occur, warning something is wrong with the FASTA file. I think there are some file formatting issues.

my.counts<- getBamCount(data.frame(Chromsome, Start,Stop), bam.files = bam.files, include.chr = F , referenceFasta =FASTA)

I dindn't tried the human_g1k_v37.fasta.gz so far, because I don't know how to load it in R.
Do you have an idea how to transform the files that they work or how to load the whole genome FASTA File?

ngs cnv calling exome exon r • 4.4k views
ADD COMMENT
0
Entering edit mode

It is always a good idea to include the output of sessionInfo() as well as any error messages you get when posting an R question.

ADD REPLY
0
Entering edit mode

I edited and attachted the errors

ADD REPLY
1
Entering edit mode
10.2 years ago

The reference FASTA file should be the same as that used for your mapping step, so you should have it already.

ADD COMMENT
0
Entering edit mode

Problem is, that I dind't mapped the bam files by myself. I only have the already mapped files.

ADD REPLY
0
Entering edit mode

You'll need to establish what the reference file was so that you can get it or simply remap the files yourself. When inheriting something other than raw data, it is always a good idea to establish the data provenance, including auxiliary files so that you can publish your results.

ADD REPLY
1
Entering edit mode

Ahh ok. Then I have to use exactly the same reference file which was used for the mapping of the reads. It will not work to "build" a new FASTA file for the target regions, although I'm shure of the informations human & hg19 and so on?

ADD REPLY
0
Entering edit mode

You'll save yourself some headache by getting the reference used for the mapping, but you can always experiment to see if something works. In particular, you'll want to get the right "build" of the genome and make sure that at least the chromosome names match the alignment chromosome names.

ADD REPLY

Login before adding your answer.

Traffic: 1889 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6