Hi! Thanks a lot for your useful comments!
I have tried the two suggested methods but I have not had any successful result :(
- Yes, you are right (Ido Tamir), I was trying this command line:
library(GenomicFeatures) txdb <- makeTranscriptDbFromUCSC(genome='hg19',tablename='ensGene') aligns <- readBamGappedAlignments("my.bam") counts <- countOverlaps(transcripts(txdb), aligns) counts.dframe <- data.frame(counts, stringsAsFactors=FALSE) rownames(counts.dframe) <- names(txdb)
Following this command line I obtained the number of reads overlapping each gene:
CC209858.4_FG002 214 CC209858.4_FG003 0 CC209858.4_FG004 17 CC209859.2_FG006 0 CC209859.2_FG008 0
Now I wanted to add an extra column with the information stored in the Metadata(txdb). I tried the two suggested options suggested command lines but I got an error:
rownames(counts.dframe) <- elementMetadata(txbd)[,"tx_name"]
Error in elementMetadata(txbd)[, "tx_name"] : selecting cols: cannot subset by character when names are NULL
elementMetadata(txbd)[,"my.counts"] <- count.dframe
*tmp*, , "my.counts", value = list(counts = c(1L, : replacing cols: cannot subset by character when names are NULL
- About trying to merge both files. The problem to merge the two files(counts and genes) using the gene ID is that I have to coerce into data.frame both files, and when I do it the number of rows of the txbd.dataframe doesn't correspond with the number of observations this file has as GRangeList (I don't get why..). Then, I can not merge the two datafrmes.
Original files Files coerced into data frame
counts (interger) counts.dframe (110185 obs. of 1 variables) txbd (GRangesList of length 110185) txbd.dframe (392978 obs. of 8 variables)
Then, I don't know how I can access to the information store in the metaData and use it to complete the information about the reads overlapping each gene. I just thinking about to export the files and write and script in Perl to do it, but should be a way to do it in R (I guess...).