gene expression for a specific gene across for multiple cancer types in TCGA datasets
12 days ago
pt.taklifi ▴ 60

Hello everyone , I am trying to make a box plot of expression of a gene across multiple cancer types ( BRCA, COAD& PRAD ) I know RTCGA package in R can produce boxplots like this :

expressionsTCGA(BRCA.rnaseq, COAD.rnaseq, PRAD.rnaseq,
                extract.cols = NULL) %>%
  rename(cohort = dataset,
         VENTX = `VENTX|27287`) %>%  
  filter(substr(bcr_patient_barcode, 14, 15) == "01") %>% #cancer samples
  ggplot(aes(y = log1p(VENTX),
             x = reorder(cohort, log1p(VENTX), median),
             fill = cohort)) + 
  geom_boxplot() +
  theme_RTCGA() +
  scale_fill_brewer(palette = "Dark2")

enter image description here

but I'm not sure how to specify the gene of interest. for example how can I get expression of "KLK2" gene across all cancer types I mentioned before ?

gene_expression TCGA RTCAG • 87 views
12 days ago
Nitin Narwade ▴ 670

I think the TCGA dataset has ensemble Id's as the identifier and also the gene symbols should be there in the expression matrix (I checked it for the PRAD dataset). So, you just have to subset the expression matrix using ensemble (gene of your interest) OR gene name.

exp.mat[exp.mat$GeneSymbols == "KLK2", ] # Rows as genes and Columns as samples
