Biostar Test Site

This is site is used for testing only. Visit: to ask a question.

gene expression for a specific gene across for multiple cancer types in TCGA datasets
Entering edit mode
12 days ago
pt.taklifi ▴ 60

Hello everyone , I am trying to make a box plot of expression of a gene across multiple cancer types ( BRCA, COAD& PRAD ) I know RTCGA package in R can produce boxplots like this :

expressionsTCGA(BRCA.rnaseq, COAD.rnaseq, PRAD.rnaseq,
                extract.cols = NULL) %>%
  rename(cohort = dataset,
         VENTX = `VENTX|27287`) %>%  
  filter(substr(bcr_patient_barcode, 14, 15) == "01") %>% #cancer samples
  ggplot(aes(y = log1p(VENTX),
             x = reorder(cohort, log1p(VENTX), median),
             fill = cohort)) + 
  geom_boxplot() +
  theme_RTCGA() +
  scale_fill_brewer(palette = "Dark2")

enter image description here

but I'm not sure how to specify the gene of interest. for example how can I get expression of "KLK2" gene across all cancer types I mentioned before ?

gene_expression TCGA RTCAG • 87 views
Entering edit mode
12 days ago
Nitin Narwade ▴ 670

I think the TCGA dataset has ensemble Id's as the identifier and also the gene symbols should be there in the expression matrix (I checked it for the PRAD dataset). So, you just have to subset the expression matrix using ensemble (gene of your interest) OR gene name.

exp.mat[exp.mat$GeneSymbols == "KLK2", ] # Rows as genes and Columns as samples
Entering edit mode

Fyi, the little hand icon on the bottom allows you to grap&drag your posts so you could have simply dragged the comment into the answer field without reposting, just for the future ;-)


Login before adding your answer.

Traffic: 173 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6