Entering edit mode
5.6 years ago
modarzi
▴
160
Hi,
I am studing on TCGA RNA-seq. So, I downloaded annotation file via .gtf format that name and version is gencode.v22.genes.gtf .
colnames(gencode.v22.genes)
"seqname", "source", "feature", "start", "end", "score", "strand", "frame", "gene_id", "gene_name", "gene_status", "gene_type", "havana_gene", "level", "tag", "full_length", "exon_length", "exon_num", "first_exon", "last_exon", "canonical_transcript", "one_transcript", "one_transcript_start", "one_transcript_end"
Now, I need Locus Link ID for Gene Enrichment Analysis and also GO study. exactly in below, WGCNA code I have to use Locus Link ID:
annot = read.csv(file = "GeneAnnotation.csv");
# Match probes in the data set to the probe IDs in the annotation file
probes = names(datExpr)
probes2annot = match(probes, annot$substanceBXH)
# Get the corresponding Locuis Link IDs
allLLIDs = annot$LocusLinkID[probes2annot];
# $ Choose interesting modules
intModules = c("brown", "red", "salmon")
for (module in intModules)
{
# Select module probes
modGenes = (moduleColors==module)
# Get their entrez ID codes
modLLIDs = allLLIDs[modGenes];
# Write them into a file
fileName = paste("LocusLinkIDs-", module, ".txt", sep="");
write.table(as.data.frame(modLLIDs), file = fileName,
row.names = FALSE, col.names = FALSE)
}
# As background in the enrichment analysis, we will use all probes in the analysis.
fileName = paste("LocusLinkIDs-all.txt", sep="");
write.table(as.data.frame(allLLIDs), file = fileName,
row.names = FALSE, col.names = FALSE)
but I dont know how can I get this ID from my annotation file.
I appreciate if anybody share his/her comment with me.
Best Regards,
Mohammad