R programming: match and rearrange
1
2
Entering edit mode
9.0 years ago
MAPK ★ 2.1k

Hi guys,

I have R programming question: I have more than 1000 samples (1:1000) with both GTs and ADs for each sample (Genotype). I want to match the genotype for all the samples (Genotype) in (Names), or in other words, I want to match Gene1.GT and get both Gene1.GT and Gene1.AD and so forth from Genotype and get the (Result) as listed below. Thank you.

Names <- c("cebi", "pithe", "Gene1.GT", "sapiens" "Gene2.GT", "calli", "Gene3.GT")
Genotype <- c("Gene1.GT", "Gene1.AD", "Gene2.GT", "Gene2.AD", "Gene3.GT", "Gene3.AD")

Result:

-> "cebi", "pithe", "Gene1.GT", "Gene1.AD", "sapiens", "Gene2.GT", "Gene2.AD", "calli", "Gene3.GT", "Gene3.AD"
R • 2.2k views
ADD COMMENT
0
Entering edit mode

It's a little unclear how you are mapping between names and genotypes. Can you explain a bit more about how the result relates to the input?

ADD REPLY
0
Entering edit mode

Thank you for your reply. I want to match the part Gene1, Gene2, Gene3... and get both GTs and ADs for them. For example, I want to match "Gene1" common in both objects and get Gene1.GT and Gene1.AD from (Genotype) and get the (Result). So, I want to match Gene1:Gene1000 and get all the corresponding GTs and ADs in the same order it matches with the (Names).

ADD REPLY
1
Entering edit mode
9.0 years ago
gtho123 ▴ 260

I don' think I quite understand what you are trying to do but from your example it seems like you want to insert the appropriate GeneX.AD value from the Genotype vector to immediately after the corresponding GeneX.GT element in the Names vector.

If this is the case you could use regular expressions and a loop like this:

ADs <- Genotype[grep("AD", Genotype)]

for(i in 1:length(ADs)){
  GT_loc <- grep(paste0("Gene", i), Names)
  Names <- c(Names[1:GT_loc], ADs[i], Names[-(1:GT_loc)])
}

Given your input vectors this creates your desired result. This will not be the most efficient way in R, especially if your sample size is large. However it does reproduce your example.

ADD COMMENT
2
Entering edit mode

If you really want to insert immediately after GeneX.GT thethe corresponding GeneX.AD you can use also following approach. Should be faster than a loop.

​Names <- c("cebi", "pithe", "Gene1.GT", "sapiens", "Gene2.GT", "calli", "Gene3.GT")
# Get Position of ".GT's"
id <- grep(".GT",Names)
# Create a index: old element gets rank, "AD's" gets half-rank
Seq <- c(seq_along(Names),id+0.5)
# Append AD's
Names <- append(Names,gsub("GT","AD",Names[id]))
# Order (AD's after GT's)
Names[order(Seq)]
ADD REPLY

Login before adding your answer.

Traffic: 3383 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6