Replace symbols in R table
3
1
Entering edit mode
5.6 years ago

Hello,

I have a table in R which consists of 70k rows and 37 columns. Lot of cells have "./." which I want to modify and make it as "ab" . I tried to use gsub() but it does not give me the required output.

I used :

file <- gsub("./.","ab",file)

I want the change to happen throughout the file. Is there any other way with which I can modify it? Thanks in advance.

Input: Eg:

S.no chr pos gene_name S1 S2 S3
1        1    1290    X            ./.   1/1  ./.
2        1     5822    Y           0/1   ./.   ./.

Output

S.no chr pos gene_name S1 S2 S3
    1        1    1290    X            ab   1/1  ab
    2        1     5822    Y           0/1   ab   ab

It can be either ab or NA

R gsub vcf • 3.3k views
ADD COMMENT
3
Entering edit mode
5.6 years ago
zx8754 11k

Try to use fixed match:

file <- gsub("./.","ab",file, fixed = TRUE)

Or

file[ file == "./." ] <- "ab"

Edit: Using example data provided by OP.

# example input data
df1 <- read.table(text = "
S.no chr pos gene_name S1 S2 S3
1        1    1290    X            ./.   1/1  ./.
2        1     5822    Y           0/1   ./.   ./.", header = TRUE, stringsAsFactors = FALSE)

df1
#   S.no chr  pos gene_name  S1  S2 S3
# 1    1   1 1290         X  ab 1/1 ab
# 2    2   1 5822         Y 0/1  ab ab

df1[, c("S1", "S2", "S3")][ df1[, c("S1", "S2", "S3")] == "./." ] <- "ab"
df1
#   S.no chr  pos gene_name  S1  S2 S3
# 1    1   1 1290         X  ab 1/1 ab
# 2    2   1 5822         Y 0/1  ab ab
ADD COMMENT
0
Entering edit mode

Thanks for your answer. The second one worked but it does not show "ab" instead it is blank and has "NA"s

ADD REPLY
0
Entering edit mode

Provide reproducible example input data, and expected output.

ADD REPLY
0
Entering edit mode

I have edited my question with an example input and output. Thanks

ADD REPLY
0
Entering edit mode

Edited my answer, see if it works.

ADD REPLY
3
Entering edit mode
5.6 years ago
df1=read.csv("test.txt", header = T, strip.white = T, stringsAsFactors = F, sep = "\t")
library(stringr)
library(dplyr)
> df1 %>%  mutate_all(funs(str_replace_all(.,"\\.[/|\\|]\\.","ab")))
  S.no chr  pos gene_name  S1  S2 S3
1    1   1 1290         X  ab 1/1 ab
2    2   1 5822         Y 0/1  ab ab

You can also use apply function:

> library(stringr)
> apply(df1,2, function(x) str_replace_all(x,"\\.[/|\\|]\\.","ab"))
     S.no chr pos    gene_name S1    S2    S3  
[1,] "1"  "1" "1290" "X"       "ab"  "1/1" "ab"
[2,] "2"  "1" "5822" "Y"       "0/1" "ab"  "ab"

This is supposed to replace both ./. and .|.. test.txt is OP input text.

ADD COMMENT
0
Entering edit mode

I want only the ./. to be modified not the 1/1.

ADD REPLY
0
Entering edit mode

oops..typo. It doesn't replace any character other than . (./. or .|.). Edited OP. Inquisitive8995

ADD REPLY
0
Entering edit mode

Moved your post to an answer, you might want to clean up your above comments, or edit them in into this post.

ADD REPLY
0
Entering edit mode

Thanks zx8754

ADD REPLY

Login before adding your answer.

Traffic: 1709 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6