Find mismatch in two columns in a data frame in R
1
1
Entering edit mode
8.1 years ago
dirranrak ▴ 20

`Hi all,

I am quite new with R. I looked for the answer in many website but I didn't find a clear way to solve my problem. I have a data frame with two columns with each column has a list of SNPs in more than 1000 rows but not the same number of row. SNP1 SNP2 rs3094315 rs3094315 rs3131972 rs3131972 rs11240777 rs11240777 rs6681022 rs6681049 rs4970383 rs4970383 rs7537756 rs7537756 rs13302982
I did > match(df$SNP1, df$SNP2) and find the indices of row having NA value which is the mismatch. But now, I want to get the rs# instead of the indices of the rows. How can I get this rs# instead of row indices?

Thank you

R • 119k views
ADD COMMENT
1
Entering edit mode
8.1 years ago
dan.shea ▴ 10

If I understand your question correctly, you want everything in df$SNP1 that is not in df$SNP2.

Small example using two vectors:

a <-c('a','b','c','d','e')
b <-c('a','b','d','e')

> a[a %in% b]
[1] "a" "b" "d" "e"
> a[!(a %in% b)]
[1] "c"

Read the R documentation on value matching found https://stat.ethz.ch/R-manual/R-devel/library/base/html/match.html

If you use %in% you will get a logical vector back of TRUE and FALSE values that you can then use to access the values in the column.

Here is the same data as a data frame if it helps visualize what is going on:

> a <-c('a','b','c','d','e')
> b <-c('a','b','d','e', NA)
> a[!(a %in% b)]
[1] "c"
> ab <- data.frame(a,b)
> ab$a[!(ab$a %in% ab$b)]
[1] c
Levels: a b c d e
ADD COMMENT
1
Entering edit mode

Hi dan.shea, I have similar prob. Actually i want extract unique value from ColumnA compared with B,C,D,E,F. Means the want to extract gene name which is present in columnA which is not present in any other remaining 5 columns. Thanks.

ADD REPLY
0
Entering edit mode

Hi dan.shea,

Thank you very much, you save my day. I was using excel and waiting almost a day for the comparison because of the huge data. And tried to find how to do it with R.

Thank you

ADD REPLY

Login before adding your answer.

Traffic: 1951 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6