I'm having a hard time understand how R is treating the AND and OR operators when I'm using filter
from dplyr
.
Here's an example to illustrate:
library(dplyr)
xy <- data.frame(x=1:6, y=c("a", "b"), z= c(rep("d",3), rep("g",3)))
> xy
x y z
1 1 a d
2 2 b d
3 3 a d
4 4 b g
5 5 a g
6 6 b g
Using filter
I want to eliminate all rows where x==1
and z==d
. This would lead me to believe I want to use the AND operator: &
> filter(xy, x != 1 & z != "d")
x y z
1 4 b g
2 5 a g
3 6 b g
But this removes all rows that have either x==1
or z==d
. What's more confusing, is that when I use the OR operator, |
I get the desired result:
> filter(xy, x != 1 | z != "d")
x y z
1 2 b d
2 3 a d
3 4 b g
4 5 a g
5 6 b g
Also, this does work, however not as desirable for if I were stringing together ==
and !=
in the same conditional evaluation.
> filter(xy, !(x == 1 & z == "d"))
x y z
1 2 b d
2 3 a d
3 4 b g
4 5 a g
5 6 b g
Can someone explain what I'm missing?
This is a question of boolean algebra. The logical expression !(x == 1 & z == d)
is equivalent to x != 1 | z != d
, just the same as -(x + y)
is equivalent to -x - y
. Eliminating the bracket, you change all ==
to !=
and all &
to |
and vice versa. This leads to the fact that
!(x == 1 & z == "d")
is NOT the same as
x != 1 & z != "d"
but rather
x != 1 | z != "d"
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments