How to use NSE in dplyr to refer to one variable?

Aaron left Stack Overflow

I want to write a function for use in a dplyr chain to arrange grouped data by a given variable and then to check that that variable is strictly increasing integers (eg, 1,2,3,...). To clarify, I mean every integer in order, not just increasing integers. So 1,2,4,... should fail.

The idea would be to have something like this in the end, that would look like this, and provide an error if x was not 1,2,3,... for every group.

d %>% group_by(group) %>% check(x) 

I've written a SE version of this that seems to work, as follows, but am stuck on the NSE version.

check_ <- function(.data, var) {
  checkint <- function(x) { stopifnot(x == seq_along(x)) }
  do(.data, {
    . <- dplyr::arrange_(., var)
    checkint(lazyeval::lazy_eval(var, data=.))
    .
  })
}

In the documentation, it looks like I should be using lazy to process a single variable, but this doesn't work right when the variable I'm passing in also exists in the global environment.

checkX <- function(.data, var) {
  check_(.data, lazyeval::lazy(var))
}

d <- expand.grid(group=1:2, x=3:1)
x <- 5 ## put an "x" in the global environment
d %>% group_by(group) %>% checkX(x)

## Error: incorrect size (1), expecting : 3 

I do have a version of the NSE that seems to work, but calling lazy_dots feels wrong because I only ever want one variable.

check <- function(.data, ...) {
  check_(.data, lazyeval::lazy_dots(...)[[1]])
}
MrFlick

Looks like lazyeval has been changing. The latest vignette doesn't even reference the lazy() function. It does seem to have problems with variables in scope (more on that at the bottom). There are now we functions being encouraged though they still haven't made their way into all of the "tidyverse" yet.

It looks like the function you want is expr_find. If we define checkX as

checkX <- function(.data, var) {
  check_(.data, lazyeval::expr_find(var))
}

Then this will work

x <- 5
d %>% group_by(group) %>% checkX(x)

(or at least it does with lazyeval_0.2.0 and dplyr_0.5.0)

But going back to the first example from the old vignette

library(lazyeval)
# `x` does not exist here
f <- function(x = a - b) {
  lazy(x)
}
f()
# <lazy>
#   expr: a - b
#   env:  <environment: 0x000000000663d618>
exists("x")
# [1] FALSE
f(x)
# <lazy>
#   expr: x
#   env:  <environment: R_GlobalEnv>
x <- 101
f(x)
# <lazy>
#   expr: 101
#   env:  <environment: R_GlobalEnv>

Or another even more simple example

# rm(x)
lazy(x)
# <lazy>
#   expr: x
#   env:  <environment: R_GlobalEnv>
x <- 100
lazy(x)
#  <lazy>
#   expr: 100
#   env:  <environment: R_GlobalEnv>

Somewhere its evaluating the parameter x so it's never being preserved in the lazy object if it exists in the environment it's coming from.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

How to get the name of variable in NSE with dplyr

Use I string to refer to a variable inside dplyr?

How to refer to variable instead of column with dplyr

R dplyr: how to use ... with summarize(across()) when ... will refer to a variable name within the data?

Use NSE in dplyr::case_when

use string content of a variable to refer to a column in a dataframe using transmute dplyr

How to use a variable in dplyr::filter?

How to refer to a data.frame variable in a dplyr pipeline via . programmatically?

How to use a variable to refer to a key of a struct?

how to use string variable as filter condition in dplyr

dplyr: NSE in joins (by)

How to perform NSE on the left hand side of a dplyr function?

Kubernetes: How to refer to one environment variable from another?

refer to column name from variable in across in dplyr

How to use a character variable to refer to a data.frame in R?

How to refer SASS variable

How to refer to a variable range

variable use in dplyr and ggplot

R: Dplyr: How to Check if the Value of One Variable is Contained in Another

String based filtering in dplyr - NSE

Programming Functions: NSE in DPLYR and PURRR

How can I use NSE and filter an expression using ... (ellipsis)?

Use NSE to construct a formula

how can I use mutate in dplyr to modify variable dynamically?

How to use dplyr lag() to smooth minor changes in a variable

how to use a variable name within dplyr::lead/lag function

How to use dplyr programming syntax to create and evaluate variable names

how to use a variable as a parameter of the dplyr::slice_max() function in R

How can I use a pre-assigned variable in dplyr::filter?