How can I make a data frame of the unique values in an existing column?

JKO

I need to make a new data frame (col.3) using only the occurrences in a previous column (col.1) that correspond to unique values in another column (col.2) in an existing data frame.

I need this:

df1
col.1   col.2     
    1    1             
    1    3             
    1    7             
    1    7            
    2    12                
    2    14   
    2    14
    2    14

 df2
 col.3
     1
     1
     1
     2
     2 

I have tried this:

new.col <- cbind(df$col.1[unique(df$col.2)])

But it gives me a column that is both too long, and which does not include the complete set of col.1 values

I suspect that plyr has a simple solution to this, but I have not figured that (or any other solution) out.

How can I achieve my desired result? Preferably using plyr, but base is fine too.

akrun

We can use duplicated to create a logical index and use that to subset the rows

df2 <- data.frame(col3. = df$col.1[!duplicated(df$col.2)])

Or with subset

subset(df, !duplicated(col.2), select = col.1)

Or with dplyr, usedistinct on col.2 and then select the 'col.1'

library(dplyr)
df %>%
   distinct(col.2, .keep_all = TRUE) %>%
   select(col.3 = col.1)
#  col.3
#1     1
#2     1
#3     1
#4     2
#5     2

If the duplicates are considered based on the equality between adjacent elements, then use rleid

library(data.table)
df %>% 
    filter(!duplicated(rleid(col.2))) %>% 
    select(col.3 = col.1)

If we convert to data.table, the unique also have a by option

library(data.table)
unique(setDT(df), by = 'col.2')[, .(col.3 = col.1)]

data

df <- structure(list(col.1 = c(1L, 1L, 1L, 1L, 2L, 2L, 2L), col.2 = c(1L, 
3L, 7L, 7L, 12L, 14L, 14L)), class = "data.frame", row.names = c(NA, 
-7L))

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

MySQL: how can i push data onto existing column data?

How can I change values in a data frame into a list of values?

How can I convert each Pandas Data Frame row into an object including the column values as the attributes?

list unique values for each column in a data frame

R - number of unique values in a column of data frame

How can I check, given a data frame that the values of a column are in increasing order without any missing number?

How to add a new column in data frame by dividing values in an existing column, specifically, in the same existing column?

How can i transpose the data frame of specific column values?

How do I change unique row values into another set of unique row values in a data frame in R?

How can I make a new column and data frame with the percentage change from the start of a particular row in R?

How would I go about creating a new data frame that has the unique values of a a column and it counts them?

How can I make this plot for each column in my data frame using sapply and ggplot2?

How can I make a dictionary from a pandas data frame where the values are data types?

How can I add additional columns to an existing data.frame, that are aligned on one specific column already in the data.frame?

How can I add a column to a data frame that counts upwards based on the values in another column?

How can I add a new column and use an existing column in a data frame in R?

How can I create a data frame with all existing variables (at once)?

How can I create a data frame column (in R) that is made up of lists containing other values in the corresponding row?

Extract number of unique values in a data frame column

How can i insert values from a list into a pandas data frame column?

How can I create a new data frame based on the existing columns?

How can i match the values of a column according to another of a data frame in R using dplyr?

How can I match the values of a column according to another data frame in R and print a message using dplyr?

How can I find the position of a substring (list values) in a data frame column

How can I add a column to a data frame with values based on an if statement?

How can I normalize a nested JSON object from a pandas column and append to existing data frame or as its own data frame?

How to make a new column after subtracting specific values in a data frame?

How can I transpose a data frame in R so that a certain column becomes column names and another column fills the values?

How can I add a column with values from a list, iterating in a data frame over factor levels?