Remove duplicate rows based on three columns

clubkli Published at Dev

clubkli

I have a data frame with prices of products on different dates. If there are same prices for a product on different dates, I would like to keep only the row with the most recent date.

Example of my data frame:

            Date           Price             Product
1         2019-08-28       10               product 1
2         2019-08-27       10               product 1
3         2019-08-28       15               product 2
4         2019-08-27       14               product 2
5         2019-08-23       15               product 2
6         2019-08-27       10               product 3

So I would like to get rid of row 2 and row 5 and only have:

            Date           Price             Product
1         2019-08-28       10               product 1
3         2019-08-28       15               product 2
4         2019-08-27       14               product 2
6         2019-08-27       10               product 3

Any suggestions? I could not find a question with a similar problem.

Ape

Order data by Date, then remove rows duplicated in the other two columns

df <- read.table(text = "          Date           Price             Product
1         2019-08-28       10               product1
                 2         2019-08-27       10               product1
                 3         2019-08-28       15               product2
                 4         2019-08-27       14               product2
                 5         2019-08-23       15               product2
                 6         2019-08-27       10               product3", 
                 header = TRUE, stringsAsFactors = FALSE)

df <- df[order(df[,1], decreasing = TRUE),]
df[!duplicated(df[,-1]),]

#        Date Price  Product
#1 2019-08-28    10 product1
#3 2019-08-28    15 product2
#4 2019-08-27    14 product2
#6 2019-08-27    10 product3

Collected from the Internet

Please contact [email protected] to delete if infringement.