Generate n distinct values x times in R

Jash Shah

I would like to create a vector that has distinct values from 1 to 20 thirty times but not uniformly.

For example: There can be four counts of 1, one count of 2, two counts of 3 etc. But the counts of each number must add up to thirty and there must be 20 distinct values.

I tried:

set.seed(3) 
sample(x = 1:20, size = 30, replace = TRUE)

But it does not always give all the values from 1 to 20. Some values are returned a higher number of times and some values are not returned at all.

I would like to create a vector that has all distinct values and the numbers have to necessarily be integers.

etienne

You can do it in three times:

  • generate a size-20 sample without replacements : you have every values 1 time

  • generate a size-10 sample with replacements

  • sample the two samples

Here is the result

a <- sample(1:20, 20)
b <- sample(1:20, 10, replace = TRUE)
result <- sample(c(a, b), 30)

# result
#  [1]  1 10 20 11 16 12  9  8 20  4 15  2  7  5 19 18  6 13 14 17 11  5  1  7  4 19  6 16  3  3

# table(result) # every value appear at least one time
# result
#  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 
#  2  1  2  2  2  2  2  1  1  1  2  1  1  1  1  2  1  1  2  2 

Note that you can do it with a one-liner :

sample(c(sample(1:20, 20), sample(1:20, 10, replace = TRUE)), 30)

# [1]  4 13 15 20  6  5  9 11 11 14 17  1 10  9  3 10 11 12 18 17  8  7 18 12 19 16  2 13 13  4

Thanks to James's comment, you can use a faster solution:

sample(c(1:20,sample(20,10,replace=TRUE)))

Here is the microbenchmark comparison:

# Unit: relative
#     expr      min       lq     mean   median     uq       max neval
#  etienne 1.727202 1.538411 1.529077 1.571341 1.5998 0.6855444  1000
#    james 1.000000 1.000000 1.000000 1.000000 1.0000 1.0000000  1000

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

R: Generate tables based on Distinct Values

paste distinct values in R

counting of distinct values in R

determining total number of times distinct values 0 or 1 or na in each column in a data frame in R

Is it possible to add an exception to summarize(count = n_distinct(x)) in R?

How to group by distinct values in one column in r so I can generate a density plot?

Repeat values for n times

Distinct values in columns X per Distinct values in column Y

First N distinct column values

Generate array of times for every X minutes in angularjs

How to automatically generate N "distinct" colors?

Generate n distinct random numbers in rust

Generate values in R and Python

R for loop n times

R repeat n times

Django ORM returning the same values multiple times when using .distinct()

Show distinct values if column X is not null

How to generate string with a char repeated for n times?

How to repeat x, n times?

How to rename values in a column of data which appear less than x times in R?

Min, Max, Average of n distinct top/low values and plotting them with time series data on the same graph in R

R Why dplyr counts unique values (n_distinct) by groups faster than data.table (uniqueN)?

Generate a list from a distinct values of a column mysql, nodejs

R: How to generate a column with row values based on the nearest N row's values

mysql select distinct row, that appears at least (n) times

Repeat row values by X number of times

generate dataframe with different values in r

r generate random poisson values

R: count distinct values in tibble/df