Given a discrete distribution, how do I round a number to the closest value in that distribution?

Scott

What I ultimately want to do is round the expected value of a discrete random variable distribution to a valid number in the distribution. For example if I am drawing evenly from the numbers [1, 5, 6], the expected value is 4 but I want to return the closest number to that (ie, 5).

from scipy.stats import *
xk = (1, 5, 6)
pk = np.ones(len(xk))/len(xk)
custom = rv_discrete(name='custom', values=(xk, pk))
print(custom.expect())   
# 4.0

def round_discrete(discrete_rv_dist, val):
    # do something here
    return answer

print(round_discrete(custom, custom.expect()))
# 5.0

I don't know apriori what distribution will be used (ie might not be integers, might be an unbounded distribution), so I'm really struggling to think of an algorithm that is sufficiently generic. Edit: I just learned that rv_discrete doesn't work on non-integer xk values.

As to why I want to do this, I'm putting together a monte-carlo simulation, and want a "nominal" value for each distribution. I think that the EV is the most physically appropriate rather than the mode or median. I might have values in the downstream simulation that have to be one of several discrete choices, so passing a value that is not within that set is not acceptable.

If there's already a nice way to do this in Python that would be great, otherwise I can interpret math into code.

Scott

Figured it out, and tested it working. If I plug my value X into the cdf, then I can plug that probability P = cdf(X) into the ppf. The values at ppf(P +- epsilon) will give me the closest values in the set to X.

Or more geometrically, for a discrete pmf, the point (X,P) will lie on a horizontal portion of the corresponding cdf. When you invert the cdf, (P,X) is now on a vertical section of the ppf. Taking P +- eps will give you the 2 nearest flat portions of the ppf connected to that vertical jump, which correspond to the valid values X1, X2. You can then do a simple difference to figure out which is closer to your target value.

import numpy as np
eps = np.finfo(float).eps

ev = custom.expect()
p = custom.cdf(ev)
ev_candidates = custom.ppf([p - eps, p, p + eps])
ev_candidates_distance = abs(ev_candidates - ev)
ev_closest = ev_candidates[np.argmin(ev_candidates_distance)]
print(ev_closest)
# 5.0

Terms:
pmf - probability mass function
cdf - cumulative distribution function (cumulative sum of the pdf)
ppf - percentage point function (inverse of the cdf)
eps - epsilon (smallest possible increment)

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

How to get a random number with a given discrete distribution in Ruby

How do I generate discrete random events with a Poisson distribution?

Discrete probability distribution with a given maximum

How to generate a number representing the sum of a discrete uniform distribution

Discrete probability distribution plot with given values

How to calculate the expectation value for a given probability distribution

How to calculate the expectation value for a given probability distribution

How to calculate expected value of a given distribution in R?

how can I check whether a scipy distribution is discrete?

Zipf Distribution: How do I measure Zipf Distribution

Choosing a random value from a discrete distribution

How to create a linear fractional distribution as a custom discrete probability distribution?

Find closest value in a Poisson distribution table

How do I find the standard deviation of a normal distribution, given mean, threshold, and probability?

How to create a discrete normal distribution in R?

How to draw distribution plot for discrete variables in seaborn

How do I get a Python distribution URL?

How to compute the probability of a value given a list of samples from a distribution in Python?

How to find cumulative probability for a given value in a GEV distribution in R?

Calculating AIC number manually Given a distribution of data and some distribution string

Discrete uniform circular distribution

Discrete Probability Distribution in Java

discrete distribution in tensorflow

How do I find the closest n numbers to a given number at x distance from it?

Algorithm for an arbitrary number of nested for loops to calculate a discrete probability distribution

how to find expected value of 1000 random number of Poisson Distribution

Random distribution like discrete_distribution<float>

visualize the value distribution for a given numpy array

Getting the probability density value for a given distribution in PyTorch