Using heapq.nlargest with a Percentage - Python

Bake.G

Does anyone know how to use heapq.nlargest with a percentage rather than a number? At the moment I have

heapq.nlargest(187030, y)

But this gets me the top 187030 numbers. I need it to get me the top 10% of numbers for each array because not all arrays have 1.8 million.

Cheers

Willem Van Onsem

Yes. Behind the curtains a heap is actually a list with certain properties (it uses what is called an implicit data structure).

So we can first take len(y) to obtain the number of elements. By dividing by 10, we get the 10% number of elements. So we can use:

heapq.nlargest(len(y)//10, y)

Or in case you want to use a percentage as parameter:

p = 17  # top 17 procent

heapq.nlargest(len(y)*p//100, y)

You can also use a fraction (for instance the top 0.14):

from math import round

p = 0.14  # top 14 procent

heapq.nlargest(round(len(y)*p), y)

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

nlargest and nsmallest ; heapq python

key function for heapq.nlargest()

make a list of the largest two and smallest two items of the same collection using the heapq module two functions nlargest() and nsmallest()

What is the time complexity of heapq.nlargest?

Issue using heapq in python for a priority list

Would like clarification on the heapq.nlargest parameter 'key'

Python heapq: Split and merge into a ordered heapq

Reverse lexicographical using heapq

Descending order using heapq

Using nlargest to find winner of contest

Is heapq in Python thread safe?

Python heapq implementation

Is heap (heapq) in python stable?

How to groupby a column and get nlargest of another column value and return entire row using python

What is Python's heapq module?

What is Python's heapq module?

How to add list to heapq in python

python heapq sorting list wrong?

color percentage in image python opencv using histogram

Getting the percentage of a color in an image using OpenCV and Python

Percentage of occurrences in a column using groupby in python pandas

Returning nlargest for nested sorted dataframe using Pandas

Python, heapq, How to efficiently modify the minimum element in heapq?

Python heapq : How do I sort the heap using nth element of the list of lists?

K-way-merge without heapq and any other libraries using python

how to avoid using _siftup or _siftdown in heapq

Python heapq not being pushed in right order?

python, heapq: difference between heappushpop() and heapreplace()

Inherit from both 'heapq' and 'deque' in python?