Is there a good way to find proper index?

유성민

I am looking any package or module to find index more fast than my coding.

If there is an numpy array, like

a = numpy.array([1,2,3,4,5,6])

So these are a[0]=1, a[1]=2, a[2]=3, a[3]=4, a[4]=5, a[5]=6. In this situation, I'd like to find two index (just before and just after) around 3.5 So, this solution are a[2] and a[3], right?

Well, I took this way

for i in range(len(a) - 1):
    difference1 = 3.5 - a[i]
    difference2 = 3.5 - a[i + 1]
    if difference1 * difference2 < 0:
        print(i)

So I can find a[2]. and i can know a[3] is a little bigger than 3.5 which of all the things i have. but this is just an example. I deal with more huge data, So there need a lot of time. Is there any tool (module or package) to find more faster in python?

meisen99

Here's one solution using numpy. I'm assuming that your array is sorted from your example.

  • Add the value that you are looking for to a copy of the array
  • Re-sort the new array
  • Find the value you are looking for. The index of the value "to the left" is the index you are looking for.

I mocked up this version, and over a 10M element array the method performed about 10x faster than the original. I've also not tried to merge together each numpy activity (you could do this in one long line if you want) for clarity.

Specific answer to your example:

a = numpy.array([1,2,3,4,5,6])
print numpy.where(numpy.sort(numpy.concatenate((a,[3.5])))==3.5)[0][0] - 1

Longer example for more explanation:

import time
import numpy
a = numpy.array([1,2,3,4,5,6])
f = 3.5

#Replacing with a larger range of values to search for timing test
a = numpy.arange(10000000)
f = 500000.5

print "starting new version"
start = time.time()
b = numpy.concatenate((a,[f]))
c = numpy.sort(b)
d = numpy.where(c==f)[0][0] - 1
print d
end = time.time()
print end-start

print "do it in one unreadable line"
start = time.time()
print numpy.where(numpy.sort(numpy.concatenate((a,[f])))==f)[0][0] - 1
end = time.time()
print end-start

print "starting original  version"
start = time.time()
for i in range(len(a) - 1):
    difference1 = f - a[i]
    difference2 = f - a[i + 1]
    if difference1 * difference2 < 0:
        print i

end = time.time()
print end-start

EDIT: Put the longer example second so it doesn't look like my suggested solution is 30 lines long!

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

proper way to fail test when find_index returns nil with page_objects

Proper Way to Shift Integer Into List Index Range

Trying to find the proper way but I couldn't

A good way to find unserializable fields in Java

Whats the proper way to ensureIndex index mongodb field on a multidimensional array?

Proper way of using derived map type index in golang

The proper way of adding an index in a nested firebase realtime database structure

Efficient way to find index of interval

How to find an EXE's install location - the proper way?

What is the proper way to check if a document in mongodb with find().limit()?

Proper way to find commit by sha with LibGit2Sharp

Is there a good way to index a lot of data in a smart contract to be able to read it efficiently?

CustomList class, what is a good way to remove at a specified index?

What is a good way of getting distinct secondary index values from rethinkdb?

Good way of popping the least signifigant bit and returning the index?

Can't find a way to carry the last good value down

What would be a good way to dynamically find out what is inside a struct?

Is there a good way to find the rank of a matrix in a field of characteristic p>0?

Java: Can't find a good way to end this while loop

Best way to find 'quite good' numbers up to 1 million?

What is a good way to find which union the variable belongs to?

numpy: find index in sorted array (in an efficient way)

What is the best way to find a char index in Java?

Better way to find index of item in ArrayList?

Fastest way to find the index of a child node in parent

Efficient way to find the index of repeated sequence in a list?

Fastest way to find index of elements in a list

Selecting the proper db index

What's the best way to insert values into their "proper" place in a pandas dataframe by some (index) parameter?

TOP Ranking

HotTag

Archive