Why does the Python heapq _siftup(...) call _siftdown(...) at the end?

aksg87

The code for_siftup at github - python/cpython/Lib/heapq.py has a final call to _siftdown:

def _siftup(heap, pos):
    endpos = len(heap)
    startpos = pos
    newitem = heap[pos]
    # Bubble up the smaller child until hitting a leaf.
    childpos = 2*pos + 1    # leftmost child position
    while childpos < endpos:
        # Set childpos to index of smaller child.
        rightpos = childpos + 1
        if rightpos < endpos and not heap[childpos] < heap[rightpos]:
            childpos = rightpos
        # Move the smaller child up.
        heap[pos] = heap[childpos]
        pos = childpos
        childpos = 2*pos + 1
    # The leaf at pos is empty now.  Put newitem there, and bubble it up
    # to its final resting place (by sifting its parents down).
    heap[pos] = newitem
    _siftdown(heap, startpos, pos)

It seems like the logic in _siftup(...) is enough to place the newitem in the correct position maintaining the heap invariant? Why is a call to _siftdown() required?

trincot

This is the consequence of a particular choice the authors made in the algorithm.

More common is an algorithm where this final _siftdown() is not necessary, but then the loop must stop when newitem < heap[childpos], after which pos will be a valid spot for newitem and no more sifting is needed.

In this version however, the loop continues until a leaf is found, and newitem is placed at a leaf spot. This may not be a valid spot for newitem, so the extra call is needed to go back up to a valid spot.

In the comment block that precedes this function, the authors have explained why they made this choice, which at first seems to be less efficient, but in practice turns out to result in fewer comparisons:

We could break out of the loop as soon as we find a pos where newitem <= both its children, but turns out that's not a good idea, and despite that many books write the algorithm that way. During a heap pop, the last array element is sifted in, and that tends to be large, so that comparing it against values starting from the root usually doesn't pay (= usually doesn't get us out of the loop early). See Knuth, Volume 3, where this is explained and quantified in an exercise.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

Why does Java's InflaterInputStream (and other similar classes) only conditionally call end on it's internal Inflater

Why does the sys_read system call end when it detects a new line?

Why write ".call(this)" at the end of an javascript anonymous function?

Why does python add an 'L' on the end of the result of large exponents?

Why does range(start, end) not include end?

Why does hasNextLine() never end?

nlargest and nsmallest ; heapq python

siftUp and siftDown operation in heap for heapifying an array

Is heapq in Python thread safe?

Why does a PowerShell script not end when there is a non-zero exit code using the call operator?

Is heap (heapq) in python stable?

how to avoid using _siftup or _siftdown in heapq

Why does Python's subprocess call not work correctly with sh on Windows?

Python heapq implementation

Why Does heapq Use the Front of the List?

Why call GC.KeepAlive in the end, and not in the beginning?

Why does this bash call from python not work?

Why does this python function not work if I wrap it in a def() call?

Why there is no end in Python?

Python 2.7.11:Why does a function call work for one function but not for another?

Why does Python call both functions?

Why does the Wikipedia API Call in Python throw up a Type Error?

Python heapq: Split and merge into a ordered heapq

Why does Python quit when I call root = Tk()?

Creating a generator in a Python function call - why does this work?

Why does this program call the wrong method? [Python, MultiTimer library]

Why does not the parameter passed to the keyword argument end in the print function of Python does not work as expected in the below context?

Why does environment variable get deleted after the end of a python script?

Why does the node not inserted at the end?