Python heapq module

The heapq module of Python supports a heap queue, also as called a priority queue. The heap is an incomplete binary tree. In such a tree the parent node is less than (in a min heap; the default heap type that Python uses) its child node(s). Due to this, it is very efficient in retrieving repeatedly the smallest element, which it does by working bottom-up.

Let me talk about the heapq module, which includes its methods and examples to explain functionality.

Key Features of heapq

  1. Min-Heap Implementation: Python’s heapq module provides a min-heap, meaning the smallest element is always at index 0.
  2. Efficient Operations:
    • Insertion and deletion of the smallest element are O(log⁡n).
    • Accessing the smallest element is O(1).
  3. In-Place Operations: The functions work directly on a regular Python list, modifying it in place.

Basic Methods

1. heapq.heappush(heap, item)

  • It adds a new item to the heap while still keeping the heap property.
  • Parameters:
    • heap: A list that has already been converted into a heap.
    • item: The item to be added.
  • Time Complexity: O(log⁡n).

Example:

import heapq

heap = []
heapq.heappush(heap, 10)
heapq.heappush(heap, 5)
heapq.heappush(heap, 15)
print("Heap after pushes:", heap)

Output:

Heap after pushes: [5, 10, 15]

2. heapq.heappop(heap)

  • Removes and returns the smallest element from the heap.
  • Parameters:
    • heap: A heapified list.
  • Time Complexity: O(log⁡n).

Example:

heap = [5, 10, 15]
smallest = heapq.heappop(heap)
print("Smallest element popped:", smallest)
print("Heap after pop:", heap)

Output:

Smallest element popped: 5
Heap after pop: [10, 15]

3. heapq.heappushpop(heap, item)

  • It pushes a new item onto the heap and pops the smallest element in a single operation.
  • Use Case: More efficient than using heappush() followed by heappop() for the same purpose.
  • Time Complexity: O(log⁡n).

Example:

heap = [5, 10, 15]
result = heapq.heappushpop(heap, 8)
print("Smallest element popped:", result)
print("Heap after pushpop:", heap)

Output:

Smallest element popped: 5
Heap after pushpop: [8, 10, 15]

4. heapq.heapreplace(heap, item)

  • Pops the smallest element and pushes a new item onto the heap.
  • Difference from heappushpop: This assumes the heap is non-empty.
  • Time Complexity: O(log⁡n).

Example:

heap = [5, 10, 15]
result = heapq.heapreplace(heap, 8)
print("Smallest element replaced:", result)
print("Heap after replace:", heap)

Output:

Smallest element replaced: 5
Heap after replace: [8, 10, 15]

Additional Functions

5. heapq.heapify(x)

  • Transforms a list into a heap, in-place, in O(n)O(n)O(n) time.
  • This is useful if you already have a list and want to treat it as a heap.

Example:

nums = [10, 20, 15, 30, 40]
heapq.heapify(nums)
print("Heapified list:", nums)

Output:

Heapified list: [10, 20, 15, 30, 40]

6. heapq.merge(*iterables, key=None, reverse=False)

  • Merges multiple sorted inputs into a single sorted iterator.
  • More efficient than concatenating and sorting manually.

Example:

nums1 = [1, 3, 5]
nums2 = [2, 4, 6]
result = heapq.merge(nums1, nums2)
print("Merged sorted result:", list(result))

Output:

Merged sorted result: [1, 2, 3, 4, 5, 6]

7. heapq.nlargest(n, iterable, key=None)

  • Returns the n largest elements of the dataset.
  • Parameters:
    • n: Number of elements to return.
    • iterable: The data to search through.
    • key: A function to extract a comparison key (optional).

Example:

nums = [10, 20, 15, 30, 40]
largest = heapq.nlargest(2, nums)
print("Two largest elements:", largest)

Output:

Two largest elements: [40, 30]

8. heapq.nsmallest(n, iterable, key=None)

  • Returns the n smallest elements from the dataset.

Example:

nums = [10, 20, 15, 30, 40]
smallest = heapq.nsmallest(2, nums)
print("Two smallest elements:", smallest)

Output:

Two smallest elements: [10, 15]

Using Heaps for Custom Objects

To use a heap with custom objects, you could also specify a key function at the time of sorting or heapifying. Another option is using a tuple in which the first element serves as the priority.

Example using a Custom Key:

class Task:
    def __init__(self, priority, description):
        self.priority = priority
        self.description = description

    def __repr__(self):
        return f"Task({self.priority}, {self.description})"

tasks = [Task(3, 'low'), Task(1, 'high'), Task(2, 'medium')]

# Convert tasks into a heap
heap = [(task.priority, task) for task in tasks]
heapq.heapify(heap)

# Access tasks by priority
smallest_task = heapq.heappop(heap)
print("Task with highest priority:", smallest_task[1])

Output:

Task with highest priority: Task(1, high)

Summary

The heapq module is very important to manage data in a priority queue format efficiently. Mastering its functions will allow you to:

  • Handle priority-based scheduling.
  • Retrieve the smallest/largest elements quickly.
  • Merge and sort data correctly.