Python heapq module
The heapq module of Python supports a heap queue, also as called a priority queue. The heap is an incomplete binary tree. In such a tree the parent node is less than (in a min heap; the default heap type that Python uses) its child node(s). Due to this, it is very efficient in retrieving repeatedly the smallest element, which it does by working bottom-up.
Let me talk about the heapq module, which includes its methods and examples to explain functionality.
Key Features of heapq
- Min-Heap Implementation: Python’s
heapqmodule provides a min-heap, meaning the smallest element is always at index0. - Efficient Operations:
- Insertion and deletion of the smallest element are O(logn).
- Accessing the smallest element is O(1).
- In-Place Operations: The functions work directly on a regular Python list, modifying it in place.
Basic Methods
1. heapq.heappush(heap, item)
- It adds a new item to the heap while still keeping the heap property.
- Parameters:
heap: A list that has already been converted into a heap.item: The item to be added.
- Time Complexity: O(logn).
Example:
import heapq
heap = []
heapq.heappush(heap, 10)
heapq.heappush(heap, 5)
heapq.heappush(heap, 15)
print("Heap after pushes:", heap)
Output:
Heap after pushes: [5, 10, 15]
2. heapq.heappop(heap)
- Removes and returns the smallest element from the heap.
- Parameters:
heap: A heapified list.
- Time Complexity: O(logn).
Example:
heap = [5, 10, 15]
smallest = heapq.heappop(heap)
print("Smallest element popped:", smallest)
print("Heap after pop:", heap)
Output:
Smallest element popped: 5
Heap after pop: [10, 15]
3. heapq.heappushpop(heap, item)
- It pushes a new item onto the heap and pops the smallest element in a single operation.
- Use Case: More efficient than using
heappush()followed byheappop()for the same purpose. - Time Complexity: O(logn).
Example:
heap = [5, 10, 15]
result = heapq.heappushpop(heap, 8)
print("Smallest element popped:", result)
print("Heap after pushpop:", heap)
Output:
Smallest element popped: 5
Heap after pushpop: [8, 10, 15]
4. heapq.heapreplace(heap, item)
- Pops the smallest element and pushes a new item onto the heap.
- Difference from
heappushpop: This assumes the heap is non-empty. - Time Complexity: O(logn).
Example:
heap = [5, 10, 15]
result = heapq.heapreplace(heap, 8)
print("Smallest element replaced:", result)
print("Heap after replace:", heap)
Output:
Smallest element replaced: 5
Heap after replace: [8, 10, 15]
Additional Functions
5. heapq.heapify(x)
- Transforms a list into a heap, in-place, in O(n)O(n)O(n) time.
- This is useful if you already have a list and want to treat it as a heap.
Example:
nums = [10, 20, 15, 30, 40]
heapq.heapify(nums)
print("Heapified list:", nums)
Output:
Heapified list: [10, 20, 15, 30, 40]
6. heapq.merge(*iterables, key=None, reverse=False)
- Merges multiple sorted inputs into a single sorted iterator.
- More efficient than concatenating and sorting manually.
Example:
nums1 = [1, 3, 5]
nums2 = [2, 4, 6]
result = heapq.merge(nums1, nums2)
print("Merged sorted result:", list(result))
Output:
Merged sorted result: [1, 2, 3, 4, 5, 6]
7. heapq.nlargest(n, iterable, key=None)
- Returns the
nlargest elements of the dataset. - Parameters:
n: Number of elements to return.iterable: The data to search through.key: A function to extract a comparison key (optional).
Example:
nums = [10, 20, 15, 30, 40]
largest = heapq.nlargest(2, nums)
print("Two largest elements:", largest)
Output:
Two largest elements: [40, 30]
8. heapq.nsmallest(n, iterable, key=None)
- Returns the
nsmallest elements from the dataset.
Example:
nums = [10, 20, 15, 30, 40]
smallest = heapq.nsmallest(2, nums)
print("Two smallest elements:", smallest)
Output:
Two smallest elements: [10, 15]
Using Heaps for Custom Objects
To use a heap with custom objects, you could also specify a key function at the time of sorting or heapifying. Another option is using a tuple in which the first element serves as the priority.
Example using a Custom Key:
class Task:
def __init__(self, priority, description):
self.priority = priority
self.description = description
def __repr__(self):
return f"Task({self.priority}, {self.description})"
tasks = [Task(3, 'low'), Task(1, 'high'), Task(2, 'medium')]
# Convert tasks into a heap
heap = [(task.priority, task) for task in tasks]
heapq.heapify(heap)
# Access tasks by priority
smallest_task = heapq.heappop(heap)
print("Task with highest priority:", smallest_task[1])
Output:
Task with highest priority: Task(1, high)
Summary
The heapq module is very important to manage data in a priority queue format efficiently. Mastering its functions will allow you to:
- Handle priority-based scheduling.
- Retrieve the smallest/largest elements quickly.
- Merge and sort data correctly.