Heap Sort in Python
Heap Sort is a simple and efficient sorting algorithm based on the binary heap data structure. It is split into two significant stages: constructing a max heap (or min heap, in case of a reversal in the sorting order) and then repeatedly extracting the largest (or smallest, in case of reverse sort order) element from the heap to build up the sorted array.
Here’s how Heap Sort works, described step-by-step in Python:
Key Concepts
- Binary Heap:
- A binary heap is a binary tree that has all the elements of the heap property satisfied.
- Max Heap: Parent node should be greater or equal to that of children node
- Min Heap: Parent should be lesser in value compared with children node
- Heap can typically be represented using an array that
- Left child of index
iis at2*i + 1. - Right child of index
iis at2*i + 2. - Parent of index
iis at(i-1)//2.
- Left child of index
2. Heapify Operation:
- It maintains the heap property. This is done recursively by comparing a node with its children and swapping them if necessary.
3. Algorithm Workflow:
- Build a Max Heap: Rearrange the input array into a max heap.
- Elements Extraction: Move the root to the end of the array repeatedly and decrease the size of the heap, then re-heapify the rest of the elements.
Python Implementation
Here’s the code for Heap Sort with detailed comments:
def heapify(arr, n, i):
"""
Ensures the subtree rooted at index 'i' follows the max heap property.
:param arr: List of elements
:param n: Size of the heap
:param i: Index of the root node of the subtree
"""
largest = i # Initialize largest as root
left = 2 * i + 1 # Left child
right = 2 * i + 2 # Right child
# If left child is larger than root
if left < n and arr[left] > arr[largest]:
largest = left
# If right child is larger than largest so far
if right < n and arr[right] > arr[largest]:
largest = right
# If largest is not root
if largest != i:
arr[i], arr[largest] = arr[largest], arr[i] # Swap
# Recursively heapify the affected subtree
heapify(arr, n, largest)
def heap_sort(arr):
"""
Perform Heap Sort on the given array.
:param arr: List of elements to be sorted
"""
n = len(arr)
# Build a max heap
for i in range(n // 2 - 1, -1, -1):
heapify(arr, n, i)
# Extract elements from the heap
for i in range(n - 1, 0, -1):
arr[i], arr[0] = arr[0], arr[i] # Move current root to end
heapify(arr, i, 0) # Heapify the reduced heap
# Example Usage
if __name__ == "__main__":
data = [12, 11, 13, 5, 6, 7]
print("Original array:", data)
heap_sort(data)
print("Sorted array:", data)
Detailed Explanation of Code
- Building the Max Heap:
- Starting from the last non-leaf node and
heapify. - Nodes from index
n//2 - 1to0are non-leaf nodes. - This makes sure that the whole array obeys the max heap property.
2. Heapify Function:
- Check if the left or right child is greater than the root.
- If so, swap them and recursively call
heapifyon the affected subtree.
3. Sorting:
- Swap the root (largest element) with the last element of the array.
- Reduce the size of the heap by one.
- Re-heapify the remaining heap to maintain the max heap property.
4. Time Complexity:
- Building the heap: O(n).
- Extracting elements: O(nlogn).
- Total: O(nlogn).
5. Space Complexity:
- In-place algorithm: O(1) auxiliary space.
Output Example
Original array: [12, 11, 13, 5, 6, 7]
Sorted array: [5, 6, 7, 11, 12, 13]
This code sorts the array in ascending order. To sort in descending order, use a min heap instead of a max heap by adjusting the comparison operators in heapify.
Advantages and Disadvantages
Advantages:
- In-place sorting (requires no extra space).
- O(nlogn) worst-case time complexity.
Disadvantages:
- Not stable (relative order of equal elements may not be preserved).
- Performance is generally slower than Quick Sort for real-world inputs.
Heap Sort is a sorting algorithm that is both systematic and efficient, especially useful for datasets that don’t require stability and don’t have extra memory. It works well in scenarios like implementing priority queues or sorting large datasets in place.