Merge Sort in Python

What is Merge Sort?

Merge Sort is a divide-and-conquer approach-based recursive sorting algorithm. It works by splitting an array repeatedly into two halves, each half sorted individually, and then merged back together. The process is methodical, and the result is a sorted array.

Steps of Merge Sort

  1. Divide: Recursively divide the array into two halves until each subarray contains only one element (a single element is trivially sorted).
  2. Conquer: Combine the sorted subarrays to create new sorted arrays.
  3. Combine: Repeat merging until all subarrays become combined into one sorted array.

Merge Sort Algorithm

  1. Base Case: An array containing 0 or 1 elements is already sorted.
  2. Recursive Case:
    • Split the array into two halves.
    • Recursively apply merge sort to each half.
    • Merge the sorted halves.

Python Implementation

def merge_sort(arr):
    """
    Performs Merge Sort on the given list.
    
    :param arr: List of elements to be sorted
    :return: A sorted version of the list
    """
    # Base case: a list of 0 or 1 element is already sorted
    if len(arr) <= 1:
        return arr
    
    # Split the array into two halves
    mid = len(arr) // 2
    left_half = arr[:mid]
    right_half = arr[mid:]
    
    # Recursively sort each half
    sorted_left = merge_sort(left_half)
    sorted_right = merge_sort(right_half)
    
    # Merge the sorted halves
    return merge(sorted_left, sorted_right)

def merge(left, right):
    """
    Merges two sorted lists into one sorted list.
    
    :param left: The first sorted list
    :param right: The second sorted list
    :return: A merged and sorted list
    """
    merged = []
    i = j = 0  # Pointers for left and right lists
    
    # Compare elements from left and right and add the smallest one to merged
    while i < len(left) and j < len(right):
        if left[i] < right[j]:
            merged.append(left[i])
            i += 1
        else:
            merged.append(right[j])
            j += 1
    
    # If there are remaining elements in either list, append them
    merged.extend(left[i:])
    merged.extend(right[j:])
    
    return merged

# Example usage
unsorted_array = [38, 27, 43, 3, 9, 82, 10]
sorted_array = merge_sort(unsorted_array)
print("Sorted Array:", sorted_array)

How the Algorithm Works

  1. Splitting:
  • Recursively divides up the array so that only two-element arrays remain.

2. Merging:

  • The merge function combines two sorted arrays into a single sorted array. It compares the smallest elements of both arrays and appends the smaller one to the result.

3. Recursive Call Stack:

  • This recursion processes a subarray breaking it down in half until they can be broken down no more, at this point the merger begins combining them in sorted form.

Example Execution

Input: [38, 27, 43, 3, 9, 82, 10]

  1. Split into [38, 27, 43] and [3, 9, 82, 10].
  2. Further split [38, 27, 43] into [38] and [27, 43].
  3. Split [27, 43] into [27] and [43].
  4. Merge [27] and [43] into [27, 43].
  5. Merge [38] and [27, 43] into [27, 38, 43].
  6. Repeat for [3, 9, 82, 10]:
    • Merge [3, 9] and [10, 82] into [3, 9, 10, 82].
  7. Merge [27, 38, 43] and [3, 9, 10, 82] into [3, 9, 10, 27, 38, 43, 82].

Key Features of Merge Sort

  1. Time Complexity:
    • Best, Worst, and Average: O(nlogn), because:
      • Dividing the array into halves is O(logn).
      • Merging takes O(n).
  2. Space Complexity:
    • Merge Sort requires additional memory for temporary arrays, so its space complexity is O(n).
  3. Stability:
    • If the two are equal, they have their natural order in a sorted array.
  4. Recursive Nature:
    • Merge Sort heavily relies on recursion, which makes it easier to implement but may use a lot of stack memory for large arrays.

Advantages and Disadvantages

Advantages:

  • Efficient for large datasets.
  • Stable sorting algorithm.
  • It is suitable for linked lists because merging can be done without extra space.

Disadvantages:

  • Requires O(n) additional memory for arrays, which can be a drawback for memory-constrained environments.
  • Recursive calls can lead to high memory usage in systems with limited stack space.