Python Collection Module

The Python collections module is part of the standard library composed of specialized container data types. With these data types, extending the functionalities of built-in Python types can be easily accomplished to ease the implementation of different tasks with minimal code. Here is a detailed description of the top-most commonly used features available in the collections module:

1. Counter

A counter is a dictionary subclass for counting hashable objects. It counts how many times each item has occurred.

Usage

from collections import Counter

# Example
data = ['apple', 'banana', 'apple', 'orange', 'banana', 'apple']
counter = Counter(data)
print(counter)  # Output: Counter({'apple': 3, 'banana': 2, 'orange': 1})

# Accessing counts
print(counter['apple'])  # Output: 3
print(counter['grape'])  # Output: 0 (returns 0 for missing items)

# Common methods
print(counter.most_common(2))  # Output: [('apple', 3), ('banana', 2)]

2. defaultdict

A defaultdict is a dictionary that returns a default value for missing keys. The default value is specified by a factory function.

Usage

from collections import defaultdict

# Example
default_dict = defaultdict(int)
default_dict['a'] += 1
print(default_dict)  # Output: defaultdict(<class 'int'>, {'a': 1})

# Using list as default factory
default_dict_list = defaultdict(list)
default_dict_list['key'].append(10)
print(default_dict_list)  # Output: defaultdict(<class 'list'>, {'key': [10]})

3. namedtuple

A namedtuple is a light, immutable, data structure that provides easy access to fields by the name rather than an index.

Usage

from collections import namedtuple

# Define a named tuple
Point = namedtuple('Point', ['x', 'y'])

# Create instances
p = Point(10, 20)
print(p.x, p.y)  # Output: 10 20

# Accessing fields
print(p[0], p[1])  # Output: 10 20

# Convert to dictionary
print(p._asdict())  # Output: {'x': 10, 'y': 20}

4. OrderedDict

An OrderedDict remembers the order in which keys were inserted. (From Python 3.7, regular dictionaries also maintain order, but OrderedDict offers some additional functionality.)

Usage

from collections import OrderedDict

# Example
ordered_dict = OrderedDict()
ordered_dict['a'] = 1
ordered_dict['b'] = 2
ordered_dict['c'] = 3
print(ordered_dict)  # Output: OrderedDict([('a', 1), ('b', 2), ('c', 3)])

5. deque

A deque or double-ended queue is more optimized for appending and popping from both ends. More efficient than a list in such operations.

Usage

from collections import deque

# Example
d = deque([1, 2, 3])
d.append(4)  # Add to the right
d.appendleft(0)  # Add to the left
print(d)  # Output: deque([0, 1, 2, 3, 4])

# Pop from both ends
d.pop()  # Removes from the right
d.popleft()  # Removes from the left

6. ChainMap

A ChainMap groups multiple dictionaries into a single, unified view.

Usage

from collections import ChainMap

# Example
dict1 = {'a': 1, 'b': 2}
dict2 = {'b': 3, 'c': 4}
chain = ChainMap(dict1, dict2)
print(chain['b'])  # Output: 2 (first dict takes precedence)

7. UserDict

A UserDict is a wrapper around dictionary objects, allowing you to customize their behavior by subclassing.

Usage

from collections import UserDict

class MyDict(UserDict):
    def __setitem__(self, key, value):
        if not isinstance(key, str):
            raise TypeError("Keys must be strings")
        super().__setitem__(key, value)

# Example
d = MyDict()
d['key'] = 'value'
# d[10] = 'value'  # Raises TypeError

8. UserList

Similar to UserDict, a UserList is a wrapper around list objects that allows customization.

Usage

from collections import UserList

class MyList(UserList):
    def append(self, item):
        if not isinstance(item, int):
            raise TypeError("Only integers are allowed")
        super().append(item)

# Example
ml = MyList([1, 2, 3])
ml.append(4)
# ml.append('a')  # Raises TypeError

9. UserString

A UserString is a wrapper around string objects that allows customization.

Usage

from collections import UserString

class MyString(UserString):
    def append(self, s):
        self.data += s

# Example
ms = MyString("Hello")
ms.append(" World")
print(ms)  # Output: Hello World

Key Benefits of collections Module:

  1. Efficiency: Many data structures are optimized for specific use cases.
  2. Readability: Code becomes more readable and intent is clear.
  3. Functionality: Provide specialized tools for difficult tasks that cannot be performed by standard data types.