Python Collection Module
The Python collections
module is part of the standard library composed of specialized container data types. With these data types, extending the functionalities of built-in Python types can be easily accomplished to ease the implementation of different tasks with minimal code. Here is a detailed description of the top-most commonly used features available in the collections module:
1. Counter
A counter
is a dictionary subclass for counting hashable objects. It counts how many times each item has occurred.
Usage
from collections import Counter
# Example
data = ['apple', 'banana', 'apple', 'orange', 'banana', 'apple']
counter = Counter(data)
print(counter) # Output: Counter({'apple': 3, 'banana': 2, 'orange': 1})
# Accessing counts
print(counter['apple']) # Output: 3
print(counter['grape']) # Output: 0 (returns 0 for missing items)
# Common methods
print(counter.most_common(2)) # Output: [('apple', 3), ('banana', 2)]
2. defaultdict
A defaultdict
is a dictionary that returns a default value for missing keys. The default value is specified by a factory function.
Usage
from collections import defaultdict
# Example
default_dict = defaultdict(int)
default_dict['a'] += 1
print(default_dict) # Output: defaultdict(<class 'int'>, {'a': 1})
# Using list as default factory
default_dict_list = defaultdict(list)
default_dict_list['key'].append(10)
print(default_dict_list) # Output: defaultdict(<class 'list'>, {'key': [10]})
3. namedtuple
A namedtuple
is a light, immutable, data structure that provides easy access to fields by the name rather than an index.
Usage
from collections import namedtuple
# Define a named tuple
Point = namedtuple('Point', ['x', 'y'])
# Create instances
p = Point(10, 20)
print(p.x, p.y) # Output: 10 20
# Accessing fields
print(p[0], p[1]) # Output: 10 20
# Convert to dictionary
print(p._asdict()) # Output: {'x': 10, 'y': 20}
4. OrderedDict
An OrderedDict
remembers the order in which keys were inserted. (From Python 3.7, regular dictionaries also maintain order, but OrderedDict
offers some additional functionality.)
Usage
from collections import OrderedDict
# Example
ordered_dict = OrderedDict()
ordered_dict['a'] = 1
ordered_dict['b'] = 2
ordered_dict['c'] = 3
print(ordered_dict) # Output: OrderedDict([('a', 1), ('b', 2), ('c', 3)])
5. deque
A deque
or double-ended queue is more optimized for appending and popping from both ends. More efficient than a list in such operations.
Usage
from collections import deque
# Example
d = deque([1, 2, 3])
d.append(4) # Add to the right
d.appendleft(0) # Add to the left
print(d) # Output: deque([0, 1, 2, 3, 4])
# Pop from both ends
d.pop() # Removes from the right
d.popleft() # Removes from the left
6. ChainMap
A ChainMap
groups multiple dictionaries into a single, unified view.
Usage
from collections import ChainMap
# Example
dict1 = {'a': 1, 'b': 2}
dict2 = {'b': 3, 'c': 4}
chain = ChainMap(dict1, dict2)
print(chain['b']) # Output: 2 (first dict takes precedence)
7. UserDict
A UserDict
is a wrapper around dictionary objects, allowing you to customize their behavior by subclassing.
Usage
from collections import UserDict
class MyDict(UserDict):
def __setitem__(self, key, value):
if not isinstance(key, str):
raise TypeError("Keys must be strings")
super().__setitem__(key, value)
# Example
d = MyDict()
d['key'] = 'value'
# d[10] = 'value' # Raises TypeError
8. UserList
Similar to UserDict
, a UserList
is a wrapper around list objects that allows customization.
Usage
from collections import UserList
class MyList(UserList):
def append(self, item):
if not isinstance(item, int):
raise TypeError("Only integers are allowed")
super().append(item)
# Example
ml = MyList([1, 2, 3])
ml.append(4)
# ml.append('a') # Raises TypeError
9. UserString
A UserString
is a wrapper around string objects that allows customization.
Usage
from collections import UserString
class MyString(UserString):
def append(self, s):
self.data += s
# Example
ms = MyString("Hello")
ms.append(" World")
print(ms) # Output: Hello World
Key Benefits of collections
Module:
- Efficiency: Many data structures are optimized for specific use cases.
- Readability: Code becomes more readable and intent is clear.
- Functionality: Provide specialized tools for difficult tasks that cannot be performed by standard data types.