There are some built-in data types in Python, such as int, str, list, tuple, dict, etc. Python’s collections module builds on these built-in data types by providing several additional data types.

namedtuple() generates a subclass of tuple that can use the name to access the content of the element
deque A list-like container that implements fast append and pop at both ends
Counter A subclass of dictionary that provides counting of hashable objects
OrderedDict subclass of dictionaries that preserve the order in which they are added
defaultdict A subclass of dictionary that provides a factory function to provide a default value for dictionary queries
UserDict wraps dictionary objects, simplifying dictionary subclassing
UserList wraps a list object, simplifying list subclassing
UserString encapsulates the list object, simplifying string subclassing

namedtuple()

namedtuple is used to generate data objects that can be accessed by name, and is often used to enhance the readability of the code, especially when accessing data of type namedtuple.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
# -*- coding: utf-8 -*-

from collections import namedtuple

websites = [
    ('百度', 'https://www.baidu.com/', '李彦宏'),
    ('阿里', 'https://www.taobao.com/', '马云'),
    ('腾讯', 'http://www.qq.com/', '马化腾')
]

Website = namedtuple('Website', ['name', 'url', 'founder'])

for website in websites:
    website = Website._make(website)
    print(website)

# 输出内容:
# Website(name='百度', url='https://www.baidu.com/', founder='李彦宏')
# Website(name='阿里', url='https://www.taobao.com/', founder='马云')
# Website(name='腾讯', url='http://www.qq.com/', founder='马化腾')

deque

deque is actually an abbreviation for double-ended queue, which translates to double-ended queue, and its biggest benefit is that it enables to add and remove objects from the head of the queue quickly: .popleft(), .appendleft().

You may say, the native list can also add and remove objects from the head, right? Like this.

1
2
l.insert(0, v)
l.pop(0)

However, it is worth noting that the time complexity of these two uses of the list object is O(n), which means that the time taken increases linearly with the number of elements. As a double-ended queue, deque also provides some other useful methods, such as rotate, etc.

1
2
3
4
5
6
7
8
9
from collections import deque

q = deque(['a', 'b', 'c'])
q.append('x')
q.appendleft('y')
print(q)

# 输出内容:
# deque(['y', 'a', 'b', 'c', 'x'])

Counter

Counter is a simple counter that counts, for example, the number of occurrences of a character.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# -*- coding: utf-8 -*-
from collections import Counter

s = '''A Counter is a dict subclass for counting hashable objects. 
It is an unordered collection where elements are stored as dictionary keys and their counts are stored as dictionary values. 
Counts are allowed to be any integer value including zero or negative counts. 
The Counter class is similar to bags or multisets in other languages.'''

c = Counter(s)
# 获取出现频率最高的5个字符
print(c.most_common(5))

# 输出内容:
# [(' ', 54), ('e', 32), ('s', 25), ('a', 24), ('t', 24)]

OrderedDict

An ordered dictionary is like a regular dictionary, but with some extra features related to sorting operations. As the built-in dict class gains the ability to remember the order of insertion (a new behavior guaranteed in Python 3.7), it becomes less important.

DefaultDict

We all know that when using Python’s native data structure dict, if you access it with something like d[key], a KeyError exception will be thrown when the specified key doesn’t exist. However, with defaultdict, if you pass in a default factory method, then when you request a non-existent key, the factory method will be called and the result will be used as the default value for the key.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
# -*- coding: utf-8 -*-
from collections import defaultdict

members = [
    # Age, name
    ['male', 'John'],
    ['male', 'Jack'],
    ['female', 'Lily'],
    ['male', 'Pony'],
    ['female', 'Lucy'],
]

result = defaultdict(list)
for sex, name in members:
    result[sex].append(name)

print(result)

# Result:
# defaultdict(<class 'list'>, {'male': ['John', 'Jack', 'Pony'], 'female': ['Lily', 'Lucy']})