I have a problem

I wrote the following code, which looks something like the following. In the end, all the elements in the list are exactly the same.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
>>> students_list = []
>>> student = {}
>>> student["name"] = "zhangsan"
>>> student["age"] = 18
>>> students_list.append(student)
>>> student["name"] = "zhaosi"
>>> student["age"] = 25
>>> students_list.append(student)
>>> print(students_list)
[{'name': 'zhaosi', 'age': 25}, {'name': 'zhaosi', 'age': 25}]

Reason: The append() method just stores the address of the dictionary into the list, and the way to assign a value to a key is to modify the address. That’s why it causes the overwriting problem.

1
2
3
4
5
6
7
8
9
>>> students_list = []
>>> student = {}
>>> student["name"] = "zhaosi"
>>> student["age"] = 25
>>> students_list.append(student)
>>> id(student)
140240891529024
>>> id(students_list[0])
140240891529024

Solution: The above problem can be solved by using copy() or deepcopy().

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
>>> students_list = []
>>> student = {}
>>> student["name"] = "zhangsan"
>>> student["age"] = 18
>>> students_list.append(student)
>>> student["name"] = "zhaosi"
>>> students_list = []
>>> student = {}
>>> student["name"] = "zhangsan"
>>> student["age"] = 18
>>> students_list.append(student.copy())
>>> student["name"] = "zhaosi"
>>> student["age"] = 25
>>> students_list.append(student.copy())
>>> print(students_list)
[{'name': 'zhangsan', 'age': 18}, {'name': 'zhaosi', 'age': 25}

What is a shallow copy, a deep copy

What we often call a deep and shallow copy is really the difference between passing a value and a reference: the

  • Deep Copy is the use of a new piece of memory of the same size as the original object to copy all the values of the object to be copied into the new memory location, and the copied object and the original object are independent of each other. Assigning values using deep copy, the values are passed.

  • Shallow copy is to use the reference of the original object as the object to copy, and not to use the new memory to store the copied object, the copied object and the original object share the same block of memory, so make changes to either one of them, the other will also change with it. Use shallow copy assignment, pass the reference.

Assignment of objects in Python

Python uses shallow copies in many places, which is very different from the default way of passing references in C/C++ (which defaults to passing values unless the declaration is a reference).

In Python, assignment of objects is a shallow copy; when assigning an object to another variable, Python just copies the reference to that object.

Using some functions of the Python standard library, if you don’t pay attention to its shallow-copy feature, you can easily fall into the pit and get inexplicable bugs.

append() function

To append elements to the end of Pyhton’s list, we usually use the append() function, but note that append() uses a shallow copy.

As in the example I gave at the beginning.

When to use deep copy

Deep copy is used when a list is nested inside a list.

Classic example

Example 1

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
old_list = [[1, 2, 3], [4, 5, 6], [7, 8, 'a']]
new_list = old_list

new_list[2][2] = 9

print('Old List:', old_list)
print('ID of Old List:', id(old_list))

print('New List:', new_list)
print('ID of New List:', id(new_list))
1
2
3
4
Old List: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
ID of Old List: 140425275713928
New List: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
ID of New List: 140425275713928

From the output you can see that the two variables old list and new list share the same id i.e. 140673303268168.

So if you want to change any value in the new list or the old list, the change will be visible in both.

Essentially, sometimes you may want to leave the original value unchanged and only modify the new value, and vice versa. In Python, there are two ways to create copies.

  1. a shallow copy
  2. deep copy

To make these copies work, we use the copy module.

Example 2

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
import copy

old_list = [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
new_list = copy.copy(old_list)

old_list.append([4, 4, 4])
new_list.append([5, 5, 5])

print("Old list:", old_list)
print("New list:", new_list)
1
2
Old list: [[1, 1, 1], [2, 2, 2], [3, 3, 3], [4, 4, 4]]
New list: [[1, 1, 1], [2, 2, 2], [3, 3, 3], [5, 5, 5]]

In the above program, we created a shallow copy of new_list for old_list . Then we append new elements to old_list and new_list respectively, and we can see that they are independent of each other.

Example 3

1
2
3
4
5
6
7
8
9
import copy

old_list = [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
new_list = copy.copy(old_list)

old_list[1][1] = 'AA'

print("Old list:", old_list)
print("New list:", new_list)
1
2
Old list: [[1, 1, 1], [2, 'AA', 2], [3, 3, 3]]
New list: [[1, 1, 1], [2, 'AA', 2], [3, 3, 3]]

In the above program, we created a shallow copy of new_list for old_list . Then we modified the nested list elements inside old_list and found that new_list was changed along with it. This is because both lists share references to the same nested object.

Example 4

1
2
3
4
5
6
7
8
9
import copy

old_list = [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
new_list = copy.deepcopy(old_list)

old_list[1][0] = 'BB'

print("Old list:", old_list)
print("New list:", new_list)
1
2
Old list: [[1, 1, 1], ['BB', 2, 2], [3, 3, 3]]
New list: [[1, 1, 1], [2, 2, 2], [3, 3, 3]]

As you can see, after using deep copy. Modifying the elements of old_list no longer affects new_list . This means that old_list and new_list are independent. This is because, old_list is copied recursively, for all nested objects.