
Creating Copies of a List
Creating Copies of a List 관련


Creating copies of an existing list is a common need in Python code. Having a copy ensures that when you change a given list, that change doesn’t affect the original data or the data in other copies.
Note
In Python, an object’s identity is a unique identifier that distinguishes it from other objects. You can use the built-in id()
function to get the identity of any Python object. In Python’s CPython implementation, an object’s identity coincides with the memory address where the object is stored.
In Python, you’ll have two kinds of mechanisms to create copies of an existing list. You can create either:
- A shallow copy
- A deep copy
Both types of copies have specific characteristics that will directly impact their behavior. In the following sections, you’ll learn how to create shallow and deep copies of existing lists in Python. First, you’ll take a glance at aliases, a related concept that can cause some confusion and lead to issues and bugs.
Aliases of a List
In Python, you can create aliases of variables using the assignment operator (=
). Assignments don’t create copies of objects in Python. Instead, they create bindings between the variable and the object involved in the assignment. Therefore, when you have several aliases of a given list, changes in an alias will affect the rest of the aliases.
To illustrate how you can create aliases and how they work, consider the following example:
countries = ["United States", "Canada", "Poland", "Germany", "Austria"]
nations = countries >>> id(countries) == id(nations)
#
# True
countries[0] = "United States of America"
nations
#
# ['United States of America', 'Canada', 'Poland', 'Germany', 'Austria']
In this code snippet, the first highlighted line creates nations
as an alias of countries
. Note how both variables point to the same object, which you know because the object’s identity is the same. In the second highlighted line, you update the object at index 0
in countries
. This change reflects in the nations
alias.
Assignment statements like the one in the first highlighted line above don’t create copies of the right-hand object. They just create aliases or variables that point to the same underlying object.
In general, aliases can come in handy in situations where you need to avoid name collisions in your code or when you need to adapt the names to specific naming patterns.
To illustrate, say that you have an app that uses your list of countries as countries
in one part of the code. The app requires the same list in another part of the code, but there’s already a variable called countries
with other content.
If you want both pieces of code to work on the same list, then you can use nations
as an alias for countries
. A handy way to do this would be to use the as
keyword for creating the alias through an implicit assignment, for example, when you import the list from another module.
Shallow Copies of a List
A shallow copy of an existing list is a new list containing references to the objects stored in the original list. In other words, when you create a shallow copy of a list, Python constructs a new list with a new identity. Then, it inserts references to the objects in the original list into the new list.
There are at least three different ways to create shallow copies of an existing list. You can use:
These three tools demonstrate equivalent behavior. So, to kick things off, you’ll start exploring the slicing operator:
countries = ["United States", "Canada", "Poland", "Germany", "Austria"]
nations = countries[:] >>> nations
#
# ['United States', 'Canada', 'Poland', 'Germany', 'Austria']
id(countries) == id(nations)
#
# False
The highlighted line creates nations
as a shallow copy of countries
by using the slicing operator with one colon only. This operation takes a slice from the beginning to the end of countries
. In this case, nations
and countries
have different identities. They’re completely independent list
objects.
However, the elements in nations
are aliases of the elements in countries
:
id(nations[0]) == id(countries[0])
#
# True
id(nations[1]) == id(countries[1])
#
# True
As you can see, items under the same index in both nations
and countries
share the same object identity. This means that you don’t have copies of the items. You’re really sharing them. This behavior allows you to save some memory when working with lists and their copies.
Now, how would this impact the behavior of both lists? If you changed an item in nations
, would the change reflect in countries
? The code below will help you answer these questions:
countries[0] = "United States of America"
countries
#
# ['United States of America', 'Canada', 'Poland', 'Germany', 'Austria']
nations
#
# ['United States', 'Canada', 'Poland', 'Germany', 'Austria']
id(countries[0]) == id(nations[0])
#
# False
id(countries[1]) == id(nations[1])
#
# True
On the first line of this piece of code, you update the item at index 0
in countries
. This change doesn’t affect the item at index 0
in nations
. Now the first items in the lists are completely different objects with their own identities. The rest of the items, however, continue to share the same identity. So, they’re the same object in each case.
Because making copies of a list is such a common operation, the list
class has a dedicated method for it. The method is called .copy()
, and it returns a shallow copy of the target list:
countries = ["United States", "Canada", "Poland", "Germany", "Austria"]
nations = countries.copy() >>> nations
#
# ['United States', 'Canada', 'Poland', 'Germany', 'Austria']
id(countries) == id(nations)
#
# False
id(countries[0]) == id(nations[0])
#
# True
id(countries[1]) == id(nations[1])
#
# True
countries[0] = "United States of America"
countries
#
# ['United States of America', 'Canada', 'Poland', 'Germany', 'Austria']
nations
#
# ['United States', 'Canada', 'Poland', 'Germany', 'Austria']
Calling .copy()
on countries
gets you a shallow copy of this list. Now you have two different lists. However, their elements are common to both. Again, if you change an element in one of the lists, then the change won’t reflect in the copy.
You’ll find yet another tool for creating shallow copies of a list. The copy()
function from the copy
module allows you to do just that:
from copy import copy
countries = ["United States", "Canada", "Poland", "Germany", "Austria"]
nations = copy(countries)
nations
#
# ['United States', 'Canada', 'Poland', 'Germany', 'Austria']
id(countries) == id(nations)
#
# False
id(countries[0]) == id(nations[0])
#
# True
id(countries[1]) == id(nations[1])
#
# True
countries[0] = "United States of America"
countries
#
# ['United States of America', 'Canada', 'Poland', 'Germany', 'Austria']
nations
#
# ['United States', 'Canada', 'Poland', 'Germany', 'Austria']
When you feed copy()
with a mutable container data type, such as a list, the function returns a shallow copy of the input object. This copy behaves the same as the previous shallow copies that you’ve built in this section.
Deep Copies of a List
Sometimes you may need to build a complete copy of an existing list. In other words, you want a copy that creates a new list object and also creates new copies of the contained elements. In these situations, you’ll have to construct what’s known as a deep copy.
When you create a deep copy of a list, Python constructs a new list object and then inserts copies of the objects from the original list recursively.
To create a deep copy of an existing list, you can use the deepcopy()
function from the copy
module. Here’s an example of how this function works:
from copy import deepcopy
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
matrix_copy = deepcopy(matrix)
id(matrix) == id(matrix_copy)
#
# False
id(matrix[0]) == id(matrix_copy[0])
#
# False
id(matrix[1]) == id(matrix_copy[1])
#
# False
In this example, you create a deep copy of your matrix
list. Note how both the lists and their sibling items have different identities.
Why would you need to create a deep copy of matrix
, anyway? For example, if you only create a shallow copy of matrix
, then you can face some issues when trying to mutate the nested lists:
from copy import copy
matrix_copy = copy(matrix)
matrix_copy[0][0] = 100
matrix_copy[0][1] = 200
matrix_copy[0][2] = 300
matrix_copy
#
# [[100, 200, 300], [4, 5, 6], [7, 8, 9]]
matrix
#
# [[100, 200, 300], [4, 5, 6], [7, 8, 9]]
In this example, you create a shallow copy of matrix
. If you change items in a nested list within matrix_copy
, then those changes affect the original data in matrix
. The way to avoid this behavior is to use a deep copy:
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
matrix_copy = deepcopy(matrix)
matrix_copy[0][0] = 100
matrix_copy[0][1] = 200
matrix_copy[0][2] = 300
matrix_copy
#
# [[100, 200, 300], [4, 5, 6], [7, 8, 9]]
matrix
#
# [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
Now the changes in matrix_copy
or any other deep copy don’t affect the content of matrix
, as you can see on the highlighted lines.
Finally, it’s important to note that when you have a list containing immutable objects, such as numbers, strings, or tuples, the behavior of deepcopy()
mimics what copy()
does:
countries = ["United States", "Canada", "Poland", "Germany", "Austria"]
nations = deepcopy(countries)
id(countries) == id(nations)
#
# False
id(countries[0]) == id(nations[0])
#
# True
id(countries[1]) == id(nations[1])
#
# True
In this example, even though you use deepcopy()
, the items in nations
are aliases of the items in countries
. That behavior makes sense because you can’t change immutable objects in place. Again, this behavior optimizes the memory consumption of your code when you’re working with multiple copies of a list.