When you modify a copied list in Python and the original list changes too, you're probably dealing with a shallow copy issue. This is one of the most common sources of unexpected bugs in Python, especially when working with nested data structures.
Step 1: Understanding the Error
Let's start with code that demonstrates this problem:
# Buggy code - shallow copy issue
original_list = [[1, 2, 3], [4, 5, 6]]
copied_list = original_list.copy()
# Modify the copied list
copied_list[0][0] = 999
print("Original:", original_list)
print("Copied:", copied_list)
Output:
Original: [[999, 2, 3], [4, 5, 6]]
Copied: [[999, 2, 3], [4, 5, 6]]
The original list changed even though we only modified the copied list. This happens because list.copy() creates a shallow copy, which only copies references to nested objects rather than the objects themselves.
Here's another common scenario with dictionaries:
# Dictionary shallow copy bug
user_data = {
'name': 'Alice',
'scores': [85, 90, 92]
}
backup_data = user_data.copy()
backup_data['scores'].append(95)
print("Original scores:", user_data['scores'])
print("Backup scores:", backup_data['scores'])
Output:
Original scores: [85, 90, 92, 95]
Backup scores: [85, 90, 92, 95]
Both dictionaries share the same list object, so modifications affect both.
Step 2: Identifying the Cause
Python provides multiple ways to copy objects, but they work differently:
Shallow copy creates a new object but inserts references to the objects found in the original. Methods that create shallow copies include:
list.copy()dict.copy()list[:]slice notationcopy.copy()
Deep copy creates a new object and recursively copies all objects found in the original, creating completely independent copies.
The problem occurs because shallow copy only goes one level deep. For nested structures like lists of lists or dictionaries containing lists, the inner objects are still shared between the original and the copy.
Here's a visualization of what happens:
import copy
original = [[1, 2], [3, 4]]
# Shallow copy
shallow = original.copy()
print("Are outer lists same?", original is shallow) # False
print("Are inner lists same?", original[0] is shallow[0]) # True
# Deep copy
deep = copy.deepcopy(original)
print("Are outer lists same?", original is deep) # False
print("Are inner lists same?", original[0] is deep[0]) # False
Output:
Are outer lists same? False
Are inner lists same? True
Are outer lists same? False
Are inner lists same? False
The shallow copy creates a new outer list but references the same inner lists. The deep copy creates entirely new objects at every level.
Step 3: Implementing the Solution
The fix depends on your data structure. Use the copy module's deepcopy() function for nested structures:
import copy
# Fixed code - using deep copy
original_list = [[1, 2, 3], [4, 5, 6]]
copied_list = copy.deepcopy(original_list)
# Now modifications are independent
copied_list[0][0] = 999
print("Original:", original_list)
print("Copied:", copied_list)
Output:
Original: [[1, 2, 3], [4, 5, 6]]
Copied: [[999, 2, 3], [4, 5, 6]]
For the dictionary example:
import copy
user_data = {
'name': 'Alice',
'scores': [85, 90, 92]
}
# Deep copy the dictionary
backup_data = copy.deepcopy(user_data)
backup_data['scores'].append(95)
print("Original scores:", user_data['scores'])
print("Backup scores:", backup_data['scores'])
Output:
Original scores: [85, 90, 92]
Backup scores: [85, 90, 92, 95]
Now the lists are independent.
Working Code Examples
Example 1: Game State Management
This is a real-world scenario where shallow copy bugs commonly occur:
import copy
# Buggy version
class GameState:
def __init__(self):
self.board = [[0, 0, 0], [0, 0, 0], [0, 0, 0]]
self.history = []
def save_state_buggy(self):
# Wrong: shallow copy
self.history.append(self.board.copy())
def save_state_fixed(self):
# Correct: deep copy
self.history.append(copy.deepcopy(self.board))
# Test the buggy version
game = GameState()
game.board[0][0] = 1
game.save_state_buggy()
game.board[1][1] = 2
game.save_state_buggy()
print("History (buggy):", game.history)
# Both states show [1, 0, 0], [0, 2, 0], [0, 0, 0] because they reference the same board
# Test the fixed version
game2 = GameState()
game2.board[0][0] = 1
game2.save_state_fixed()
game2.board[1][1] = 2
game2.save_state_fixed()
print("History (fixed):", game2.history)
# States are correctly preserved
Output:
History (buggy): [[[1, 0, 0], [0, 2, 0], [0, 0, 0]], [[1, 0, 0], [0, 2, 0], [0, 0, 0]]]
History (fixed): [[[1, 0, 0], [0, 0, 0], [0, 0, 0]], [[1, 0, 0], [0, 2, 0], [0, 0, 0]]]
Example 2: Configuration Management
Another common case involves nested configuration dictionaries:
import copy
# Default configuration template
default_config = {
'database': {
'host': 'localhost',
'port': 5432,
'credentials': {
'user': 'admin',
'password': 'default'
}
},
'cache': {
'enabled': True,
'ttl': 3600
}
}
# Buggy: shallow copy
prod_config = default_config.copy()
prod_config['database']['host'] = 'prod.example.com'
prod_config['database']['credentials']['password'] = 'prod_secret'
print("Default password:", default_config['database']['credentials']['password'])
# Prints: prod_secret (unintended modification!)
# Fixed: deep copy
default_config_reset = {
'database': {
'host': 'localhost',
'port': 5432,
'credentials': {
'user': 'admin',
'password': 'default'
}
},
'cache': {
'enabled': True,
'ttl': 3600
}
}
prod_config_fixed = copy.deepcopy(default_config_reset)
prod_config_fixed['database']['host'] = 'prod.example.com'
prod_config_fixed['database']['credentials']['password'] = 'prod_secret'
print("Default password (fixed):", default_config_reset['database']['credentials']['password'])
# Prints: default (correct!)
Example 3: Matrix Operations
When working with 2D lists representing matrices:
import copy
def create_matrix_buggy(rows, cols, initial_value=0):
# Wrong: creates shallow copies of the same list
row = [initial_value] * cols
matrix = [row] * rows
return matrix
def create_matrix_fixed(rows, cols, initial_value=0):
# Correct: creates independent rows
return [[initial_value for _ in range(cols)] for _ in range(rows)]
# Test buggy version
matrix1 = create_matrix_buggy(3, 3)
matrix1[0][0] = 1
print("Buggy matrix:")
for row in matrix1:
print(row)
# All rows are modified!
print()
# Test fixed version
matrix2 = create_matrix_fixed(3, 3)
matrix2[0][0] = 1
print("Fixed matrix:")
for row in matrix2:
print(row)
# Only first row is modified
Output:
Buggy matrix:
[1, 0, 0]
[1, 0, 0]
[1, 0, 0]
Fixed matrix:
[1, 0, 0]
[0, 0, 0]
[0, 0, 0]
Additional Tips and Related Errors
When Shallow Copy Is Enough
Shallow copy works fine for flat data structures containing only immutable objects:
# These are safe with shallow copy
numbers = [1, 2, 3, 4, 5]
copied = numbers.copy()
strings = ['apple', 'banana', 'cherry']
copied_strings = strings.copy()
tuples = [(1, 2), (3, 4), (5, 6)]
copied_tuples = tuples.copy()
Since integers, strings, and tuples are immutable, modifying the copied list elements creates new objects rather than changing existing ones.
Performance Considerations
Deep copy is slower and uses more memory because it recursively copies all nested objects:
import copy
import time
# Large nested structure
large_data = [[i] * 1000 for i in range(1000)]
# Measure shallow copy
start = time.time()
shallow = large_data.copy()
shallow_time = time.time() - start
# Measure deep copy
start = time.time()
deep = copy.deepcopy(large_data)
deep_time = time.time() - start
print(f"Shallow copy: {shallow_time:.4f} seconds")
print(f"Deep copy: {deep_time:.4f} seconds")
print(f"Deep copy is {deep_time/shallow_time:.1f}x slower")
Only use deep copy when you actually need independent nested objects. For read-only operations or when you know you won't modify nested structures, shallow copy is more efficient.
Custom Objects and Copy Behavior
When copying custom objects, you might need to define __copy__ and __deepcopy__ methods:
import copy
class Node:
def __init__(self, value, children=None):
self.value = value
self.children = children or []
def __deepcopy__(self, memo):
# Custom deep copy implementation
new_node = Node(self.value)
new_node.children = copy.deepcopy(self.children, memo)
return new_node
# Test custom deep copy
root = Node(1)
root.children = [Node(2), Node(3)]
copied_root = copy.deepcopy(root)
copied_root.children[0].value = 999
print("Original first child:", root.children[0].value)
print("Copied first child:", copied_root.children[0].value)
Output:
Original first child: 2
Copied first child: 999
Common Gotcha with Default Arguments
Be careful with mutable default arguments, which create a similar shared reference issue:
# Buggy: mutable default argument
def add_item_buggy(item, items=[]):
items.append(item)
return items
list1 = add_item_buggy(1)
list2 = add_item_buggy(2)
print("List1:", list1) # [1, 2] - unexpected!
print("List2:", list2) # [1, 2] - same object!
# Fixed: use None as default
def add_item_fixed(item, items=None):
if items is None:
items = []
items.append(item)
return items
list3 = add_item_fixed(1)
list4 = add_item_fixed(2)
print("List3:", list3) # [1]
print("List4:", list4) # [2]
Debugging Tip: Use id() to Check Object Identity
When you're unsure whether two variables reference the same object, use the id() function:
import copy
original = [[1, 2], [3, 4]]
shallow = original.copy()
deep = copy.deepcopy(original)
print("Original inner list id:", id(original[0]))
print("Shallow inner list id:", id(shallow[0]))
print("Deep inner list id:", id(deep[0]))
Different IDs mean different objects in memory. If shallow and original have the same inner list ID, they share the same object.
Understanding the difference between shallow and deep copy prevents subtle bugs that can be difficult to track down. Use shallow copy for simple, flat structures and deep copy when working with nested data structures that need to be truly independent.