How to Fix Python Bugs Caused by Shallow Copy vs Deep Copy Confusion


When you modify a copied list in Python and the original list changes too, you're probably dealing with a shallow copy issue. This is one of the most common sources of unexpected bugs in Python, especially when working with nested data structures.


Step 1: Understanding the Error


Let's start with code that demonstrates this problem:

# Buggy code - shallow copy issue
original_list = [[1, 2, 3], [4, 5, 6]]
copied_list = original_list.copy()

# Modify the copied list
copied_list[0][0] = 999

print("Original:", original_list)
print("Copied:", copied_list)


Output:

Original: [[999, 2, 3], [4, 5, 6]]
Copied: [[999, 2, 3], [4, 5, 6]]


The original list changed even though we only modified the copied list. This happens because list.copy() creates a shallow copy, which only copies references to nested objects rather than the objects themselves.


Here's another common scenario with dictionaries:

# Dictionary shallow copy bug
user_data = {
    'name': 'Alice',
    'scores': [85, 90, 92]
}

backup_data = user_data.copy()
backup_data['scores'].append(95)

print("Original scores:", user_data['scores'])
print("Backup scores:", backup_data['scores'])


Output:

Original scores: [85, 90, 92, 95]
Backup scores: [85, 90, 92, 95]


Both dictionaries share the same list object, so modifications affect both.


Step 2: Identifying the Cause


Python provides multiple ways to copy objects, but they work differently:


Shallow copy creates a new object but inserts references to the objects found in the original. Methods that create shallow copies include:

  • list.copy()
  • dict.copy()
  • list[:] slice notation
  • copy.copy()


Deep copy creates a new object and recursively copies all objects found in the original, creating completely independent copies.


The problem occurs because shallow copy only goes one level deep. For nested structures like lists of lists or dictionaries containing lists, the inner objects are still shared between the original and the copy.


Here's a visualization of what happens:

import copy

original = [[1, 2], [3, 4]]

# Shallow copy
shallow = original.copy()
print("Are outer lists same?", original is shallow)  # False
print("Are inner lists same?", original[0] is shallow[0])  # True

# Deep copy
deep = copy.deepcopy(original)
print("Are outer lists same?", original is deep)  # False
print("Are inner lists same?", original[0] is deep[0])  # False


Output:

Are outer lists same? False
Are inner lists same? True
Are outer lists same? False
Are inner lists same? False


The shallow copy creates a new outer list but references the same inner lists. The deep copy creates entirely new objects at every level.


Step 3: Implementing the Solution


The fix depends on your data structure. Use the copy module's deepcopy() function for nested structures:

import copy

# Fixed code - using deep copy
original_list = [[1, 2, 3], [4, 5, 6]]
copied_list = copy.deepcopy(original_list)

# Now modifications are independent
copied_list[0][0] = 999

print("Original:", original_list)
print("Copied:", copied_list)


Output:

Original: [[1, 2, 3], [4, 5, 6]]
Copied: [[999, 2, 3], [4, 5, 6]]


For the dictionary example:

import copy

user_data = {
    'name': 'Alice',
    'scores': [85, 90, 92]
}

# Deep copy the dictionary
backup_data = copy.deepcopy(user_data)
backup_data['scores'].append(95)

print("Original scores:", user_data['scores'])
print("Backup scores:", backup_data['scores'])


Output:

Original scores: [85, 90, 92]
Backup scores: [85, 90, 92, 95]


Now the lists are independent.


Working Code Examples


Example 1: Game State Management

This is a real-world scenario where shallow copy bugs commonly occur:

import copy

# Buggy version
class GameState:
    def __init__(self):
        self.board = [[0, 0, 0], [0, 0, 0], [0, 0, 0]]
        self.history = []
    
    def save_state_buggy(self):
        # Wrong: shallow copy
        self.history.append(self.board.copy())
    
    def save_state_fixed(self):
        # Correct: deep copy
        self.history.append(copy.deepcopy(self.board))

# Test the buggy version
game = GameState()
game.board[0][0] = 1
game.save_state_buggy()
game.board[1][1] = 2
game.save_state_buggy()

print("History (buggy):", game.history)
# Both states show [1, 0, 0], [0, 2, 0], [0, 0, 0] because they reference the same board

# Test the fixed version
game2 = GameState()
game2.board[0][0] = 1
game2.save_state_fixed()
game2.board[1][1] = 2
game2.save_state_fixed()

print("History (fixed):", game2.history)
# States are correctly preserved


Output:

History (buggy): [[[1, 0, 0], [0, 2, 0], [0, 0, 0]], [[1, 0, 0], [0, 2, 0], [0, 0, 0]]]
History (fixed): [[[1, 0, 0], [0, 0, 0], [0, 0, 0]], [[1, 0, 0], [0, 2, 0], [0, 0, 0]]]


Example 2: Configuration Management

Another common case involves nested configuration dictionaries:

import copy

# Default configuration template
default_config = {
    'database': {
        'host': 'localhost',
        'port': 5432,
        'credentials': {
            'user': 'admin',
            'password': 'default'
        }
    },
    'cache': {
        'enabled': True,
        'ttl': 3600
    }
}

# Buggy: shallow copy
prod_config = default_config.copy()
prod_config['database']['host'] = 'prod.example.com'
prod_config['database']['credentials']['password'] = 'prod_secret'

print("Default password:", default_config['database']['credentials']['password'])
# Prints: prod_secret (unintended modification!)

# Fixed: deep copy
default_config_reset = {
    'database': {
        'host': 'localhost',
        'port': 5432,
        'credentials': {
            'user': 'admin',
            'password': 'default'
        }
    },
    'cache': {
        'enabled': True,
        'ttl': 3600
    }
}

prod_config_fixed = copy.deepcopy(default_config_reset)
prod_config_fixed['database']['host'] = 'prod.example.com'
prod_config_fixed['database']['credentials']['password'] = 'prod_secret'

print("Default password (fixed):", default_config_reset['database']['credentials']['password'])
# Prints: default (correct!)


Example 3: Matrix Operations

When working with 2D lists representing matrices:

import copy

def create_matrix_buggy(rows, cols, initial_value=0):
    # Wrong: creates shallow copies of the same list
    row = [initial_value] * cols
    matrix = [row] * rows
    return matrix

def create_matrix_fixed(rows, cols, initial_value=0):
    # Correct: creates independent rows
    return [[initial_value for _ in range(cols)] for _ in range(rows)]

# Test buggy version
matrix1 = create_matrix_buggy(3, 3)
matrix1[0][0] = 1
print("Buggy matrix:")
for row in matrix1:
    print(row)
# All rows are modified!

print()

# Test fixed version
matrix2 = create_matrix_fixed(3, 3)
matrix2[0][0] = 1
print("Fixed matrix:")
for row in matrix2:
    print(row)
# Only first row is modified


Output:

Buggy matrix:
[1, 0, 0]
[1, 0, 0]
[1, 0, 0]

Fixed matrix:
[1, 0, 0]
[0, 0, 0]
[0, 0, 0]


Additional Tips and Related Errors


When Shallow Copy Is Enough

Shallow copy works fine for flat data structures containing only immutable objects:

# These are safe with shallow copy
numbers = [1, 2, 3, 4, 5]
copied = numbers.copy()

strings = ['apple', 'banana', 'cherry']
copied_strings = strings.copy()

tuples = [(1, 2), (3, 4), (5, 6)]
copied_tuples = tuples.copy()


Since integers, strings, and tuples are immutable, modifying the copied list elements creates new objects rather than changing existing ones.


Performance Considerations

Deep copy is slower and uses more memory because it recursively copies all nested objects:

import copy
import time

# Large nested structure
large_data = [[i] * 1000 for i in range(1000)]

# Measure shallow copy
start = time.time()
shallow = large_data.copy()
shallow_time = time.time() - start

# Measure deep copy
start = time.time()
deep = copy.deepcopy(large_data)
deep_time = time.time() - start

print(f"Shallow copy: {shallow_time:.4f} seconds")
print(f"Deep copy: {deep_time:.4f} seconds")
print(f"Deep copy is {deep_time/shallow_time:.1f}x slower")


Only use deep copy when you actually need independent nested objects. For read-only operations or when you know you won't modify nested structures, shallow copy is more efficient.


Custom Objects and Copy Behavior

When copying custom objects, you might need to define __copy__ and __deepcopy__ methods:

import copy

class Node:
    def __init__(self, value, children=None):
        self.value = value
        self.children = children or []
    
    def __deepcopy__(self, memo):
        # Custom deep copy implementation
        new_node = Node(self.value)
        new_node.children = copy.deepcopy(self.children, memo)
        return new_node

# Test custom deep copy
root = Node(1)
root.children = [Node(2), Node(3)]

copied_root = copy.deepcopy(root)
copied_root.children[0].value = 999

print("Original first child:", root.children[0].value)
print("Copied first child:", copied_root.children[0].value)


Output:

Original first child: 2
Copied first child: 999


Common Gotcha with Default Arguments

Be careful with mutable default arguments, which create a similar shared reference issue:

# Buggy: mutable default argument
def add_item_buggy(item, items=[]):
    items.append(item)
    return items

list1 = add_item_buggy(1)
list2 = add_item_buggy(2)
print("List1:", list1)  # [1, 2] - unexpected!
print("List2:", list2)  # [1, 2] - same object!

# Fixed: use None as default
def add_item_fixed(item, items=None):
    if items is None:
        items = []
    items.append(item)
    return items

list3 = add_item_fixed(1)
list4 = add_item_fixed(2)
print("List3:", list3)  # [1]
print("List4:", list4)  # [2]


Debugging Tip: Use id() to Check Object Identity

When you're unsure whether two variables reference the same object, use the id() function:

import copy

original = [[1, 2], [3, 4]]
shallow = original.copy()
deep = copy.deepcopy(original)

print("Original inner list id:", id(original[0]))
print("Shallow inner list id:", id(shallow[0]))
print("Deep inner list id:", id(deep[0]))


Different IDs mean different objects in memory. If shallow and original have the same inner list ID, they share the same object.


Understanding the difference between shallow and deep copy prevents subtle bugs that can be difficult to track down. Use shallow copy for simple, flat structures and deep copy when working with nested data structures that need to be truly independent.


How to Fix Python JSON KeyError and TypeError During Data Parsing