You're running your Python script and suddenly hit this:
UnicodeEncodeError: 'ascii' codec can't encode character '\u2019' in position 15: ordinal not in range(128)
This error shows up when Python tries to convert text containing special characters to a format that doesn't support them. Let's fix it with practical examples you can use right away.
Step 1: Understanding the Error
The UnicodeEncodeError happens when Python attempts to encode a string using a character encoding that can't represent certain characters in your text. Here's a simple example that triggers the error:
# This will fail on some systems
text = "Hello, it's a café"
print(text.encode('ascii'))
Running this on macOS terminal:
$ python3 test.py
UnicodeEncodeError: 'ascii' codec can't encode character '\xe9' in position 16: ordinal not in range(128)
The error tells us exactly what went wrong. The character 'é' in "café" can't be represented in ASCII encoding, which only handles characters 0-127. The '\xe9' is the Unicode code point for 'é'.
Here's what each part of the error means:
- 'ascii' codec: The encoding method being used
- character '\xe9': The problematic character
- position 16: Where in the string the character appears
- ordinal not in range(128): ASCII only supports 128 characters
Step 2: Identifying the Cause
UnicodeEncodeError typically appears in these scenarios:
Scenario 1: Writing to Files
# This triggers the error
text = "User's comment: 你好"
with open('output.txt', 'w') as f:
f.write(text)
On older Python versions or certain system configurations, this fails because the default encoding doesn't support Chinese characters.
Scenario 2: Printing to Terminal
# May fail depending on terminal settings
names = ["José", "François", "Björk"]
for name in names:
print(name)
If your terminal's encoding is set to ASCII, printing non-ASCII characters causes the error.
Scenario 3: API Responses
# Common with web scraping
import requests
response = requests.get('https://example.com')
data = response.text
print(data.encode('ascii')) # Fails if response contains special chars
Scenario 4: CSV File Writing
import csv
data = [["Name", "Comment"], ["User1", "Great product™"]]
with open('data.csv', 'w', newline='') as f:
writer = csv.writer(f)
writer.writerows(data) # May fail with trademark symbol
The root cause is always the same: you're trying to encode Unicode text using an encoding that doesn't support all the characters in your string.
Step 3: Implementing the Solution
Solution 1: Use UTF-8 Encoding
UTF-8 is the most common solution because it supports virtually all characters. When writing to files, explicitly specify UTF-8:
# Working version
text = "User's comment: 你好 café™"
with open('output.txt', 'w', encoding='utf-8') as f:
f.write(text)
print("File written successfully!")
$ python3 test.py
File written successfully!
For reading the file back:
# Always use the same encoding for reading
with open('output.txt', 'r', encoding='utf-8') as f:
content = f.read()
print(content)
This works because UTF-8 can represent over a million different characters, including emoji, Asian languages, mathematical symbols, and accented letters.
Solution 2: Handle Encoding in Print Statements
If you need to print text but aren't sure about terminal encoding:
text = "José's café serves crème brûlée"
# Option 1: Encode with error handling
print(text.encode('ascii', errors='ignore').decode('ascii'))
# Output: Jos's caf serves crme brle (removes special chars)
# Option 2: Replace unknown characters
print(text.encode('ascii', errors='replace').decode('ascii'))
# Output: Jos?'s caf? serves cr?me br?l?e
# Option 3: Use XML entities
print(text.encode('ascii', errors='xmlcharrefreplace').decode('ascii'))
# Output: José's café serves crème brûlée
# Option 4: Best approach - use UTF-8
print(text.encode('utf-8').decode('utf-8'))
# Output: José's café serves crème brûlée (preserves everything)
The error handling options:
- ignore: Removes characters that can't be encoded
- replace: Substitutes with '?'
- xmlcharrefreplace: Uses XML character references
- backslashreplace: Uses Python backslash escapes
Solution 3: CSV Files with Proper Encoding
import csv
# Data with special characters
data = [
["Name", "Product", "Review"],
["José García", "Coffee™", "Très bien! 很好"],
["François", "Tea®", "Excellent café"],
]
# Write with UTF-8 encoding
with open('reviews.csv', 'w', encoding='utf-8', newline='') as f:
writer = csv.writer(f)
writer.writerows(data)
# Read it back
with open('reviews.csv', 'r', encoding='utf-8') as f:
reader = csv.reader(f)
for row in reader:
print(row)
$ python3 csv_example.py
['Name', 'Product', 'Review']
['José García', 'Coffee™', 'Très bien! 很好']
['François', 'Tea®', 'Excellent café']
Solution 4: Environment Variables and System Encoding
Sometimes the issue is your system's default encoding. Check it:
import sys
import locale
print(f"Default encoding: {sys.getdefaultencoding()}")
print(f"File system encoding: {sys.getfilesystemencoding()}")
print(f"Preferred encoding: {locale.getpreferredencoding()}")
If your system uses ASCII by default, you can set an environment variable:
$ export PYTHONIOENCODING=utf-8
$ python3 your_script.py
Or set it programmatically at the start of your script:
import sys
import io
# Force UTF-8 for stdout
sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')
text = "Special chars: é ñ 中文 🎉"
print(text) # Now works reliably
Solution 5: Working with APIs and Web Content
When dealing with web content:
import requests
# Fetch content with special characters
response = requests.get('https://example.com/international-page')
# Don't encode to ASCII - work directly with Unicode
text = response.text
# If you must save to file
with open('webpage.html', 'w', encoding='utf-8') as f:
f.write(text)
# If you need to process byte data
byte_data = response.content # Already bytes
decoded_text = byte_data.decode('utf-8') # Decode explicitly
Here's a practical web scraping example:
import requests
from bs4 import BeautifulSoup
url = "https://example.com"
response = requests.get(url)
# Ensure proper encoding from the start
response.encoding = 'utf-8'
soup = BeautifulSoup(response.text, 'html.parser')
# Extract and save content
titles = soup.find_all('h1')
with open('titles.txt', 'w', encoding='utf-8') as f:
for title in titles:
f.write(title.text + '\n')
Solution 6: Handling User Input
When working with user input that might contain special characters:
def save_user_comment(comment):
"""Safely save user comments with any characters"""
try:
with open('comments.txt', 'a', encoding='utf-8') as f:
f.write(f"{comment}\n")
return True
except UnicodeEncodeError as e:
print(f"Encoding error: {e}")
# Fallback: remove problematic characters
safe_comment = comment.encode('ascii', errors='ignore').decode('ascii')
with open('comments.txt', 'a', encoding='utf-8') as f:
f.write(f"{safe_comment}\n")
return False
# Test with various inputs
comments = [
"Great product!",
"Très bon! 🎉",
"素晴らしい製品",
"Отличный продукт",
]
for comment in comments:
save_user_comment(comment)
Solution 7: Database Operations
When inserting data into databases:
import sqlite3
# Create database with proper encoding
conn = sqlite3.connect('users.db')
cursor = conn.cursor()
# Create table
cursor.execute('''
CREATE TABLE IF NOT EXISTS users (
id INTEGER PRIMARY KEY,
name TEXT,
bio TEXT
)
''')
# Insert data with special characters - SQLite handles UTF-8 automatically
users = [
("José García", "Software engineer from España"),
("李明", "Developer from 中国"),
("François", "Designer from France 🇫🇷"),
]
for name, bio in users:
cursor.execute('INSERT INTO users (name, bio) VALUES (?, ?)', (name, bio))
conn.commit()
# Retrieve and display
cursor.execute('SELECT * FROM users')
for row in cursor.fetchall():
print(f"ID: {row[0]}, Name: {row[1]}, Bio: {row[2]}")
conn.close()
Additional Tips and Common Pitfalls
Tip 1: Check Your File Before Processing
Before processing files, verify their encoding:
import chardet
def detect_encoding(file_path):
with open(file_path, 'rb') as f:
raw_data = f.read()
result = chardet.detect(raw_data)
return result['encoding']
# Use detected encoding
file_path = 'unknown_encoding.txt'
encoding = detect_encoding(file_path)
print(f"Detected encoding: {encoding}")
with open(file_path, 'r', encoding=encoding) as f:
content = f.read()
Tip 2: Python 3 Default Behavior
Python 3 uses UTF-8 by default for string literals, but file operations might use your system's default encoding. Always specify encoding explicitly:
# Bad - uses system default
with open('file.txt', 'w') as f:
f.write(text)
# Good - explicit UTF-8
with open('file.txt', 'w', encoding='utf-8') as f:
f.write(text)
Tip 3: Working with JSON
JSON handles Unicode automatically, making it a safe choice:
import json
data = {
"users": [
{"name": "José", "comment": "Très bien!"},
{"name": "李明", "comment": "很好"},
]
}
# JSON automatically handles UTF-8
with open('data.json', 'w', encoding='utf-8') as f:
json.dump(data, f, ensure_ascii=False, indent=2)
# Read it back
with open('data.json', 'r', encoding='utf-8') as f:
loaded_data = json.load(f)
print(loaded_data)
The ensure_ascii=False
parameter prevents JSON from escaping Unicode characters.
Tip 4: Debugging the Error
When you encounter the error, inspect the problematic character:
def find_problematic_chars(text):
"""Identify characters that can't be ASCII encoded"""
problems = []
for i, char in enumerate(text):
try:
char.encode('ascii')
except UnicodeEncodeError:
problems.append((i, char, hex(ord(char))))
return problems
text = "Hello café™ 你好"
issues = find_problematic_chars(text)
for pos, char, code in issues:
print(f"Position {pos}: '{char}' (Unicode: {code})")
$ python3 debug.py
Position 8: 'é' (Unicode: 0xe9)
Position 11: '™' (Unicode: 0x2122)
Position 13: '你' (Unicode: 0x4f60)
Position 14: '好' (Unicode: 0x597d)
Related Error: UnicodeDecodeError
The opposite problem occurs when reading files:
# Reading a UTF-8 file with wrong encoding causes UnicodeDecodeError
try:
with open('utf8_file.txt', 'r', encoding='ascii') as f:
content = f.read()
except UnicodeDecodeError as e:
print(f"Decode error: {e}")
# Solution: use correct encoding
with open('utf8_file.txt', 'r', encoding='utf-8') as f:
content = f.read()
Windows-Specific Issues
Windows sometimes uses different encodings. For cross-platform compatibility:
import sys
import platform
# Check platform
if platform.system() == 'Windows':
# Windows might use cp1252 or others
default_encoding = 'utf-8'
else:
default_encoding = 'utf-8'
with open('file.txt', 'w', encoding=default_encoding) as f:
f.write("Cross-platform text: café")
The safest approach is always using UTF-8 explicitly, regardless of platform. Modern systems handle UTF-8 well, and it's the standard for web content, APIs, and international text.
When in doubt, use UTF-8 encoding, handle errors explicitly, and test with international characters during development. This prevents surprises when your code encounters real-world data with special characters.