What Those Underscores Mean in Python
Python uses underscores in weird ways. When I was learning Python, I kept seeing different underscore patterns and had no idea what they meant. _name, __name, __name__, just _ - it looked like the language was having an identity crisis.
Turns out each pattern has a specific meaning, and understanding them is crucial for writing idiomatic Python. This isn't just trivia - these conventions affect how your code behaves, how it's imported, and how other developers will interact with your APIs.
Here's the complete guide to Python underscore conventions, with real-world examples and the gotchas I wish someone had told me about.
Single leading underscore: _name
This is a convention for 'private' or 'internal use only'. Python doesn't actually enforce it - you can still access _name from outside the class. But it signals to other developers: 'Hey, this is an implementation detail. Don't rely on it.'
class UserService:
def __init__(self):
self._cache = {} # Internal cache, don't access directly
self._db = Database() # Internal database connection
def _validate_email(self, email): # Internal helper method
return '@' in email and '.' in email
def create_user(self, email, name): # Public method
if not self._validate_email(email):
raise ValueError('Invalid email')
# ... create user ...
Linters like ruff can warn when you access _ prefixed names from outside the module. It's not enforcement, but it's a helpful reminder.
The wildcard import gotcha
One important gotcha: when you do from module import *, names starting with _ are NOT imported. This is the one place where the convention has actual behavior.
# In mymodule.py
public_function = lambda: 'public'
_private_function = lambda: 'private'
# In another file
from mymodule import *
public_function() # Works
_private_function() # NameError! Not imported
You can override this behavior using __all__:
# In mymodule.py
__all__ = ['public_function', '_private_function']
public_function = lambda: 'public'
_private_function = lambda: 'private'
_secret_function = lambda: 'secret'
# In another file
from mymodule import *
public_function() # Works
_private_function() # Now works! Explicitly included in __all__
_secret_function() # Still NameError - not in __all__
When to use single underscore
I use single underscore for:
Internal helper methods that shouldn't be called by users of the class:
class DataProcessor:
def process(self, data):
cleaned = self._clean(data)
validated = self._validate(cleaned)
return self._transform(validated)
def _clean(self, data):
\"\"\"Remove null bytes, trim whitespace, etc.\"\"\"
return data.strip().replace('\x00', '')
def _validate(self, data):
\"\"\"Check data meets requirements.\"\"\"
if not data:
raise ValueError('Empty data')
return data
def _transform(self, data):
\"\"\"Apply business logic transformations.\"\"\"
return data.upper()
Implementation details that might change between versions:
class Cache:
def __init__(self):
self._data = {} # Might change to Redis later
self._hits = 0 # Tracking metric, might remove
self._misses = 0
def get(self, key):
\"\"\"Public API - won't change.\"\"\"
if key in self._data:
self._hits += 1
return self._data[key]
self._misses += 1
return None
The rule: if you might want to change it later without breaking external code, prefix it with _.
Module-level privacy
Single underscore also works at module level, not just classes:
# api.py
import requests
API_KEY = 'public-key'
_SECRET_KEY = 'secret-key' # Don't import this elsewhere
def fetch_data():
return _make_request('/data')
def _make_request(endpoint):
\"\"\"Internal request helper.\"\"\"
return requests.get(f'https://api.example.com{endpoint}',
headers={'Authorization': f'Bearer {_SECRET_KEY}'})
This keeps your module's public API clean and signals what's safe to depend on.
Double leading underscore: __name
This one's different - it's not just a convention. Python actually modifies these names through a process called 'name mangling'.
When you write __name inside a class, Python renames it to _ClassName__name. This prevents accidental overrides in subclasses.
class Parent:
def __init__(self):
self.__secret = 'parent secret'
def reveal(self):
return self.__secret
class Child(Parent):
def __init__(self):
super().__init__()
self.__secret = 'child secret' # This is actually _Child__secret
def reveal_child(self):
return self.__secret
child = Child()
print(child.reveal()) # 'parent secret' - Parent's __secret unchanged
print(child.reveal_child()) # 'child secret' - Child's own __secret
Without name mangling, the child's __secret would overwrite the parent's. The mangling keeps them separate.
When name mangling actually helps
Honestly, I rarely use this. Single underscore is enough for most cases. Double underscore is useful when you're writing a library and want to absolutely prevent subclasses from accidentally stepping on your internal attributes. But for everyday code? Overkill.
Here's a real scenario where it matters:
class Counter:
def __init__(self):
self.__count = 0 # Name mangled to _Counter__count
def increment(self):
self.__count += 1
def get_count(self):
return self.__count
class DebugCounter(Counter):
def __init__(self):
super().__init__()
self.__count = [] # Name mangled to _DebugCounter__count
# Doesn't conflict with parent's __count!
def increment(self):
super().increment()
self.__count.append(self.get_count()) # Track history
def get_history(self):
return self.__count
c = DebugCounter()
c.increment()
c.increment()
print(c.get_count()) # 2 - parent's counter works
print(c.get_history()) # [1, 2] - child's list works
Without name mangling, both classes trying to use self.__count would collide. With mangling, they coexist peacefully.
The escape hatch (and why you shouldn't use it)
You can still access mangled names if you really want to:
print(child._Parent__secret) # 'parent secret'
print(child._Child__secret) # 'child secret'
But if you're doing this in production code, you should probably reconsider your design. This is for debugging and introspection only.
A common mistake with name mangling
Name mangling only happens inside class definitions. This trips people up:
class Foo:
def __init__(self):
self.__private = 42
foo = Foo()
foo.__private # AttributeError: 'Foo' object has no attribute '__private'
foo._Foo__private # 42 - this works
# But watch out for this:
foo.__dict__['__private'] = 99 # Creates NEW attribute, doesn't modify mangled one!
foo._Foo__private # Still 42
foo.__dict__ # {'_Foo__private': 42, '__private': 99}
The second __private wasn't mangled because we set it outside the class definition. Now we have two attributes. Confusing!
Double underscores on both sides: name
These are 'dunder' methods (double underscore). They're Python's magic methods - the language calls them automatically in specific situations.
class Vector:
def __init__(self, x, y): # Called when you create an instance
self.x = x
self.y = y
def __repr__(self): # Called by repr() and in the REPL
return f'Vector({self.x}, {self.y})'
def __str__(self): # Called by str() and print()
return f'({self.x}, {self.y})'
def __add__(self, other): # Called for + operator
return Vector(self.x + other.x, self.y + other.y)
def __len__(self): # Called by len()
return 2
def __eq__(self, other): # Called for == comparison
return self.x == other.x and self.y == other.y
v1 = Vector(1, 2)
v2 = Vector(3, 4)
print(v1 + v2) # Vector(4, 6) - uses __add__
print(len(v1)) # 2 - uses __len__
The most useful dunder methods
There are dozens of dunder methods, but here are the ones I use constantly:
Object lifecycle:
- __init__: Initialize an instance
- __del__: Called when object is about to be destroyed (rarely needed)
- __new__: Create a new instance (for metaclass stuff, very advanced)
String representation:
- __repr__: Official string representation, for developers
- __str__: Informal string representation, for end users
- __format__: Used by f-strings and format()
Comparison operators:
- __eq__: Equality (==)
- __ne__: Not equal (!=)
- __lt__: Less than (<)
- __le__: Less than or equal (<=)
- __gt__: Greater than (>)
- __ge__: Greater than or equal (>=)
Math operators:
- __add__: Addition (+)
- __sub__: Subtraction (-)
- __mul__: Multiplication ()
- __truediv__: Division (/)
- __floordiv__: Floor division (//)
- __mod__: Modulo (%)
- __pow__: Power (*)
Container emulation:
- __len__: len() function
- __getitem__: Indexing (obj[key])
- __setitem__: Assignment to index (obj[key] = value)
- __delitem__: Delete item (del obj[key])
- __contains__: Membership test (in operator)
- __iter__: Make object iterable
Context managers:
- __enter__: Entering a with block
- __exit__: Exiting a with block
Here's a practical example using container emulation:
class CaseInsensitiveDict:
\"\"\"A dictionary that treats keys case-insensitively.\"\"\"
def __init__(self):
self._data = {}
def __setitem__(self, key, value):
self._data[key.lower()] = value
def __getitem__(self, key):
return self._data[key.lower()]
def __delitem__(self, key):
del self._data[key.lower()]
def __contains__(self, key):
return key.lower() in self._data
def __len__(self):
return len(self._data)
def __repr__(self):
return f'CaseInsensitiveDict({self._data})'
d = CaseInsensitiveDict()
d['Name'] = 'Alice'
print(d['name']) # 'Alice' - case insensitive!
print(d['NAME']) # 'Alice'
print('NaMe' in d) # True
print(len(d)) # 1
This makes your class behave like a built-in type. Users can interact with it naturally.
Never invent your own dunders
Never invent your own dunder names. They're reserved for Python's use, and future Python versions might add dunders that conflict with yours.
# DON'T DO THIS
class Bad:
def __my_method__(self): # Bad! You don't own this namespace
pass
# DO THIS INSTEAD
class Good:
def my_method(self): # Good! Plain names are fine
pass
def _my_internal_method(self): # Good! Internal with single underscore
pass
Stick to the documented ones. There's a complete list in the Python data model docs.
Single underscore by itself: _
The lone underscore has two uses. First, as a throwaway variable when you don't care about a value:
# We just want to run something 10 times, don't care about index
for _ in range(10):
do_something()
# Unpacking when we only care about some values
first, _, _, last = ['a', 'b', 'c', 'd']
# Ignoring a return value
_, extension = os.path.splitext(filename)
Advanced unpacking with _
Python 3.10+ added a nice feature where you can use _ to explicitly ignore multiple values:
# Old way - had to use different variable names
first, _1, _2, _3, last = [1, 2, 3, 4, 5]
# Better way - reuse _ for all ignored values
first, _, _, _, last = [1, 2, 3, 4, 5]
# Even better with * operator
first, *_, last = [1, 2, 3, 4, 5]
The *_ syntax means "capture everything in between, but I don't care about it."
Here's a practical example:
def parse_log_line(line):
# Log format: timestamp|level|module|message
# We only care about level and message
_, level, _, message = line.split('|')
return level, message
# Or with *_:
def parse_csv_row(row):
# name,email,phone,address,city,state,zip
# We only need name and email
name, email, *_ = row.split(',')
return name, email
The REPL magic
Second, in the interactive interpreter, _ holds the last expression's result:
>>> 2 + 2
4
>>> _ * 3
12
>>> _
12
>>> import math
>>> math.sqrt(144)
12.0
>>> _ / 2
6.0
This is super handy for quick calculations in the REPL. But be careful - it only works in interactive mode, not in scripts:
# This WILL NOT WORK in a .py file
x = 2 + 2
print(_) # NameError: name '_' is not defined
When _ is actually a variable
One gotcha: _ is still a valid variable name. If you assign to it, you overwrite the REPL behavior:
>>> 2 + 2
4
>>> _ = 100 # Oops, overwrote it
>>> 3 + 3
6
>>> _
100 # Still 100, not 6!
This is why some linters warn about using _ as an actual variable in loops. It's technically allowed, but can be confusing:
# Confusing - is _ throwaway or actual variable?
for _ in range(10):
if _ > 5: # Wait, we're using it?
print(_)
# Better - use descriptive name if you need the value
for i in range(10):
if i > 5:
print(i)
In numeric literals: 1_000_000
Python 3.6 added underscore separators in numbers. They're purely visual - the interpreter ignores them completely.
# These are all the same number
million = 1000000
million = 1_000_000 # Much easier to read
million = 10_00_000 # You can put them anywhere, but please don't
# Works for all numeric types
binary = 0b1111_0000 # Binary with separator
hex_color = 0xFF_00_FF # Hex with separator
big_float = 3.14159_26535 # Float with separator
Use them for long numbers where grouping helps readability. Don't go overboard - 1_0 is technically valid but confusing.
The pattern summary
| Pattern | Name | Meaning |
|---|---|---|
_name |
Single leading | Convention for internal use |
__name |
Double leading | Name mangling (subclass protection) |
__name__ |
Double both sides | Magic/dunder methods |
_ |
Single underscore | Throwaway variable |
1_000 |
In numbers | Visual separator |
The one you should actually use
For most code, just use single leading underscore for internal stuff and dunders for magic methods. That's 95% of what you need.
Double leading underscore (name mangling) is niche. I've used it maybe twice in a decade of Python.