What Those Underscores Mean in Python

Python uses underscores in weird ways. When I was learning Python, I kept seeing different underscore patterns and had no idea what they meant. _name, __name, __name__, just _ - it looked like the language was having an identity crisis.

Turns out each pattern has a specific meaning, and understanding them is crucial for writing idiomatic Python. This isn't just trivia - these conventions affect how your code behaves, how it's imported, and how other developers will interact with your APIs.

Here's the complete guide to Python underscore conventions, with real-world examples and the gotchas I wish someone had told me about.

Single leading underscore: _name

This is a convention for 'private' or 'internal use only'. Python doesn't actually enforce it - you can still access _name from outside the class. But it signals to other developers: 'Hey, this is an implementation detail. Don't rely on it.'

class UserService:
    def __init__(self):
        self._cache = {}  # Internal cache, don't access directly
        self._db = Database()  # Internal database connection

    def _validate_email(self, email):  # Internal helper method
        return '@' in email and '.' in email

    def create_user(self, email, name):  # Public method
        if not self._validate_email(email):
            raise ValueError('Invalid email')
        # ... create user ...

Linters like ruff can warn when you access _ prefixed names from outside the module. It's not enforcement, but it's a helpful reminder.

The wildcard import gotcha

One important gotcha: when you do from module import *, names starting with _ are NOT imported. This is the one place where the convention has actual behavior.

# In mymodule.py
public_function = lambda: 'public'
_private_function = lambda: 'private'

# In another file
from mymodule import *
public_function()   # Works
_private_function()  # NameError! Not imported

You can override this behavior using __all__:

# In mymodule.py
__all__ = ['public_function', '_private_function']

public_function = lambda: 'public'
_private_function = lambda: 'private'
_secret_function = lambda: 'secret'

# In another file
from mymodule import *
public_function()    # Works
_private_function()  # Now works! Explicitly included in __all__
_secret_function()   # Still NameError - not in __all__

When to use single underscore

I use single underscore for:

Internal helper methods that shouldn't be called by users of the class:

class DataProcessor:
    def process(self, data):
        cleaned = self._clean(data)
        validated = self._validate(cleaned)
        return self._transform(validated)

    def _clean(self, data):
        \"\"\"Remove null bytes, trim whitespace, etc.\"\"\"
        return data.strip().replace('\x00', '')

    def _validate(self, data):
        \"\"\"Check data meets requirements.\"\"\"
        if not data:
            raise ValueError('Empty data')
        return data

    def _transform(self, data):
        \"\"\"Apply business logic transformations.\"\"\"
        return data.upper()

Implementation details that might change between versions:

class Cache:
    def __init__(self):
        self._data = {}  # Might change to Redis later
        self._hits = 0   # Tracking metric, might remove
        self._misses = 0

    def get(self, key):
        \"\"\"Public API - won't change.\"\"\"
        if key in self._data:
            self._hits += 1
            return self._data[key]
        self._misses += 1
        return None

The rule: if you might want to change it later without breaking external code, prefix it with _.

Module-level privacy

Single underscore also works at module level, not just classes:

# api.py
import requests

API_KEY = 'public-key'
_SECRET_KEY = 'secret-key'  # Don't import this elsewhere

def fetch_data():
    return _make_request('/data')

def _make_request(endpoint):
    \"\"\"Internal request helper.\"\"\"
    return requests.get(f'https://api.example.com{endpoint}',
                       headers={'Authorization': f'Bearer {_SECRET_KEY}'})

This keeps your module's public API clean and signals what's safe to depend on.

Double leading underscore: __name

This one's different - it's not just a convention. Python actually modifies these names through a process called 'name mangling'.

When you write __name inside a class, Python renames it to _ClassName__name. This prevents accidental overrides in subclasses.

class Parent:
    def __init__(self):
        self.__secret = 'parent secret'

    def reveal(self):
        return self.__secret

class Child(Parent):
    def __init__(self):
        super().__init__()
        self.__secret = 'child secret'  # This is actually _Child__secret

    def reveal_child(self):
        return self.__secret

child = Child()
print(child.reveal())       # 'parent secret' - Parent's __secret unchanged
print(child.reveal_child()) # 'child secret' - Child's own __secret

Without name mangling, the child's __secret would overwrite the parent's. The mangling keeps them separate.

When name mangling actually helps

Honestly, I rarely use this. Single underscore is enough for most cases. Double underscore is useful when you're writing a library and want to absolutely prevent subclasses from accidentally stepping on your internal attributes. But for everyday code? Overkill.

Here's a real scenario where it matters:

class Counter:
    def __init__(self):
        self.__count = 0  # Name mangled to _Counter__count

    def increment(self):
        self.__count += 1

    def get_count(self):
        return self.__count

class DebugCounter(Counter):
    def __init__(self):
        super().__init__()
        self.__count = []  # Name mangled to _DebugCounter__count
        # Doesn't conflict with parent's __count!

    def increment(self):
        super().increment()
        self.__count.append(self.get_count())  # Track history

    def get_history(self):
        return self.__count

c = DebugCounter()
c.increment()
c.increment()
print(c.get_count())    # 2 - parent's counter works
print(c.get_history())  # [1, 2] - child's list works

Without name mangling, both classes trying to use self.__count would collide. With mangling, they coexist peacefully.

The escape hatch (and why you shouldn't use it)

You can still access mangled names if you really want to:

print(child._Parent__secret)  # 'parent secret'
print(child._Child__secret)   # 'child secret'

But if you're doing this in production code, you should probably reconsider your design. This is for debugging and introspection only.

A common mistake with name mangling

Name mangling only happens inside class definitions. This trips people up:

class Foo:
    def __init__(self):
        self.__private = 42

foo = Foo()
foo.__private        # AttributeError: 'Foo' object has no attribute '__private'
foo._Foo__private    # 42 - this works

# But watch out for this:
foo.__dict__['__private'] = 99  # Creates NEW attribute, doesn't modify mangled one!
foo._Foo__private    # Still 42
foo.__dict__         # {'_Foo__private': 42, '__private': 99}

The second __private wasn't mangled because we set it outside the class definition. Now we have two attributes. Confusing!

Double underscores on both sides: name

These are 'dunder' methods (double underscore). They're Python's magic methods - the language calls them automatically in specific situations.

class Vector:
    def __init__(self, x, y):  # Called when you create an instance
        self.x = x
        self.y = y

    def __repr__(self):  # Called by repr() and in the REPL
        return f'Vector({self.x}, {self.y})'

    def __str__(self):  # Called by str() and print()
        return f'({self.x}, {self.y})'

    def __add__(self, other):  # Called for + operator
        return Vector(self.x + other.x, self.y + other.y)

    def __len__(self):  # Called by len()
        return 2

    def __eq__(self, other):  # Called for == comparison
        return self.x == other.x and self.y == other.y

v1 = Vector(1, 2)
v2 = Vector(3, 4)
print(v1 + v2)  # Vector(4, 6) - uses __add__
print(len(v1))  # 2 - uses __len__

The most useful dunder methods

There are dozens of dunder methods, but here are the ones I use constantly:

Object lifecycle:
- __init__: Initialize an instance
- __del__: Called when object is about to be destroyed (rarely needed)
- __new__: Create a new instance (for metaclass stuff, very advanced)

String representation:
- __repr__: Official string representation, for developers
- __str__: Informal string representation, for end users
- __format__: Used by f-strings and format()

Comparison operators:
- __eq__: Equality (==)
- __ne__: Not equal (!=)
- __lt__: Less than (<)
- __le__: Less than or equal (<=)
- __gt__: Greater than (>)
- __ge__: Greater than or equal (>=)

Math operators:
- __add__: Addition (+)
- __sub__: Subtraction (-)
- __mul__: Multiplication ()
- __truediv__: Division (/)
- __floordiv__: Floor division (//)
- __mod__: Modulo (%)
- __pow__: Power (*)

Container emulation:
- __len__: len() function
- __getitem__: Indexing (obj[key])
- __setitem__: Assignment to index (obj[key] = value)
- __delitem__: Delete item (del obj[key])
- __contains__: Membership test (in operator)
- __iter__: Make object iterable

Context managers:
- __enter__: Entering a with block
- __exit__: Exiting a with block

Here's a practical example using container emulation:

class CaseInsensitiveDict:
    \"\"\"A dictionary that treats keys case-insensitively.\"\"\"

    def __init__(self):
        self._data = {}

    def __setitem__(self, key, value):
        self._data[key.lower()] = value

    def __getitem__(self, key):
        return self._data[key.lower()]

    def __delitem__(self, key):
        del self._data[key.lower()]

    def __contains__(self, key):
        return key.lower() in self._data

    def __len__(self):
        return len(self._data)

    def __repr__(self):
        return f'CaseInsensitiveDict({self._data})'

d = CaseInsensitiveDict()
d['Name'] = 'Alice'
print(d['name'])      # 'Alice' - case insensitive!
print(d['NAME'])      # 'Alice'
print('NaMe' in d)    # True
print(len(d))         # 1

This makes your class behave like a built-in type. Users can interact with it naturally.

Never invent your own dunders

Never invent your own dunder names. They're reserved for Python's use, and future Python versions might add dunders that conflict with yours.

# DON'T DO THIS
class Bad:
    def __my_method__(self):  # Bad! You don't own this namespace
        pass

# DO THIS INSTEAD
class Good:
    def my_method(self):  # Good! Plain names are fine
        pass

    def _my_internal_method(self):  # Good! Internal with single underscore
        pass

Stick to the documented ones. There's a complete list in the Python data model docs.

Single underscore by itself: _

The lone underscore has two uses. First, as a throwaway variable when you don't care about a value:

# We just want to run something 10 times, don't care about index
for _ in range(10):
    do_something()

# Unpacking when we only care about some values
first, _, _, last = ['a', 'b', 'c', 'd']

# Ignoring a return value
_, extension = os.path.splitext(filename)

Advanced unpacking with _

Python 3.10+ added a nice feature where you can use _ to explicitly ignore multiple values:

# Old way - had to use different variable names
first, _1, _2, _3, last = [1, 2, 3, 4, 5]

# Better way - reuse _ for all ignored values
first, _, _, _, last = [1, 2, 3, 4, 5]

# Even better with * operator
first, *_, last = [1, 2, 3, 4, 5]

The *_ syntax means "capture everything in between, but I don't care about it."

Here's a practical example:

def parse_log_line(line):
    # Log format: timestamp|level|module|message
    # We only care about level and message
    _, level, _, message = line.split('|')
    return level, message

# Or with *_:
def parse_csv_row(row):
    # name,email,phone,address,city,state,zip
    # We only need name and email
    name, email, *_ = row.split(',')
    return name, email

The REPL magic

Second, in the interactive interpreter, _ holds the last expression's result:

>>> 2 + 2
4
>>> _ * 3
12
>>> _
12
>>> import math
>>> math.sqrt(144)
12.0
>>> _ / 2
6.0

This is super handy for quick calculations in the REPL. But be careful - it only works in interactive mode, not in scripts:

# This WILL NOT WORK in a .py file
x = 2 + 2
print(_)  # NameError: name '_' is not defined

When _ is actually a variable

One gotcha: _ is still a valid variable name. If you assign to it, you overwrite the REPL behavior:

>>> 2 + 2
4
>>> _ = 100  # Oops, overwrote it
>>> 3 + 3
6
>>> _
100  # Still 100, not 6!

This is why some linters warn about using _ as an actual variable in loops. It's technically allowed, but can be confusing:

# Confusing - is _ throwaway or actual variable?
for _ in range(10):
    if _ > 5:  # Wait, we're using it?
        print(_)

# Better - use descriptive name if you need the value
for i in range(10):
    if i > 5:
        print(i)

In numeric literals: 1_000_000

Python 3.6 added underscore separators in numbers. They're purely visual - the interpreter ignores them completely.

# These are all the same number
million = 1000000
million = 1_000_000  # Much easier to read
million = 10_00_000  # You can put them anywhere, but please don't

# Works for all numeric types
binary = 0b1111_0000  # Binary with separator
hex_color = 0xFF_00_FF  # Hex with separator
big_float = 3.14159_26535  # Float with separator

Use them for long numbers where grouping helps readability. Don't go overboard - 1_0 is technically valid but confusing.

The pattern summary

Pattern	Name	Meaning
`_name`	Single leading	Convention for internal use
`__name`	Double leading	Name mangling (subclass protection)
`__name__`	Double both sides	Magic/dunder methods
`_`	Single underscore	Throwaway variable
`1_000`	In numbers	Visual separator

The one you should actually use

For most code, just use single leading underscore for internal stuff and dunders for magic methods. That's 95% of what you need.

Double leading underscore (name mangling) is niche. I've used it maybe twice in a decade of Python.