structlog > logging

The Python logging module is fine. But structlog is better. And once you try it, you won't go back.

The problem with traditional logging

import logging
logger = logging.getLogger(__name__)

logger.info(f'User {user_id} logged in from {ip_address}')
# Output: INFO:auth:User 123 logged in from 192.168.1.1

This works, but now try answering: 'Show me all logins from IP 192.168.1.1'. You'd have to parse that string with regex or something. Gross.

And what if you log it differently elsewhere?

logger.info(f'{user_id} authenticated successfully from {ip_address}')

Now your parsing breaks. The data is there, but it's buried in human-readable text with inconsistent formatting.

Structured logging: data first

import structlog
logger = structlog.get_logger()

logger.info('user_logged_in', user_id=user_id, ip_address=ip_address)

Output (in JSON mode):

{"event": "user_logged_in", "user_id": 123, "ip_address": "192.168.1.1", "timestamp": "2025-06-15T10:30:00Z"}

Now every field is queryable. Want all logins from an IP? Filter on ip_address. Want all events for a user? Filter on user_id. No parsing required.

The key insight

Log data, not messages. Everything you might want to search for should be a separate field, not buried in a string.

Bad:

logger.info(f'Payment of ${amount} processed for order {order_id}')

Good:

logger.info(
    'payment_processed', order_id=order_id, amount=amount, currency='USD'
)

Context binding

This is where structlog really shines. You can bind context that persists across multiple log calls:

logger = logger.bind(request_id=request_id, user_id=user_id)

# Now every log includes request_id and user_id automatically
logger.info('processing_started')
# do stuff...
logger.info('processing_complete')

Both log lines will have request_id and user_id without you typing them again. This is huge for tracing requests through your system.

Pretty output in development

JSON is great for production (machines read it), but annoying in development. structlog can switch between formats:

import structlog

if settings.debug:
    structlog.configure(
        processors=[
            structlog.dev.ConsoleRenderer()
        ]
    )
else:
    structlog.configure(
        processors=[
            structlog.processors.JSONRenderer()
        ]
    )

In dev you get colorful, human-readable output. In production you get JSON.

Getting started

uv add structlog

Basic setup:

import structlog

structlog.configure(
    processors=[
        structlog.processors.TimeStamper(fmt='iso'),
        structlog.processors.JSONRenderer(),
    ]
)

logger = structlog.get_logger()
logger.info('app_started', version='1.0.0')

That's enough to get going. There's more you can configure, but start simple.

Processors: The power under the hood

Processors are what make structlog flexible. They're a chain of functions that transform your log events. Here's a more complete setup:

import structlog

structlog.configure(
    processors=[
        # Add log level to event dict
        structlog.stdlib.add_log_level,
        # Add a timestamp in ISO format
        structlog.processors.TimeStamper(fmt='iso'),
        # Filter out keys you don't want in logs
        structlog.processors.filter_by_level,
        # Stack traces for exceptions
        structlog.processors.StackInfoRenderer(),
        structlog.processors.format_exc_info,
        # Final renderer
        structlog.processors.JSONRenderer(),
    ]
)

Each processor does one thing. The timestamper adds timestamps. The exception formatter handles tracebacks. You can write custom processors too:

def drop_debug_logs(logger, method_name, event_dict):
    if method_name == 'debug':
        raise structlog.DropEvent
    return event_dict

Add it to your processor chain and debug logs disappear. Simple.

Playing nice with stdlib logging

Your dependencies probably use Python's logging module. structlog can capture those logs too:

import logging
import structlog

# Route stdlib logging through structlog
logging.basicConfig(
    format='%(message)s',
    handlers=[structlog.stdlib.ProcessorFormatter.wrap_for_formatter],
)

structlog.configure(
    processors=[
        structlog.stdlib.filter_by_level,
        structlog.stdlib.add_logger_name,
        structlog.stdlib.add_log_level,
        structlog.processors.TimeStamper(fmt='iso'),
        structlog.processors.StackInfoRenderer(),
        structlog.processors.format_exc_info,
        structlog.stdlib.ProcessorFormatter.wrap_for_formatter,
    ],
    wrapper_class=structlog.stdlib.BoundLogger,
    logger_factory=structlog.stdlib.LoggerFactory(),
)

Now when some library does logging.getLogger(__name__).info('something'), it gets processed through structlog's pipeline. Everything ends up in the same format.

Why this matters in production

When something breaks at 2am, you're not going to want to read through pages of text logs. You want to query:

'Show me all errors in the last hour'
'Show me all requests from this user'
'Show me everything with this request_id'

Structured logs + a log aggregator (Datadog, Elasticsearch, whatever) make this trivial. Unstructured logs make it painful.

The 10 minutes it takes to switch to structlog will save you hours of debugging.

Real-world patterns that actually work

Let's talk about how to use this in actual applications. Here are patterns I use constantly.

Request tracking across services

You know how a single user action can trigger multiple backend services? Tracking that flow is a nightmare with traditional logging. With structlog and context binding, it's easy:

# In your API gateway or initial handler
import uuid
from contextvars import ContextVar

request_id_var = ContextVar('request_id', default=None)

@app.middleware('http')
async def add_request_id(request, call_next):
    request_id = request.headers.get('X-Request-ID', str(uuid.uuid4()))
    request_id_var.set(request_id)

    logger = structlog.get_logger().bind(request_id=request_id)
    logger.info('request_started', method=request.method, path=request.url.path)

    response = await call_next(request)

    logger.info('request_completed', status_code=response.status_code)
    return response

Now every log in that request context automatically includes the request_id. When you make calls to other services, pass it along:

async def call_payment_service(order_id, amount):
    logger = structlog.get_logger()
    logger.info('calling_payment_service', order_id=order_id, amount=amount)

    response = await http_client.post(
        '/payments',
        json={'order_id': order_id, 'amount': amount},
        headers={'X-Request-ID': request_id_var.get()}
    )

    logger.info('payment_service_responded', status=response.status_code)
    return response

In your payment service, extract the request_id and bind it there too. Now you can trace a single request across your entire stack. Just search your logs for that request_id.

Logging database queries

You want to know which queries are slow, but you don't want to spam logs. Here's how I handle it:

import time
import structlog

logger = structlog.get_logger()

async def execute_query(query, params=None):
    start = time.perf_counter()

    logger.debug('query_started', query=query[:100])  # Truncate long queries

    result = await db.execute(query, params)

    duration = time.perf_counter() - start

    # Only log slow queries as warnings
    if duration > 0.1:  # 100ms threshold
        logger.warning(
            'slow_query',
            query=query[:100],
            duration_ms=round(duration * 1000, 2),
            row_count=len(result)
        )
    else:
        logger.debug(
            'query_completed',
            duration_ms=round(duration * 1000, 2),
            row_count=len(result)
        )

    return result

In production, you set the log level to INFO or WARNING, so you only see slow queries. But in development with DEBUG level, you see everything. And because it's structured, you can easily graph query performance over time in your monitoring tool.

Error context that actually helps

When an exception happens, you want context. Not just the stacktrace, but the state of your application when it died:

def process_order(order_id):
    logger = structlog.get_logger().bind(order_id=order_id)

    try:
        logger.info('order_processing_started')

        order = get_order(order_id)
        logger = logger.bind(
            user_id=order.user_id,
            total_amount=order.total,
            item_count=len(order.items)
        )

        logger.info('order_loaded')

        payment = charge_card(order.total, order.payment_method)
        logger = logger.bind(payment_id=payment.id)

        logger.info('payment_charged')

        shipment = create_shipment(order)
        logger = logger.bind(shipment_id=shipment.id)

        logger.info('shipment_created')
        logger.info('order_processing_complete')

    except Exception as e:
        logger.exception(
            'order_processing_failed',
            error_type=type(e).__name__,
            error_message=str(e)
        )
        raise

When this fails, your logs show exactly where it failed and what the order looked like. The bound context accumulates as you go, so by the time an error happens, you have all the relevant IDs.

Testing with structlog

Testing logging is usually painful, but structlog makes it reasonable. You can capture log events and assert against the structured data:

import structlog
from structlog.testing import LogCapture

def test_user_creation_logs():
    cap = LogCapture()
    structlog.configure(processors=[cap])

    logger = structlog.get_logger()

    # Your code that logs
    create_user('alice@example.com')

    # Assert against structured logs
    assert len(cap.entries) == 2
    assert cap.entries[0]['event'] == 'user_creation_started'
    assert cap.entries[0]['email'] == 'alice@example.com'
    assert cap.entries[1]['event'] == 'user_creation_completed'
    assert 'user_id' in cap.entries[1]

Way better than parsing strings or using mocks.

Common gotchas and how to avoid them

Don't log sensitive data

This should be obvious, but when you're logging structured data, it's easy to accidentally log passwords, credit cards, or API keys:

# BAD - logs the password
logger.info('user_authenticated', username=username, password=password)

# GOOD
logger.info('user_authenticated', username=username)

Write a custom processor to scrub sensitive fields:

SENSITIVE_KEYS = {'password', 'credit_card', 'ssn', 'api_key', 'secret'}

def scrub_sensitive_data(logger, method_name, event_dict):
    for key in SENSITIVE_KEYS:
        if key in event_dict:
            event_dict[key] = '***REDACTED***'
    return event_dict

structlog.configure(
    processors=[
        scrub_sensitive_data,
        # ... other processors
    ]
)

Now even if someone accidentally logs a password, it gets scrubbed.

Log level discipline

Just because you can log everything doesn't mean you should. Use levels appropriately:

DEBUG: Detailed information for diagnosing problems. Should be noisy.
INFO: General informational messages. Normal operation events.
WARNING: Something unexpected but not broken. Slow queries, deprecated API usage, etc.
ERROR: Something broke but the application can continue.
CRITICAL: Something broke and the application might be dying.

In production, you typically run at INFO level. So anything you log at DEBUG won't show up. Reserve DEBUG for things like 'cache hit' or 'parsing started'. Use INFO for 'user logged in' or 'order completed'.

Watch your cardinality

Every unique field value creates a new series in your monitoring system. If you log something like:

logger.info('request', url=request.url)

And your app has a million different URLs, that's a million different series. Some log aggregators will hate you for this. Instead, log the route pattern:

# BAD - high cardinality
logger.info('request', url='/users/12345/orders/67890')

# GOOD - low cardinality
logger.info('request', route='/users/:user_id/orders/:order_id', user_id=12345, order_id=67890)

Now you can still filter by user_id or order_id, but you're not creating a new series for every unique URL.

Performance considerations

Logging has overhead. Structured logging has slightly more overhead than string formatting (you're building dicts, running processors, serializing to JSON). In practice, it's negligible, but here are some tips:

Lazy evaluation

Don't do expensive work before calling the logger:

# BAD - computes stats even if debug logging is disabled
expensive_stats = compute_statistics(huge_dataset)
logger.debug('stats_computed', stats=expensive_stats)

# GOOD - only computes if debug is enabled
if logger.isEnabledFor(logging.DEBUG):
    expensive_stats = compute_statistics(huge_dataset)
    logger.debug('stats_computed', stats=expensive_stats)

Avoid logging in hot loops

If you're processing thousands of items per second, don't log each one:

# BAD - will kill your performance
for item in millions_of_items:
    logger.debug('processing_item', item_id=item.id)
    process(item)

# GOOD - log batches
for i, item in enumerate(millions_of_items):
    process(item)
    if i % 1000 == 0:
        logger.info('batch_processed', count=i)

Or use sampling:

import random

for item in millions_of_items:
    process(item)
    if random.random() < 0.01:  # Log 1% of items
        logger.debug('processing_item', item_id=item.id)

Working with different environments

You want different logging behavior in development, staging, and production. Here's a config pattern that works:

import structlog
import sys

def configure_logging(environment):
    processors = [
        structlog.stdlib.add_log_level,
        structlog.processors.TimeStamper(fmt='iso'),
    ]

    if environment == 'development':
        processors.append(structlog.dev.ConsoleRenderer())
    else:
        processors.extend([
            structlog.processors.StackInfoRenderer(),
            structlog.processors.format_exc_info,
            structlog.processors.JSONRenderer()
        ])

    structlog.configure(
        processors=processors,
        wrapper_class=structlog.stdlib.BoundLogger,
        cache_logger_on_first_use=True,
    )

# In your app startup
configure_logging(settings.ENVIRONMENT)

Development gets pretty colors. Production gets JSON. Easy.

Integration with monitoring tools

The real power of structured logging shows up when you connect it to modern observability platforms.

Datadog

import structlog

structlog.configure(
    processors=[
        structlog.stdlib.add_log_level,
        structlog.processors.TimeStamper(fmt='iso'),
        structlog.processors.JSONRenderer()
    ]
)

logger = structlog.get_logger()
logger.info('user_signup', user_id=123, plan='pro', source='organic')

In Datadog, you can now create dashboards and alerts based on these fields:
- Graph signups by plan over time
- Alert when signup failures spike
- Filter all events for a specific user_id

Elasticsearch/Kibana

Same logs, different platform. Your JSON logs go straight into Elasticsearch:

{"event": "user_signup", "user_id": 123, "plan": "pro", "timestamp": "2025-06-15T10:30:00Z"}

Now in Kibana you can:
- Create visualizations of user behavior
- Search for all events in a time range
- Build dashboards tracking business metrics

The key is that you wrote the logging code once, and it works with any platform that understands JSON. You're not locked into a vendor-specific API.

Sentry integration

For error tracking, you can bind Sentry context to structlog:

import structlog
import sentry_sdk

def add_sentry_context(logger, method_name, event_dict):
    # Add structured log data to Sentry breadcrumbs
    sentry_sdk.add_breadcrumb(
        category='log',
        message=event_dict.get('event', ''),
        level=method_name,
        data=event_dict
    )
    return event_dict

structlog.configure(
    processors=[
        add_sentry_context,
        structlog.processors.TimeStamper(fmt='iso'),
        structlog.processors.JSONRenderer()
    ]
)

Now when an error gets sent to Sentry, it includes all your structured log context as breadcrumbs. You can see exactly what the user was doing before the error happened.

Migrating from stdlib logging

Already have a codebase using logging? You don't have to rewrite everything at once. Start with new code and gradually migrate:

import structlog
import logging

# Old code can keep working
old_logger = logging.getLogger(__name__)
old_logger.info('This still works')

# New code uses structlog
new_logger = structlog.get_logger()
new_logger.info('new_feature_started', user_id=123)

Both will work side-by-side. When you have time, convert the old logging calls. The migration is usually just:

# Before
logger.info(f'User {user_id} completed action {action}')

# After
logger.info('action_completed', user_id=user_id, action=action)

Takes like 30 seconds per call.

Migration strategy that works

Here's how I approach it:

Configure structlog to capture stdlib logs (shown earlier). This ensures everything ends up in the same format.
Start with new features. Any new code you write uses structlog from day one.
Convert as you touch files. When you're fixing a bug or adding a feature to an existing file, convert its logging calls while you're in there.
Don't stress about 100% coverage. Some old code that rarely changes can stay on stdlib logging. It's fine.

The gradual approach works. I've migrated multiple production codebases this way, and it's never caused issues. The two systems coexist peacefully.

The bottom line

I've used both stdlib logging and structlog in production. I've debugged incidents at 3am with both. And I can tell you: structured logging is just better.

You spend less time grepping through logs looking for patterns. You spend less time writing regex parsers. You spend more time actually fixing problems.

Is it more to set up initially? Barely. Is it worth it? Absolutely.

Start your next project with structlog. Or add it to your current one. Future you will thank present you.