Making CLI Tools in Python

Making CLI Tools in Python

Python is great for CLI tools. I've built a bunch of them - for automating workflows, data processing, interacting with APIs. Here's how I structure them.

Why Python for CLI tools?

Before diving into the how, let's talk about why Python is my go-to for CLI tools. It's not the fastest language, and it's not the most portable (you need Python installed), but it hits a sweet spot:

  • Quick to write: You can prototype a tool in minutes, not hours
  • Batteries included: The standard library covers most common tasks
  • Great ecosystem: Need to parse JSON, YAML, TOML? There's a package. Need to make HTTP requests? requests or httpx. Need to parse HTML? beautifulsoup4. You get the idea.
  • Easy to distribute: Once you know the tricks (which I'll show you), sharing tools is trivial

I've written CLI tools in Rust, Go, and Shell scripts too. Rust is great when you need performance or single-binary distribution. Go is solid for system tools. But for most tasks? Python gets me to done faster.

The basic structure

I usually organize CLI tools like this:

my-tool/
  pyproject.toml
  README.md
  src/
    my_tool/
      __init__.py
      cli.py      # Entry point
      core.py     # Actual logic
      utils.py    # Helper functions
  tests/
    test_core.py

Keep the CLI parsing separate from the business logic. That way you can test the logic independently and potentially reuse it as a library. I learned this the hard way after writing a data processing tool where the parsing logic was so tangled with argparse that I couldn't reuse any of it when I needed the same functionality in a web service.

Here's a concrete example. Instead of this:

# Bad: everything mixed together
def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('file')
    args = parser.parse_args()

    # Logic buried in CLI code
    with open(args.file) as f:
        data = json.load(f)
        for item in data:
            # ... 50 lines of processing

Do this:

# Good: separated concerns
# core.py
def process_data(file_path: Path) -> List[Result]:
    with open(file_path) as f:
        data = json.load(f)
        return [transform(item) for item in data]

# cli.py
def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('file')
    args = parser.parse_args()

    results = process_data(Path(args.file))
    display_results(results)

Now process_data can be imported and tested independently. You can use it in a web app, a notebook, wherever.

Start with argparse

It's built in and good enough for most things:

import argparse
import sys

def main():
    parser = argparse.ArgumentParser(
        description='Process some files'
    )
    parser.add_argument('filename', help='File to process')
    parser.add_argument(
        '-v', '--verbose',
        action='store_true',
        help='Enable verbose output'
    )
    parser.add_argument(
        '-o', '--output',
        default='output.txt',
        help='Output file (default: output.txt)'
    )

    args = parser.parse_args()
    # Do stuff with args.filename, args.verbose, args.output

if __name__ == '__main__':
    main()

argparse gives you --help for free and handles a lot of edge cases. Don't reinvent this wheel.

argparse tips I wish I knew earlier

Use type for validation:

def positive_int(value):
    ivalue = int(value)
    if ivalue <= 0:
        raise argparse.ArgumentTypeError(f"{value} must be positive")
    return ivalue

parser.add_argument('--timeout', type=positive_int, default=30)

Now argparse validates the input for you. No need to check it manually later.

Use choices for enums:

parser.add_argument(
    '--format',
    choices=['json', 'yaml', 'toml'],
    default='json',
    help='Output format'
)

Users get validation and tab completion (if they have it set up).

Multiple values with nargs:

# Accept multiple files
parser.add_argument('files', nargs='+', help='Files to process')

# Optional multiple values
parser.add_argument('--exclude', nargs='*', default=[])

nargs='+' means "one or more", nargs='*' means "zero or more".

Make it installable

The magic is in pyproject.toml:

[project]
name = 'my-tool'
version = '0.1.0'

[project.scripts]
my-tool = 'my_tool.cli:main'

After pip install -e . (or uv pip install -e .), you can run my-tool from anywhere on your system. No more python -m my_tool or path shenanigans.

Publishing your tool

When you're ready to share it, uv makes publishing trivial:

uv build
uv publish

This builds a wheel and source distribution, then uploads to PyPI. Users can then install with pip install my-tool or even better, uv tool install my-tool which installs it in an isolated environment. I've found uv tool to be the cleanest way to distribute CLI tools - no virtual environment pollution, and users can upgrade or uninstall without affecting anything else.

Make output pretty with rich

The rich library makes terminal output look professional:

from rich.console import Console
from rich.progress import track
from rich.table import Table

console = Console()

# Colored output
console.print('[green]Success![/green]')
console.print('[red]Error:[/red] Something went wrong')

# Progress bars
for item in track(items, description='Processing...'):
    process(item)

# Nice tables
table = Table(title='Results')
table.add_column('Name')
table.add_column('Status')
table.add_row('thing1', '[green]OK[/green]')
table.add_row('thing2', '[red]Failed[/red]')
console.print(table)

rich makes your tool feel polished with minimal effort. Users notice.

More rich tricks

Spinners for long operations:

from rich.spinner import Spinner

with console.status('[bold green]Fetching data...') as status:
    data = fetch_from_api()  # Long operation
    status.update('[bold green]Processing...')
    process(data)  # Another long operation

Syntax highlighting:

from rich.syntax import Syntax

code = '''
def hello():
    print("world")
'''

syntax = Syntax(code, "python", theme="monokai", line_numbers=True)
console.print(syntax)

This is incredible for tools that generate code or show diffs.

Panels for important info:

from rich.panel import Panel

console.print(Panel(
    "[bold red]Warning:[/bold red] This will delete all files!",
    title="Danger Zone",
    border_style="red"
))

Handle errors gracefully

Don't just let exceptions bubble up. Catch them and show a nice message:

def main():
    try:
        args = parse_args()
        run(args)
    except FileNotFoundError as e:
        console.print(f'[red]Error:[/red] File not found: {e.filename}')
        sys.exit(1)
    except KeyboardInterrupt:
        console.print('\\n[yellow]Cancelled by user[/yellow]')
        sys.exit(130)
    except Exception as e:
        console.print(f'[red]Unexpected error:[/red] {e}')
        sys.exit(1)

Exit codes matter. 0 means success, non-zero means failure. Use them correctly.

Common exit codes

Here are the conventions I follow:

  • 0: Success
  • 1: General error
  • 2: Misuse of command (bad arguments)
  • 130: Terminated by Ctrl+C (128 + SIGINT)

You can also use specific codes for your tool:

# exit_codes.py
SUCCESS = 0
INVALID_INPUT = 2
FILE_NOT_FOUND = 3
NETWORK_ERROR = 4
PERMISSION_ERROR = 5

# cli.py
from .exit_codes import *

def main():
    try:
        result = process()
    except FileNotFoundError:
        console.print('[red]File not found[/red]')
        sys.exit(FILE_NOT_FOUND)
    except PermissionError:
        console.print('[red]Permission denied[/red]')
        sys.exit(PERMISSION_ERROR)

This is helpful for scripts that call your tool and need to know why it failed.

Logging done right

Most CLI tools don't need logs, but when you do, keep it simple:

import logging

def setup_logging(verbose: bool):
    level = logging.DEBUG if verbose else logging.INFO
    logging.basicConfig(
        format='%(asctime)s - %(levelname)s - %(message)s',
        level=level
    )

def main():
    args = parse_args()
    setup_logging(args.verbose)

    logging.info('Starting process')
    logging.debug('Using config: %s', config)

For tools that run unattended (cron jobs, CI), logging is essential. Add a --log-file option to write logs somewhere permanent. But for interactive tools, keep it minimal - users don't want walls of debug output cluttering their terminal.

When to use click instead

If you have subcommands (like git status, git commit), click makes it much easier:

import click

@click.group()
def cli():
    pass

@cli.command()
@click.argument('name')
def create(name):
    '''Create a new thing'''
    click.echo(f'Creating {name}')

@cli.command()
@click.option('--all', is_flag=True)
def list(all):
    '''List all things'''
    click.echo('Listing...')

click uses decorators which some people love and some hate. I reach for it when I need subcommands; otherwise argparse is simpler.

Testing CLI tools

The trick is to separate your logic from the CLI parsing:

# In core.py - pure logic, easy to test
def process_file(path: Path, verbose: bool) -> Result:
    ...

# In cli.py - thin wrapper
def main():
    args = parse_args()
    result = process_file(args.filename, args.verbose)
    display_result(result)

Now you can test process_file directly without dealing with command-line parsing.

For integration tests that do need to test the CLI:

import subprocess
import sys

def test_cli_basic():
    result = subprocess.run(
        [sys.executable, '-m', 'my_tool', 'input.txt'],
        capture_output=True,
        text=True
    )
    assert result.returncode == 0
    assert 'Success' in result.stdout

def test_cli_error_handling():
    result = subprocess.run(
        [sys.executable, '-m', 'my_tool', 'nonexistent.txt'],
        capture_output=True,
        text=True
    )
    assert result.returncode != 0
    assert 'not found' in result.stderr.lower()

This actually runs your CLI as a subprocess, testing the whole thing end-to-end.

Bonus: config files

For tools with lots of options, let users put them in a config file:

from pathlib import Path
import tomllib

def load_config():
    config_path = Path.home() / '.config' / 'my-tool' / 'config.toml'
    if config_path.exists():
        return tomllib.loads(config_path.read_text())
    return {}

Command-line args override config file, which overrides defaults. Users can set their preferences once and forget about them.

Here's how to merge them properly:

def get_settings(args) -> dict:
    # Start with defaults
    settings = {
        'timeout': 30,
        'retries': 3,
        'verbose': False,
    }

    # Override with config file
    config = load_config()
    settings.update(config)

    # Override with command line args
    if args.timeout is not None:
        settings['timeout'] = args.timeout
    if args.retries is not None:
        settings['retries'] = args.retries
    if args.verbose:
        settings['verbose'] = args.verbose

    return settings

Working with stdin/stdout

Good CLI tools play well with pipes. Here's the pattern:

import sys

def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('file', nargs='?', help='Input file (or stdin)')
    args = parser.parse_args()

    # Read from file or stdin
    if args.file and args.file != '-':
        with open(args.file) as f:
            data = f.read()
    else:
        data = sys.stdin.read()

    result = process(data)

    # Write to stdout
    print(result)

Now your tool works in pipelines:

cat data.txt | my-tool | grep important
my-tool input.txt | other-tool
echo "test" | my-tool

Detecting interactive vs piped output

When piped, you might want to skip colors and progress bars:

import sys

def is_interactive() -> bool:
    return sys.stdout.isatty()

def main():
    if is_interactive():
        console = Console()  # Colors and fancy output
    else:
        console = Console(force_terminal=False)  # Plain text

    # Or just check before showing progress
    if is_interactive():
        for item in track(items):
            process(item)
    else:
        for item in items:
            process(item)

Environment variables

For sensitive data (API keys, tokens), use environment variables:

import os

def get_api_key() -> str:
    key = os.environ.get('MY_TOOL_API_KEY')
    if not key:
        console.print('[red]Error:[/red] MY_TOOL_API_KEY not set')
        console.print('Set it with: export MY_TOOL_API_KEY=your-key')
        sys.exit(1)
    return key

Or use python-dotenv to load from a .env file:

from dotenv import load_dotenv

load_dotenv()  # Loads .env from current directory or parent
api_key = os.environ.get('MY_TOOL_API_KEY')

Never put secrets in config files that might get committed to git.

Performance tips

Python CLI tools can be slow to start. Here's how to keep them snappy:

Lazy imports

Don't import heavy libraries until you need them:

# Instead of this at the top
import pandas as pd
import numpy as np

def main():
    if args.analyze:
        # Do pandas stuff
        pass

# Do this
def main():
    if args.analyze:
        import pandas as pd  # Only import if needed
        # Do pandas stuff
        pass

This can shave hundreds of milliseconds off startup time.

Cache expensive operations

If you're fetching data from an API or processing large files repeatedly:

from pathlib import Path
import json
import hashlib
from datetime import datetime, timedelta

def get_cache_path(url: str) -> Path:
    cache_dir = Path.home() / '.cache' / 'my-tool'
    cache_dir.mkdir(parents=True, exist_ok=True)
    filename = hashlib.md5(url.encode()).hexdigest()
    return cache_dir / f'{filename}.json'

def fetch_with_cache(url: str, max_age: timedelta = timedelta(hours=1)):
    cache_path = get_cache_path(url)

    # Check if cache exists and is fresh
    if cache_path.exists():
        age = datetime.now() - datetime.fromtimestamp(cache_path.stat().st_mtime)
        if age < max_age:
            return json.loads(cache_path.read_text())

    # Fetch fresh data
    import httpx
    response = httpx.get(url)
    data = response.json()

    # Save to cache
    cache_path.write_text(json.dumps(data))
    return data

Users appreciate fast tools.

Real-world example: putting it together

Here's a complete example of a tool that fetches GitHub repo stats:

# github_stats.py
import argparse
import sys
from pathlib import Path
from rich.console import Console
from rich.table import Table
import httpx

console = Console()

def fetch_repo_stats(owner: str, repo: str, token: str = None):
    headers = {}
    if token:
        headers['Authorization'] = f'token {token}'

    url = f'https://api.github.com/repos/{owner}/{repo}'

    try:
        response = httpx.get(url, headers=headers, timeout=10)
        response.raise_for_status()
        return response.json()
    except httpx.HTTPStatusError as e:
        if e.response.status_code == 404:
            console.print(f'[red]Repository not found:[/red] {owner}/{repo}')
        else:
            console.print(f'[red]HTTP error:[/red] {e}')
        sys.exit(1)
    except httpx.RequestError as e:
        console.print(f'[red]Network error:[/red] {e}')
        sys.exit(1)

def display_stats(data):
    table = Table(title=f"{data['full_name']}")
    table.add_column('Metric', style='cyan')
    table.add_column('Value', style='green')

    table.add_row('Stars', str(data['stargazers_count']))
    table.add_row('Forks', str(data['forks_count']))
    table.add_row('Open Issues', str(data['open_issues_count']))
    table.add_row('Language', data['language'] or 'N/A')

    console.print(table)

def main():
    parser = argparse.ArgumentParser(
        description='Fetch GitHub repository statistics'
    )
    parser.add_argument('repo', help='Repository in owner/repo format')
    parser.add_argument(
        '--token',
        help='GitHub token (or set GITHUB_TOKEN env var)'
    )

    args = parser.parse_args()

    # Parse owner/repo
    try:
        owner, repo = args.repo.split('/')
    except ValueError:
        console.print('[red]Error:[/red] Repository must be in owner/repo format')
        sys.exit(2)

    # Get token from args or env
    import os
    token = args.token or os.environ.get('GITHUB_TOKEN')

    # Fetch and display
    with console.status('[bold green]Fetching stats...'):
        data = fetch_repo_stats(owner, repo, token)

    display_stats(data)

if __name__ == '__main__':
    main()

This demonstrates:
- Argument parsing with validation
- Error handling for network and HTTP errors
- Environment variables for secrets
- Rich output with tables and spinners
- Proper exit codes
- Separation of concerns (fetch, display, main)

Going further

Once you've mastered the basics, explore:

  • Typer: A modern alternative to argparse and click, built on type hints
  • Questionary: For interactive prompts (great for setup wizards)
  • sh: For running shell commands cleanly
  • textual: For building full TUI (terminal UI) apps with widgets
  • invoke: For task automation (like make but in Python)

Wrapping up

CLI tools are one of my favorite things to build. They're practical, they solve real problems, and they're satisfying to use. Start simple with argparse and rich, keep your logic separate from your parsing, and iterate based on feedback.

The best CLI tools are the ones that just work. They give helpful errors, they're fast, they play nice with other tools. Focus on the user experience and the rest follows.

Send a Message