Pants Build System: The Sane Way to Run a Python Monorepo

You’ve got five services, three internal libraries, and a data pipeline all living in one repo. Congratulations — you’ve built a monorepo, and now you’re about to find out why everyone who did this before you looks slightly haunted.

The classic Python toolchain falls apart here fast. pip has no concept of workspace. Poetry’s monorepo support is bolted on and fragile. Makefiles scale until they don’t, and then you spend a Friday debugging why make test-service-b silently skips the library it depends on. Bazel works, but you’ll spend two weeks writing BUILD files before you run a single test.

Pants sits in a different category. It was built specifically for this problem — and unlike Bazel, it actually has a workable Python story out of the box. This is a ground-up guide to adopting it, including the parts the documentation quietly glosses over.

Official repo: https://github.com/pantsbuild/pants


What Pants Actually Does

Before touching config, the mental model matters.

Pants is a build orchestrator with dependency inference. You don’t declare what depends on what in most cases — Pants reads your imports and figures it out. You write from mylib.utils import something, and Pants knows mylib needs to be on the path when running your tests.

It builds everything in hermetic sandboxes. Tests don’t share state. A test for service-a can’t accidentally pull in service-b‘s virtualenv. This sounds annoying until the third time a CI flake turns out to be a path contamination bug you’d never catch locally.

Results are cached — locally and optionally in a remote cache. Run pants test :: on a clean checkout, commit one line in service-a, run again. Only service-a tests re-execute.

That’s the pitch. Now let’s make it work.


Repository Layout

Pants doesn’t mandate a directory structure, but it works best when you’re intentional. A typical Python monorepo:

repo-root/
├── pants.toml          # main Pants config
├── pants               # bootstrap script (checked in)
├── BUILD               # root build file (usually minimal)
├── 3rdparty/
│   └── python/
│       └── default.lock  # generated lockfile
├── src/
│   ├── service_a/
│   │   ├── BUILD
│   │   ├── main.py
│   │   └── tests/
│   │       ├── BUILD
│   │       └── test_main.py
│   ├── service_b/
│   │   └── ...
│   └── mylib/
│       ├── BUILD
│       └── utils.py
└── pyproject.toml      # used for tool config (black, isort, etc.)

The BUILD files are where you declare targets. They look like Python but they’re a DSL — don’t confuse them with setup.py.


Bootstrap and Initial Config

Pants ships as a single binary fetched by a bootstrap script. Check it in:

curl -L -o pants https://static.pantsbuild.org/setup/pants && chmod +x pants

Now pants.toml — this is where you spend most of your early time:

[GLOBAL]
pants_version = "2.21.0"
backend_packages = [
  "pants.backend.python",
  "pants.backend.python.lint.black",
  "pants.backend.python.lint.isort",
  "pants.backend.python.lint.flake8",
  "pants.backend.python.typecheck.mypy",
]

[python]
# Pin the interpreter constraint for the whole repo.
# Pants will refuse to run with a Python version outside this range.
interpreter_constraints = ["CPython>=3.11,<3.13"]

# Where to find third-party requirements
enable_resolves = true
default_resolve = "default"

[python.resolves]
# Named resolves — each maps to a lockfile
default = "3rdparty/python/default.lock"

[black]
version = "black==24.4.2"
# black reads pyproject.toml automatically

[isort]
version = "isort==5.13.2"
args = ["--profile=black"]

[mypy]
version = "mypy==1.10.0"

The interpreter_constraints field is critical. It tells Pants which Python versions are valid across the repo. If your CI uses Python 3.12 and a developer has 3.10 locally, Pants will warn or error depending on what’s available. Get this pinned early.


BUILD Files: What You Actually Write

Most targets are short. Here’s a real-world BUILD for an internal library:

# src/mylib/BUILD

python_sources(
    name="mylib",
    # sources defaults to ["*.py"] — no need to list files
)

And for a service with tests:

# src/service_a/BUILD

python_sources(
    name="service_a",
)

pex_binary(
    name="service_a_bin",
    entry_point="main.py",
    # Pants infers dependencies automatically
)
# src/service_a/tests/BUILD

python_tests(
    name="tests",
    # dependencies inferred from imports in test files
)

That’s it. No DEPS = [...] listing every transitive import. Pants reads from mylib.utils import something and wires it up.

Gotcha #1: Dependency inference works on Python imports, not on filesystem paths. If you have a file that does dynamic imports (importlib.import_module(some_variable)), Pants can’t infer that. You’ll need to add an explicit dependencies list on that target, or use # pants: no-infer-dep annotations.


Lockfiles: The Right Way

This is where Pants diverges sharply from Poetry or pip-tools.

Pants has named resolves — each resolve is an isolated Python environment with its own lockfile. You can have default, data-pipeline, lambda-compat resolves with different constraints and packages. No more "but the lambda needs an older boto3."

First, declare your requirements. You can use a python_requirement target per package, or a single python_requirements target pointing to a requirements.txt:

# 3rdparty/python/BUILD

python_requirements(
    name="default_reqs",
    source="requirements.txt",
)
# 3rdparty/python/requirements.txt
fastapi==0.111.0
httpx==0.27.0
pydantic==2.7.1
pytest==8.2.0
black==24.4.2
mypy==1.10.0
isort==5.13.2

Generate the lockfile:

pants generate-lockfiles --resolve=default

This produces 3rdparty/python/default.lock — a fully resolved, hash-pinned file that goes into version control. Every developer, every CI run, every Docker build uses exactly the same package versions.

To update a package, change requirements.txt and re-run generate-lockfiles. The diff in the lockfile is your audit trail.

Gotcha #2: The lockfile generation requires network access. If you’re in an air-gapped environment or behind a corporate proxy, you need to configure [python-repos] with your internal PyPI mirror. Pants respects PIP_INDEX_URL but you should set it explicitly in pants.toml to avoid surprises:

[python-repos]
indexes = ["https://pypi.internal.corp/simple/"]

Gotcha #3: Multiple resolves means multiple lockfiles. If service_a uses the default resolve and you’re writing tests that span two resolves, you’ll get an error. The right fix is usually to unify resolves, not to fight the error. Treat a proliferation of resolves as a smell — more than 2-3 is usually a sign your dependency graph has real problems that lockfiles can’t fix.


Running Things

# Run all tests
pants test ::

# Run tests for one service only
pants test src/service_a/tests::

# Typecheck everything
pants check ::

# Format and lint
pants fmt ::
pants lint ::

# Build a PEX binary (self-contained Python executable)
pants package src/service_a:service_a_bin

The :: glob means "everything recursively from here." src/service_a:: means "everything under service_a."

The first run is slow — Pants is downloading interpreters, installing packages, warming the cache. Subsequent runs on unchanged code are near-instant. This is the core value proposition: you pay the cost once.


Plugins: Extending Pants in Python

This is where Pants gets genuinely interesting. Plugins are written in Python and use the same rule-based engine that Pants itself is built on. You can add new target types, new linters, custom code generators, anything.

A minimal plugin lives inside the repo under a directory, usually pants-plugins/:

pants-plugins/
├── BUILD
└── my_plugin/
    ├── BUILD
    ├── __init__.py
    └── rules.py

Register it in pants.toml:

[GLOBAL]
pants_version = "2.21.0"
pythonpath = ["%(buildroot)s/pants-plugins"]
backend_packages = [
  "pants.backend.python",
  "my_plugin",  # your plugin package name
]

Here’s a real example: a plugin that enforces that every python_sources target in src/ has a corresponding BUILD entry with a tags field. Useful for tracking ownership.

# pants-plugins/my_plugin/rules.py

from pants.engine.rules import collect_rules, rule
from pants.engine.target import AllTargets
from pants.engine.internals.selectors import Get
from pants.core.goals.lint import LintResult, LintResults, LintTargetsRequest
from dataclasses import dataclass


@dataclass(frozen=True)
class OwnershipLintRequest(LintTargetsRequest):
    name = "ownership-check"


@rule
async def check_ownership(request: OwnershipLintRequest) -> LintResults:
    violations = []
    for target in request.elements:
        if target.address.spec_path.startswith("src/"):
            if not target.get("tags"):
                violations.append(
                    f"{target.address}: missing 'tags' field (needed for ownership tracking)"
                )

    if violations:
        report = "\n".join(violations)
        return LintResults(
            [LintResult(1, "", report, linter_name="ownership-check")],
            linter_name="ownership-check",
        )

    return LintResults(
        [LintResult(0, "All targets have ownership tags", "", linter_name="ownership-check")],
        linter_name="ownership-check",
    )


def rules():
    return collect_rules()
# pants-plugins/my_plugin/__init__.py

from my_plugin.rules import rules


def rules():
    return rules()

Gotcha #4: The Pants plugin API changes between minor versions. Rules that work on 2.19 may need a one-line fix on 2.21. Before writing plugins, check the changelog. It’s not painful, but it’s not zero-friction either. Pin pants_version tightly and do intentional upgrades.


The Adoption Story: What Nobody Warns You About

Here’s the honest version of what a migration from "a pile of requirements.txt files and a Makefile" to Pants actually looks like.

Week one is spent on the bootstrap, not features. You’ll fight interpreter detection. Pants prefers to download its own Python interpreter via python-bootstrap. If your CI image has a non-standard Python installation (common with pyenv or conda), expect friction. The fix is usually:

[python-bootstrap]
# Tell Pants where to find Python instead of downloading
search_path = ["/usr/bin", "/usr/local/bin", "pyenv"]

The BUILD file generation is your friend, use it. Pants has a tailor goal that scans your repo and generates BUILD files:

pants tailor ::

Don’t trust it blindly — it generates generic targets that may not match your actual structure. But it’s a far better starting point than writing BUILD files from scratch. Run it, review the output, commit what makes sense.

Dependency inference fails on ~5-10% of your targets. Plan for it. Dynamic imports, __init__.py re-exports, C extensions, generated code — all require manual dependencies = [] annotations. Keep a list of these as you find them; they cluster around the same patterns.

Lockfile conflicts are the silent killer. When two different parts of the repo need the same package at different versions, you’ll find out at generate-lockfiles time, not at test time. This surfaces real dependency problems you might have been hiding with venv isolation. Treat the surfacing as a feature. Resolve the conflict in requirements.txt, don’t split resolves to paper over it.

CI integration. Add this to your CI pipeline:

# .github/workflows/pants.yml (GitHub Actions)

- name: Setup Pants cache
  uses: actions/cache@v4
  with:
    path: |
      ~/.cache/pants/
    key: pants-${{ runner.os }}-${{ hashFiles('pants.toml', '3rdparty/**/*.lock') }}
    restore-keys: |
      pants-${{ runner.os }}-

- name: Check lockfiles are up to date
  run: pants generate-lockfiles --resolve=default --check

- name: Lint
  run: pants lint ::

- name: Typecheck
  run: pants check ::

- name: Test
  run: pants test ::

The --check flag on generate-lockfiles makes CI fail if someone forgot to commit an updated lockfile after changing requirements. This is the guard you want.


Remote Caching: The CI Multiplier

Once the local setup works, add remote caching. Pants supports the Remote Execution API (REAPI), which means it works with BuildBuddy, EngFlow, or a self-hosted bazel-remote instance.

A minimal self-hosted setup with bazel-remote:

# docker-compose.yml for your CI infrastructure

services:
  bazel-remote:
    image: buchgr/bazel-remote-cache
    ports:
      - "9090:9090"
      - "9092:9092"
    volumes:
      - bazel-cache:/data
    command:
      - --dir=/data
      - --max_size=20
      - --grpc_address=0.0.0.0:9092

volumes:
  bazel-cache:

Then in pants.toml:

[GLOBAL]
remote_cache_read = true
remote_cache_write = true
remote_store_address = "grpc://your-bazel-remote:9092"

With this in place, CI job #2 that runs after a PR push reuses everything CI job #1 computed. For a repo with 50 services, this turns a 20-minute CI run into a 3-minute one for typical PRs.

Gotcha #5: Remote cache write permissions. Don’t give every CI job write access. Have only your main branch builds write to cache; PR builds only read. Otherwise a flaky test that writes a bad cache entry poisons the cache for everyone.


Production-Ready Checklist

Before you call the migration done:

Lockfile hygiene. Add a CI check that generate-lockfiles --check passes. Make it a required status check on merge. Without this, you’ll drift.

Interpreter pinning. Set interpreter_constraints at the root [python] level, not per-target. Per-target overrides create a maintenance headache and usually signal that something else is wrong.

PEX for deployments. Stop shipping Docker images that run pip install at container startup. Build a PEX binary with pants package, copy the single file into the image. It’s reproducible, it’s fast, it starts in milliseconds.

FROM python:3.12-slim
COPY dist/service_a.pex /app/service_a.pex
RUN chmod +x /app/service_a.pex
CMD ["/app/service_a.pex"]

Tag your targets. Use tags = ["team:platform", "svc:service_a"] in BUILD files. Then you can run pants test --tag='team:platform' to scope CI to your team’s targets. Sounds optional, becomes essential once you have more than 10 services.

Watch the pants.log. It’s in .pants.d/pants.log. When something breaks mysteriously, that file has the full backtrace. Most support questions in the Pants Slack have been answered by someone reading that file.


When Pants Isn’t the Answer

Pants is the right tool for Python monorepos with 3+ services or internal libraries that need to be versioned together. It’s overkill for a single-service repo where Poetry or uv will serve you better and faster.

If your team is already deep into Bazel, Pants probably isn’t worth the migration cost — Bazel’s Python support has improved substantially, and the remote execution story is more mature.

If your CI runs take under 5 minutes and you have fewer than 3 developers touching the repo, Pants adds complexity you won’t feel the benefit of. Don’t optimize infrastructure that isn’t a bottleneck.


Where to Go From Here

The Pants documentation at pantsbuild.org is genuinely good for a build tool — it covers the concepts well. The community Slack (pantsbuild.slack.com) is active and the maintainers respond fast.

Start with pants tailor ::, get the tests running, commit the lockfile. Everything else — plugins, remote caching, PEX deployments — can come incrementally. The biggest mistake is trying to do everything in one migration PR. Get the fundamentals working, ship it, then iterate.

The second-biggest mistake is not pinning pants_version. Pin it. Bump it deliberately. The upgrade notes are always worth reading.

Leave a comment

👁 Views: 2,289 · Unique visitors: 1,646