I spent four years extracting services from a monolith and the next four quietly re-merging them. The modular monolith isn’t a defeat — it’s a design choice that keeps optionality open and the deploy story boring. This post is the case for picking it on purpose, with the receipts from the migration that taught me the lesson the hard way.
The extraction era
From 2014 to 2018, I worked on hospital platforms (AFAQ, HAKEEM),
recruitment platforms (Talentera), and a few others where the conversation
always drifted the same way: “we should break this up into services.”
Services were how you proved the codebase was Modern. Kubernetes was how
you proved your infra team was Modern. A dozen repos with pyproject.toml
files was how you proved your engineering practice was Modern.
We did it. Most of those extractions shipped. Some of them even worked.
Here’s the part nobody says out loud: the thing that made most of those services “work” was that they stayed tightly coupled to the original database. The service boundaries were illusions. The transaction boundaries were not. When we had to coordinate changes across the new services — which happened roughly weekly, because our domain boundaries didn’t match our service boundaries — we did it with cron jobs, Slack coordination, and hope.
The re-merge era
From 2020 at Bytro onward, the pattern reversed. We started with an event-driven legacy system that had been split into plausible-looking services and quietly re-pulled them into a single deployable that we called, depending on the audience:
- “Modular monolith” in design docs.
- “The big thing” in engineering meetings.
- “A reasonable architecture that nobody writes conference talks about” in private.
This was not a downgrade. The things we got back:
- Single deployment. One CI pipeline. One rollback. One version-skew problem to reason about during an incident.
- Refactorable boundaries. Want to move a function from module A to module B? It’s an IDE action, not an RFC.
- Honest transactions. If two operations needed to be atomic, they could be. If they didn’t, we still split them at the module boundary — but we didn’t lie about the atomicity guarantee to buy a deployment story.
- Faster local dev.
docker compose upopens one process, not nine. - Cheaper observability. One service’s traces are a tree. Nine services’ traces are a graph — and the graph is lying to you about half the edges because half your spans didn’t propagate the trace context properly.
The thing nobody tells you about microservices
The reason microservices ship is organisational, not technical. They let teams deploy independently. That’s the whole pitch, and it’s a good one — if you have more than one team whose deploy cadence is actually in conflict.
If you have three teams that deploy to the same staging environment on the same days in the same order, you don’t have a deployment-coupling problem. You have three teams and one monolith, and that’s a working arrangement.
The pathology I’ve watched, at three companies now, is this:
- An eight-person team builds a monolith.
- They start to feel friction.
- Someone reads a post about how Netflix does microservices.
- They spend 18 months extracting services.
- At the end, they have eight people and 14 services.
- Each service is owned by nobody in particular.
- Every change still requires coordination — but now across service boundaries, with eventual consistency, with distributed tracing, with a brand-new failure mode called “service X is down but service Y is up and what should service Z do?”
The friction they were feeling in step 2 was not a service-boundary problem. It was an internal-module-boundary problem. Those can be fixed without introducing a network.
What modular actually means
The “modular” in “modular monolith” does work. The discipline looks like:
- Module boundaries enforced by build system, not by
convention. If
orderscan import frombillingat compile time, you do not have a boundary. You have a wish. Use whatever your language supports: Go’s internal packages, Rust’s crates, TypeScript’s project references, Java’s module system. Pick one, enforce it, break the build on violations. - Module boundaries mirror domain boundaries. This is the hard part. You don’t get them right the first time. You refactor them twice, then the right shape emerges. The fact that refactoring them is cheap inside a monolith is precisely why it’s the right place to do the refactoring.
- A pact about what crosses boundaries. Values, never references. Events, not method calls. Published types, not internal types. Once those are in place, later extracting a module to a service is a weekend of work. Without them, the extraction is an 18-month project that produces a distributed monolith.
When to actually break out a service
I am not anti-service. I’ve extracted dozens. The signals I now look for before doing it are all organisational or operational — not technical:
- A specific team wants to deploy on a different cadence, right now, not theoretically in a year.
- A specific workload has a runtime or scaling profile the monolith can’t serve cheaply — e.g., a CPU-hungry ML inference path.
- A specific capability has a different compliance or tenancy boundary — e.g., a PII-heavy surface that needs its own audit trail.
- A vendor integration is naturally a service — e.g., a webhook receiver that has to keep running during a monolith deploy.
The common factor: every one of these is a real, named, concrete pressure. “To be Modern” is not on this list and never will be.
The receipts
This is the part where I’d normally cite a specific p99 number or a quantified deploy-time improvement. Instead I’ll cite the thing that most convinced me:
At Bytro, after we consolidated a previously over-fragmented service topology back into a tight modular monolith, the team’s on-call rotation got quieter. Not because we had fewer incidents — we had roughly the same number. But a “single service got paged” incident is a one-engineer, one-terminal, one-rollback problem. A “six services are confused about each other’s state” incident is a six-engineer conference-call problem, and you don’t get back the sleep you lost to those.
The on-call rotation is the truth-teller about your architecture. If your rotation is healthy, the architecture is working. If your rotation is a folk tax, the architecture is wrong — regardless of what shape it’s drawn in on the whiteboard.
The modern part
I’ll admit the quiet: in 2025 and 2026, the monolith + boring-tech stack is the one I’d ship the most exciting things on. AI-assisted development is vastly easier when the whole codebase fits in an agent’s context window. A modular monolith fits. Nine services with four different languages and sixteen config formats do not.
If you want your AI tooling to be useful, give it a codebase it can read. The 2026 case for the monolith is, unexpectedly, that it’s the fastest architecture for an LLM-driven engineering workflow. Which is, somehow, the most 2026-coded sentence I’ve ever written.