Keep your protos in one repo

[-]Dagon4y30

Upvoted for discussion, but I disagree with the advice. I've spent way too much time trying to harmonize dependencies and track down weird versioning problems between dev and production systems. These things aren't easy in independent repos, but they're nigh-impossible in giant monorepos with almost-disjoint projects and very weak separation between service models and clients.

[-]RobertM4y10

Was this specifically with protos? "very weak separation between service models and clients" doesn't sound like something that'd happen with protos, since clients are generated from the service models directly.

Can you go into more detail on the specific failure modes you ran into that seemed to be downstream of everything living in a monorepo? I agree you need to be more careful about maintaining proper separation of concerns, but I'm not seeing how monorepos would be more likely to cause versioning issues across environments. I can imagine that if protos didn't have a build step, you might run into problems similar to e.g. dynamic linking (or resolution at runtime like with protobufjs), and it might be easier to establish a versioning strategy other than "just pull latest main, it's fine!" with separate repos, but that's typically not how protos work. I guess there can be a similar failure mode if you tell your build system to resolve proto dependencies at build time by compiling from whatever protos are currently in the repo, instead of using a fixed published version for each target? I might not be understanding what you're pointing at, though.

[-]Dagon4y20

this was not specifically with protos, it's possible that it makes sense to centralize those in some way (but it still worries me that you'd have to version a bunch of diverse service definitions together).

[-]RobertM4y30

Protos have some relatively unique characteristics and pathologies that push me to recommend this pattern, where I wouldn't necessarily do so for other "shared dependencies" (though I think I probably still favor a monorepo in most situations).

They're fairly distinct as far as build targets go, since you often end up generating packages in multiple languages, and the relevant versioning strategy is per-build-target, not per-source-file. It is important to avoid the issues with inappropriate shared dependencies across protos that I mentioned in the article because that is one way you run into trouble with versioning, but imo the solution to that is "programmatically enforce proper separation of concerns within the monorepo".

^{^}

If you find yourself writing such tooling, stop and ask yourself: "Is there any other way I can achieve the same functionality?"

There are almost always functional alternatives. They are probably better than signing yourself up for managing cross-repo proto imports.

If you aren't sure about your situation, drop me a line (robert@protocall.dev).

^{^}

A concern occasionally raised about the adoption of protobufs/gRPC is the additional complexity that some features bring, like having a strict schema and requiring backwards-compatible changes, when compared to opposed to alternatives like JSON. This concern mostly elides the fact that the essential complexity is effectively unchanged by the choice of IDL and serialization format; some choices (such as protobufs) simply make the complexity more legible and difficult to ignore. This is a topic that deserves its own post, so I might come back to it in the future.

^{^}

Blue and green are inspired by this fable.

^{^}

Each engineer would pay a cost that scaled linearly with respect to the number of other repos they might need to interact with. As an organization grows in size, this complexity could be modeled by (# of employees * # of repos). If the number of repos scales linearly with engineering headcount, which seems like a reasonable approximation, we get n/r edges, for n employees and r employees per repo. This is similar to Graičiūnas' model for how many relationships between subordinates a manager would need to oversee.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

5

Keep your protos in one repo

5

5

Here are some (valid) reasons for not keeping your proto files in one repo.

Locality

Differential Processes, Tooling, and Norms

Avoiding (Inappropriate) Shared Dependencies

However, the upsides of keeping your proto files in one repo are much stronger.

Locality

Consistent Processes, Tooling, and Norms

The bottom line is straightforward.