This post is inspired by reading the book Clean Architecture by Robert Martin (Uncle Bob) while working on a project discussing the issue of how code should be shared. In the organization, there was another project working on the almost exact same project, domain wise, but for another customer segment. The two projects got colocated in the same office for working closer together, share knowledge and swap resources and code across the projects.
All of this sounds sane as we knew the projects were that similar. The hard part was to determine how code should be shared between the two projects. Based on this, I learned when to use a shared library, a shared API or duplicating the code and why you should optimize for development early in the project and maintainability later in the project.
Ways of sharing code between projects
Let’s get some definitions in place. From my experience, there are three ways of sharing code between projects (ranging from least coupled to most coupled):
- Dublicate code
- Shared library
- Shared API
All of these have different pros and cons and all can be a good fit for a given project sharing situation. Let’s go through these.
This is the simplest of them all. You simply copy paste the code from other projects repositories into your repository, This gives you complete decoupling from the other projects code base, as you are managing your own copy.
- Easiest and require no setup.
- No technical dependency on the provider of the shared code.
- The copied code can easily be changed without coordinating with the shared code provider.
- The copied code is not automatically maintained by the shared code provider.
A shared library means packaging the shared code into some module using semantic versioning (major, minor and patch numbers) and distributing the code on a package library (eg. NPM). This sharing method requires a technical provider as it needs both tools for packaging (should be automated and run on the CI server), versioning (can be automated using Git commit conventions and standard version) and distribution (either a public or private package library).
- Eases maintenance with relatively low coupling, as you can get a specific build of the shared module and can update as you desire.
- Separation of concerns as the shared library is in its own module and is an external library in the application code.
- Technical dependencies between teams starts to occur.
- Teams need to agree on what should be developed in the shared library, which is slowing down development time.
- Needs people to maintain the shared library with correct semantic versioning.
- Tedious to upgrade shared library on all clients for every change.
Monorepo fixes many of the pains in the shared library approach. This is why it is being used by some of the dominating companies in the industry like Google and Facebook. Monorepo means to have all the code in one repository which is making code sharing easier, as the different projects in the monorepo can just import the shared code. Likewise, it is very easy to share code, as you just need to copy it to a shared folder in the monorepo.
- The easiest and most effective way of working with codesharing, given the right tooling is in place
- Easy to share code
- No need to upgrade consuming clients
- No need to version the shared code
- No NPM link
- No versioning on the shared code
- Requires more tooling on CI, such as only running test and builds for affected projects
This is my recommended approach for code-sharing. Read more about how to implement the monorepo approach here.
Another option is to create a shared API. This will require a server for hosting the API and versioning of breaking changes according to the service level agreement (SLA) between the clients and the shared API provider.
- Interacting with the shared library using a communication protocol (eg. HTTP), making it platform agnostic, as clients and shared code provider can be implemented in two completely different technologies like .NET and NodeJS.
- Need to maintain and host the shared API.
- More semantic coupling between client and library as the client don’t have a snapshot of the shared code stored locally like for the shared library.
How do these code styles work together with different projects?
One of the main points of books like The Phoenix project and The Goal is that all optimization that is not on the bottleneck is an illusion. Converted to this context, this means that you should focus on solving your biggest pains you have at the moment of time. Don’t guess what the future problems might be and optimize too far ahead or you may not gonna need it (YAGNI).
When working on a brand new project, your focus should be to get a proof of concept (POC) or a minimum viable project (MVP) ready as fast as possible, so you can get your first feedback cycle. That means that you should optimize for development early in the project and can postpone focusing on how the system should be maintained.
If you connect this with the types of sharing code this means, that you should probably simply duplicate code, as this is the easiest and fastest way and you don’t know yet if and how your project is gonna diverge from the shared code base. Like in the agile methodology you simply postpone what you don’t need right now until it becomes very necessary and you have the most knowledge:
As the project is maturing you start having a better sense of what is gonna be hard to maintain in a shared code library and what is gonna be more stable. You might find that basic parts of the application that are not specific to your application domain, like auth and logging, can be generalized and shared for other projects to benefit because it won’t diverge much in the future. For this, you might use a shared code library as this still gives some decoupling with the semantic versioning, because the clients decide when to update the shared code.
As your application becomes mature you have probably moved more of your non-application domain-specific code to a shared library because the code doesn’t change drastically and other projects can easily benefit from it.
When the project becomes very mature and it has passed the test of time, you are even gonna know which part of the application-domain code is not going to change drastically. If other projects in the organization work with the same application domain, there might be parts that could be moved to a shared library.
If the sharing projects are using different technologies and don’t need to have a specific build of the shared code stored library, it might make sense to push some of the library code to a shared API. As mentioned before this creates more coupling (and even semantic coupling, the worst kind) as you are depending on an API that probably only is versioning major changes. This can make the systems more fragile and require more coordination between the project and a service level agreement (SLA).
We went through the different ways of sharing code between projects, going from the ones that create the least coupling to the most: duplicating code, shared library, monorepo and shared API, and the pros and cons of each. We looked at what kind of code sharing technique you should use in what phase of the project and my recommendation is simple: optimize for development early in the project and ease maintenance later when you know what and how code should be shared. This means start with the code sharing techniques that create the least coupling, such as duplicating code, and then, later in the project, maintenance can be eased with a shared library, Monorepo or a shared API. Just understand that you pay with more coordination effort due to the tighter coupling between projects which slows down development time.