How to setup our codebase for efficient code sharing and development?

Question

Our situation

At first, our company had 1 product. Custom hardware with firmware we wrote ourselves.

Now more projects are starting to be added. Many can reuse most of the components of our first product, but of course the business logic is different. Also the hardware could change, and the remote device monitoring interfaces, as the sensors and available data could change.

Now we are looking at how to structure and manage our codebase. Currently we are leaning towards making a repository that will include all the non-project-specific firmware code. This includes battery management, remote device management skeleton, hardware drivers, etc. Everything that the different projects may share. This way, fixes and new features for these modules only need to be committed once.

Furthermore, we would create repositories per project, where the project-specific code is stored.

I think this is called multi-repo.

My thoughts

Project setup and management becomes harder (it would e.g. perhaps need a script to get the right version of the non-project-specific repo)
Each project can have its own rules (branching strategy
We would have to setup CI for each extra repo (build validation, code style, policies)
Because of 1-3, would monorepo be better? Won't build validation and such become a lot harder because not all code is meant to be together (e.g. different projects)? How do we keep our freelancers out of the code they don't need?
Are there other (better) alternatives in our case?

Would you consider what you have to be a software product line? If so, there are different techniques for managing product lines that can be applied. You can derive different repository structures based on how you build and configure variants. If you don't have a product line, that's a different problem with a different set of possible solutions. — Thomas Owens, Commented Jan 18, 2021 at 12:25
@ThomasOwens Yes, that describes our goal remarkably closely. We will have products with very much overlap, and it's very desirable if most changes would be configuration-only. What complicates it, is our support of third party external sensors. In that case, we would always need to roll out a specific software version that supports that new hardware. — Kodiak, Commented Jan 18, 2021 at 12:36

Flater · Accepted Answer · 2021-01-18 11:15:22Z

If you've never seen or heard about snow in your life, and I put in in the middle of a snowy field, you're going to be asking yourself things that seem naive to someone who has a modicum of experience with snow.

That is what you've done here. You come from what most of us would call a very outdated concept, you're on the cusp of entering our world (i.e. that of modern day development standard), and you're asking questions from a "funnily outdated" perspective, worrying about things that we don't even have to think about.

It's perfectly understandable to not really "get it" right away, as you haven't worked in this new system yet and don't quite see how it balances itself. But it's also hard to respond to every possible question you come up with, because the premise of some of them is very misguided and it requires a repeated back and forth to find out where the premise went off the rails.

I did my best to answer your specific worries below, but it often boils down to "believe me that it is X". It is good to be a critical thinker, and you've put your due diligence in this question. But it's also good to sometimes realize that you don't quite understand something yet, and thus rely on the fact that (a) others tell you it is good and (b) many, many others are using and advocating for this system you're inexperienced with.

Project setup and management becomes harder (it would e.g. perhaps need a script to get the right version of the non-project-specific repo)

Quite logically, having more than one repository brings with it the overhead cost of having to handle multiple repositories. That is inevitable logic.

But without a shadow of a doubt, the overhead cost is massively worth what you get in return. There's a reason why pretty much every modern development team uses it. It cuts down on so much conflicts, repetition, and code juggling.

Each project can have its own rules (branching strategy)

Unless there is a project that has a concrete reason to follow a branching strategy, I strongly suggest that every project follows the same branching strategy.

Uniformity brings with it an innate understanding how things work even when you're new to the project. If you've worked with projects A, B and C, all with the same structure and branching strategy, then you're going to be able to hit the ground running when you start on project D with the same structure and branching strategy.

Note that a lot of modern day development principles are all about reducing complexity as much as possible to keep things manageable. Having wildly varying branching strategies is an added complexity that you don't need.

We would have to setup CI for each extra repo (build validation, code style, policies)

You said you were going from one company project to now having multiple projects. Having multiple build pipelines was always going to happen.

Note that coding style validation should be uniform across projects (see previous point), and potentially whatever you mean by "policies" as well.

Secondly, Having extra pipelines isn't particularly an issue. They're mostly copy/pasteable, with some small alterations between projects. But it's usually a matter of setting it up once and then potentially tweaking it a handful of times.

Setting up a build pipeline for a project is not a meaningful amount of work relative to the development of the project itself.

[..] Won't build validation and such become a lot harder because not all code is meant to be together (e.g. different projects)?

Your end products, i.e. the solutions that are the deliverable project, will still run their build including the common libraries (i.e. your non-project-specific code).

So what you're worried about is a non-issue. The build of a project still validates the entire stack of code, as it should.
The build of your non-project-specific code obviously doesn't include any project-specific code, because it is specifically project-agnostic. But that's the idea, so not an issue.

[..] How do we keep our freelancers out of the code they don't need?

I don't quite understand. How would you be doing it on a monolithic repository?

Having separate repositories enables you to provide access control to specific repositories. It doesn't create a problem here, it provides a solution.

Because of 1-3, would monorepo be better?

Other than the feedback I already provided on point 1-3, monorepo is just not a good approach.

Every time we develop a new technology, we first make it monolithic. The first car had its parts welded together. The first computer had no discrete components and was just one big circuit. The first application was a single-file single-project application. The first code was not OOP and instead used global statics.

And with all those cases, once they started improving on it, they noticed that they needed to subdivide things more, so that things would be easier to build, easier to fix, and easier to configure.

Monorepo (for multiple projects with an independent life cycle) has many, many disadvantages. You're not really perceiving them because they're currently your daily bread and you've got a handle on them. But take it from me who's standing on the other side of the bridge, that those disadvantages will dissipate when you move over. In exchange, there'll be some overhead management required, but the good outweighs the bad by several orders of magnitude.

Currently we are leaning towards making a repository that will include all the non-project-specific firmware code.

You've not really discussed how your projects are going to reference your libraries. I suspect you're thinking of having developers (and the build server) check out both the project and library repos, and link them directly.

A example to prove what that's a bad idea: Project A uses library A v2.0, but project C still uses library A v1.0 (and cannot upgrade). So any developer working on both A and C is going to have to constantly check out the correct version of the library in order to keep their project working.

What you want instead is to have a versioned release system. Essentially, a "release collection" where you find all released versions of all of your libraries.
As a .NET developer, NuGet does precisely this, it's an online collection of published releases. But in its very essence, you could even just have a shared network drive that houses the DLLs.

In the end, the most important part is that this resource has all published versions of your library available. This way, projects A and C can each reference their own specific library versions, without the developer needing to constantly hop from one to the other.

One more thing to consider, should you do one repository per library or one repository for all of them?
Well, here is where we get to the part where we do consider the overhead cost. If these libraries are small, and the total repository size would not cause problems, then you can argue that the cost of having separate repositories for every tiny library is becoming silly.

However, each individually published library should have its own build pipeline, since it has its own life cycle and release schedule. You wouldn't want to have to rebuild and release your entire collection just because one of them had a minor change. This is going to lead to many "unchanged" new released of the other libraries.

In the end, you should theoretically split up your published libraries in repositories of their own, but this can be ignored for sufficiently tiny libraries. In all cases, have a separate build/release pipeline for each library, and host all release on a commonly available resources to minimize the hassle when consuming these libraries.

Thank you very much for this extremely thorough response, I appreciate it a lot and have taken the time to read it multiple times. I can see that for multi-repos, the repeated (copy/paste) work for setting up the CI pipelines (building, coding style, etc.) is minimal and definitely acceptable. As for NuGet, would having versioned release branches/tags be ok? Also, I think those should include the source code, or should they be strictly precompiled code only (e.g. .a or .dll)? — Kodiak, Commented Jan 18, 2021 at 12:25
@Kodiak: I suggest only giving them the DLL. That's not to say you can't give them read access to the source repository if your really need to (it may be useful for some fringe debugging issues), but I wouldn't particularly bundle it in the release. One of the major points of separating your libraries from your projects is so that your project developers don't have to bother about the library's internals. Giving them the source code is effectively a distraction, when they should really only be interested in the released DLL. — Flater, Commented Jan 18, 2021 at 12:28
And as I see it now: The workflow of setting up a new project would be like this: Create a project specific repository and local directory. Write a script that gets all the required modules from the different repositories, and add that script to the dedicated repository so that other team members can easily use it. For the rest, nothing new. The rest of the workflow would be just as it is already? — Kodiak, Commented Jan 18, 2021 at 12:29
@Kodiak: You're suggesting to reinvent the wheel, which is never a good approach to default to. Nuget provides a browsable/searchable list, and because you can configure multiple sources it includes many if not mostly all) third party libraries on top of any of your own nuget feeds. It also allows you to (optionally) configure a reference so that it auto-updates when a new version is released, or so that it doesn't update unless you manually tell it to. It also runs automatically in most nuget-integrated IDEs (Visual Studio, Rider) whenever you build your application. — Flater, Commented Jan 18, 2021 at 13:16
@Kodiak: "but I don't have any experience with it and therefore fail to see the advantages of using it" This is a dangerous line of thinking when you're specifically starting out with a new approach that you don't have any experience with yet. It sets you up for refusing to modernize anything until you've personally vetted it up and down. I suggest taking the other route: do things the new way, until you find a reason not to. It'll be significantly easier to find your footing in this new approach. — Flater, Commented Jan 18, 2021 at 13:18

Stack Exchange Network

How to setup our codebase for efficient code sharing and development?

Our situation

My thoughts

1 Answer 1

Hot Network Questions

How to setup our codebase for efficient code sharing and development?

Our situation

My thoughts

1 Answer 1

Related

Hot Network Questions