60

There are very complex open source projects out there, and to some of them I think I could make some contributions, and I wish I could, but the barrier to entry is too high for a single reason: for changing one line of code at a big project you have to understand all of it.

You don't need to read all the code (even if you read, it won't be sufficient) and understand all every single line does and why, because the code probably is modularized and compartimentized, so there are abstractions in place, but even then you need to get an overview of the project so you can know where are the modules, where does one module interface with other, what exactly each module do and why, and in which directories and files are each of these things happening.

I'm calling this code overview, as the name of a section that open source projects could have in the website or documentation explaining their code to outsiders. I think it would benefit potential contributors, as they would be able to identify places where they could build, the actual primary coders involved, as they would be able to, while writing everything, reorganize their minds, and would help users, as they would be help to understand and better report bugs they experience and maybe even become contributors.

But still I have never seen one of these "code overviews". Why? Are there things like these and I'm missing them? Things that do the same job as I am describing? Or is this a completely useless idea, as everybody, except for me, can understand projects with thousands lines of code easily?

18
  • 7
    You mean a design document? I've seen the rare project with a description of each package but that's usually an API already. Commented Nov 30, 2014 at 18:28
  • 14
    Why? Because there are few projects whose maintainers want to invest the effort to write and maintain high-quality documentation, and often they may not understand the benefits either.
    – Alex D
    Commented Nov 30, 2014 at 19:39
  • 9
    Documentation can be out-of-date or inaccurate relative to actual behavior. Code can't. So most projects prefer code. Commented Dec 1, 2014 at 8:31
  • 5
    Also it's easy to underestimate how much you can learn about a project if you set a kitchen timer for 2 hours or so and Just Read It (tm).
    – Kos
    Commented Dec 1, 2014 at 9:57
  • 43
    Welcome to the community-driven world: if it's not done, that's because you haven't done it :) Commented Dec 1, 2014 at 15:17

6 Answers 6

59

Because it's extra effort to create and maintain such a document, and too many people don't understand the associated benefits.

Many programmers aren't good technical writers (although many are); they rarely write documents strictly for human consumption, therefore they don't have practice and don't like doing it. Writing a code overview takes time that you can't spend on coding, and the initial benefit to a project is always greater if you can say "We support all three encoding variants" rather than "We have really neat explanations of our code!" The notion that such a document will attract more developers so that in the long run more code will get written isn't exactly foreign to them, but it's perceived as an uncertain gamble; will this text really make the difference between snagging a collaborator or not? If I keep coding right now, we will certainly get this thing done.

A code overview document can also make people feel defensive; it's hard to describe higher-level decisions without feeling the need to justify them, and very often people make decisions without a reason that "sounds good enough" when actually written own. There's also an effect related to the aforementioned one: since updating the text to suit the changing code causes additional effort, this can discourage sweeping changes to the code. Sometimes this stability is a good thing, but if the code really does need a mid-level rewrite, it turns into a liability.

7
  • 6
    Well, it seems the answer is yes: gnunet.org/gnunet-source-overview
    – fiatjaf
    Commented Nov 30, 2014 at 23:02
  • 5
    If you want it to exist, volunteer to write it. The whole point of open-source projects is that people can and should contribute what they can, subject to the community agreeing that it's worth integrating.
    – keshlam
    Commented Dec 1, 2014 at 5:37
  • 8
    @keshlam - that makes sense if you're already a contributor to the project... but if you're a potential contributor who is trying to get a basic idea of how the code works, you're the worst person possible to write that document....
    – Jon Story
    Commented Dec 1, 2014 at 11:49
  • 13
    @JonStory Your point is a good one, but in practice I've found the opposite is true sometimes, too. In some projects I've ended up writing a bunch of documentation based on notes I made while learning an undocumented code base. It was better documentation because I had to start at the API I could see and then dig deeper and deeper. The developers who had written the code already had a model of the code in their heads, and so had lots of assumptions about what someone would already know. Documentation by someone new to the project can be better documentation for someone new to the project. Commented Dec 1, 2014 at 13:09
  • 6
    @JonStory: If you're getting involved in a less-than-pefectly-documented project, you're going to have to start figuring this out anyway. Making your notes part of the project helps save the next person work. (I don't know that anyone would use the presence or absence of docs as a deciding factor on whether to contribute.) Simply improving the javadoc comments (or equivalent) can be a valuable way to start contributing. Seriously, that's the basic principle behind open-source: If you see something that needs to be done, DO it rather than waiting for someone else to.
    – keshlam
    Commented Dec 1, 2014 at 13:13
14

The dry, harsh truth?

Documentation is not made because projects can do without it.

Even open source projects often face stiff competition. Most of such projects don't start with large shoulders, they start off a bright idea, often a one man bright idea.

As such, they can't afford the time and costs of hiring human documentors, even if they offered to cooperate for free. A documented project, infact, has usually gone through several beginning iterations first. It often starts with 1-3, maybe 5 guys writing their novel idea down and showing it to the world as a proof of concept. If the idea proves good then "followers" may add, they tend to start asking for extensions, new options, translations... At this point the code is still a prototype, usually with hard coded options and messages.

Not all open source projects go beyond this phase, only those that break the "critical mass" needed to attract public interest. Moreover, one of the beginning developers has to "think big and far" and plan for expansions and so on. He might as well become the project "evangelist" and sometimes also "project manager" (other times it's different people). That's a necessary step to bring the project up, from proof of concept to an industry established reality.

Even then, the project manager may opt to not create documentation.

A dynamic, growing project would be both slowed down and documentation would really lag behind the code while it's being enhanced really hard, to implement translations, options, plug in managers...

What usually happens is:

  1. A brief introductory document is made, about what the project is and where it's going to (the famous "roadmap").
  2. If possible, an API is developed and that one is elected as "documented code" over the bulk of the underlying code.
  3. Expecially the API but also the other code are reformatted and "PHPdoc" / "Javadoc" etc. special comments are added. They offer a decent compromise between time spent and reward: even a modest programmer usually knows how to write an one liner describing his functions, parameters get "auto" documented as well and the whole is tied to its pertaining code and thus they avoid documentation "desyncing" and lagging behing development.
  4. Most often, a forum gets created. It's a powerful social media where end users and programmers may talk each other (and between their peers, possibly in "devs only" subforums). This allows a lot of knowledge to slowly emerge and getting consolidated by community made (read: not weighing on the developers team) FAQs and HOWTOs.
  5. In really large projects, a wiki is also produced. I state "large projects" because they are often those with enough followers to create a wiki (a dev does) and then actually fill it beyond the "bare bones" (the community does).
4
  • 2
    WOW!! we live (and work) in two totally different worlds. Wherever you are currently working, get out of there fast & find a company (there are many) where it gets done correctly because that actually saves you money. Don't ever let pointy headed managers / cowboy coders try to tell you otherwise.
    – Mawg
    Commented Dec 1, 2014 at 12:21
  • 6
    +1, I agree with almost all of your points, the only statement I strongly reject is that parameters get "auto" documented. When we think of explanations rather than the mere syntax/type constraints, nothing gets "auto-documented"; a generated comment in the style Returns the X. for a getX method is not helpful documentation, it's just a filler without any extra information. Commented Dec 1, 2014 at 13:29
  • 3
    @Mawg providing good documentation is an investment, you forego developer time in return for (hopefully) more contributors in the future, and some other benefits. But like many of its kind, it's only worthwhile if you know there's a good chance the project will succeed, and most software projects fail. It's important to be aware of survivorship bias when you lament the lack of documentation in successful projects. Commented Dec 2, 2014 at 0:13
  • Isn't it possible that those projects fail because they don't document? And by document, I mean plan, so that you understand, rather than sit down at the keyboard & pound away. Here's my estimate for a project life-cycle, all figures +/- 5%. Up front stuff (requirements & use cases, architecture, detailed design) 50%, coding 10 to 15%, testing, the rest. "If you fail to plan, you plan to fail"
    – Mawg
    Commented Dec 2, 2014 at 21:12
6

Overview documents such as you describe are rare even on commercial projects. They require extra effort with little value for the developers. Also developers tend not to write documentation unless they really need to. Some projects are lucky to have members who are good at technical writing, and as a result have good user documentation. Developer documentation if it exists, is unlikely to be updated to reflect code changes.

Any well organized project will have a directory tree which is relatively self-explanatory. Some projects will document this hierarchy and/or the reasons it was chosen. Many projects follow relatively standard code layouts, so if you understand one you will understand the layout of other projects using the same layout.

To change a line of code you need a limited understanding of the surrounding code. You should never have to understand the whole code base in order do so. If you have a reasonable idea of the kind of function that is broken, it is often possible to navigate the directory hierarchy rather quickly.

To change a line of code you need to understand the method within which the line is found. If you understand what the expected behavior of the method is, you should be able to make corrective changes, or extensions to the functionality.

For languages which provide scoping, you can refactor private scoped methods. In this case you will be may need to change callers as well as the refactor method or methods. This requires a broader, but still limited, understanding of the code base.

See my article Adding SHA-2 to tinyca for an example of how such changes can be done. I have an extremely limited understanding of the code used to generate the interface.

2
  • 1
    The important point here wasn't to assert how much you need to know about the code in order to make a change. Of course this will depend on a lot of things, but you'll never need to understand the whole code, neither an overview will give you that understanding, but even to find the line of code you'll change you need a certain knowledge of the general project structure.
    – fiatjaf
    Commented Nov 30, 2014 at 22:19
  • +1 There is nothing special about open source. In my over 10 years experience working in industry I've never once seen an overview document. What typically happens is that employers expect the first month of your employment to have zero productivity because you're studying the codebase. "Overviews" are usually implemented as asking your co-workers questions
    – slebetman
    Commented Dec 2, 2014 at 3:45
5

Are there things like these and I'm missing them? Things that do the same job as I am describing?

There is an excellent book called The Architecture of Open Source Applications that provides detailed descriptions of a variety of high-profile open source software projects. However, I'm not sure if it exactly fills the role you're imagining, because I believe its primary audience is intended to be developers looking for patterns to follow when creating their own applications, not new contributors to the projects featured in the book (though I'm sure it could be helpful there).

6
  • this reads more like a comment, see How to Answer
    – gnat
    Commented Dec 1, 2014 at 19:14
  • 4
    I don't find your comment constructive. What, specifically, do you feel is lacking? Many of the other answers here are lengthy speculation about possible reasons why developers might not write overview documentation. I've linked to a specific example of good overview documents.
    – bjmc
    Commented Dec 1, 2014 at 20:06
  • 1
    I feel an answer to the question asked is lacking, "Why aren't there code overviews for open-source projects?"
    – gnat
    Commented Dec 1, 2014 at 20:08
  • 3
    I would argue it's not possible to respond accurately to the question as written when, in fact, there are code overviews for some open-source projects. I've edited my answer to make it clear that I'm narrowly responding to a request for examples the user may have missed.
    – bjmc
    Commented Dec 1, 2014 at 20:14
  • 1
    The question as written asks "Are there things like these and I'm missing them?" This answer responds definitively, pointing to an existing collection of such code overviews. As such I think it's a great (and appropriate) answer to the question. Commented Dec 1, 2014 at 20:40
4

Because there are far more open-source programmers than open-source technical writers.

Documentation takes maintenance and time to keep up to date. The more bulky the documentation, the more it takes. And documentation that isn't in sync with the code is worse than useless: it misleads and conceals instead of revealing.

A well documented code base is better than one less documented, but documentation can easily take as long as writing the code in the first place. So your question is, is it better to have a well documented code base, or a code base that is twice as large? Is the cost to keep the documentation up to date whenever code changes worth the contributions of extra developers it may or may not bring?

Shipping code wins. Reducing the amount of effort put into things other than shipping code can make code ship more often, and be more likely to ship before it runs out of resources.

This doesn't mean that things beside shipping matter. Documentation adds value to the project, and with a large enough project the interconnect cost of adding another developer might be far higher than adding a documentor. And, as noted, documentation can increase investment in the project (by making it easier for new programmers to join).

However, nothing sells like success: a project that isn't working or doing anything interesting rarely attracts developers either.

Documentation of a code base is a form of meta-work. You can spend a lot of time writing up fancy documents describing a code base that doesn't do much of value, or you can spend time making stuff that consumers of your code base want and make your code base have value.

Sometimes making things harder makes those who do the task better. Either due to a higher degree of commitment to the project (spending hours upon hours learning the architecture), or because of skill bias (if you are already an expert in related tech, getting up to speed will be faster, so the barrier of lack of such documentation is less important: thus more experts join the team, and fewer beginners).

Finally, for reasons noted above the current developers are likely to be experts on the code base. Writing such documentation doesn't help them understand the code base much, as they already have the knowledge, it only helps other developers. Much of open source development is based off of "scratching an itch" that the developer has with the code: lack of documentation that already says what the developer knows rarely itches.

1
  • +1 "documentation can easily take as long as writing the code in the first place" -- or longer!
    – Marco
    Commented Dec 4, 2014 at 14:55
-1

Besides being extra effort, some open source project are crippling their documentations on purpose, in order to get freelancing jobs for their maintainers (to implement something, or to hold trainings). Not only they don't have code overview, but their API and tutorials are bad or missing lots of things.

Just to name one quite popular : bluez. Good luck finding a good tutorial, other then to scan for nearby devices.

10
  • 8
    No matter how many examples you can list for badly documented open source projects, in my opinion, the claim that they "are crippling their documentations on purpose" needs to be supported by conclusive evidence (and even then it probably doesn't hold as a general statement). Commented Dec 1, 2014 at 13:32
  • @O.R.Mapper Lets start with "Bluez - greatest linux mystery". As the only bluetooth library for linux, I find it hard to believe that it as not documentation because it is an extra effort. Hell, there is doxygen, and how hard is to write simple tutorials? Commented Dec 2, 2014 at 12:06
  • @O.R.Mapper Then there is linux kernel. If you are missing something (like a kernel driver), if your company is missing the expertise, you can either hire someone, or find a freelancer or a company that will do it for you. So, it is open source, but it is coming with a price Commented Dec 2, 2014 at 12:08
  • @O.R.Mapper Then there are open source project, with documentation in paper format. So you buy a book, and there are given no other documentation. Is this documentation crippling, or not? Commented Dec 2, 2014 at 12:10
  • 2
    For what it's worth, i've seen enough profiteering off of shoddy documentation to at least wonder whether it's intentional. When the same groups putting half-assed documentation online are more than happy to sell you a book or a training class, it doesn't take much cynicism at all to reach that conclusion.
    – cHao
    Commented Dec 2, 2014 at 14:27

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.