About repository usage

Introduction

There are several names: source code control, source management, software configuration management, etc. Doesn't matter how you call it or even the software tool that you use, keeping track of changes in source code is of uttermost importance for software development.

Source control has several benefits. It can help to document software project evolution, safeguard source code files, rollback changes in case of regression. It's also a valuable tool to help team programmers to share source code and ideas, specially when integrated with an e-mail notification system to each commit. As a result, it enables even larger teams working together smoothly and makes easier for team leaders and managers to assess project's progress.

Even with all this advantages, some surveys indicate that only a small part (rough estimate of 25% to 40%) of software developer teams do use source code control and the majority of universities don't teach programmers to use it.

Even if you are using some tool, there are a set of good practices of using source code control that maybe you are not aware of. You should try to follow this rules about repository usage, since all this practices combined, improves collective code ownership thus helping to solve the so much dreaded pitfall of 'knowledge islands'.

Remember that source control it's not just about source code backup, but it's all about project history.

Commit early and commit often!

Whenever possible, try to commit your changes as soon as you have a small code piece working. Keeping lots of code exclusively in your private computer can create future merging headaches. It also helps management have an idea of a given sprint progress.

Commits should be atomic

Try as far as is possible to keep your changes atomic, not mixing several non related changes in a given commit. You should only wear one of four hats

  • Refactoring code, but only changing the interface
  • Refactoring code, but only changing the implementation
  • Adding new functionality, but only changing the interface
  • Adding new functionality, but only changing the implementation.

Large commits can break the build and be difficult to merge. If your SCS (Source Control System) don't do this automatically, remember to always check for updates before trying to commit.

Trunk code should be buildable

The primary source code trunk must be able to generate a full working copy of project at any time. To achieve this, each new piece of code should be tested to ensure that it at least compiles fine before committed to main repository.

If you are working in a new feature that requires widespread changes in other components, you can hold to make 1 big commit to guarantee to not break the build.

An even better way to solve multiple dependencies commit is working and committing the interfaces and latter introducing the functions/objects calls into client code. Say for example that you need to create a new function called 'foobar()' and also a client code to call this function, you can break the whole thing in 2 commits with following logs:

  1. 'adding new function foobar(), it does something';
  2. 'calling foobar(), so app now is not a dummy one';

What to commit?

Only files that cannot be automatically regenerated. Object files, library files, shell scripts, backup files, etc must not be added to project.

Commits must have a log

A commit without log is a child without father. Logs can be a decisive tool to determine project progress and to discover what happened when things get nasty (and rest assured that they do!).

It also helps to known in which component a given project team member is currently working and finaly improves team programmers ideas exchange.

To be useful, a log must foremost describe why the developer did a commit (the what is obvious by simply reading the code!). It also needs to be well contextualized, so a given developer can grasp a idea of whole project only reading the repository logs (without making 'diffs' between commits).

Poor programmers don't write logs. Good programmers do write logs that needs the code for full understanding. Really great programmers write logs including even references to external documentation/bug tracking tickets.

Acknowledgments

This text is a transcript of repository usage good practices followed by the extinct Conectiva Manaus team. It also benefits from discussions and suggestions from Ademar de Souza Reis Jr. <ademar@ademar.org>.

Any omission and/or error in this text is my very own failure and shame. ;-)

- Adenilson Cavalcanti <savagobr@yahoo.com>