CODING: programming guidelines and best practices

There's a post in my blog about this text, but in Brazilian Portuguese.


INTRODUCTION
------------------------------------------------------------------------------
This coding-style and best-practices text is based on a document
produced by the Mandriva Development Team at Manaus (2004-2006).

It covers the following topics:
  - C coding style
  - Bash coding style
  - Code documentation and instrumentation
  - Commits and changelogs
  - Unit tests
  - Build-system

It is supposed to be a simple text and in direct language (KISS). The
idea is to keep it on the root of the project repository so that
developers and contributors can consult it easily.

(C) 2005-2006 Mandriva Conectiva S/A.
(C) 2006-2008 Ademar de Souza Reis Jr. <ademar@ademar.org>

Licensed under GNU/FDL. The latest version can always be found at
https://www.ademar.org/texts/coding.html.

Please send comments, contributions and fixes to <ademar@ademar.org>.

$Id: CODING 241 2008-02-05 14:49:14Z ademar $

C CODING STYLE
------------------------------------------------------------------------------
Read linux/Documentation/CodingStyle if you haven't done so yet.

Some highlights:

  - Outside of comments and documentation, never use spaces. Indentation
    is done using tabs only;

  - Do not use tabs to align text documentation. Although the preferred
    tab width is 8 characters, changing tab width should not interfere
    with the layout/alignment of code, only indentation;

  - Maximum text width is 80 columns, with very few exceptions;

  - Do not put multiple statements on a single line:

      wrong:
      if (condition) do_this;

      right:
      if (condition)
          do_this;

  - Variables should always be in lower case, like:
      char *tmp_buf;
      int length;
      int i, j;

  - Macros (#defines) for constants should be in upper
    case, like:
      #define MAX_NUM_ENTRIES 10
      #define WHATEVER 0x220

  - Do not use typedefs to hide structs or enums, unless
    it's absolutely necessary to create an abstract type. If
    that's the case add a '_t' suffix to identify it;

  - Never use typedefs just to hide pointers, unless
    it's a function pointer;

  - CamelCase is discouraged, with one exception:
      - In cases where it better integrates with other coding-style,
        to avoid mixing different cases (for example, if the code
        is *heavily* tied to some library which uses such notation);

  - When using a (semi) object-oriented paradigm, separate
    "classes" from methods/attributes using '__', as below:

    foobar__do_something()        ["class":foobar, method:do_something()]
    user_adm__change_pass()       ["class":user_adm, method:change_pass()]
    bobex__lock                   ["class":bobex, attribute:lock]
    backend_sysfs__methods        [...]

    This must be done carefully, since it may cause unwanted code
    pollution;

  - All non-static functions should have a prefix to avoid symbol names
    clashing, as well as new defined types and exported macros, as in
    the example below:

      atc_dos2unix()
      ^^^

  - When performing error checking, use goto to help creating a single
    return point. Do not abuse goto usage though (avoid going up, going
    too far, too much labels, etc);

  - The most important lesson about coding style is to keep your source
    consistent. For example, using non-idiomatic (alien) constructions
    is strongly discouraged;

  About header files and modularization:

     - Source code has to be split into modules (AKA units), which are
       defined as a collection of implementation files (.c) with an
       interface exported through a header file (.h);

     - The inclusion (#include) of headers must reflect the dependencies
       between modules in the implementation. The most important
       implications of this statement are:

         . implementation files (.c) *must* include *all* headers it
           directly depends on;
         . implementation files (.c) *must not* include headers it
           doesn't directly depend on;
         . headers should only include headers (nested headers) when
           there's a need for a definition or declaration;
         . headers should never include other headers just to create a
           "single point of inclusion".

     - Local headers (from the same component) are included using "",
       with an exception for the auto-generated files "config.h" and
       "system.h", which can be included using <>.


BASH CODING STYLE
------------------------------------------------------------------------------

  - Never create hardcoded files/dirs on /tmp and alikes. Use mktemp
    instead. Example:
      TMP1=$(mktemp -d /tmp/fstests.XXXXXX) || exit 1

  - When implementing test scripts, do not write on directories/files
    outside your temp dir. And never read files available only in your
    machine;

  - Maximum text width is 80 columns, with very few exceptions;

  - Be portable (Mandriva, Fedora, SuSE/Novell, Ubuntu, etc);

  - Use bash2 syntax (e.g. $(command) instead of `command`);

  - Make sure the programs you invoke exist before calling them,
    preferably at the beginning of your script:
    Examples:
      [ "$(which "$PROG")" ] || { echo "Error: $PROG not found" >&2; exit 1; }
      [ -x /bin/mail ] || { echo "Error: /bin/mail not found" >&2; exit 1; }

  - Use functions and modules when implementing complex scripts;

  - Check the result of every command and exit with an error message
    if it fails. Use "set -e" by default (unless of course your script
    depends on the return code of some command);

  - Always wrap variables with "" in tests, to avoid failures if the
    variable is empty. Examples:
      [ -x "$PROG" ]
      [ -f "$FILE" ]
      [ -e "$ANSWER" ]

  - Use getopt() to parse command-line arguments (better yet: use getoptx
    from Grigoriy Strokin);

  - CODE DOCUMENTATION and CODE INSTRUMENTATION sections have
    insightful remarks which are valid for BASH programming as well.


CODE DOCUMENTATION
------------------------------------------------------------------------------

  - Add FIXME, TODO and XXX comments appropriately;

  - Never commit commented-out code without adding the reason
    why it's still there but disabled. Removing the code is
    better than commenting-out it;

  - Use doxygen for code documentation;

  - Add simple READMEs and pictures as needed, but concentrate
    on doxygen;

  - Never use a language different than English.


CODE INSTRUMENTATION
------------------------------------------------------------------------------

  - use const for non modifiable function parameters;
  - const is usually better than #define;
  - use static for internal functions;
  - use safer functions like snprintf(), strncpy() and alikes;
  - check validity of all received parameters;
  - use assert() where appropriate;
  - do not use alloca() or other non-recommended functions;
  - check for return errors even from malloc() and other
    standard system calls;
  - compile and run your code with ElectricFence enabled while
    developing;
  - never commit debug printf() and alike calls unless they're
    inside debug-enabled macros;
  - use valgrind to profile you application and find memleaks
    regularly.


COMMITS AND CHANGELOGS
------------------------------------------------------------------------------

  - in a changelog, what matters is not "what" but "why";
  - several commits are usually better than a big one;
  - do not commit unrelated modifications at once unless they're
    really trivial;
  - commit early, commit often;
  - always run "svn diff" and "svn status" before a commit and
    document all changes in the changelogs;
  - your changelog must be useful in a "svn log" operation, not
    just to people receiving the diff by e-mail (think about
    its usage years later, by someone not involved in the project)
  - changelog messages should respect the code text width limit
    (around 80 columns);
  - as expected, code in the repository should always compile,
    without warnings.


UNIT TESTS
------------------------------------------------------------------------------

  - All code should be written along with unit-tests. The tool
    used to implement the tests is "check", available on
    http://check.sourceforge.net/.

    The build-system infra-structure provides a configure option to
    allow check usage.

    The tests must be implemented inside a sub-directory called utests
    with the following modules:

    README                 --> duu :-)
    check_<component name> --> the test-manager
    check_<unit name>      --> implements tests for interfaces
                               exported by unit
    check_<...>

    Just to remember, a unit, or module, is a collection of
    source-code files (.c) with an interface exported through
    a header file (.h).

    All interfaces exported by units must be tested (that is,
    all non-static functions). The tests should implement at
    least the following test cases:

      - standard usage
      - upper and bottom limits of pre-allocated buffers
      - upper and bottom limits of variables (like sizes and
        lengths)
      - error conditions
      - invalid input parameters
      - test each possible use-case scenario

    Use incremental tests, that is, test functions before using them in
    other test-cases (for example, if you need to call function A to
    prepare the environment to test function B, then test function A
    first).

    Whenever possible, static functions should also be tested. In this
    case, you can include the .c file in the utest file, something like:

      #include "mymodule.c"

    If the test needs an external entity to work (for example,
    it needs to communicate with a hardware device), or if the test
    requires specific privileges, put the test inside a condition for
    an environment variable and document it in a README file.


BUILD-SYSTEM
------------------------------------------------------------------------------
Standard instructions:

  - Code should compile with no warnings, using the following GCC
    options:

    C:
       -Wall
       -W
       -Wmissing-declarations
       -Wmissing-prototypes
       -Wredundant-decls
       -Wshadow
       -Wbad-function-cast
       -Wcast-qual
       -Wconversion
    C++:
       -Wall
       -W
       -Wredundant-decls
       -Wshadow
       -Wcast-qual
       -Wconversion
       -Weffc++

  - Minor components or proof-of-concept code can have a simple
    buildsystem (Makefile based), but main components should use
    the complete template.

# vim:tw=70:ts=4:et: