Wartbed:Code style
From Dark Omen Wiki
Contents |
Note: | This page is primarily aimed at WARTBED developers. Currently it is written and maintained more as a guide to myself to help me with coding consistency. Some parts of WARTBED consists of code predating this Standard and may thus deviate from some of the guidelines. If parties interested in assisting with WARTBED feel dissuaded for aesthetics reasons due to this Standard then remember the golden rule: "Ignore all rules to achieve the best possible result". |
Also see the WARTBED design document. |
Syniopsis:
Basic Principles: Code Pedagogics
//-------------------------------------------------------- // Avoid mixing responsibilities // This function should be split into CookFood() // and RepairCar() functions // void CookFoodAndRepairCar() //-------------------------------------------------------- { Recipe howto = GetRecipe( "Rabbit casserole" ); Meal nomnom = CookFood( howto ); Todo items = ExamineCar( "Chevy" ); Cost topay = Repair( items ); } //-------------------------------------------------------- // Avoid mixing levels of abstraction // The low-level code in the middle of the function should // be replaced by a call to a "PlayGame()" function // void PlayDungeonGame() //-------------------------------------------------------- { CreateCharacter(); SelectDungeon(); for (Monsters::iterator i = currentDungeon.begin(); i != currentDungeon.end(); ++i) KillMonster( *i ); ImproveCharacter(); SaveProgess(); } //-------------------------------------------------------- // Avoid side effects // Why is the network and character data adjusted // in this function? // void PrintDebugMessage() //-------------------------------------------------------- { network.initialise(); debugStream << character.name << " is level " << character.level << endl; character.resetAllData(); }
Ultimately, everything in the WARTBED code style aims at code pedagogics, which is about the expressiveness and understandability of source code for human readers. Good code pedagogics lead to fast overview and understanding and higher maintainability of code and less frustration for programmers.
Some principles have emerged over time as best practices to achieve good code pedagogics:
- Accessibility
- Natural language: code should read as natural as possible: the closer code is to natural English the easier it is to scan and grok.
- Self-explanatory naming: the intent and use of identifiers should be evident from their names
- KISS: Avoid unnecessary complication.
- Abstraction and separation
- Separation of responsibilities: Different functionality should be separated and not mixed.
- Consistent level of abstraction: All code in a source segment (block and function) should be of the same level of abstraction; high- and low-level code should not be mixed.
- Principle of least surprise
- No side effects: Code should do what its name implies it should do and nothing else.
- Consistent styles and conventions: Code should stick to one set of metaphors and styles.
- Architecture and design
- Prefer standard language facilities
- Prefer to design by patterns
- Prefer short and concise functions
- Prefer fewer arguments to functions
C++ Language Facilities
// Use a smart pointer class to store dynamically allocated objects typedef auto_ptr<Model> PTR_MODEL; typedef auto_ptr<Renderable> PTR_RENDERABLE; void GoodConventions( std::string const &rParam1, DataStruct const &rData ) { // Variables are declared where they are used PTR_MODEL pShip = ModelFactory::createModel( "Spaceship" ); PTR_RENDERABLE pHero = SkinModel( pShip, "WingCommander" ); PTR_MODEL pAlien = ModelFactory::createModel( "Alien" ); PTR_RENDERABLE pEnemy = SkinModel( pAlien, "Kilrathi" ); if (pHero && pEnemy) { for (unsigned i = 0; i < 10; ++i) { BattleStats stats = static_cast<BattleStats &>( DoBattle(pHero, pEnemy) ); PresentStatistics( stats ); logger << stats; } } }
// Don't do this! // BAD: param is called by-value, not by-reference (const reference) // BAD: pData is sent as a pointer, requiring checks for NULL void PoorConventions( std::string param, DataStruct *pData ) { BattleStats stats; // BAD: Declared outside where it is used // BAD: Storing dynamically allocated object in raw pointer Model *pModel = new Model( "BFG9000" ); PTR_MODEL pShip = ModelFactory::createModel("Spaceship"); PTR_MODEL pAlien = ModelFactory::createModel("Alien"); // BAD: If this fails then Alien was created unnecessarily PTR_RENDERABLE pHero = SkinModel( pShip, "WingCommander" ); PTR_RENDERABLE pEnemy = SkinModel( pAlien, "Kilrathi" ); if (pHero && pEnemy) { // BAD: Trusting the compiler to optimise the post-incrementor for (unsigned i = 0; i < 10; i++) { // BAD: Using C-style cast where C++'s is more suitable stats = *(BattleStats *) &DoBattle( pHero, pEnemy ); PresentStatistics( stats ); logger << stats; } } }
Calling conventions
- Call-by-reference is preferred over call-by-value
- References are preferred over pointers
- Const is preferred over non-const
- The "const" keyword is placed after the data type ("Class const &rC = myC;")
Conform to the Standard
- Avoid language extensions. MSVC allows several, all of which makes the code not compliant with the Standard and makes WARTBED less portable.
- Treat any "recommendations" related to deprecation issued by VC with utmost scepticism.
- Avoid the #pragma once compiler directive and use header guards
- some #pragma directives can be used if desired, for instance for code folding, since they will not affect compilation on other platforms.
- Note that all upper-case identifiers (generally enums) can clash with compiler-internal symbols, which is partly why the mixed-capitalisation form of above is promoted. Likewise, #defines lead by two underscores are generally reserved by the compiler and could result in symbol conflicts if used in client code. WARTBED's header guards avoid this by being intentional verbose.
Variables
- Unless where warranted for batching or special purposes, variables are declared in close proximity to where they are used. This is to ease locating the declaration of symbols, emphasise code flow, align code for MRU stack access, avoid allocation and assignment overhead (f.i. due to pre-emptive code termination), and promote automatic scoping.Background: Back in the days, for compiler parsing reasons, some languages (f.i. early versions of C and Pascal) required you to declare all variables immediately following the function signature or the principal open token. This practice still survives among some programmers and is commonly rationalised as "making it easier to find symbols by collating them in one place".
- Make the scope of a variable as small as possible.
- Local variables, including loop control variables, should generally be declared in their code structure (i.e. "for (int i = 0; i < 5; ++i)").
- Explicitly initialise atomic variables, since they are not default-initialised by the compiler
- Set numerics to a sensible value ("int count = 0;")
- Initialise pointers to 0 (prefer 0 over NULL) ("int *pInt = 0;")
- To allow objects to be default-initialised, where suitable provide a default constructor setting the object's state to workable default values
Operators
- Always prefer pre-increment and pre-decrement operators to their postfix alternatives.
- C++ casts are preferred over C-style casts.
- Operators are padded with one space (see horizontal indentation and padding, below).
Pointers
Contrary to some vocal opinions, pointers are not evil or bad. Pointers can be enormously powerful and efficient: thus, with the significant exception for dynamic allocation of memory, do use pointers wherever warranted, if judiciously, with restraint and with discipline.
- Never use raw pointers by themselves to store dynamically allocated memory. All dynamically allocated objects should be immediately stored inside smart pointer containers to avoid automated resource management and deallocation.
Bad: "MyObject *pObject = new MyObject;"
Good: "shared_ptr<MyObject> pObject = shared_ptr<MyObject>( new MyObject );"- Corollary 1: Never return new's pointers as a result from a function, instead always return smart pointers.
- Corollary 2: This also means passing a new'd pointer in function call parameter lists, as well as any other place where the allocated resource isn't immediately stored in a automated container is prohibited.
STL
// Might be inefficient // for (Collection::iterator i = items.begin(); i != items.end(); ++i) i->doMyThing(); // More efficient alt. 1 // Collection::iterator end = items.end(); for (Collection::iterator i = items.begin(); i != end; ++i) i->doMyThing(); // More efficient alt. 2 // for (Collection::iterator i = items.begin(), end = items.end(); i != end; ++i) i->doMyThing();
- STL: Always prefer STL solutions when available, unless for very good reasons.
- Iterators: In time-sensitive loops, define the end criterion either outside the loop body or in the initialisation clause to avoid lookup overhead.
- Pre-Standard C++ Headers (header files with ".h" ending): Avoid. The library isn't generic. Especially streams are incompatible. The STL covers all of pre-Standard C++.
Other C++ Language Facilities
C++ is a feature-rich language. These features make it an expert-friendly but sometimes beginner-hostile language. The feature set of C++ makes it a multi-paradigm language that is well-suited to represent problems of many domains and allows the programmer to chose the solution (how to address the problem), unlike other languages where the form of the solution is always already made. This comes at the cost of requiring skill and discipline of the programmer.
- struct vs. class: In C++ both keywords denotes classes indentical in all ways except default access permissions. Using struct when defining classes characterised by openness creates cleaner code and provides additional indications of intended usage, and is encouraged.
- Operator overloading: Operator overloading is extremely well-suited to address many specific OO-requirements in games development. Mathematics and quasi-mathematical notations is well expressed using this form. However, operator overloading should only be used where it naturally expresses the intention or function of the code, and when it contributes to code pedagogics.
- Multiple Inheritance: Game programming is one of a few development domains that truly benefits from MI. Just as single inheritance, it should be used judiciously.
- Multiple return statements: Edsger Dijkstra popularised the "every function should have one entry point and one exit point" notion. This was very good advice in the days when computers were programmed with hole punches. In modern languages well-written code can be helped made more efficient, readable and maintainable by aborting early and returning where suitable.
- continue and break statements: Similar to the above, by convention especially continue, but sometimes also break, are considered bad in loops. This is nonsensical: both keywords are highly beneficial in well-written algorithms.
- goto: Its reputation for evil to the contrary, even the goto statement can be useful - there are times and places where its use is warranted and contributes to code pedagogics.
- UNICODE isn't naturally well represented in C++. Nonetheless, wide chars and strings are preferred for all IO actions and textual contents.
- Legacy C code and calling conventions: Avoid. We're not writing kernel drivers. C buffer format functions do not support wide strings. new and delete completely replaces malloc() and free().
Stylistic Conventions
Naming conventions
Fundamental naming principles
//Don't do this! // Don't decorate classes with type information class IMyInterface {}; class CMyClass { // Don't tell me that the variable is a constant zero-terminated old-style string // What is this to be used for? "Text" isn't very descriptive... char const *szcText; // This terse variable name will not help reading the source code int dbofs; //database offset };
- Identifiers' names MUST be descriptive of their usage or intention. Terseness for its own sake is dissuaded. Single-letter identifiers are prohibited except from loop control variables or obvious usages.
- Naming should express usage, not data types. Therefore Hungarian notation is dissuaded as are pseudo-Hungarian prefixes such as "C", "S", "T", "I" etc. for classes, structs or interfaces.
- Exception 1: Lower-case "p" and "r" prefixes to indicate pointers/smart pointers and references, respectively, are encouraged , since these indicate fundamental code affordances and helps avoid mistakes when writing
- Exception 2: The common "g_" (or "g") global and "m_" (or "m") member prefixes are allowed though not recommended.
Capitalisation
template<typename T> class Regiment { Renderable *pRenderable; void member() {} int anotherMethod() {} Regiment& getMe() { return *this; } }; typedef Regiment<int> REGIMENT; enum MOVE_ORDER { ORDER_approach, ORDER_retreat, };
- All class- and function names are Captialised. All class members or any global variables are initiated by lower-case.
- Generally, CamelCase is used to separate words in identifiers. There are exceptions where appropriate and for better adherence to STL naming conventions.
- UPPER_CASED_IDENTIFIERS use underscores to separate words.
- Namespaces are always lower-cased and moderately concise.
- Typedef and enumeration identifiers are generally upper-cased. Enumeration members need not be upper-cased but are always initiated by a common word in upper case (followed by an underscore to separate the enum ID marker from usage description).
Error avoidance and pedagogics
template<typename T> class Alien { ... }; typedef Alien<Evil> EVIL_ALIEN; typedef std::vector<EVIL_ALIEN> ALIENS; // Gotcha: So, what data type is _2, below really? int* _1, _2; // Don't do this!
- Pointer and reference tokens ("*" and "&") belong with the identifier, not their data type. Exceptions can be made for function return types to emphasise different code structures (and because it looks a bit silly and confusing to keep return data type specifiers with the function name).
Correct: | int *pInt = 0; |
Wrong: | int* pInt = 0; |
OK: | int& MyFunc(); |
- Provide typedefs for the most commonly used template signatures, especially for collections.
- typedefs of collections end in "S" (for nouns in strong plural) to indicate their nature.
- Templates accept both the typename and class keywords as synonymous to denote a generic type parameter. "typename" should be preferred at all times over "class" since it is more expressive of intent and less confusable with class's other type declaration usage.
Braces and indentation
namespace wb { class Widget { typedef std::vector<Widget *> WIDGETS; WIDGETS children; std::string name; Widget* getChild( std::string const &rName ) { if (!children.empty()) for (WIDGETS::iterator i = children.begin(), end = children.end(); i != end; ++end) if (*i && (**i).name == rName) return *i; return 0; } void performAction( Action what ) { if (ACT_first == what) { DoAnAction(); log << "Doing action one" << std:::endl; } else if (ACT_second == what) { DoAnotherAction(); log << "Doing action two" << std:::endl; } else return; log << "Action performed" << std:::endl; } }; }
- For braces and indentation WARTBED uses an adaptation of the Allman indentation style
- With "braces" (the "{" and "}" characters) are meant the limit markers of all S-structures, which in C++ are all code structures delimited by {}-brace pairs. S-structures includes:
- classes, structs and enums
- functions/methods and inner scopes
- flow control structures (if, do, while, switch)
- error handling mechanisms (try, catch)
- namespace declarations
- Explicit array initialisation lists are S-structures but are excepted from the style convention outlined here.
- All braces are set on their own lines, except where silly or better aesthetics is achieved otherwise. Avoid K&R, Pico or similar schemes where braces follow their clause on the same line, since those braces tend to get lost in clutter and makes visual scanning more time consuming. Braces on their own lines stronger emphasise code and scope structure.
- Braces are set underneath the first character of their owning keyword. Avoid GNU, Whitesmith or similar indentation conventions where braces are indented further compared to their keywords.
- Braces are not mandatory for trivial bodies of conditionals or loops. In fact, in some situations using braces may make code messier, more spread out and arduous to scan.
- Lines are broken to assist with reading the code. Lines do not have to be broken for short/trivial cases.
- When lines are broken and braces are not used, the next line is still indented.
Horizontal indentation and padding
namespace wb { class Weapon { std::string name; void setName( std::string const &rNewName ) { name = rNewName; } }; void Function( MOVE_ORDER order ) { AnotherFunc( std::string("Text") ); switch (order) { case ORDER_approach: break; case ORDER_retreat: break; }; } }
- All tabs ("indentations") are translated to space. Tab width is set to 4 spaces.
- Switch/case-structures use indented case clauses, generally with code bodies underneath the case and indented one additional tab.
- Function signatures and calls are padded with one space. ("Call( argument )")
- Nested calls (parenthesis inside parenthesis) are not padded. ("Call( outer(inner) )")
- All operators are padded with a single space ("unsigned a = 10;")
- Flow control parameters (if, for, while, catch etc) are indented one space from their keyword and use no internal padding. ("while (condition) Func();")
- C-style cast data types are padded one space from the castee, type information is also padded
No type info: | (unsigned) &my_int |
Type info: | (unsigned *) &my_int |
- Indenting and otherwise aligning member signatures where appropriate is encouraged.
Comments and vertical spacing
/********************************************************* * Source file.cpp * * Copyright notice and documentation * * * *********************************************************/ namespace wb { //-------------------------------------------------------- class Weapon //-------------------------------------------------------- { std::string name; void setName( std::string const &rNewName ) { name = rNewName; } }; //-------------------------------------------------------- // SpecialWeapon is a specialisation of Weapon. It adds // many special capabilities over Weapon. // class SpecialWeapon //-------------------------------------------------------- { }; //-------------------------------------------------------- void Function( std::string const &rStr ) //-------------------------------------------------------- { ObtainString( rStr ); for (unsigned i = 0; 1 < rStr.length(); ++i) { } Wants wanna = GawkAtTrinkets(); Haves gots = StealShinies( wanna ); } }
- Block comments are dissuaded in the main code since they interfere with temporary commenting out large blocks of code.
- Temporary commenting out of lines generally place the comment token at column 0 of the line.
- Class and function names are emphasised by a leading and following comment dash-line. Non-documentation (/** ... */ or /// comments) descriptions are placed underneath the first dash-line. While this might look a bit uncommon, it has proven to be highly useful for good code block separation and object identification.
- Separate code blocks (classes, functions) are separated by three linefeeds.
- Sections inside a function body that should be distinct are separated by two line feeds.
- Internal structure in code is maintained by liberal usage of blank lines -- it was a long time since we had to save vertical space on 25-line consoles.
- Members and functions should be grouped when applicable. Grouping criteria should be obvious. (Related to that return types, names and parameters of functions should be indented and aligned in the declaration section where suitable.)
Structure of source and header files
#ifndef __HEADER_FILE_INCLUDE_GUARD__ #define __HEADER_FILE_INCLUDE_GUARD__ namespace Ogre { class SceneNode; } namespace myapp { Ogre::SceneNode* DoSomethingWith( Ogre::SceneNode &rSceneNode ); } #endif
- Header files should never #include more other headers than is absolutely necessary for the signatures to be valid: headers should instead be included in the .cpp file. This may help avoid recompile cascades when changing a header file and leverage incremental build capacities.
- If any signature contains a pointer or a reference, a forward to that symbol should be declared instead of including the corresponding header file.
- All headers possible should always be included in the .cpp file.
- Note: This can be critical for compile efficiency on large projects or when including headers with many dependencies. Especially when including STL or Boost headers.