GCC Coding Conventions Rationale and Discussion

C and C++ Language Conventions

Language Use

Inlining Functions

Inlining functions has a potential cost in object size, working set size, compile time, and debugability. These costs should not be borne without some evidence that the inlining pays for itself.

C++ Language Conventions

Language Use

Variable Definitions

Defining variables when they are first needed reduces the cognitive burden on the programmer. It also eases converting common sub-expressions into a defined and reused variable.

Defining and testing variables in control expressions has the same advantages as above. In addition, it restricts the scope of the variable to the region in which it is known to be sensible.


if (info *q = get_any_available_info ()) {
  // q is known to be non-NULL, and so do useful processing.
}

Struct Definitions

In the C++ standard, structs and classes differ only in the default access rules. In the past, there was a mild preference among some GCC developers for using these two keywords to indicate whether or not a class met the criteria for a Plain Old Data (POD) type. However, this convention was never consistently adhered to or fully socialized. A review of a patch to add support for POD struct convention (PR 61339) revealed that the convention lacked broad enough support within the GCC developer community. As a result, the convention was removed.

Class Definitions

Forgetting to write a special member function is a known programming problem. Some authors recommend always defining all special member functions. Such classes are less efficient. First, these definitions may prevent the compiler from passing the class in a register. Second, these definitions may prevent the compiler from using more efficient methods for copying. Adding a comment that the default is intended preserves the performance while ensuring that the function was not forgotten.

Classes generally are either value classes or identity classes. Copy constructors and assignment operators are fine for value classes. They are often not appropriate to identity classes. These classes should delete, or disable, these functions. Marking such functions as follows will enable compiling against C++03, and later modifying them for C++11.


TypeName(const TypeName&) /*DELETED*/;
void operator=(const TypeName&) /*DELETED*/;

Multiple inheritance is confusing and rarely useful. When it is needed though, there may be no substitute. Seek guidance, review and feedback from the wider community.

Using virtual functions increases the size of instances of the class by at least one pointer. In heavily allocated types, such as trees, GIMPLE or RTL, this size increase could have adverse performance impacts. On the other hand, virtual functions can often reduce the size of instances by binding information into the virtual tables and the virtual functions. For example, various type tags need not be present. Other attributes can be inferred from type and more general information, or from extending the class hierarchy at the leaves. So, even though trees are heavily allocated, it remains to be seen whether virtual functions would increase the size. Virtual functions are implemented as indirect function calls, which can inhibit some optimization, particularly inlining. Therefore virtual functions should be added in heavily allocated classes only after size and performance studies. However, virtual functions are acceptable where we use hooks today, as they are already essentially virtual tables.

There are good reasons to make private all data members of non-POD classes. However, in converting from C to C++, there is a necessary transition that has public data members.

Constructors and Destructors

The compiler implicitly initializes all non-POD fields. Any initialization in the body of the constructor implies extra work.

Polymorphic classes without a virtual destructor are almost certainly incorrect.

Conversions

C++ uses single-argument constructors for implicit type conversion. Wide use of implicit conversion can cause some very surprising results.

C++03 has no explicit conversion operators, and hence using them cannot avoid surprises. Wait for C++11.

Overloading Functions

Function overloading can be confusing. However, in some cases introducing new function names adds little value, as in the current distinction between build_index_type and build_index_2_type.

The essential problem is to use overloading in a way that improves conciseness without introducing confusion. To that end, consider the following advice.

You may overload if the overloaded name supports an action notion. For example, the C++ standard's notion of swap.

You may not overload when implicit conversions among argument types may yield unintended effects. For example,


void swizzle (int arg);
void swizzle (const char *arg);
... swizzle (NULL); ...

results in an unintended call to the int overload on some systems. In practice, the problem that this restriction addresses arises more from bad user-defined implicit conversion operators. See ISO C++ N2437 and ISO C++ N2514.

You may overload if a single argument, in a single position, has multiple representations. For example,


void append (const char *str);
void append (std::string str);

You may not overload if more than one argument constitutes the representation of some data notion. For example, in


void register (int house, const char *street, int zipcode);

the arguments are a representation of addresses. Instead, the overload should be on addresses.


void register(const Address &addr);

This restriction cannot apply to constructors, where the whole point is to collect representational data.


Address::Address (int house, const char *street, int zipcode);

Notwithstanding the restrictions above, you may overload to detect errors. That is, if unsigned numbers are good, but signed numbers are bad, you could overload


void munch (unsigned int arg);
void munch (int arg);

and simply not define the signed int version. Anyone using it would get a link-time error. (The C++11 standard has a syntax that enables compile-time detection of the problem.)

Overloading Operators

Using [] to index a vector is unsurprising, but using [] to query a database over a network is bound to cause performance problems.

Default Arguments

Expensive default arguments can cause hard-to-identify performance problems.

Default arguments cause confusion when attempting to take the address of a function. They clause client code taking the address of a function to break when a default argument is replaced by a specialized overload. So, default arguments should generally not be used in customer-facing interfaces. Consider function overloading instead.

Namespaces

Putting using directives or namespace-scope using declarations into header files can change client code in surprising ways.

Using them within an implementation file can help conciseness.

RTTI and dynamic_cast

Disabling RTTI will save space in the compiler.

Other Casts

C++-style casts are very explicit in the intended transformation. Making intent clear avoids bugs and helps with other programmers.

Exceptions

The current compiler code is not exception safe.

Disabling exceptions will permit the compiler itself to be slightly more optimized.

Aborting the compiler is a reasonable response to unexpected problems.

We would like the compiler to be exception safe, to permit reconsideration of the exception convention. This change would require a significant change in style, adopting "resource acquisition is initialization" (RAII). We would be using shared_ptr (from TR1's <memory>) or unique_ptr (from C++11).

The Standard Library

At present, C++ provides no great advantage for i18n. GCC does type checking for printf arguments, so the type insecurity of printf is moot, but the clarity in layout persists.

Formatting Conventions

Names

Prefixing data members with m_ highlights the extra overhead of access to fields over local variables. Think of the leading m_ as being similar to the * dereference operator.

When using the above convention, the constructor parameter names and getter member function names can use the more concise non-underscore form.