The Theory of Binary Compatibility
In a software system made up of a number of independently evolving parts--many of which are evolving,, yet will be built to work with last year's version of that system--it is of vital importance to be able to predict and control the effect of changes made to one component on untold numbers of unknown components that may be dependent on it.
When a change can be shown not to break any law-abiding software that worked with the component before the change, than that change is a backward compatible change. When a change is such that all software that functions correctly within the new system wil also work with the old system then that change is said to be forward compatible. Backward compatibility is generally the overriding requirement, with forward compatibility an added bonus. (Bug fixes, for instance, are by their very nature not forward compatible.)
In descending order of difficulty:Binary compatibility is to do with the effects of putting separately built yet interdependent parts together; ie, compatibility across link unit boundaries. When we consider the effects of linking independently compiled pieces of code together we're talking about link compatibility. And finally, a change is source compatible when the source code of dependent components works unchanged when compiled with the new component.
All such considerations are aspects of configuration management--the discipline concerned with assembling working parts into working systems.
When shipping, one is engaging in a particularly awkward form of configuration management--one which is concerned with binary compatibility and can't even physically put the resulting system together for testing: that is done by customers in the field.
We have to do our configuration management in the abstract. We need to be confident about our work's compatibility with whole classes of independently created software products. To gain this confidence we need to make reasonable assumptions about the behaviour of such software, and derive rules as to the kind of change it will withstand.
[Bear in mind that these rules are just guidelines. If a million-selling killer application for an Symbian OS-based product bends the rules in a way that rules out a certain desirable change than that's tough. You'll have to find another way. Conversely, if the cost of fixing a problem causing high levels of returns is to break one or two obscure pieces of software than we'll just have to do that.]
The principal working assumption we'll make about software produced to work with our components is that it adheres to reasonable software engineering practice; ie, rogue code fraudulently gaining access to all sorts of private bits we'll break with joyful abandon. (Unless it happens to be that best-selling piece of software, of course.) On the other hand, if ever some facility or piece of information has been--or given the impression of being--a legitimate part of a supported interface, we are honour-bound to maintain it forever.
Additional assumptions we'll be making are to do with the behaviour of the C++ implementation we're working with. We'll assume certain things that aren't in the C++ spec, but are nonetheless fairly reasonable. (This is effectively limiting implementations of Symbian OS to C++ compilers for which those assumptions are true.)
For instance, whereas C++ doesn't define the layout of of class members across access specifiers, we will assume that access specifiers have no effect on layout, and that we can therefore relax a data member's access restrictions as long as we don't change declaration order. Similarly, we'll assume that the order of declaration of virtual member functions is the only thing that affects their order in the virtual table. We'll assume pointers and references share a representation, and, with respect to multiple inheritance, we'll make the assumption that a pointer to a certain class's representation remains unchanged when it is converted to a pointer to the first base class in declaration order. (The first base class comes first in the class's layout.)
Furthermore, we'll assume that C++'s type-safe linkage of compilation units is not in force across link units; ie, we'll be able to make well-considered changes to the name and signature of link unit entrypoints. (This requires link by ordinal everywhere.)
An interface is a contract a provider of services enters into with a client. Either party can "own" the interface. I'll use the term client interface if the interface is best thought of as belonging to the service provider. If the interface is defined by the client of the service I'll call it a provider interface. Client interfaces are normally monomorphic whereas provider interfaces can be--and generally are--polymorphic.
In C++ terms, the public interface of a class is an example of a client interface, and so is the protected interface (which defines services the class provides to its subclasses). A class's virtual interface is a provider interface. The base class specifies services to be provided by derived classes.
In terms of link units--DLLs--a client interface setup exists when a DLL is being used as a shared library. The DLL's interface is a table of exported functions, knowledge of the indices and meaning of which is shared between DLL and clients. A shared library comes complete with an import library, which encapsulates this shared knowledge. The preservation of this information across versions is a necessary condition for compatibility. To this end we maintain a definitions file (.def) which is the source equivalent of the import library. Freeze files (.frz) are just DEFs by another name. Most Symbian OS DLLs serve as shared libraries.
DLLs also work well in a provider interface setup, both in monomorphic and polymorphic flavours. Monomorphic provider DLLs are useful to choose a particular service provider at configuration time. Symbian OS examples are the console library (ECONS.dll) and the various libraries adapting the O/S to hardware variants. The client links to an import library generated from a well-defined definitions file, which is published for use by providers. Providers link their implementation DLLs using the interface DEF file.
Polymorphic provider DLLs (drivers, in a broad sense) allow the client to choose a service provider at runtime. Providers once again link using a DEF file published by the interface owner. However, the latter doesn't use an import library this time, but dynamically loads the chosen provider DLL instead. It then uses its knowledge of the exported entrypoints to invoke the provider's services. In this case, the DEF file is to be treated as true source code since the DLL's functions are looked up programmatically.
Binary Compatibility in Practice
There's a number of things all component owners need to do in order to gain control over their binary interface. Once entry-level binary compatibility is assured we can talk about the sorts of changes you will be able to make and how to tweak your interfaces in order to maximizes the options.
2.0 Exports - DEF Files
To maintain BC from one release to the next, (this is whilst making only implementation changes) every DLL interface involved needs to have its definitions file preserved, for all builds. ie, Debug, Release, Unicode Debug, Unicode Release. At ACME this means archiving the definitions files using the version control system. The definitions file lists the exports definitions, ie, a description of those functions that are exported from your DLL.
For both MARM and WINS builds you can specify the frozen DEFs (master) files to build with. The build process will then generate new DEFs - identical to the specified frozen ones, (unless you've added entry points - see later - "Adding Services") and this new DEF will be used to link the DLL.
You can tell MAKMAKE to build using the frozen export files by adding the following into you .MMP project files
DEFFILE component.DEF
MAKMAKE will look for these files in the BMARM and BWINS directories by default.
This will ensure that the exports from a subsequent command-line build of the component are compatible with the current version.
(the build process automatically mangles the DEF file name so the correct one is used, ie. componentD.def for a Narrow Debug build, componentUD.def for a Unicode Debug build.)
(note that some components do not yet conform to this standard. For MARM builds, some components rename the DEFs to be FRZs - that is a freeze file - but these components are, over time, being converted to the new build system as described here).
2.1 MARM Exports file
So where do the .DEF files come from in the first place then ?
For MARM, well the MARM build process will always generate a new DEF exports file which it leaves lying around in the intermediate build directory. This is a link-by-ordinal DEF file, so all you have to do is create a BMARM project directory and copy the DEF files here, adding the correct suffix for the variant.
eg, copy...
\epoc32\build\comp\marmd\rel\dll.def --> \project\bmarm\dll..def
\epoc32\build\comp\marmd\deb\dll.def --> \project\bmarm\dllD.def
\epoc32\build\comp\marmd\urel\dll.def --> \project\bmarm\dllU.def
\epoc32\build\comp\marmd\udeb\dll.def --> \project\bmarm\dllUD.def
2.2 WINS Exports file
With WINS, things are a little more complicated. The basic idea is the same and involves archiving the DEF exports file. First you use a tool called DEFMAKE to generate the initial DEF files (into a BWINS project directory), as follows...
defmake \epoc32\release\wins\rel\dll.dll \project\bwins\dll.def
defmake \epoc32\release\wins\deb\dll.dll \project\bwins\dllD.def
defmake \epoc32\release\wins\urel\dll.dll \project\bwins\dllU.def
defmake \epoc32\release\wins\udeb\dll.dll \project\bwins\dllUD.def
...DEFMAKE reads the Win32 PE file (the WINS binary you have built), extracting names and ordinals and produces a link-by-ordinal DEF file, which from now on you will use to link the DLL. Once you have rebuilt your DLL using this new link-by-ordinal DEF file, the resulting PE file will no longer contain names for DEFMAKE to extract. Luckily you will be archiving the exports file (won't you ?) and you wont normally need to regenerate them, except when you're adding services. (see later - "Adding Services")
3.0 Adding Services
In general, it is possible to add exported services to your published interface. There are restrictions on the type of services that may be added. See below ("Allowed Changes") for a description of these.
For both MARM and WINS builds, this is pretty straightforward. All new services get added at the end of the automatically generated DEF file, following a build of the component. Simply replacing the original DEF file with this new one will give these new entry points permanent status. (The reason a DEF file is specified in the .MMP file is that this is used as a template when the build process generates the new one. All matching exports maintain the same ordinal, and all new exports are appended to the new DEF file).
4.0 Allowed Changes
Now that we have the mechanism in place, let's look at some of the changes you can or can't make to an interface while preserving backward binary compatibility. Naturally, these remarks only apply to constructs which are part of an external interface. You are free to arrange you internal interfaces as you see fit.
- Add services to a shared library.
Adding classes, global functions, static member functions and non-virtual member functions is fairly straightforward. Remember to avoid name collisions (as always) and to freeze the new entry points as soon as the new version of the library is released. - You can't generally add or delete virtual member functions, or even change their order of declaration.
You can't even override an existing virtual function that was previously inherited. (Existing derived classes would be left inheriting the wrong function). - All changes to private non-virtual (static and non-static) member functions are OK as long as they are not accessed through public or protected inline functions; either directly or indirectly.
Some friends or member functions affected by the change may be in a different link-unit, in which case you must make sure that the relevant binaries are kept in synch at all times. If this is not practical then the change must be disallowed. - You can make changes to private data members that are not accessed through any public or protected inline functions - directly or indirectly - provided that the size of the class remains unchanged and that the offset of all public or protected data members or private members accessible through public or protected inline functions, directly or indirectly, stay the same. If friends or members of the class exist in a different link unit then all relevant binaries must be kept in synch.
- You can relax access specifiers; ie, a protected member can become public, a private member public or protected. The reverse is not allowed because it would make it impossible to draw any conclusions from a member's current access specification. An exception to this rule can be made if a forwarding inline function is left in its place.
- Similarly, you can bestow friendship upon additional classes or functions, but, once given it can't be withdrawn. Friendship is forever.
- You can change the size of a class provided it has only private, non-inline constructors and either a virtual destructor or if it has a non-virtual destructor it mustn't declare or inherit an operator delete() of the form with two arguments. In this case only friends and members can allocate memory for and construct instances of the class. The operator delete() requirement in the presence of a non-virtual destructor exists because in that case the compiler will supply the second argument - the size of the object - to the delete operator based on the version of the class declaration it has seen. Further restrictions are as for changes to private data members. Note that a class without constructors gets a compiler-generated, public default constructor and a class without a copy constructor gets a public default copy constructor. (Constructor generation, however, can be inhibited higher up in the class hierarchy, as is the case for copy constructors in classes derived from CBase.)
- You can widen the range of valid inputs to a function, or narrow the range of possible outputs. You can't change the interpretation of an existing valid input or change the meaning of an existing output value. An enumeration can be added to but not reordered, say.
- You can change the name and/or signature of a function if the change preserves or changes the input or output ranges along acceptable lines. A non-const parameter can be made const, a reference to a class can be changed to a reference to first base class, etc.
- As a single, unlikely exception, you can add a virtual function to a class or implement a previously inherited virtual function that wasn't public if classes derived from the previous incarnation of the class wouldn't have been able to be instantiated; ie, if the class had only private constructors or had a private destructor. (Yeah, right.)
I guess there's a general flavour to these rules, which can be used to derive some guidelines for defensive interface definition. It goes something like: A change is OK if either
- You can pin down and fix every single line of code affected by it, and make sure that the fixed code goes everywhere the change goes. This only works if no aspect of the change escapes from the private domain.
- The change is demonstrably compatible with all possible clients, not just current ones or ones you are aware of.
Unless you are confident that an interface will never need changing (this is a valid attitude, especially if you turn out to be right), you should be defensive about what leaks out of your interfaces. No information should escape for no particular reason. Quite often information seems to make it out into the public domain by "accident". Panic numbers, message numbers, purely private definitions such as hard-coded directory names, and indeed entire private headers are but a few examples. Scores of private libraries have their import libraries released. Avoid doing this if at all possible. What you do not publish you do not have to freeze.
Perhaps the most common violation of this principle is overuse of the protected keyword. Protected is often just slapped on by default, on the grounds that it allows more flexibility. The reverse is true. Unnecessary protected interfaces have to be supported in perpetuity just as legitimate ones have to be. Protected belongs only in classes designed as base classes in a framework.
Perhaps another thing that is apparent from the above is that the options with virtual functions are severely limited. In frameworks with high fluidity it may therefore be appropriate to add in one or two "spare" virtual functions. A restricted class of changes may be supported by pressing such a spare into service. If a framework suddenly, courtesy of a new category of concrete classes, acquires new attributes along a new "dimension". As a contrived example, should controls all of a sudden require a degree of transparency, so that they can be layered with lower layers filtering through, then a spare virtual function could be given a default behaviour that suits existing (ie, opaque) controls and new, transparent controls can be introduced. This is certainly no panacea but may be useful in some cases.