Software Development
August 2002

Universe of Composition

Components can mesh via objects (plugging into a hub), connections (wiring electronic boards together) or contexts (a cell phone calling a base station). Which to use when? Part 1 of 2.

By Clemens Szyperski

Components are units of composition. This claim seems redundant—isn't that what they do by the very name they're given? Indeed. However, exploiting the full potential of components requires a thorough understanding of composition. This column investigates the surprisingly wide spectrum of composition techniques explored thus far. In all likelihood, more will be discovered as our field evolves: We're nowhere near a universal understanding of the techniques of composition.

Perhaps surprisingly, although they can be combined, there is no single technology that supports all the composition techniques discussed here. (Actually, the CORBA Component Model gets close to combining them all, but it remains to be seen whether it will leave its mark in actual applications.) Perhaps this is due to a lack of understanding of this field's richness. I hope that this short article will inspire many to go forth and explore composition.

Objects, Containers and Connections
Presently, three composition approaches dominate the scene. The first, which doesn't actually work, is object-oriented composition. The second is connection-oriented composition, and the third is container-based composition. The first two are symmetric, while the third is asymmetric in the sense that it relates components and containers, which typically aren't components.

Before we begin with object-oriented composition, let's take a step backward. How do we compose functions (real ones—not the unrelated items of the same name found in C and elsewhere)? Simply by nesting. The result of one function is used as an argument for another: h(f(x), g(y)). The same method works for functions (or procedures) in programming languages—provided these functions don't use nonlocal variables. Well, almost. Procedural functions don't compose if they have side effects. Here, global variables are the most obvious—but not the only—sin. Passing around mutable data structures by reference leads to the same problem when there is aliasing—the duplicate references that occur whenever two different computations interfere by referring to the same mutable data structure. Global variables are just the most brutal form of aliasing: All code can reference a global variable, and aliasing ensues if at least two functions do. Why do aliased references interfere with composition? Consider the previous example, h(f(x), g(y)). If we have another function, say d, that, given the same arguments, computes the same results as function f, we'd expect that h(d(x), g(y)) would yield the same result as our original expression did. However, if f and g or d and g interfere, the new expression will produce a different result. Since this would be unexpected, we'd have to resort to a detailed analysis of the implementations of these functions, not just their specifications, in order to determine whether the particular expression does what is required. Composition fails: These functions do not compose or are not composable. Of course, things might work serendipitously—due not to rigorous design, but merely to luck. (In the real world, luck is usually manifested as "correctness by exhaustive testing"—something that can't actually be done, since a truly exhaustive test of anything but trivial code would not finish in any humanly acceptable time. In other words, if something passes all tests and then works in practice, it's still due to luck.)

Side Effects
Alternatively, we say that these side effects were meant to happen; that is, we make them part of the specification. There's nothing wrong with this approach. For instance, we could specify that both d and f use their argument, dereference it and change the referent in a specific way. In that case, our two expressions would again yield the same result, and they would do so as a result of compositional rigor and specification, not as a result of luck. Consider the practical example of a function that writes to a file: write(file, pos, char), where the file is passed as a reference. The function may not even return a value and yet perform its intended purpose by means of a side effect: It writes the character to the file at the specified position. A useful specification of write must state that this side effect is intended.

Objects Do Not Compose
What has all this to do with objects or their compositionality? Consider a typical class and how it is specified: All attention is paid to the abstract interface of the class, perhaps by providing pre- and postconditions. The focus is on what a method does when it's invoked. As long as methods are pure functions, this is fine. However, that's not what objects are about. Methods tend to operate over the state of their object, which upon invocation is passed by reference to every one of its methods. If the specification is complete and if that object does not rely on any other objects to perform its work, things are under control. (Well, there are some wrinkles even in that case, related to the wonders of inheritance, but we may visit these in another column.)

Real objects tend not to be that simple. Their implementing classes draw on other classes, either by instantiating them or by taking references to their instances. However, if we focus our specifications only on the side of incoming method calls, we can't actually talk about what objects do to other objects, since that's done with outgoing calls (calls that originate in the object of interest and are directed at other objects). If we captured all such outgoing dependencies in our specifications, we'd be in a similar position as with functions that have side effects. The picture may not be pretty, but composition would be possible. Now, have a look and ask yourself how many specifications for typical class libraries you know that actually provide that level of specification. Close to none is probably the standard here. Even a class specified using the full design-by-contract approach—with pre- and postconditions as well as a class invariant—fails to capture outgoing calls. The only outgoing calls that are frequently documented in some form are events that listeners can register for.

Object Models Do Compose ... Sort Of
Surprisingly, the same observation doesn't hold at the level of object models as described using, say, UML. All objects in such a model have well-defined boundaries that cover both incoming and outgoing dependencies: Relationships between objects connect outgoing to incoming dependencies. Thus, if you have an accurate model of a given class library, the information we seek is available or at least derivable. Unfortunately, even if such models are available, they tend to be used at a rather abstract level. To gain the necessary level of precision, the models must be decorated with a precise specification language—such as the object constraint language (OCL), in the case of UML.

Even if all this information is available and we therefore find confidence when composing objects, another problem looms. When using an object that is an instance of a huge class library, we're likely to be creating a dependency on a large number of classes. After all, every object we use is likely to use other objects, which again use further objects and so on. Ultimately, a dependency on a good part of the entire system can result.

Such dramatic transitive dependencies make it difficult to argue convincingly that composition will work. They also make it difficult to gain confidence that all reasonable composition products can actually be realized. More likely than not, there will be large classes of cases in which a particular class or object would be just fine to do the job, but not if that meant pulling in all its transitive dependencies that don't do what is required.

From Objects to Components
Technically, every class that's specified well enough to allow some compositional use and that's packaged such that independent deployment is possible qualifies as (part of) a software component. "Specified well enough" means that all dependencies of the component have been made explicit.

Explicit dependencies are but the first step. One way to gain compositional flexibility is to move from explicit to parametric dependencies. For example, instead of making a class depend on another class, make it depend on an interface and equip it with a mechanism to connect to any class that implements that interface. Now this class can be used in many different compositions. Class libraries that support the reading and writing of streams are typically factored this way: A variety of reader and writer objects specialize in data formatting and can be connected to a variety of stream objects that specialize in the actual stream implementation. Another example is the event/listener pattern found in many component class models such as JavaBeans or .NET Windows Forms, or even at the language level in C#.

Classes that are designed to depend only on interfaces and therefore have fully parametric dependencies are a rare exception. However, it is these classes that form the most useful foundation for components, since the same class can now be used in separate compositions without any fear of cross-composition side effects. Security is another interesting aspect: If a class can use only what it has been connected to, overall behavior is much easier to contain. If taken to the limit, where all class interaction is based on connections and where composites can abstract compositions hierarchically, we arrive at connection-oriented composition.

Contextual Composition
The final form of composition in our little list is contextual composition. The typical examples here are EJB containers and .NET enterprise services contexts. Unlike connection-oriented composition, contextual composition is implicit and asymmetric. A component instance inside a container instance benefits from container-supplied services and from a container-maintained abstraction (simplification) of the world outside the container. For example, a container can add transactional processing and limit what a contained instance can and cannot see.

How is a container different from a platform? It depends. Some platforms actually have container properties. For example, operating systems that strongly isolate processes form process containers. As a result, two applications running in separate processes of such an operating system can communicate with each other only through the mechanisms provided and controlled by the operating system. Other platforms don't form containers: They don't intercept all incoming and all outgoing communications and thus can be bypassed.

How is a container different from a set of connections? A container completely encloses its contained instances, while the same can be established only by convention for peer-to-peer connections. Also, a container is implicitly connected to all instances inside—all enclosed instances uniformly benefit from their single shared container instance's services and all are uniformly constrained by that container's policies.

Composition to No End
In this first segment, we've explored the three most commonly used forms of composition in today's component technologies: object-oriented, connection-oriented and contextual composition. Object-oriented composition prevails in most class-based frameworks. Connection-oriented composition is commonly used for event source/event listener connections, as exemplified by JavaBeans or .NET Windows Forms. Contextual composition goes back to the Microsoft Transaction Server (MTS) and is now prominently used in Enterprise JavaBeans in the form of EJB containers, as well as in .NET Enterprise Services, which today essentially use COM+ contexts that evolved from MTS contexts. The CORBA Component Model is currently unique in that it combines object-oriented (traditional CORBA), contextual (EJB-like container) and connection-oriented (facets/ receptacles and event sources/sinks) composition techniques—yet it doesn't provide architectural guidance in combining these techniques to produce effective compositional solutions.

Next month, we'll venture beyond current common practice by looking at some taxonomies that help us to better understand the trade-offs involved when using the different composition techniques. We'll also take a peek into the probable future by examining some further possible composition techniques.