Object-oriented decomposition -- why and why not

My friend Nels Beckman is writing on how to do functional programming in Java. He asked me why I thought people build systems that follow an OO decomposition. Since my boss has been relentless about getting the book done, I have not posted a blog recently, so here’s my note to Nels.

When you decompose systems, at some point you have a chunk (module, component, …) that you want to structure internally. How do you do that? One of the strategies is to mirror domain concepts. This actually works fine in most IT cases (and probably most others too, once you get used to it). There is an argument that the domain concepts (nouns) change very slowly, while the their behaviors seem to always be evolving (verbs). So if you align your structure with the nouns you get less churn induced by the domain changing. This OO strategy implicitly promotes modifiability.

Your suggestion to use a functional programming style is similar to the “orthogonal abstraction” I describe in the book — in the functional case it is usually some math formalism, hence the desire to work with tuples instead of a domain-specific concept. I can see two big advantages to the orthogonal abstraction. First, it could be much faster (or some other quality attribute). In fact it would be surprising if the OO strategy was the fastest — why would the way things are “in real life” naturally align with performance? And second, it could be that there is a domain that has already been well studied (e.g. compilers, databases, static analysis) that has its own set of (stable) abstractions. Experts have decided that these are the essential concepts and they work well.

A day in the life of an IT programmer is a frustrating exercise in what is not known. It is not about engineering a great solution to a known problem. Perhaps this is an under-appreciated fact in academics, and it would explain many prejudices. You’re just some guy trying to make the system do what the marketing / sales / customers want it to do. You never understand the domain (banking, inventory, insurance, etc.) as well as the experts. You are always building an approximation of what they’d really like to have. They always wish it was already built and you always build it too slowly. What you built yesterday doesn’t really support what they want tomorrow. I think systems/OS guys feel less like this — they have less churn in their domain and they spend more time working on optimizing the internal implementation of a module/component.

There’s a story about building a system to track boats on a lake. The first design is an OO one, with classes representing boats, trips, and fares; it can tell you what the cash register should hold at the end of the day. The second design is minimal: just keep two counters that sum the departure and return times of each boat. At the end of the day, subtract the departure counter from the return counter to get total minutes and multiply by rate. Ta da. But if you ask for any change to the OO system (say discount fares after 5pm) it’s easy to add because the change is relative to the domain, which is already encoded in the design, while the other solution has pruned the problem to its core, so you’d have to start over to change anything. I’m not saying this story is accurate — it’s a fable so it’s exaggerated — but the nugget of truth is there, including the extreme simplicity of non-OO solutions. (If anyone knows the origin if this fable please let me know).

Comments

Thanks George. It’s true,

Thanks George.

It’s true, I don’t disagree with what you say here. Here’s all I am trying to say:

I always learned the “modeling first” approach to OO design. And even today, no matter which language I use, I do pretty much always model my domain concepts with types in that language. However, what I don’t normally hear from the “modeling first” approach is, what do I do about all that other code I have to write but that has no manifestation in the domain?

It sounds like your answer is that most of the infrastructure code for your average programmer already exists in the form of libraries and frameworks. This is probably true.

All I can say is that my experience from using this approach often lead to designs where non-domain functionality, like display for instance, would get lumped in with the domain classes, since domain decomposition didn’t really tell me where this stuff needed to go.

Thanks bro.
 Nels

your code shouldn't resemble your model

Nels, I agree with you that the view, “model your domain and turn each concept into a class”, is too simplistic. An OO program is invariably much messier than its domain model (e.g., because of language shortcomings, such as bi-directional relationships in Java), but that’s not even the whole problem. I think it is fundamentally a mistake to think about the problem (domain model) and the solution (OO program) as mirror images. The OO program is designed to meet non-functional requirements, and those first of all dictate certain decisions to make the program, I don’t know, faster or maintainable. And then it turns out that what we do in the model world, mixing data and behavior, is not what we do in the program.

In my experience it is very very beneficial to separate out data and behavior. In other words, some classes in the program just encode the data, like the famous “Student” class with fields like firstName and lastName. But that class doesn’t have a “graduate” method; that method is in some “manager” class. So fundamentally, we’re ripping our domain model apart and allocate behavior essentially arbitrarily somewhere in the program. What remains is the data, which encapsulates the “state of the world” part of our domain model.

Now that data part, I think, can, modulo practical issues like bi-directionality, resemble the model very closely in code. One may want to re-group things here and there, or make distinctions that are not in the model, but that’s about it. Incidentally, this is the part we like to store in databases, and so this separation also makes sense because we can hand those data objects, which is what they’re actually sometimes called (or “value” objects), to Hibernate or some other persistency layer. This is really really common in enterprise systems. EJBs practically force you to make a separation like this, and they even have names for it, Entity Beans for the data and Session Beans for methods that operate on that data (i.e., for the behavior).

I have a suspicion that other separations like this may also be beneficial, but this is a very common, useful, and familiar one to me. And it makes the domain model be fundamentally different from the code.

OK, enough for now, let me know what you think, and I’ll try to come up with more wisdom about the differences between problem and solution. Btw, I learned to call these two parts “analysis” and “design”, and even if they use the same notation (UML) it doesn’t mean they’re the same at all.