Whenever OOP (well, I should say C++/Java/C#) programmers think of boundaries/coupling, etc. the tendency is to think in terms of “names”. I won’t go into the nouns vs verbs diatribe. However, boundaries and coupling are thought in terms of components – where a component is arbitrarily defined by each person. Inevitably these boundaries become muddled. Usually when we think of complex problems, it helps to go back to basics. There was one system I’ve used in my life, that forced me to never have these problems around boundaries. It did it so elegantly and effectively, that even today I can’t find something that good for use easily.
When we think of calling methods, we commonly run into the challenge of sending data across component boundaries (whether your language calls it a “class” or not, is irrelevant). In the way an program is written or thought of, it isn’t always clear what the right granularity for information should be. Let’s take the example of a quicksort in C.
I can write the signature as:
quicksort(array, left, right) //My recursive calls progressively narrow left->right windows. quicksort(array, right) //In C I can pass a pointer to "left", I bound on the right quicksort(array); //I somehow found a way to NULL-terminate my sub-arrays. Call it magic.
What’s interesting about this example is, until you look at all three signatures, you don’t realize that each of them conveys very different meaning. If you saw any one of these in a code review, it wouldn’t raise a large red flag. It’s all plain pure C. Nothing out of the ordinary is happening here. (Someone coming from a traditionally functional/declarative language might throw a fit just on principle, but that’s because in their world, all arithmetic has to be built from a single primitive function “increment”. I may be wrong here – someone in the comments is probably going to point out that increment is built using some other more basic functions.)
Now imagine doing the exact same thing in Java or C#. When was the last time you even got to consider the three options as being options? Due to memory constraints and lack of pointers or null-terminability of arrays, you almost always end up using the first version of the method signature. Doing anything else, would mean copying the subarray into new arrays (not that there’s anything wrong with that – it’s just that the languages choose to make it expensive.)
This is what I mean by object serialization. The cost of object composition or casting is orders of magnitude more expensive than passing the message to the other side. So you inevitably end up sending everything you know. “Here’s three globals, two locals, my array, another array I was working on, and the past 10 US presidents for good measure. You figure out what you need to sort this array.”
This is where we come to the title of this article. One of the common pushbacks I get when I ask for sending the “proper message” across a boundary is how difficult it is to serialize an object. That is, somehow the challenge message-passing systems have to “overcome” is to figure out how Java and C# can serialize their stuff better. I submit to you, that that is where we lost our componentizing battles. Whereas the challenge presented to Java and C# should have been: Make it easier to compose your stuff into messages.
You want empirical evidence? Here goes. In 1995, pure-C programmers using multiple different compilers were able to build Microsoft Office that could be extended almost indefinitely. Now these people weren’t security experts, but the proliferation of IE’s ActiveX nightmare only proves the case of their APIs. Internet Explorer using COM was the closest the world got to Jeff Goldblum being able to execute PowerPC machine code on an Alien mothership (Where Steve Jobs probably designed their systems before coming to earth ). While it was a godforsaken nightmare for Microsoft, how many of you wish your legitimate components could work THAT good with each other, huh? Be honest!
The takeaway is, the next time you define an interface, ask yourself this question:
If I was sending this information across a compiler and memory-layout boundary, is this the way I would send it?
(Also, all your dangling unclosed file descriptors will go away in the next few minutes.)
 There’s a reason I call out the three most common commercial Object-oriented languages out there. I’m going to say something that’ll polarize my audience – must you really know the best commercially-used and widely deployed OOP system in existence today? It’s COM. COM isn’t easy to understand. It sure as hell isn’t simple. However, that thing is so darned elegant, you fall in love with it the more you use it. Elegance comes from using a very small number of powerful concepts repeatedly.
I’ve spent two years writing very performant OOP code for mobile devices using COM, and later when I switched to C# for more mainstream server programming, I found that C# programmers had an array of things they couldn’t do which I was doing comfortably using plain-old-C. Sure you save some verbosity using “val foo = <Large type declaraton>”, or you save some filtering verbosity using LINQ, and what not.
COM was built out of a need – out of desperation – and the elegance shows. When you write code that talks to a COM component, you have no idea what language it would be written in (proper OO boundary). You don’t quite know what methods it exposes (duck typing, dynamic typing). The object itself can expose methods not physically compiled by the language (it can have extension methods, or runtime-exposed methods, etc.) However, the one key benefit of all this was that communication between two components was always very well enforced. The benefit C++/C#/Java programmers tend to miss. We got this as C++Light programmers (C++Light is when you’re mainly writing C code on a C++ compiler but use some C++ for strategic things like smart pointers.)
OOP was never about the language. It was about “objects”. COM forced you to think of components. You couldn’t have global statics because… well how do you do that across language boundaries? When you’re linking two DLLs with vastly different memory layouts, you just can’t pull it off. So you are forced to think how you share stuff.