I began this rant in one of my older BCI-related posts, about why I’m not going to use “Frameworks”. This is a first-class post into why I hate most frameworks and platforms out there, and why people should really just take it up with their companies how they can improve the promotion-systems if writing small tools doesn’t seem “fancy enough” (seems to me like either an HR problem for workers, or a faulty evaluation system for academicians.)
So what’s wrong? Some frameworks are good. Some are great even. I wouldn’t think of doing matrix algebra without Matlab (and I appreciate the fact that they allow us to write plugins to add new features and extend it.) I love how I run most of my code above the .Net CLR and it is “managed” for me. No more processes gone awry. Very few memory leak opportunities. Great library to work with. Same can be said for the JVM, or Python.
If you notice, each of the above is a platform with very little (or none) missing pieces. However, every so often, I come across a platform or framework, that is composed entirely of missing pieces. Something very typical I commented on in that BCI post is as simple as file-copy. Back in college, if I needed a file-copy, you’d see someone building a five-line program to copy files from one place to another. A good program would have buffering, network FS mounting, etc. etc. What it would still not have, and this is something that would never even cross the mind of the developer, is to make it a “data copying platform, where file-copying is a subset of the functionality.”
So what’s wrong with building a platform beyond your core problem? Lots. First, if you don’t know a requirement, you can’t plan for it. I’m making that a very hard statement. When you read the above, you have no idea what that “data” is going to be. If it is just binary opaque data, then it maps to file-copying. If it isn’t binary opaque data, then you have no clue what the requirement will be. So you end up providing features for extending events such as “precopy”, “postcopy”, “copyprogress”, etc. Chances are (and I’d call it a near-certainty) that when that specialized data really needs to be copied, your provided hooks are at best completely useless (in which case, a decision not to use your platform is easy to justify), or they miss some last-mile option which requires a complex hack. I’d bet almost everyone reading this has used at least one fancy solve-it-all framework which lacked one key feature, which required elaborate mechanisms to fit within that framework. If the framework were a tool instead, you could write your mechanism around it, instead of within it.
Taking the file-copy example above, I could generalize that task to something even more abstract. “Operation.” Instead of a file-copy program, I write a program to “Do Operations”, one of them being “File Copy”. So I’d have a framework for operations and each OperationProvider must provide a well-defined entry point called “DoOperation”. That way, when I need to do any other operations such as “FileZip” in the future, I can just write a plugin and leverage my framework. Either you keep the parameters to this entry-point super strict (a list of files), or you keep them super-generic (a Variant, or Object.) Either way, there is one common theme that begins to emerge. What you’re writing is an operating system shell!
The shell provides a generic method for “Doing Operations”, which are represented by distinct plugins called “executable files”, which are loaded dynamically at runtime, whose parameters are generic but are strongly type-checked within, and each plugin provides an entry-point, and blah blah blah…..
What I’ve observed is, 9 out of 10 frameworks, over a few iterations, end up becoming shell replacements. There is only a percieved benefit of abstraction and you have none of the loose-binding power of the system shell. Whatever features the framework itself was intended to provide, remains a small part of the system, largely superseeded by a magnitude of plugins, which now can’t be used for other tasks because they depend on the whole framework loading mechanisms.
I’m not without my own skeletons in the closet. During college, and in my first three years at Microsoft, I’ve tried every single time, (and once almost got away with it), to build these generic monsters. It’s very tempting really. When you’re working on a problem, you start to see generalizations. To be fair, a lot of them are real, while a small number of them are percieved - or you can just make something up like “Do Operation” which is so generic, you can fit anything in there. There’s a certain elegance in the Unix-style design (and even today, I won’t use a machine unless I have trusty VI and GREP installed.) The fact that tools are just that – tools. They do stuff they’re supposed to do, while remaining agnostic to the environment they run in. They can be combined or reconfigured to build bigger tools. And they depend on a framework too – the system shell.
Any modern OS comes with a shell capable of powerfull script execution. The fact is, platforms only work when they specialize more than the shell. The specialization has to be fairly targetted. The .Net CLR collects garbage. It is a framework that provides garbage collection to programs. The garbage collection itself is not arguable. The Type system is not up for plugin-based changes. Matlab’s language is immutable. Its matrices are not what you can plugin and modify.
The deeper problem that once begins to face very fast is that of coupling. Imagine a complex workflow where you define a billion operations that depend on a million other operations. Sounds like a perfect problem for “Framework Man” (if I were any good at photoshop, you’d see a superhero here.) In fact, that’s what most frameworks do. That generic “DoOperation” wrote about above isn’t as apparently ridiculous to others as one would think. A lot of what I’m about to say depends on interpretation, so many of you may disagree.
For the layperson, think of the coupling as how under the hood of your car, your sparkplug depends on the cylinder size you have, and your cylinder size dictates the power you get, and also the fuel injection, and also the the cooling required, and the cooling system eats some of that power, so that takes away a little from what you can use to run. The actual situation is a lot more complicated. Now here’s the thing with cars or operating systems – people fight tooth and nail to reduce coupling as much as humanly possible, and then some. Unless absolutely justified without no way out, you want to reduce coupling. I’m not exaggerating how seriously this is done.
Problem with that file-copy example I gave above is, it seems almost intentional, or at worst a tongue-in-cheek joke at how many people may end up using it because it provides: 1. Filecopy and 2. Extensibility.
Problems with coupling are well-known. One thing depends on another, depends on another, depends on another…. It is a nightmare to deploy, understand, maintain and most importantly fix! You see, if I found a better tool for file-copy, the best I’d do in an OS script is modify parameters to that tool. For a framework, I either write a shell wrapper (if so, what was wrong with a shell script anyway?), or modify the code to be hosted in it, and if the code is in a different language than what it supports, use another framework to write marshalling code around it.
I think what I’m saying here is, management and tasks must never mix. If you need to manage and handle complex workflows for file-copies, that’s fine. Have a framework that manages tasks. Let each of those tasks be handled by the native OS. Even if you can write a fancy-powerful task, write it as an independent tool that can be plugged out and replaced or reused. Call it on the shell. That way, others can quickly reproduce any problems they face with that task in the future when they need to debug. Fixing the task if code isn’t available, is as easy as finding another tool and fitting it in. Undoing the action of that task are simpler. These scenarios aren’t as far-fetched as one would think. Imagine if my n’th task in a workflow were writing registry keys, that my n+1th task were reading. If my n+1th task fails, it’s quite a nightmare to reproduce that failure without creating another workflow that can feed the whole chain (or mocking the task prior to it and hosting the whole framework beast). Trying different parameters involves running the mock workflow over and over again, and depending on what kind of checks, initializations it does, it’s a massive waste of time (is it too hard for the tool to first check for updates, then check for updates to each component, then checking security keys, etc. etc.?)
Small distinct stateless tools, data piped by the OS, a lightweight scripting langage for glue, and console control FTW!