Archis's Blog

June 29, 2014

Why Programming Languages matter (and how you may choose wisely.)

A colleague asked me a couple of months ago, what determines my choice of language – or even paradigm and that had already motivated me to write a draft on this. This past week, I spent in Silicon Valley at a conference. This would be the first time that I actually interacted with the “hack3rs” and startup-junkies and techies and what not. I learnt a lot of buzz-words I had never heard of before, but I did walk away with one major take-away – even startups that are trying to be lean and simple, tend to complicate things beyond belief!

You can separate out the ones who “get it” vs those who don’t. The talks by Google, Facebook and Akamai engineers were captivating. They operate on a higher-level of problem space. This is not because of money, size or scale – it is simply because they choose to brutally simplify that which does not matter – so that they can focus on what does matter.

So in the backdrop of me pushing my own team to consider bringing in higher-level languages or even using full object-oriented as it was meant to be used, I found that a lot of new-age startups didn’t “get it” either. They use Javascript and Python and Ruby, but they have no clue why.

Now admittedly there is value in writing a more compact loop, or avoid boiler-plate code. But if that is the only reason you are considering a programming language, you’re doing it entirely wrong. I’ll take an FFT written in BASIC any day of the week and twice on a weekend, as opposed to a correlation-DFT on a multi-core parallel futures-dependent async map-reduce framework on top of hyper-optimized vectorized C++.

So what should influence the choice of language? One question and one question alone: Am I saying what I want to say?

After putting aside reasonable performance requirements, all the requisite functionality, etc., the language needs to allow you to express your intent, not only to the compiler, but to a future reader. It is my belief that 99% of the reason behind all failed attempts at making software maintenance possible, lies the inability of the original programmers to express their intent. What is documentation if not expression of intent? What is a UML diagram if not expression of intent? What is Object Oriented Programming, if not expressing intent of WHAT operations are possible on WHAT data? It isn’t as if the old C-style ModifyWindowEx(HWND wnd) didn’t work, but Window.modify() tells you and the compiler, what is possible on that window, and what is not. It is expression of intent.

Fortran was huge back in the day, because it expressed your formula. Instead of reading:

MOV AX, $5D
ADD AX, $6F
MOV $7F, AX

You could say:

c = a + b

So you know that the entity “a” was to be added to entity “b”, and the result stored in “c”. You don’t even need to know computers to tell me what that means.

The common misinterpretation of this concept is that “Functional languages allow you to say what you want, and imperative languages allow you to say how you want.”

That is a terrible way of looking at it. Because sometimes “how” you want something is what you’re trying to express.

Like all my blog posts, I’m going to give you that fundamental question to ask yourself when choosing your language:

“Did I tell people what my intent here was?”

If you can’t answer that question directly by the language, you’re using a non-optimal fit. When you have to write documentation and code comments, that means your language failed at expressing intent. Take the example of this prototype:

char* reverseString(const char *foo);

Without extensive documentation on treatment of nulls, empty strings, and exception-handling capabilities, there is no way to understand what the author of the function intended this to be used for. This is BAD! Sure there may be tons of input validation inside the function, but now you have to write a dozen unit tests for a dozen scenarios to ensure that assumption isn’t broken.

What do I mean by intent-expressivity? Suppose C++ allowed hypothetical annotations that could be made part of the prototype metadata?

char* @Nullable reverseString(@NonNullable const char *foo);

If those annotations were stored at the prototype metadata, you got two benefits:

1. You never need tests to ensure foo is non-null. Your compiler did whatever it had to, to give you a non-null char pointer.

2. You expressed to your caller, in no uncertain terms, that you will not tolerate nulls. You expressed it in a way that the compiler understood, and a smart static analysis tool would catch a class of bugs not possible in plain-old-C.

While this appears to be cosmetic syntactic sugar, it is far far more than that – it is semantic sugar. Now any analyser, man or machine, knows that having foo being null, will not be entertained by my function. Rather, you’re locking down the domain and range of the function. It looks very very silly that I would care about such a thing.

 Functional Programming isn’t the answer to everything:

Another common misconception about me is that I want pure-functional languages. Oh boy do I love them dearly, and for good reason. See that expression above?

c = a + b

what if I wanted to add the result of two expressions?

c = (expr1) + (expr2)

What if expr1 has side-effects that affect expr2? This isn’t an unseen situation:

c = (a++) + (a + b);

The problem here is not the one you think it is. I know what you’re thinking: “Who knows how this language interprets that statement? What happens if the evaluation order got changed?”

And you’re WRONG! That kind of thinking is what allows such features to live. There is an easy answer to what you thought was the problem. The answer may be reading the compiler spec.

The real problem with the expression above, is I have no way of knowing whether that sequencing was incidental or intentional. I can deterministically answer what will happen above. What I cannot answer is, was that intended? If I had to optimize the method above for running in a loop. If I had to make it so that it could be invoked by multiple threads, possibly running on different cores. If someone asked me, “If I set value of variable z to 10 instead of 20, will it affect your value of c?”

Then it is theoretically impossible[1] to answer that question. Sure we could heuristically make some assertion, after adding a thousand caveats (or just one caveat), but as a reasoned outcome, we cannot say, that z somehow didn’t affect a or b. Furthermore, multiple evaluations of the expression above cause “c” to change.

Why is that important?

Because the ability to reason is the ability to maintain. You want to know why CSS sucks? It isn’t horrible, like most people think, because people write it wrong, or because designers mix font rules with layout rules. CSS sucks because it literally removes any and all ability to reason about the intent behind any rule, without massive comments.

Remember that a rule-based declarative language isn’t exactly new or revolutionary. Prolog gave us the CSS-style declarations 50 years ago. Erlang gives them to us today in a widely used industrial language.

If I showed you the code below:

div .title #subtitle {color: blue}

I bet you, you would have absolutely no freakin clue what effect this has on a particular page, without actually loading the page. It makes no mention of how it is supposed to be interpreted in relation to other rules. It makes no mention of how it relates to conflicting matches.

So for all you Ruby/Python/Node.js users out there, I have one piece of advice – if you truly want to out-do the “establishment” and gain an edge – do what Google and Facebook does. They use experimental technology, but they don’t do it to reduce boilerplate code in for-loops. They use it to express their intent for their loops. Rapid development is a good enough reason to pick an easier language. Accurate expression of intent is the best reason to pick any language.

When Imperative matters:

To finish up, I wanted to explain why imperative programming matters. Look at a device driver for instance:

setlpt1(00000000b);

setlpt1(00010000b);

setlpt1(00000000b);

That’s some primitive protocol I made up for the parallel port. Those statements are organized chronologically. Even 200 years from now, that is EXACTLY what they mean and what they must do. To use imperative programming where necessary, provides a strong signal to the reader that this code is NOT to be messed with. There is no opportunity to reorder those operations. There is no opportunity to apply them to abstract “ports” – they only work for the “parallel port” or “printer port of old”.

Writing the above in a functional language, and then adding synchronization primitives to ensure they run sequentially, is folly.

Conclusion:

If there’s one take-away I can summarize from this post, the next time you write ANY code/spec/program, ask yourself – did you express your intent properly? Did your choice of tool/language allow you to express it semantically without ambiguity for interpretation? Was it done in such a way that a future maintainer, would know what your constraints are? Without reading a single code-comment or documentation? If you answered yes to most of the above – you’re probably using the right language, and using the right language right.

About these ads

8 Comments »

  1. Thanks for a great post. I agree on many of your points.

    Compilers (and future me’s) should be given more data on our intent, like in your example:
    char* @Nullable reverseString(@NonNullable const char *foo);

    This is sorely needed in C/C++, but here’s another syntax for a fictional language (and somehow I see this as a much more important thing than const, so I’ll just leave const out, until someone convinces me that I actually need it):

    (opt string) reverseString (ref string foo)

    Comment by sadesaapuu — June 30, 2014 @ 3:43 am

  2. Great post. The only problem is:

    “Writing the above in a functional language, and then adding synchronization primitives to ensure they run sequentially, is folly.”

    This is how you would typically write it in Haskell:

    foo = do
    setlpt1([00000000])
    setlpt1([00010000])
    setlpt1([00000000])

    No need for any “synchronization primitives”.

    “do” in Haskell makes it easy to work with “imperative” style code.
    However it is *way* more powerful than just doing simple IO.

    Google “Haskell Monads” for more info. Guaranteed to blow your mind (when you get it that is) :-)

    Comment by mbrodersen — June 30, 2014 @ 12:16 pm

    • Yes I know. Come on, don’t make me defend functional. :-) I thought my subtlety wouldn’t go unnoticed. I’m coming down HARD in FAVOR of functional programming. I saw the reddit thread happening about this, and if you want to post back there, then here goes:

      Note how I said, “If compressing a for-loop is the only reason you’re using Python/Ruby/Node.js/Closure, you’re doing it wrong?” My entire point here is, use them for all the power they provide. It isn’t about writing more compact imperative code. It is about writing mind-blowing code that 10 years later elegantly explains itself – to a compiler and a person. This allows future maintainers to know what to do when running on multiple cores, symmetric-multiprocessing, distributed processing with eventual memory-consistency, etc.

      What I’m advocating is just what you said – when you want to express intent of chronology, go blunt about it. However, consider the example above and if I had to reset 3 ports, what would I do?

      setlpt1([00F])
      setlpt2([00F])
      setlpt3([00F])

      Now, in this case, if I express the same as:

      setlpt1([00F]) && setlpt2([00F]) && setlpt3([00F])

      Then I’m giving a strong hint that I don’t care about order of evaluation. If the processor had 3 cores with three independent IO busses, then it’s fair game to run all three instructions in one cycle. If not, run them serially in any order. All sides of a logical AND are equal to me.

      And yes, I love Monads. Remember that comment of old? “Any sufficiently complicated software system, in any languages, ends up implementing a lisp interpreter unknowingly?” With large scale integration of disparate system, the new rule should be, “Any sufficiently diverse system in any language, ends up reimplementing Monads unknowingly.”

      Comment by archisgore — June 30, 2014 @ 7:09 pm

  3. char* @Nullable reverseString(@NonNullable const char *foo);

    Use this:

    std::string reverse_string(const std::string& arg);

    or, if you know about how to use rvalues, use this:

    std::string reverse_string(std::string arg);

    And if you really, really, really desire for return-values that are null:

    boost::optional reverse_string(std::string);

    Comment by FJW — June 30, 2014 @ 3:39 pm

  4. OK, the blog stole the angle-brackets, the last line was this:

    boost::optional opening-angle-bracket std::string closing-angle-bracket reverse_string(std::string);

    Comment by FJW — June 30, 2014 @ 3:42 pm

  5. This is why Perl is an awful language. They actually Brag about how many different ways there are to do the same thing. So to understand the intent, you have to learn every different accent and linquistic quirk of each regional dialect. no thank you.

    Comment by chris — June 30, 2014 @ 9:35 pm

    • Chris I sort of agree, but also slightly disagree. Perhaps this is because I’ve been burned in the past with this line of thinking. If you read my previous posts, I’m a sworn opponent of frameworks and rigidity. Because then you end up with “hacks” to workaround the unnecessary rules imposed. In my short career so far, I’ve gone full-circle from being the guy who wanted to enforce all sorts of crazy strict rules, to the guy who wants to remove ALL rules except for a few KEY rules.

      Every attempt at using rigidity, no matter how well-intentioned, produces the kind of pain that I’m arguing against. While I’m a pure-functional LISP guy (who hates that Haskel allows imperative programming), I’m also a pragmatist. Java tried to solve C++’s operator-overloading idiocy by removing that ability. What made C++’s operator overloading idiotic had nothing to do with the ability to overload operators, but the idiots who overloaded it all the time everywhere without knowing why. Java however didn’t allow class extensions like Smalltalk. That leads do this kinda stuff:

      class OperatorContainer{
      Class1 addClass1ToClass1(Class1 clazz1, Class1 clazz2);
      }

      So basically people write a ton of satellite classes to contain what should rightly be extension functions or overloaded operators. That’s why Apache Commons provides a dozen StringUtils, MapUtils. ListUtils, etc. instead of exposing the API on the class itself.

      At the end of the day, it is up to the programmer to ensure that they use the correct feature/option at the correct place. I am of the view that the feature be made available regardless. To prevent a good-intentioned use of a feature because someone may do something dumb with it is, folly. Give someone the purest strictest language on earth, and I’ll bet 20 bucks they’ll find a way to do something dumb in it by end of the day.

      Having said that Perl still pisses me off, but that’s just biased personal opinion due to the horrid horrid Perl code I’ve had to deal with. It has nothing to do with the language itself – which I’m sure has many facets of beauty.

      Comment by archisgore — June 30, 2014 @ 11:36 pm

  6. This is why we love Python over old-fashion Java/C++ family languages. I agree with many of your views, in spite of these, functional programming language usage is increasing day-by-day. So, can we say that, “expressing the intent” doesn’t work always ?

    Comment by Murat — August 14, 2014 @ 11:13 am


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Silver is the New Black Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 1,119 other followers

%d bloggers like this: