A story of space…
When I was 10, it was already 35 years into the space age. Humankind had made it onto the moon. I was fascinated with reading as much as I could get my hands on, about the Apollo and Gemini missions. It would be my first introduction to redundancy. This was the era when computers were hilariously primitive. With no software development discipline, no formal verification methods, no historical learnings to depend upon, how would these machines be kept reliable? The answer was both elegant and ingenious – two separate contractors would build two independent computers in isolation, designed to perform the same calculations in real time. When they disagreed on their computations, that’s when the pilot would be alerted.
While it took me 20 years to understand what they did, they weren’t really providing redundancy in the true sense of the word. They were testing the system LIVE. One of the systems (picked arbitrarily) would be treated as the product, while the other system would be treated as the validation test.
It is a simple and powerful concept. When in doubt, simply rebuild the entire system all over and see if it matches. The chances of making mistakes in both systems is, without a doubt, very high. However, if it could be shown that the chances of making the same mistakes in both systems is very low, then you might be on to something. For very little knowledge of any software discipline at all, you could still build a pretty damn good quality system.
While you continue reading the rest of this post, try and answer the following question: What does a mismatch in calculations tell you? Try and enumerate the implications of that predicate. It’s going to be important later, because that will directly determine what we can do with that information. If we wish to do nothing about it, then having the validation in place, achieves nothing, but costs us a lot.
The story of ECC and Parity bits…
Anyone who’s taken basic CS classes knows about a parity bit. It goes something like this:
You know that game you play where you are standing in a row of people on stage.The game show host whispers a particularly tough-to-remember phrase to a person standing on one end. They have a limited amount of time to repeat it to the next person and so forth. Eventually the person standing at the other end has to repeat the phrase as they know it best.
Let us assume this phrase was “much would would a woodchuck chuck if a woodchuck could chuck wood“.
Now there are a few tricks each participant can use to ensure the person they relayed it to, got it correctly. One way I just made up, would be to add two bits of metadata: “wood: 5, chuck: 4“
A modified phrase to pass around, for some extra memory length is:
“much would would a woodchuck chuck if a woodchuck could chuck wood. Wood: 5, Chuck: 4“.
The last participant then, if something was missed, could generally reconstruct the phrase within rules of grammar (implicit in the universe), and far simpler metadata, the numbers 5 and 4. It also gives the precedent a chance to validate that the relay has happened correctly.
This how all electronic communications works. There’s relays upon relays upon relays. Mediums change and encodings change. This is in a pure universe with no malicious attacker. We’re not talking security here. We’re talking reliability. Just like in the example above, when bits are transmitted, two parties could make simple agreements with each other.
First, that any numeral can only be a zero or a one (just like English grammar being implicit). Usually this rule is enforced by the medium itself.
Second, that the number of 1st must necessarily be even (or odd).
So if I’m transmitting a number: 10011000
With no knowledge of electronics or binary coding, and having even the most basic ability to write messages, I can ensure that my message, as relayed to my friend has an error on it. There are three 1′s and five 0′s. What if the number I wanted to send was the above. Then I would simply append an extra 1 to the right making four 1′s and five 0′s. The agreement being that my friend always looks at the left-most 8 digits.
This is how parity bits work. ECC (Error Correcting Codes) use a similar principle but are able to detect which digit was broken. Thanks to our first rule that only two digits are allowed, recovery is trivial – since it can only be 0 or 1, and we know that it is wrong, we make it right by changing it to the opposite.
You’ll notice that in this example, I gave you the answer to: what a mismatch in expectations tells you.
Testing, QA and Validation
The main purpose of this post was to comment on how to write tests, when, for what and why. The two examples above are great references to keep in mind when designing a test.
When writing any test you should be able to answer two questions crisply:
1. What does passing this test tell me?
2. What does failing this test tell me?
If you cannot answer these questions based on a test, you need to rework your test-case.
Why is this so important? Because like NASA, your test’s primary job is to smell something is fishy. You don’t know what is fishy. You’re not sure if it’s cosmic radiation, or a programming bug, or even a bug in the “comparator” (remember that the thingy comparing the output of two simultaneous computers is also human-made – it is susceptible to all the wonder and amazement and limitations that come along with it.)
Furthermore, a quality test should, like the ECC, tell me what is wrong. More than often, think of tests as being published, not written. Your tests are a publication to your peers, broadcasting to them the expectation you have from something. So when that expectation is broken, the test has to tell them what it was.
In FOPL terms, a test should guarantee that
1. P->Q (P implies Q) (If this test passes, then you did nothing wrong.)
However, further more, it should also meet the contrapositive
2. ~Q->~P (Not Q implies Not P) (If you did something wrong, then this test MUST fail.)
“Your passwords do not match”
I finally come to an explanation of the title. When you create an account on any modern webservice, you are asked to enter a new password twice. What is the purpose of that? What predicates does it allow us to make? When passwords match, and when they do not match?
The answer is very few. As in, all we can say is, the passwords you entered do not match. Is any one password more valid than the other? Is one password more proper than the other? We simply do not know.
However, when we write tests, we derive a false sense of security when qualitatively, all that the test is doing, is the above. Let’s go into a few examples.
1. A sorting test:
In order to catch a possible memorization scenario, I argue that a real sorting test generates previously unseen inputs and sorts them itself, and produces outputs. This ensures that someone hasn’t simply programmed if (condition) then return “solution” kind of functions.
However, it isn’t as trivial as it seems. What if they did? If they implemented a non-stable sort (Quicksort), and your test validates against a stable sort (Mergesort) – who is more “right”?
Remember the two questions posited above. If you implement a quicksort all over again, then you mandate quicksort behavior on the implementation – which means you’re not really writing a test. You’re doing the “two passwords don’t match” thingy. If you DON’T implement a quicksort, you’re impostng rules that are not part of the expectation.
This is a common annoyance. Let’s say you’re Startup and you’re writing a program to generate terms and conditions statement through automation for every new service you spin off. The idea is that you take in a template, and replace company name with Startup (abstracted out with getCompanyName()) and product name with AprilFoolsDeuxExMachina (abstracted out with getNewProductName()).
A temptation to ensure nobody accidentally changes company name is to guard it with a test. In this case it is literally the equivalent of two-passwords-matching case.
You would do
“Startup” == getCompanyName();
What is this test ensuring? That the company name is right? Who’s to say you didn’t get it wrong in the test? When the company gets acquired, is the test wrong or the code wrong?
Would it be better expressed if we simply wrote two identical functions (a sort of two-factor authentication) that increases the barrier to entry for modifying company name, rather than making it a test “failure”?
There are many more examples. This is simply an observation I had when reading nearly any code… that chasing code coverage looks very good in principle and on paper, but there are times when an argument needs to be made to not add a test, if it doesn’t answer either (and especially if it answers neither) of the two questions above.