Archis's Blog

September 17, 2011

The rise of context-free language

Filed under: philosophy, Science — Tags: , , , , , — archisgore @ 6:48 am

Here’s an intriguing thought. I have a super-intelligent friend (one of those whose guides is Turing Award winner) who works on NLP. We have our occasional long-term phone calls where some or the other topic comes up for discussion. This time it was worth blogging about.

Quick overview – languages have rules, structures, etc. Sometimes, the rules become too complex, or at times, they are so specialized, they turn into a look-up table (i.e., not a lot of generaization.) Whenever you can’t generalize, you add entropy. Putting aside, for a moment, the poetic beauty of a language and the art of eloqution, many rules are redundant.

Consider language as simply a tool, a means to an end rather than the end in itself, designed to express a thought. If so, the less ambiguity something has, the easier it is, and the better it solves its purpose. When one first begins to learn computer languages, or even when they think of “parsing” English, every single person that I know goes through the thought process above. Why not just have a language that isn’t as nuanced? Why not design a simpler language? Esperanto certainly came out of a need, but building out a complete new language may not have been the solution. It appears that the need is already being met by modification to English itself.

I am beginning to believe that the very efficiency computational linquists want in a simple-to-parse language, is also the kind of simplicity the human brain wants. There is a certain idea you want to express. The nuances of whether I will do something, as opposed to whether something will be done by me, while undoubtedly helpful, may not be as necessary as we think. Facebook/Twitter are helping reinforce that idea. If you look at most non-proofread contemporary speech, it almost feels like a context-free language. It appears that what NLP wants, NLP may end up getting, simply because what makes NLP so hard is what also makes language itself so hard for most people.

Texting is the classic blatant example. Most texts are simply a gathering of words put together. There is a certain amount of context and syntax present to avoid ambiguity, but overall, the tools used to elimiate ambiguity are the ones that can do it in as blatant a way as possible, with as little simplicity subtlety as possible. Similarly, few FB/Twitter posts seem to be carefully crafted treatises, but generally just words that present an idea. The less context necessary for the idea to be parsed, the better it is communicated. Five years ago, a lot of “old school” people, including me, would complain of the utter lack of punctuation in sentences. Instead of adapting punctuation correctly, I found that people learnt to phrase their text in such a way that addition of commas and full-stops became unnecessary. A modern FB post is as decipherable without punctuation, as it is with. That’s some creative adaptation, right there.

Another reason for this is search engines. Very rarely do you search for something like, “Give me movie times for today evening in Redmond.”

The same idea is expessed as simply as, “Movies redmond today”

Over time, it is not difficult to imagine this is how I might begin communication with a friend. Even the verb is implied and not explicitly stated! The parsing rules for this language are just ridiculously simple – tokenize the sentence, and you know what it’s saying.

Then again, I’m not blaming the internet or machines for this phenomenon. I think it is simply been the first time that a large population the entire earth is literate (even 30 years ago, when I was born, I knew plenty of people who couldn’t read or write.) Written language was, no matter how many people may dislike this, an elite previledge – and to some extend, an end in itself. When you are a club of handful people, you can end up in an ego-pissing match. What we might call spoken ‘peasant’ language was always utterly simple and efficient (though I find a lot of ideas I cannot express to them due to the lack of a vocabulary that can convey subtle differences.)

I’m not advocating anything here, but we have to admit that any complex and large system always tends towards reducing entropy over time. It does not mean literary art will have no appreciation, but it is an interesting thought. This would be an interesting hypothesis to test out, if only for the academic validity of the idea. Is modern human language finding a path towards reduction in the energy and ambiguity required to express an idea? Is it a dual-feedback loop where NLP systems are getting better with feedback, but also driving certain generalizations back into the human world?

September 1, 2011

On code reuse and maintainability

Filed under: Science, Technology — Tags: , , , — archisgore @ 12:32 am

A wise man once said, “Any procedural program, given a sufficient level of complexity, will end up implementing some form of Lisp.” (If you can’t find a paraphrase of that quote, then attribute it to me – but I’m pretty sure I read it somewhere 10 years ago.)

Today, we continue on the rant against frameworks, and look into code reuse and maintainability. Unlike regular posts, this is one area I’m not too sure about, and would love comments or counter-examples. Spare no punches!

What is reuse exactly? Syntax or semantics?

Let’s start with an example. Given a procedural language, I write a framework to do the following. Instead of the coding having to decide at coding-time what operation they want to do on a float, they can delegate it to my run-time operation-definition framework that allows data-driven dynamically loaded operations.
1. The code itself will look like this:
y = f(x)
2. The executable will add a configuration file for my framework that says:
<define function=”f” definition=”System.Math.Sin”/>

This one raises a lot of questions that I want to ask, but for now, I’d like to know, would you call this code reuse? I can certainly make great arguments for this style of coding (keep quiet functional-programmers, this one’s for someone who _has_ chosen a procedural language.) It allows me to change the definition of ‘f’ at runtime. I don’t have to worry if tomorrow my computation changes, because a simple config-change will make my code work for Cos, or Tan or whatever else the user needs. I can replace the definition of “Sin” to use a different implementation whenever I choose.

For me personally, this is bullshit! It’s the worst kind of code I would ever had the displeasure of dealing with. Only the last argument made any sense, and there are ways around that. It is the most irresponsible style of coding too – the programmer, instead of taking responsibility for ensuring correctness of code, delegates every function out of their own scope. If you really want to do that, use a functional language already! People have been advocating them for over four decades now, and this is exactly the reason why! Stop contaminating my procedural code with a smarty-pants half-assed implementation of something for which robust implementations exist already. You can replace your code at runtime and any interpretor worth its two cents has a decent JITer. Semantically, what would be the difference in sending the interpretor a new file to interpret, versus changing configuration for a running program? y=f(x) is certainly not going to have bugs (and can be tested easily.) Your probable bugs are going to be inside ‘f’ anyway. So while your core ‘executable’ can be assured of being stable, it’s a false perception.

The problem with this snippet, functional language or not, is that instead of ensuring correctness, it actually reduces it.

For one, what you see above, is an example of syntactic reuse. You are reusing the syntax for making a function call. You may disagree with me on this, but to me personally, what is really valuable isn’t syntax reuse but semantic reuse. Implementing a good Sine function is damned difficult. That’s what I want to reuse. Calling into the Sine function isn’t what I worry about when I open up my editor. The correctness of my Sine function is what I want to reuse. If there’s a bug, and someone fixes it, I want the new Sine function. If I may ever need to use a Sine function from a different library/implementation, well, seriously – change references to the new library and recompile your code (I know I’m making some atrocious demands here.)

The second problem is the really serious one. When I’m writing code as y = f(x), what the heck am I thinking? I mean seriously. If I am writing a program, I write it for a specific purpose. If I’m computing some vector component across one axis, I know why I’m computing it along that axis. Which means, when I write f(x), I had better damned well know what that ‘f’ should be. If that ‘f’ is ever going to change to ‘g’, then that’s because my problem statement has changed. It alters completely what I am doing (two axes are never the same.) If I start computing Cos(x), it is very very different from computing Sin(x) and I would have serious justifications for why I want the Cosine now. I sure as hell don’t want to reconfigure a running program to do that. I may do a host of things with a running program – use a more accurate Sin implementation, use a faster Sin implementation. If I’m fundamentally changing the definition of the function, I’m in big problems from the outset because I’m changing what my code is guaranteed to do.

Copy-fidelity

I know a lot of people don’t consider the fidelity their code preserves when it is xcopied from one place to another, but I assign it a very high value. The problem with the above snippet, is that 9 out of 10 times, someone’s going to only pick up the code file, without caring what the configuration file is. This is certainly not unreasonable, regardless of how senior or experienced you are. If you see a file called “eigenvalues.java”, you think to yourself, “Hmm…. maybe I should copy eigenvalues.java and use it to compute my eigenvalues.”

I find nothing wrong with this thinking. Very soon though, you see a compiler error: “Function ‘f’ not found.” You spend a couple of days (and if you’re lucky, you’d find a comment) figuring out that you have to host this class file in another loader called ‘function-replacement-framework.jar’. No big deal you say, I love myself some helper tools! This is when you go mad. “functionReplacementParserError: Please define ‘f’ in the configuration file.”

Now you are, in effect, figuring out how to compute eigenvalues so you can define what ‘f’ should be. I’m sorry, but you really should NEVER have to write a config file to define _what_ your code does! Regardless of how much computer-sciency stuff your parser and interpretor are doing, and how they’re generating binary classes at run-time which are processor-optimized by the JITer, this is some pretty bad design there.

Maintainability

So I’ve been thinking about this for a few weeks now, and want opinions on. I came to a good working definition that I think I’m going to use for a while in the near future. I draw this definition from how real-time systems are defined.

A brief overview for the uninitiated – while the common-sense notion of real-time systems is ‘they’re really really fast’, the working definition is ‘time-deterministic’. Meaning that, given what they do, they must do it in deterministic time – meaning, they have to guarantee that something will happen at (‘by’ if you’re soft-realtime) a specific time. Even in light-hearted situations such as filling ketchup in a bottle, you need hard time-determinism, to ensure the bottle is under the nozzle ‘at’ a certain time, not ‘by’ a certain time (if it passes from under it too early, you’ve got a mess to clean up.)

In a similar fashion, I was wondering if maintainability is “doing less work”, or “ensuring correctness even if it is more work.” If I had to make the choice between say…
1. Being able to replace all definitions of “Sin” with “Cos”, using a single config change, which is admittedly a lot less work, but depends on the hope that everyone has taken care to handle the cases where x=0,
2. Or replace Sin with Cos in all the code manually, which is a lot more work, but you ensure that that specific place really is worthy of a “Cos” function as per your replacement-intention. This would guarantee determinism in correctness, but increase work significantly.

If the above sounds unlikely I can certainly come up with more concrete stuff. We frequently use stdout and stderr to send output of a program. Would you prefer to control what goes where implicitly, or would you prefer to modify code yourself (there’s a difference between code consciously having an if-then based on a configuration parameter, vs, your code just having fprintf(outstream, “<stuff>”) where you don’t know during code-time what outstream could be, and can’t function without a complex config file.)

I intentionally chose the stderr vs stdout example because it is vague. I can see it from both ways. In some cases I am conflicted as to whether a warning should go to stderr or stdout. It depends on what kind of tools are going to capture the output, and how they may want to parse/interpret it (some tools, for instance, may consider anything being spit to stderr as indication of program failure.)

How would you define maintainable (or is it maintenable) code? How do you draw the line between abstraction of details vs. core purpose of the tool which is what makes the tool what it is. Does configuration at some point, become complex enough that you’re really just programming in a declarative language through your config files, whereas your “code” is simply an interpretor at that point? If so, is it configuration any longer? If declarative code is more maintainable, why not use a declarative language from the ground up?

June 21, 2011

Fact

Filed under: philosophy, Science — Tags: , , — archisgore @ 1:14 pm

This one comes from long introspection. Over the course of my career, I’ve been involved in sufficient debates and arguments where a misinterpretation of semantics has lead to difficulty in communication. I try to be very precise in choice of language and generally demand that others do the same. Any student of mathematics ought to appreciate the value of accurate semantics, and interpretation of data. However, not everyone is a mathematician, and predicate logic isn’t for everyone.

Last weekend I had the opportunity to meet someone whose opinion I value highly, and someone I hold in high regard in terms of intellect. Those discussions helped me formulate this with precision.

Regular readers will know that I frequently do change my opinions and definitions, or the premises upon which I build arguments. Therefore, if this conflicts with any previous posts, I would be glad to have that pointed out to me, as a reminder that I am in fact, learning.

I’ve been in plenty of debates where I try to get my opponent to state their premises. In many cases, they find me rude for interrupting them, because when I find a premise inaccurate, flawed, or misstated, I feel the need to get that clarified (and I forsee the need to write a post on how a good premise build up could make arguments efficient.) In many cases, premises are assumed to be facts (or at least, in a good argument they should be.)

So what is fact? I’ve heard plenty of opinions. I’ve heard people say, “What is fact for you, may not be fact for others.”

I think it’s worth dedicating a paragraph to clarify that fact is not Truth. Truth is best left for philosophers, and perhaps even theologists. While it is common that mathematical truth is based on fact, we shall not open that can of worms here.

Going back to what fact is. I must heavily insist that I disagree with the comment, “What is fact for you, may not be fact for others.”

For one, if that were true, there would never be progress in the debate, and there would be no point in holding a debate. As I understand it, a debate operates on information that is an order higher than fact. It operates on two things: Interpretation of fact, and Opinion. I won’t go into analyzing opinion here.

After various attempt at defining fact, I arrived at a definition that fits the need quite snugly – Fact is observation (as in a scientific experimental observation.) There is a reason fact may not be truth – and that’s because the observation process may contain a flaw.

First lets look at whether it satisfies the rigorous requirements of building a premise.

1. An observation is, to put it in terms of signal processing, an ‘impulse response’. An observation is the effect a process has on a quantity you can measure. An observation is, whether correct or incorrect, indisputable by parties in an argument. Let me explain. If we were debating why a certain number of jobs are reduced, we may disagree on the reason of decline, however, our observation cannot change.

2. If the observation does change based on who is making the observation, it is all the better. Now we reach the very depths of predicate logic. The reason some arguments seem to go around in “circles” is not because two parties have incompatible observations, but because different parties have incompatible processes to make those observations. The power of defining fact as an observation comes into forcing both parties to agree on the process used to make the observation. You may argue that such deep definitions are irrelevant for a debate. I disagree. I argue that almost all debates are very simple and can follow simple First-Order-Predicate-Logic if we agreed on our observations. The real debate is in the details – can we agree on observation? Here we reach an opinion (and my job ends.)

So why this clarification? I strongly believe, and have observed, that many of the truths we cling to depend on our personal convenience (misquoting Obi-Wan a bit.) We try to appear to debate on a higher-order plane because it is a convenient arena where interpretations and opinions can be inserted without premises.

You can’t really disagree with observation. You can disagree with the process used to make the observation. A classic example is clinical trials. Common arguments about drug-approval testing go, “While the FDA thinks the drug is unfit for use, they don’t know the whole story.”

The fact of the matter is, the FDA isn’t interested in the whole story. They know the story they care about. Their process is well-documented and available for peer review. It is open to criticism and suggestions for improvement. Whether or not that process is followed for a certain clinical trial is also a matter of opinion (and to a large extent quantifiable.) However, having raised no objections to the process of the trial, nor any concerns regarding its execution, there really isn’t a different observation you can arrive at, and therefore, the fact cannot change for you.

A final question I want to address is, “Is observation good enough?” Observation is the “effect” a process is having on a measurable quantity. That is what I meant by ‘impulse response’ in DSP terms. Observation is all we really have. Almost all of science depends on observation. Do not misinterpret observation as ‘seeing’ (as in, ‘to see’.) Observation is quite literally measurement of an effect. When we observe gravity, we observe the effect gravity has on an object (change of force, shape, etc.) Even if you strive for truth, you will end up at a dead-end – observation.

June 2, 2011

Astrologers…

Filed under: Entertainment, philosophy, Science — Tags: — archisgore @ 2:10 am

Did I spell that right? I’m supposed to write a document that’s going to take 2 hours, and I’ve pushed it too far. Good time to get all my thoughts out to the world one at a time. Today, let’s rip on Astrologers a bit.

To give you an idea of the motivation, I picked up a hillarious book at the airport during my last India visit called “Am I a Hindu.” That’s going to lead to a few posts, but you’ll have to wait until the next time I run out of things to do, and face the inevitable document-writing task. Today, I began reading this book to take my mind off some blocking issues. I had read a part of it during my flight, and I recalled an emotional rollercoaster between humor, apathy and perhaps anger (or annoyance). I’m giving you this context because this post is regarding one argument that book made (I’m willing to discuss other arguments.)

Lets get this out of the way - lack of disproof, is not a proof. I’d love to talk to anyone who believes that isn’t true (that was sarcastic; if you think lack of disproof is proof itself, I probably don’t ever want to speak to you in my life). The Indian Government proclaims it’s a science, and I claim I’m king of the world. Neither of the clauses is relevant for this discussion. A common argument we hear from pro-Astrology people is, “Why is it so difficult to believe stars could affect the physical processes within you?”

It’s not difficult to believe at all. I never claimed a remote planet doesn’t have gravitational influence on me. I’m claiming you’re full of crap. When I rip on Astrologers and Prophesizers, I’m making fun of them. I’m claiming they’re full of bullshit. It’s about them! That’s as direct as I can say it (offense intended). I have no problem believing that we might be able to model those interactions, and what the result of that influence would be. I’m not saying it can’t be done. I’m saying you’re not the ones doing it.

“If people can predict weather, why can’t people predict fortunes?”

We could extend this argument infinitely – If people can predict weather, and people can predict fortunes, why can’t people predict when we’ll get a man on Mars? If people can predict weather, the stock market, the next Tsunami, and some Earthquakes, sure, it may be possible to predict fortunes too. Doesn’t change the fact that you’re not the one to do it.

I think the modern Astrology debate has gotten too impersonal. Perhaps we’re trying to be too politically correct, or the Astrologers are just better at reframing the problem than we are at noticing that it got reframed right under our noses. I believe in open-heart surgery, however, if you’re any one of the people reading this blog, I can safely say I’m not letting you come anywhere near my heart. If you tried to convince me, I’d find it midly humorous and highly annoying. It’s the same with Astrologers – science doesn’t deny modelling. Modelling is a fundamental tenet of Science. What we’re saying is, you’re no good at it, and that you have no idea what you’re talking about.

It’s true that all models are merely approximations. That’s why in addition to predicting weather, weather-predictors are also constantly ‘learning’ from the outliers. They’re on the search for new variables, and better sampling methods. I’ve not seen major publications that have indiciated the discovery of any new variables or processes, or models that provide a better fit than what historic Astrology demonstrates. Even if that’s accomplished, a theory that does not demonstrate a prediction record significantly higher than a random process, is not considered a theory at all. Are astrologers willing to submit themselves to a controlled experiment where they can demonstrate their predictors are any better than a random predictor?

Hence the title of this post – ‘Astrologers’. Don’t make this about the “Science of Astrology”. I don’t claim it won’t work. This is about you – I claim you don’t work.

April 1, 2011

Making a Brain-Computer Interface

After five years of consistently blogging, and consistently failing to do something about it, the BCI-building has begun. The academician in me needed an outlet and I’ve been craving for something hardcore technical for a while now. So begins the first formal attempt at building a cheap home-made BCI. I’m going to label all entries so that it will be easy to follow progress on this.

Unlike my regular posts which are composed with at least some thought, this series of entries will be more like a journal. I’ll post entries with what I do, what methods I follow, everything I try. With any luck, there should be enough detail that anyone could replicate what I am doing with full fidelity. The attempt at any original research is not even a long-term goal. The goal here is to, and I say this as directly as possible, with no misconceptions whatsoever, and no subtext, “to have fun.”

If you have any interesting theories, experiments, I would love to hear your thoughts. If you’d like to participate, you’re welcome to join. Most followups will be brain-dumps of my thoughts – unedited, raw, and naive. A side-effect of this blog is to also demonstrate any points of failure, or stuff that doesn’t work. At this point, I have nothing working. I don’t want to claim that I knew everything, in case I succeed.  I won’t use the excuse that I never wanted to succeed in case I fail. I want to make this work, and even if it doesn’t, it won’t change the fact that I still wanted it to work. If I make a mistake it’s going to be published, since I will try and publish what I intend to do before I do it as a validation.

At the moment, I picked the PocketEEG from PocketNeurobics (apparently an Australian company) to get started. I’ve been out of this world for a while (four years), and it takes a long time to catch up on the IEEE Journal of Biomedical Engineering where most BCI work is (used to be?) published. The WaveRider system is a clinical device but it costs too much for me at the moment, but if this device fails, I’ll save up for the WaveRider and go with it. It’s a two-channel device.

I haven’t got the electrodes yet, so haven’t taken any readings, but plenty of work has to happen before the electrodes become relevant. The recommended software to be used with it is BioExplorer which I think costs too much. I instead tried to connect the open-source BioEra software. The UI takes me back to the good old colleges days before the polished world of  iPhones and iPads. It has a way to define the processing pipeline but I rather hate doing it graphically. The dongle does provide a fake serial port on your PC that you can read from, but I didn’t want to go through that much trouble. I intend to use BioEra to capture the signal and send it across a local TCP channel into a server I’ll write that does what I want it to. BioEra seems to support plugins, but I love the flexibility of having my own executable as opposed to being “hosted in” another exe.

This pipeline investigation should take until the weekend, and hopefully I’ll have it coded over the weekend. If I’m lucky, and the electrodes to arrive, I’ll be sure to post some recordings of motor activity of right and left hands, captured from the C3 and C4 points at the beta band.

The reason I don’t want to use the provided FFT block is because I much rather enjoy playing with parameters initially. Do I want it sliding window at every point or will I perform it on intervals? If I intend to do something like auto-regression, I’d much rather use my own buffers and optimize the pipeline to operate on. I’ll post what happens.

December 26, 2010

Mythbusters’ method of derivation by first-principles

Filed under: Entertainment, Personal, Preaching, Science — Tags: , , , — archisgore @ 1:20 pm

I’m the type of person who loves deriving from first-principles and one who admires people who like to do the same. This post goes in honour (British English, people – I come from an ex-colony) of the Mythbusters.

To figure things out, to derive things when no knowledge exists is a concept that seems rare today, and yet I’m sure it was rare as far as humanity existed. It is simply observation bias that made me believe the Renaissance period was any better than today. I didn’t read about all the billion people over the world who didn’t do anything while Da Vinci was doing something. Science and Technology ‘exist’ just as a lot of other things.

Most people know things – they don’t find out things, or learn things; they just know things. I came back from a road trip an hour ago on a route that everyone knew had no places to stay or eat, and yet I stayed in warm lodgings, clean beds, and ate some of the best American food in 20 years. We know toast is made by heating bread. We know we’re supposed to go ‘ahhh’ when we eat French or Italian food, and we know bread can’t be made any other way because – well wouldn’t we know about it already? Red wine is better than White wine. Gas pumps have gas, because… they just do, don’t they? Two drinks are never dangerous for driving because I always have control.

Unfortunately, a lot of science education programs also follow this pattern. There exists the earth. It is round. We live on it. It revolves around the sun. I literally don’t know a single person (including myself) around me who can devise a simple experiment right now on the spot to test whether or not the earth is round. I went through five years of college being told Knowledge (no, not a grammatical error there, I was literally told Knowledge – as in a proper noun).

The one thing that really defines the Renaissance was the spirit of individuality and discovery. Leonardo didn’t make ‘great’ paintings, as if God had said, “Let there be a definition of great paintings that humans can aim for. There was hence a definition for what maketh paintings great.” Leonardo made paintings – they were appreciated. Others couldn’t make much better than his, and his paintings obtained value. The renaissance evolved and nurtured the process of independent thought and opinionated thinking (two things I value most.)

I wrote once before about how a process (also called a model) is what defines everything about science, and perhaps what defines science itself versus… well, lack-of-science. I can’t be more precise than that because process is all-encompassing. String theories don’t define one outcome, but define a process by which outcomes for all situations can be predicted. ‘Solutions of equations of the n’th degree’ in mathematics are really the processes used to solve any system of any number of equations with any number of variables of the n’th degree. As a child I had the opportunity to read some interesting books by 20th century scientists, and one difference I noted from modern populist writing is their emphasis on their line of reasoning, their attempts at scientific enquiry, the setbacks, the necessity for designing creative experiments to test hypothesis.

For the last three years, I’d been trying to figure out just what makes me such a mad fan of the Mythbusters, and the answer is that they are more old-school scientists than many I have met in my life in a university. Of course one does chance upon those rare inquisitive individuals who want to know, but they are few and far apart. I must say that the Mythbusters remind me of some of the influential people from my past who made me who I am today – people who genuinely wanted to find out. I will put this out there – Adam and Jamie are two of the very best science teachers that exist on earth today, and the reason is precisely because they are not scientists (while that’s clearly not true, we’ll go by their claim for now.)

They love to discover. They love to figure out. Sure, you’d say, why figure out what’s already known? If you really just said that, then you don’t know squat! :-) To design an experiment to test a hypothesis is a complex task, heck there’s a whole specialization one can study in design of experimentation. Designing an experiment for a theory that cannot be easily tested rarely happens through dreams, no matter how much we want to believe that that’s how we’ll get rich some day. It comes through practice. Let’s be honest, half the things Adam and Jamie test are not known – sure we can make an educated guess at them, but we don’t know them do we? Chickens are not spheres with point mass.

The Mythbusters teach true, pure science, while selflessly claiming they’re not scientists. They derive from first-principles. Instead of assuming chickens are spheres of point-mass, they start with chickens as chickens, and spherical-masses as spherical-masses. If the two being shot out of a cannon demonstrate the same result, Adam goes, “Hmm… Jamie, what if we replaced our chickens with small spherical balls?” (such a thing has not really happened on the show, I made that up.) This casual remark teaches tons more science than all of high-school physics put together. It demonstrates how generalizations come to be in the first place. What the phrase, ‘without loss of generality’ means. What substitutions are allowed. How experiments must be broken down. How do you discover theory in the first place?

Deriving from basics is one of the key overlooked abilities of this decade. Yes, we know the earth is round. We know we can go in space and figure it out. We also know that some ancients figured it out long ago. Most readers of this blog, I’m sure, are at least self-styled techies who are ‘in the know’ about all things technology. I doubt there’s anyone who can come up with an experiment to test the veracity of that hypothesis right now without leaving this page. That’s the kind of stuff the Mythbusters do daily. Some of the tests they are asked to conduct are impossible to imagine being tested. It is like a classic Sherlock Holmes mystery – when you know the answer, it’s obvious, it couldn’t have been anyone else! I’ve been racking my own brains for the last hour trying to figure out how, given that I don’t even know what “The earth” is, I would attempt to figure out its shape. I’ve had formal education in high-school physics.

They also follow a pre-declared results-based experimentation process. A lot of experiments in my high-school physics were dead-on in conclusions, but they never defined what a set of outcomes would have implied before the results were described. The Mythbusters approach is truly scientific. They would first ask: Why do you decide that you want to put an Apple in liquid nitrogen? Then they would define what each outcome would imply: Suppose it were to come out soft, what would that tell us? Suppose it were to come out hard, what would that tell us?

Objectivity is very hard to learn – and is a constant struggle. We all hope our very first outcome is favourable, and it rarely is. Data is manipulated, conclusions are creatively worded, because the results don’t quite imply what we expect them do. The Mythbusters are not afraid to fail, but heck, they love to fail! Almost every other episode they are proven wrong. They love it! It doesn’t get any purer science than that!

If today’s kids are going to break new barriers, then they must have the ability to derive from first-principles – from the very basic axioms. This however, must be done without compromising clear and hard science. Plenty of out-of-the-box thinkers who promote unlearning what universities teach, get too carried away in philosophy, spirituality or just plain stupidity. Deriving from first-principles never causes you to unlearn what you have learnt, but rather causes you to conform what you have learnt. If you were to put an apple in liquid nitrogen, no matter how out-of-the-box you are, it must have the same results as anyone else doing it. If not, you’ve hit upon something and must find out why.

I’m glad the President Obama recognized such brilliant men who love to discover and figure out. It is heartening to see them teaching principles of science (and I know secretly that they too know they are following the scientific method) without making it ‘science’. Under the casual tone of ‘obvious necessary steps’, they are secretly teaching some very fundamental methods of scientific enquiry that took me years to learn.

EDIT: Some people just asked me, why this is important. We come across people treating simple problems as if they were obstacles created by God, or treating solutions as if they just came into existence of their own accord, without appreciating that there was some human being who developed that solution. If one cannot appreciate that, then one can never look at current problems as solvable, since they inevitably ‘exist’. Brings to me frightful visions of the Eloi.

August 19, 2010

Polynomial approximations of neural networks

Filed under: Science — archisgore @ 11:23 am

Still in draft-purge mode, so lots of older shit from five years ago is coming out now. This post particularly makes no sense whatsoever anymore, partly because the neural network scene is much more mature today, and AI isn’t the sexy-new-thing grads look forward to. But…. since I had this written, wanted to go ahead and throw it out there. I must say I am very proud of the attempt even today, even though it was an embarrassment in the end.

Back in college I was bigtime into signal processing and brain-computer interfaces. The biggest challenge in such a system is data-filtering. You end up with a ton of data per second, and your system has to find the right data to process at the right time. To give you an idea (and this may be outdated stuff, since today people use ECoGs and fMRIs), an EEG  has about 64 channels, each sampling at about 100 Hz (assume 8-bit accuracy of each sample – in whatever units – usually micro-amperes).

Obviously there are a lot of conventional tools to process this data (Principal Component Analysis, Independent Component Analysis – which wasn’t of much help since it turned out there is very little cross-talk between channels, Support Vector Machines, etc.) But who in college is content with what we have? I secretly hoped the data had non-linear components (which it didn’t – all non-linear methods gave about 10% more accuracy than linear ones, and it seemed a lot like the reason was memorization rather than a better fit.)

So I had this crazy idea for sensitivity analysis. I had a well-trained Neural Network, and which the old method of branch-and-bound on the input vector to check variance in output is well known, I somehow wanted to model it in an equation, reason being, through sampling, you can find out that the answer is very sensitive to dimension X1, but not so sensitive to dimension X2. However, I wanted to know what Xi’s were most important and what weren’t and by how much. It’s the same concept you use in PCA – you want to pick ’r’ channels out of ‘n’ channels so that you get a ’90%’ accuracy. PCA gives you the contribution index of each channel to the final output. The idea is a trivial extension of the sampling method used above, but simply substitutes polynomials in place of the logarithmic/exponential functions used.

Enter the brand new idea of sampling each neuron at ‘n’ intervals in its range, and using a curve-fitting method to generate an ‘n’th order polynomial for each neuron, and then injecting that into the next neuron it feeds. Here’s the idea:

1. Each input’s domain is determined. Whether it’s an Int32, or Int64 or double or quad. You know the exact bounds of the set.  Therefore, whatever function you use, (in my case TanH),and since backpropagation requires the function to be continuous and 2nd-order differentiable, you know that it’s range is finite and well-known. This range feeds into the next neuron, so it applies to all neurons.

2. Since most used networks (most used by me) were 2-layer, it was a tractable problem.

3. What you do is, you sample each neuron in the first layer at ‘n’ evenly-spaced points in it’s domain, and you compute output-values. Then you fit a polynomial (Spline, Bezier, whatever) to mimic that exact shape (you don’t do this earlier, because this is a fixed graph that can’t be “trained” anymore using backpropagation).

4. You feed in the polynomial as a symbolic variable into the next neuron’s polynomial too, and then simpify the result.

5. What you would get is an n^2 (2 being the number of layers) order polynomial for each output in “m” unknowns (‘m’ being the number of input variables).

Based on the constants and exponents for each variable, you can judge what it contributes to the output. It felt like a good way to quantify in relative terms and symbolically, the importance of each variable, and to find which channels/dimensions could be dropped entirely without losing significant accuracy.

Frankly, if I worked with ANNs ever again, I’d still go this route, especially for problems where the nonlinearity is unknown and you find an ANN that gives you a good fit. Of course, if I worked with ANNs again, I’d catch up on where the world is today, but that’s another matter. :-)

Theme: Silver is the New Black. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.