Archis's Blog

June 8, 2008

College project idea: Failure-tolerant systems enabled by "sync"

Back in the days of DOS, shutting down a machine was merely about turning off the power, and at times, directly the mains. One didn’t have fancy file systems that needed maintaining consistency and atomicity. If something was bad, you just ran chkdsk or scandisk to fix it. Consistency was maintained very optimistically – we just assumed the best and prepared for the worst. If files got lost, we would have xcopy’d backups elsewhere and we’d compared the whatever little was saved against our backup and do a merge….. sounds very familiar doesn’t it?

Then came relational databases, normalization, concurrency control, etc. All kinds of fancy computer science was dedicated to complex programming techniques and systems that were debugging nightmares. As if designing and maintaining a system that acquires locks over time, then makes changes, and rolls back failed transactions wasn’t fun enough, we went ahead and created distributed versions of those systems. Either everything was in realtime, synchronous and consistent, or nothing was happening at all. But this can go only so far – trying to maintain redundancy and failover in such a complex system isn’t just non-trivial, but impossible on the scale the web today requires.

Then again, all the million people on your site don’t really operate on the same data at the same time – and perhaps such tight locking across all resources wasn’t only unnecessary, but also inefficient. Perhaps you could have two broken databases that would periodically exchange information with each other about the changes they underwent in the meantime. After many years of research in locking and concurrency control, and write-ahead logging, and such stuff, we ended up with the same model we used in the early 80′s – sync!

It’s not only a lot simpler to implement, maintain, fix, patch, test, but also a lot simpler to understand and use as well. Moreover, every user, every device, every application, has it’s own version of their personalized history. It allows every user to view what’s going on in the world from their own point of view (Obiwan Kenobi, et al).

Yesterday evening I met a friend who is in India for a couple of months and the HoD of Fergusson College’s CS Dept over dinner. We began discussing the kind of cool projects kids could do for fun over the summer, and Atul mentioned beginning by creating a simple MPI cluster out of all machines in the lab (for the uninitiated, for kids in B.Sc. computer science, it would be a first in the history of the course.) Since we’ve both worked-on/interested-in loosely coupled massively scalable distributed architectures (and my project at ISSC was to attempt to build one called YODA), we all began pushing our agenda (before you assume anything, “sync” wasn’t really my original agenda).

The outcome came to be something so awesome and cool that we’re meeting up on Monday (16th of June 2008) to discuss something constructive. As with anything I do, I’m on the lookout for enthusiastic crazy kids who want to do something fun just for the heck of it.

Mrs. Page, the HoD, wanted some real tangible experience for her students. I still have the itch to complete YODA to fruitition – I still believe it was the most awesome project I ever worked on with the best project group in my life with a lot of potential. Atul has his interests in high-perf computations for language processing (the perfect candidate for MapReduce).

The idea is exactly what YODA was all about, but perhaps at a less ambitious level – handing off the network/node/data management jobs to MPI (yes, we take the hit to have a geographically localised cluster). To have a thin “loading” framework on top (my agenda) so that you can submit “jobs” (concepts stolen from BOINC) which may be of the MapReduce type (Atul’s interest), but with one major twist to it – instead of the conventional master-slave topology, why not go for a peer-to-peer sync topology? (perhaps working on Live Mesh has made me a religious sync fanatic)

Look at it this way – today, MapReduce/BOINC keeps working if any/all worker nodes go down. What happens if the entire data center went down? Including all your master nodes? The last few months, if Ray Ozzie has taught me anything, it is that you can add all the UPSes you want and all the security guards you want, but the only way to truly protect data is to simply replicate it as far and wide as possible. Its cheap, it’s effective, and it’s simple. GFS (Google File System) claims to do it. Then why not do it for compute-intensive tasks to share merge results?

Let’s say a BOINC-like cluster where every machine is a peer. You “inject” your task onto one of these machines and based on availability of bandwidth the task starts replicating across all machines like a torrent (the parts that have not gone out before, go out first). When any node finishes it’s processing, it simply “sync’s” the results with any peer nodes that may have completed results. When it has any meaningful results that it can merge (“Reduce”), it does so, or it acts as a proxy to sync those results further through the swarm. (You may notice this follows a lot of principles of git as well.)

Instead of being “assigned” a workunit by a master, each peer simply picks up a random workunit out of the ones it currently has but has no recorded solution for. When it gets more than one unit that is mergable, it merges them. This is like dynamic-programming on stereoids. If the bittorrent implementation is done right, each node should have as minimum overlap as possible, and yet, we do need some overlap which will be a function of the design as opposed to any deliberate actions by the master. (The overlaps are used to validate results in case something went wrong.)

Just imagine if your college lab turned into a swarm of independently operating units who are sync’ing with each other and processing results, and when you go at any node in the system, you have part of the answer. Depending on how long you give it, every node will have the final answer in the end. You can naturally optimize by assigning a “special” result node which periodically gets partial results from all boxes in the swarm and thereby create a weak master.

The good part is, if you have multiple problems to solve you just go on injecting them into any node you have accessible. Over time, you go to any node you can find and get results. All without any fancy coding to maintain consistency, accuracy, integrity, etc.

Naturally, there may have been things I’ve overlooked, but for a 10-minute discussion, doesn’t seem like a bad idea for some college students to try out as a few-weeks project. In case you’re interested, let me know. All debates, arguments, flames, criticisms are appreciated.

June 6, 2008

The earnest wait to release…. of movies or software or books!

Filed under: Uncategorized — Tags: , , , , — archisgore @ 9:25 am

We all know how eagerly we all waited for the final Harry Potter book. And my friends should tell you my almost maniacal obsession with “The Dark Knight“. And we always wonder why these people project dates that are a couple of years ahead or at least eighteen months. I used to get frustrated and angry at such dates and I also frequently comment on blogs/forums about this issue.

And yet, something different happened a few minutes ago. I was reading about J J Abrams’ upcoming Star Trek movie and it said “eleven months to go”, and for once, I didn’t feel as if this was an irrational timeline. Perhaps because we’re having meetings at work daily planning for the next 18 months.

You really come to realize things when you’re on the other side of the barrier. People keep asking so many queries about Live Mesh, and where it will go, and whether it will do a special feature that they want. And I used to get frustrated – after all, we are working as hard as possible, but it takes time to write code, test it, validate it, approve it, sign it off, push it to a distribution channel, and provide that seamless install experience for users. Suddenly, eleven months doesn’t sound all that much.

And I also got a glimpse of what the producer/director/actors are feeling like. I’m sure they want to get it out as soon as humanly possible, but there are things that just take physical chronological time and there’s nothing you can do about it.

Now that’s something to think about……

June 1, 2008

Ubuntu still not ready to replace my desktop

Filed under: Uncategorized — Tags: , , , , — archisgore @ 9:45 am

Sorry people, no flames please. :-) I’ve been using Ubuntu for the last couple of months (after a 1.5 year haitus I’m back on Linux on my home desktop – partly because I used to be a hardcore developer a couple of years ago, and mostly because I like to know what happens in the world).

I remember the Fedora Core 6 days which was blazing fast back then and pretty impressive. When it came to using Linux again, I used pendrivelinux as a bootstrap operating system to get my network working, attempted to get a full-fledged mandriva out of it, but was disappointed at the lack of some packages or the lack of some GUI config tools I’m used to on Windows.

Having my network running, I downloaded Ubuntu since I’d never used it before and since it’s making so much news lately. Ubuntu certainly is a lot cleaner and professional-feeling compared to Mandriva (the update mechanism, gstreamer plugins, etc.)

It still doesn’t match the speed-of-use I get from my XP (I keep forgetting to buy a Vista licence) desktop. I dogfooded it myself and whenever my friends come over, they use it also. Speed-of-use isn’t always about raw speed of the software. Many times, there are simple UX pieces missing in the media players or GUI config tools which make you go to the command-line occasionally.

1. The default movie player (Totem using a gstreamer backend) is lackluster at best. Before you go ranting about proprietary file-formats, I’m talking about the UX. The “Preferences” doesn’t have enough configurable settings, and I miss most of my settings from Windows Media Player. The progress-bar-thingy (that allows you to control playback position) doesn’t work quite as smoothly as I’d like. And there’s a whole lot of polish, options, configurations, etc. that it needs.

2. Ubuntu hangs quite frequently – more so than my XP box. I have to reboot it at least twice a day. Don’t go telling me how I must have screwed up the configuration – to be fair, I only used the GUI utilities which are supposed to, by definition, guard me against incorrect configurations. Firefox is the biggest culprit here. It times it hangs, and then fails to restart unless I reboot. The desktop shell hangs too in which case I have to “ctrl+alt+backspace” to even be able to reboot the box.

3. The default bittorrent client, Transmission, is nowhere close to uTorrent in terms of features and functionality.

4. I miss some kind of hardware detection/diagnostics utilities, especially for USB. I don’t know the command-line, and I don’t want to know it. I want a Device Manager kinda thingy that shows me all ports, cards, disks, and anything else connected on my system. I want to know the status of that hardware and exactly what my system is interpreting it. I connected an external USB harddrive enclosure which shows up as /dev/sdb, but there is no /dev/sdb[n] partition when I know that the disk contains five partitions. Sure, the system may be confused, maybe the disk has a bad partition table. But beyond /dev/sdb, there’s just no information on what is happening. I presume the USB mass storage protocols are perfectly standardised and available for implementation.

5. I can’t rename my partitions for some reason. They show up as “127.8 GB media”. I don’t remember my disks by their capacity; I prefer to remember them by name. It’s the 21st century – you can give me that much! Come on!

And these have just been my personal experiences. My friends are totally baffled by the system. They can find their way through the Applications menu but, at points they do lack some serious functionality we take for granted daily on XP.

The one thing I loved about the system is when I begin to play an unknown format, it can make the files “just play” in a couple of clicks. And gstreamer does have an impressive set of plugins.

Blog at WordPress.com.