Archis's Blog

August 27, 2007

Project guidance

Filed under: Uncategorized — Tags: — archisgore @ 5:02 pm

I hope this mitigates those “I wanna do a project, do you have a project?” kind of questions and explains all my views in excruciating detail. For any questions, you’re always welcome to mail me and I’ll revert back with another entry. Me not being either an “industry expert” (I’ve exhausted all industry-expert humor for now – but I promise to come up with new insults soon), and neither am I an “alumni” (if you’re new to Pune culture, this isn’t an adjective which means past-student, but a very special honour – it’s a bit hard to explain here unless you’ve faced some “alumni” yourselves), so there’s little scope of me ever being able to speak on these topics directly to students.

Let’s get down to business. I’ll post suggesstions here and justifications later, so that any of you who want to get a quick overview won’t waste time:

1. Projects should always be focussed on a “problem”. They must begin with a problem, and only end in solving the problem. If you’re beginning with, “I want to do something in Java/.Net/”, you’re in trouble. Now, this could be a valid assertion, assuming you’re talking about extending the language/technology in question. If you want to increase the “Java platform” itself, this makes sense. But if you’re talking about using Java to do (where is as of yet undefined), you’d better rethink your choice of becoming a computer scientist. Not sure if you even know how a “problem” is defined.

2. A problem must be something real and substantial. It must be something people can relate to. Something that, if “solved”, must no longer exist. Let’s say you pick up “Base64 encoding a string” as a problem, then after your solution, base64 encoding of a string should no longer bother your target audience.

A problem must always have a target audience. If nobody needs a base64 encoding of a string, don’t do it. If someone needs a base64 encoding of a string, and won’t use your program for it, don’t do it. Do something that someone real uses. This doesn’t have to be a “live” project. Do something at home. I don’t care. But do something that works! Even if you’re mom or dad or grandpa uses it, it’s okay.

In my opinion, a great project is one that achieves something – something that at least someone cares about – even if it’s one person on earth (including you). Let’s take an example later.

3. Write “REAL” code: Go to your own college. I’m sure everyone’s complaining about some missing tool, or “I wish I had a tool to do this”. Now think to yourself. In your own college, one that is admittedly “academic” and “non-industry-oriented”, there isn’t a single tool written by many of you today, that solves any of the problems they crib about. Now if your so-called “academic” college won’t use it (which is why you bring in “industry experts” to deliver lectures), why should a multi-billion-dollar industry trust you?

Your own HoD won’t trust the code that you write, but overnight, you expect a multibillion-dollar company to just trust you? And let me make this very clear – the industry, once they hire you, implicitely trusts you. I had a disagreement about this issue with a senior many years ago, and he had said I would “learn that trust isn’t everything” when I joined the industry. As I have yet to apaprently join the “industry”, I maintain my original opinion, and actively promote it. When I write code, I am questioned by nobody. My manager doesn’t ask me why I did something a certain way. If I say something can’t be done, he takes it at face value and defends my decision higher up. He’ll never ask for a second opinion. My mails are not monitored and I could burn all my team’s code on a CD and walk out with it without anyone checking. It’s that simple. You can now imagine just how much trust Microsoft has in all it’s 70,000+ employees worldwide (vendors and interns over and above this number). That’s how the industry works. And this trust has to be earned.

“How do I write real code?” It’s simple. Write code. Distribute it. If one person uses your code for their problem, it’s “real” code. Now it is very important that they really do use it. Some friend shouldn’t just fire up your program, run it, and then fire up something else later. You’re program needs to be the only program they use for the specific problem your program attempts to solve.

4. “Impressive projects” (IPs for short), are mostly for cowards. Let me explain. In my day, I’ve seen (and continue to see) extremely fancy-sounding names like “kernel-module-janky-panky-thingy”, or “hyper-threaded-multi-headed-monster” to name the remotely sane ones. Now while those of you doing “small” projects like GUI’s for a database might get intimidated by these names, let me attempt to give you some confidence.

If the comment, “My project is 90% done”, can be made about any project, that project is cowardly. How many of those kernel modules being made is the college using in their labs? How many of those jhanky-panky things are being used by your own teachers? So far as I’ve noticed, ZERO! The cool thing about an IP (impressive project) is that the minute you begin it, you’ve already defined an exit-route. Regardless of what you end up doing, you just say, “Hey it’s 90% done”, and you got 100% marks for doing 90% of a very complex project.

The really brave are those who take up those silly-sounding database GUI projects. Due to their silly-sounding nature, everyone is already critical of them before the evaluation begins. Being a silly-sounding project, they really need to “deliver”. If it doesn’t work, they’re screwed. Forget 99%, they need it to be done 100% or even more sometimes. Think about this for a moment.

One of the reasons I have never been, nor will never be asked to get involved in industry-student activities (where industry jerks go to the college and tell students how stupid they are), is that I keep asking the wrong questions. I distinctly remember at least 3 IPs where I asked the teacher praising it, “So ma’am, does your machine use this file-system driver?”, and she didn’t reply, and I didn’t need to ask anything further. That’s why “defining the problem” is the most important step. Define it and solve it goddamit!

If you’re still not convinced, let me put this in another way. Most of you know I love making my own toothpaste just for fun. Now I recently made a toothpaste that’s 90% done. It can kill off absolutely any bacteria, virus, fungus and any other pathogens known to man! Wow! It’s super-awesome! Just one minor issue – we can’t use it on our teeth yet. But wait! It’s 90% done! We’re almost there! We solved the major issue that no other toothpaste in the world solves! We killed off all germs! Take that you idiotic dentists studying for 1000 years! I, a mere computer programmer, invented a better toothpaste than you! And now, I want you all to invest 100 million dollars in my new invention so I can make it safe for teeth, and we’ll all be rich!

Just tell me how many if you will invest in my super-awesome toothpaste? The moral of the story is, just go do a project you enjoy and have fun with, and don’t be intimidated by these IP’s. If you’re doing a small GUI project, you’re braver than many of them, and you’ve got a lot more to be afraid of – because you will be scrutinized, you’ll be interrogated, you’ll be broken down into bits before you get a single point for your project, which needs to do everything you said it would do. Your project needs to be safe for teeth, _and_ kill as many germs as possible. So take heart and trust me – the industry will value you. Have you ever heard of the standard comment at college that goes something like…. “sometimes low-scorers are hired by the industry and high-scorers remain jobless”? Believe me – the industry isn’t stupid. They never hired low-scorers. They hired those who were scrutinized the most and didn’t have the escape routes that the others had.

As an ending note: if you do manage an IP and only deliver the solution
- well, you’re beyond all praise. I honestly mean it. This blog entry was not meant for you.

5. Commitment: I know most of you have never met people who’re committal, but hey, they do exist you know! Now, I don’t mean this in the context of management-style blazer-wearing people. I mean it in the plain human definition of it. If you want to be brave, learn to write down the problem you’ll commit to solve, and only go to the examiner with, “It’s solved”, or “It’s not solved” (unless you’d rather back me a million dollars for my toothpaste).

Define the problem in use-case terms ONLY! That’s the only way to really commit. If you’re making a Linux installer, and if I were an examiner, I’d ask it to install Linux on my PC without asking you a single question otherwise. If it didn’t install Linux, but instead did some hyper-threaded-multi-headed thingy, I wouldn’t give a damn! Use-case scenarios are frightening to commit to, and that’s why you need to learn this while you’re students. Now is the time to make those mistakes.

And believe me, there’s a difference when I say this. I work for a company that produces products that _you_ personally use (well, I hope so anyway; and if you don’t use our products, I’ve proven my point even better). I don’t come to you and say, “I’m some bigshot industry fellow who knows everything and you’re stupid.” The code I write, physically reaches you. You’re being, while you’re reading this blog, my direct judges. You’re my evaluators. You’re directly responsible for my bread-and-butter. How many of you use Live Search? What if I told you Live Search uses some jhanky-panky-super-cool technology that Google doesn’t have, will you use it? If I said, Live uses C#, will you use it? Think of orkut. What technology does it use? What Os does it run on? What AJAX engine does it employ? Have you even once considered these questions? All you care about is the quality of search results, and the ability to _communicate_ using orkut. Orkut’s problem is a use-case scenario. “To enable Person A to talk to Person B”. Even if it’s 90% done, it’s worthless. Even if it’s 99% done, it’s worthless. Only if it enables Person A to talk to Person B, do you – the judges, the evaluators, the jury – use orkut! Then why is it, that having expressed a desire to work for this industry, you hate these “academic questions” asked by allegedly stupid university examiners? In my opinion, the university examiners ask the perfect questions – the very questions that you would ask me when I come and ask you to use my product. In a way, calling them stupid, is calling yourself stupid.

I admit, they don’t care what technology you’ve used, or how hard you’ve worked, or that cool kernel-thingy you did. But then again, neither do you care for all these things. Learn to live with it – better now than later. That’s what life in the industry-without-double-quotes is. That’s how we live daily. If Person A cannot talk to Person B, I’m worthless to the world regardless of what transport protocol I may have used and how many layers of encryption I may have put on it.

The one final suggesstion I would make to you – learn to make commitments and learn to live with them. Never have excuses based on technology, hard-work, or impressive-sounding names – that’s for cowards.

6. Ensure you’re thinking of deployment: How many projects in our labs are run directly from IDEs, and how many are binary executables? Even more so, how many of those executables are packaged? On Linux, there’s deb and rpm, on Windows there’s MSI. How many projects can you “install->double-click->run”? Again, would you use my products if I gave you a large source tree? This is a tough call, but it’s an important one. Being able to write something that’ll run on even one machine that you don’t have control over provides immense pleasure.

7. Use all the tools at your disposal: Use a source versioning system. Store your code in a CVS/SVN/VSS repository. Keep incremental changes as diffs.

Compare those diffs. Look at how you wrote the code. If something goes wrong, you can revert back to a working build instantly. It’s important to be able to use these tools regardless of where you work. If you’re handling any content, a versioning system is critical – even if you’re only working with office documents and not source code. Your company/clients/stakeholders are going to want your sharepoint to hold all incremental changes to documents that are made. Use source-analysis tools to find memory holes. Even knowing that a certain tool exists for a certain job can be valuable.

8. “If I do all this, when will I do my project?” The answer is simple – do a simple project, but one in which you can focus on all aspects of development. I had attempted to make this suggesstion in the syllabus, but the university is still living in the 1980′s so we’ve got to give them some time before they can catch up on almost 22 years. In the meantime, I personally recommend build only notepad if you have to. Build sudoku. Build minesweeper. But build something that you yourself should use for hours on your own. Not for testing, but for actual “using”. Build something that allows you to explore various challenges – how to version source code being edited by 20 people at a time, how to write comments, how to build binaries, how to build packages, how to ensure and test packages will install and run and take care of dependencies for unknown configurations. These are all fun and interesting aspects, but most importantly, they’re just plain old bread-and-butter aspects of a programmer.

When I say real code, go out to a public CVS. Contribute just 100 lines to Apache or MySQL, or Postgres, or Linux. Instead of a 10000-line kernel-thingy, submit 100 lines to _the_ kernel. 100 lines that will be used by hundreds of thousands of users all over the world! Now that’s “real” code! It’s not the quantity but the quality that counts.

Naturally, I don’t discourage building a kernel driver, but then have the guts to commit to it. It shouldn’t be done for “40 marks”. The only acceptibility criteria is that it be merged into the kernel tree. There should be no other criteria. If you’re solving the problem of “non-existence of a driver”, after your solution, this problem must go away (as I mentioned above). After your solution, there should “exist a driver” – not on your PC or in your lab, but in the kernel production tree. Being able to make this commitment is what the thrill is all about. Make a commitment on notepad, but then ship it! It should go into production. If you can’t commit on even notepad, then you’re seriously in the wrong place and reading the wrong blog. Its upto you – whether you want to be a coward and be non-committal, or be courageous and commit to a small deliverable!

August 15, 2007

Ubuntu Servers Hacked

Filed under: Uncategorized — Tags: — archisgore @ 11:42 am

Disclaimer: All content in this blog represents my opinions and only my opinions, and does not even remotely, incidentally, implicitely, or accidentally, or in any other way represent the opinion of my current, past or future employers or any institutions I may have been affiliated or associated with.

http://it.slashdot.org/article.pl?sid=07/08/15/1341224&from=rss

Now before we misinterpret it, let me make that clear. I don’t mean servers of other companies which are running Ubuntu. I meant servers of Ubuntu.

I’m sure Microsoft must have had something to do with this! But until we can somehow figure out how Microsoft was involved, let’s just assume it wasn’t them and look at alternatives:

I once had a major mail-debate with a prominent freedom-fighter regarding system administrators. His argument was “Linux requires admins with a high IQ” whereas “Windows doesn’t”. Now call me stupid, but I thought this was a good thing for windows. Afterall, I was under the impression that software was meant to simplify our lives (I keep making wrong assumptions on so many fronts lately). Then he goes to say that because Linux requires Administrators with high IQ, they are always competent – by virtue of being Linux admins, whereas a Windows admin is not of high IQ because it’s Microsoft’s fault that Windows is easy to use. Yes, Microsoft is responsible that companies who use Windows, hire admins with low IQ. (disclaimer – I do not subscribe to this opinion personally).

Anyways, a whole community’s eyes (thousands of eyes, to quote said mighty freedom-fighter) did not notice a (probably must be low-iq) procedure that I, as a college student was smart enough to know – use everything over SSL. I quote: “as using unencrypted FTP transfers with accounts”. I’m pretty sure there was something called ssl and secure ftp and ssh in college. We’re talking about servers running GPL’d programs, which, according to some, by virtue of applying the GPL, got thousands of eyes going over them, combined with the fact that they’re being run by the community (add a thousand or so more eyes to the mix), and add to that the fact that we’re dealing with the most popular Linux distro in the world – one that’s going to be the flagship into the entry of Linux on preinstalled desktops by Dell (add a few hundred yet more eyes). Hmm…. maybe I missed something…..

The community has it’s share blames to throw on Canonical too. The kernel has backward compatibility issues with Canonical-provided hardware, which I presume was bought with all kinds of freedom-fighter-retaliation-threats to whoever sold it to them, which means that there’s no hardware lockin (besides the servers were working till now – I mean something was throwing that webpage at me!). Now I found out from slashdot that Vista’s just stupid that it’s not backward compatible with lots of hardware. However, the Linux kernel gets patches “overnight” at the click of your fingers (especially on hardware that was bought with that purpose in mind, and which worked with the previous version). However, it’s Canonical’s fault that they sponsored this hardware. Maybe I missed something again……

Wait, there’s more….. there were missing security patches. Damn that Windows Update! Applies patches authomatically huh? Must be for those low-IQ admins. With high-iq people who can apply patches “overnight”, why bother?

Wait there’s even more…. I’ve faced “case studies” and “statistics” of how Linux machines never ever fail, and any windows failure is always Microsoft’s personal fault, and the configurations had nothing to do with it. And let’s not forget the whole hardware compatibility Vista gave up – just how mean and evil of it! I guess the freedom-fighter-threatened vendor who sold that hardware (or Canonical to have bought it) must be so damned evil, that even after all those threats, he sold hardware that chose to be backward compatible with an upgrade in the kernel. Shouldn’t the stupid, arrogant, low-IQ vendor have anticipated future kernel-changes and made hardware that could live with them? Since the code is by-definition always right (thanks to those thousands + thousands + hundreds of eyes), the evil hardware decided not to allow the kernel to upgrade. Maybe it was that damned DRM again…

Afterall, it’s always everyone else’s fault right?

Theme: Silver is the New Black. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.