When you first watched Skyfall, how many of you wondered if James Bond’s universe has a movie industry? Wouldn’t you imagine M giving the order to send the entire NATO NOC list to some embassy in the middle east, and some lowly freshly-hired intern going, “Uhmm.. frightfully sorry to bother you M, what with it being time for afternoon tea; but I saw this rather fascinating movie from the United States called ‘Mission Impossible’. Wouldn’t it be perhaps prudent to, just not send the NOC list to Egypt or whatever?”
As I fly back from BlackHat, I can’t help but be baffled at how much energy we spend protecting these NOC lists using complex measures, after having first sent them to an embassy in Egypt in the first place!
Let’s back up a minute and I’ll begin with the punchline, and add a description/justification/thesis later.
There are two ways people view security (or availability, or resiliency, or uptime, or whatever you call it.)
The first approach is the Cathedral. You begin by establishing who is important and who is not. You create a hierarchy of people. The Manager is more important than the lowly IC. The CEO is more important than the manager. This is an implied proxy for how “trustworthy” you are, or how “reliable” you are. This is why you see the phrase “trust establishment” all over the place in the security industry.
Then you proceed now define security as the power to do more stuff the more trustworthy you are in the social hierarchy. The Domain Administrator can do anything. The field sales person can do little. The office secretary can do nothing – not even browse Facebook. By pivoting around “trustworthiness”, you end up with a very cathedral model.
Its easy to explain, and easy to swallow. If I’m CEO, I don’t want to be questioned. If I’m emperor, I don’t order people to do things do I? Our romanticized hollywood fictions of Kings are that their “will” is done. The Bible doesn’t open with “Make me some light”, it opens with “Let there be light.” The President of the United States has the authority to launch “all nukes”, whereas the General for the Pacific Fleet may be able to launch only nukes from the pacific fleet.
Having landed on this model, your only foreseeable problem is that of establishing trust. How do I know you’re the Emperor? How do I know you’re a manager? How do I know you’re the domain controller? and so on…
Once you establish you’re a domain controller, you can generate kerberos tokens on behalf of any “principle” in the domain – a user, a machine, a whatever. If you’re a “good program” you get more access, vs a “bad program”. Similarly, if you’re the new york times, my employees are allowed to browse you, but if you’re webmail, facebook or twitter, they aren’t allowed to browse you. Given that they can browse a “trusted good website”, the browser can write into c:\system32 whenever it likes – because trust has been established.
The setuid bit of passwd is an example of this: Because we established you’re a “good trusted program”, we will let you touch the entire system and run you as root. /etc? /var? /lib? /bin? /proc? sending SIGKILL to processes? Go for it. It’s yours.
It sort of models the idealized real world model of authority… but only a fictionalized idealization, because the real world operates more like:
The other approach is the bazaar. You treat everyone as equals- insiders, outsiders and everyone else. Nobody is inherently trustworhthy, but nobody is untrustworthy either. There is no social hierarchy. There is no concept of a “better person having more access by virtue of them being better.”
Instead you base security around, “Why do you need access to something to do your job?”
In this model, as in the real world, the President has the authority to order the launch of nukes. He isn’t really going to be allowed in the launch bunker and allowed to turn a key, no matter what anyone tells me. There’s a subtle but important difference here. The president’s job is to determine that a launch is necessary, and perhaps what the targets are. He has the tools required to do his job. His importance in the chain has nothing to do with what he is allowed access to. He maybe have access to reports, documents, laptops, and a telephone. He doesn’t get to carry the plutonium core back to the white house just because he’s the domain administrator.
If you’re an HTTP server, you have every right to control port 80. And a part of the filesystem. No control. No oversight. However, if you’re an HTTP server, you don’t have rights to do anything else. Don’t care who the heck you are, what your pay grade is, and where in the hierarchy you sit. Never gonna happen. You get port 80 – that’s your prerogative. You get nothing else.
Rather than controlling access to Facebook and twitter and webmail, you pivot the problem as, “Why in the name of hell will my web browser ever need to write anything to disk outside of the cache folder, let alone c:\system32?” So you never allow it. I don’t care that your web browser can cure cancer – I don’t know of any reason why it will ever need to read /etc, let alone write to it. You could be a trusted website, facebook, twitter, a porn website, or a virus website. I don’t see any reason a trusted verified website would ever want to write to c:\system32 to save my life, so you’re not getting it!
Security is about actions, not trust
Let me establish some context here – so you trust I know what I’m talking about. I did Amazon.com’s search fleet operations for Q4 three years in a row. Meaning if Amazon.com’s search pages went down anytime in the holiday season, I would have to spend a week getting humiliated in front of a committee where they question every decision I made – from load balancers, scaling, predictions, performance and so on. Doesn’t matter who caused the failure – if I didn’t catch it, I had to answer for it. Moreover, I would run on-call for the Black Friday weekend – which is a straight up five-day no-sleep traffic onslaught like you’ve never seen before.
The common man thinks of Security/Availability/Reliability defence against malicious actions or attacks. The reality is, in all my years, the only Denial of Service attacks that I was unable to prevent was when someone on the inside left an expensive computation on one of the pages, and that page ended up being popular. While it makes for great good vs evil television hacking movies with dramatic music, handling large scale incoming traffic from a botnet wouldn’t even wake me up at night – you can block it, throttle it, or just pull the plug. You make a small headline in tech crunch that Amazon was down, and life moves on. Perhaps people even buy more from you the next day because they think you’re that popular.
My nightmare scenario was when I see no traffic deviation from normal, nothing in my peer systems shows any signs of alarm, but my servers suddenly run hotter than expected. With 50 people committing features twice times a day, and you’re shipping that code to production twice a week, you can’t single handedly nail down the issue and respond fast enough.
The classical Cathedral model fails very fast and it will crash and burn. This is not meant to be an insult at anyone. We can’t all do everything, but we can very easily and reasonably set bounds on what some system should reasonably be able to do.
Did you ever read Steve Vegge’s Rant about Amazon? Quite an educational read. The crux of it is, everything inside amazon is built to be internet-facing and internet-capable. The gatekeepers to the internet don’t expose the servers to the internet, but as a system, each system individually is built with all the monitoring, analytics, throttling, failover, fallback, recovery, alarming, etc. criteria that you would expect from a public-facing service.
The reason isn’t that people don’t trust each other. I had great relationships with fellow teams who have throttled my services in the past. The cathedral of who is important socially and whom I value socially, is detached from the bazaar of what my service should ever be able to do. If my service deviates, the limits kick in immediately and the issue is contained before it can get any worse. Just because my name or my team, or my org is important, doesn’t mean I will ever need to log into a customer’s credit card table. To block access to me, isn’t a sign of disrespect. It is merely a fact.
A big reason why I do what I do today at my new company, is because I fundamentally believe in the Bazaar. Even in my last days of Amazon, I was fighting heavily to bring in Actor models (Akka, Erlang, etc.) to compartmentalize even our internal developers. Even if they are “internal trusted developers”, the point of compartmentalization isn’t a sign of disrespect. It is a Bazaar.
The purpose of the Bazaar is to heavily compartmentalize Command and Control to the point that the incentive to attack one particular system is very low, because it can’t do a lot more than what it was designed to do (if you take over a webserver, the best you can do is make it serve up some other content – bad, but not as terrible as taking over the system.) It imposes a simple governance model. For example, rather than auditing all servers in your active directory domain for compliance and being paranoid – you audit what a sales rep’s server should ever be allowed to do. If it’s never going to control other machines, instruct other machines to not accept WMI commands from it. It could belong to your billion-dollar sales person who is literally the most valuable person in your company. I just don’t imagine why their machine would ever realistically want to send WMI commands to your domain controller.
I hope the next time you think of security, I will have changed some minds to think about what something should be allowed to do, as opposed to how important something is.