You’re thinking about scale all wrong

Scale isn’t about large numbers

To hear modern architects, system designers, consultants and inexperienced (but forgivable) developers talk about scale, you’d think every product and service was built to be the next Twitter or Facebook.

Ironically, almost everything they create to be scalable would crash and burn if that actually happened. Even Google and Amazon aren’t an exception to this, at least from time to time. I know this because we run the largest build farm on the planet, and I’m exposed to dirty secrets about pretty much every cloud provider out there.

I want to talk about what scalability really means, why it matters and how to get there. Let’s briefly calibrate on how it’s used today.

Recap of pop-culture scalability

When most tech journalists and architects use the word scale, they use it as a noun. They imagine a very large static system that’s like… really really big in some way or another. Everyone throws out numbers like they’re talking about corn candy — hundreds or thousands of machines, millions of processes, billions of “hits” or transactions per second… you get the idea.

If you can quote a stupidly large number, you’re somehow considered important, impregnable even.

Netflix constitutes 37% of the US internet traffic at peak hours. Microsoft famously runs “a million” servers. Whatsapp moves a billion messages a day.

These numbers are impressive, no doubt. And it’s precisely because they’re impressive that we think of scale as a noun. “At a million servers,” “a billion transactions” or “20% of peak traffic” become defining characteristics of scale.

Why it’s all wrong

Calling something “scalable” simply because it is very, very, very large is like calling something realtime only because it is really, really fast.

Did you know that nowhere in the definition of “real-time systems” does it say “really, really fast?” Real-time systems are meant to be time-deterministic, i.e., they perform some operation in a predictable amount of time.

Having a system go uncontrollably fast can quite frequently be undesirable. You ever played one of those old DOS games on a modern PC? You know how they run insanely fast and are almost unplayable? That’s an example of a non-realtime system. Just because it runs incredibly fast doesn’t make it useful. That it could act with desirable and predictable time characteristics is what would make it a realtime system.

What makes a system realtime is that it works in time that is “real” — a game character’s movements must move in time that is like the real world, the soundtrack of a video must play to match the reality of the video, a rocket’s guidance computer must act in a time that matches the real world. Occasionally a “real time” system might have to execute NO-OPs so that certain actuators are signaled at the “correct time.”

As with much of computing, the definition of scalability depends on the correctness of a system, rather than the size or speed of it.

Scale is a verb, not a noun

The biggest misconception about scale is that it is about being “at scale.” There’s no honor, glory, difficulty or challenge in that, trust me. You want to see a 10K node cluster handling 100M hits per second? Pay me the bill, you got it. I’ll even spin it up over a weekend.

The real challenge, if you’ve ever run any service/product for more than a few months, is the verb “to scale.” To scale from 10 nodes to 100 nodes. To scale from 100 transactions to 500 transactions. To scale from 5 shards to 8 shards.

A scalable system isn’t one that launches some fancy large number and just stupidly sits there. A scalable system is one that scales as a verb, not runs at some arbitrary large number as a noun.

What scalability really means

We commonly use the Big-O notation to define the correctness of behavior in an algorithm. If I were to sort n numbers, a quicksort would perform at worst n-squared operations, and it would take n memory units. A realtime sort would add the additional constraint that it would respond within n minutes on the wall-clock.

Similarly, a scalable system has a predictable Big-O operational complexity to adapt to a certain scale.

Meaning, if you had to build a system to handle n transactions per second, how much complexity do you predict it would take to set it up?

O(n)? O(n-squared)? O(e^n)?

Not really an easy answer is it? Sure we try our best, and we question everything, and we often really worry about our choices at scale.

But are we scale-predictable? Are we scale-deterministic? Can we say that “for 10 million transactions a second, it would take the order of 10 million dollars, and NO MORE, because we are built to scale”?

I run into a dozen or so people who talk about large numbers and huge workloads. But very few people who can grow with my workload, with incremental operational costs.

Scalability doesn’t mean a LOT of servers. Anyone can rent a lot of servers and make them work. Scalability doesn’t mean a lot of transactions. Plenty of things will fetch you a lot of transactions.

Scalability is the Big-O measure of cost for getting to that number, and moreover, the predictability of that cost. The cost can be high, but it needs to be known and predictable.

Some popular things that “don’t scale”

Hopefully this explains why we say some things “don’t scale.” Let’s take the easiest punching bag — any SQL server. I can run a SQL server easy. One that handles a trillion transactions? Quite easy. With 20 shards? That’s easy too. With 4 hot-standby failovers? Not difficult. Geographically diverse failovers? Piece of cake.

However, the cost of going from the one SQL instance I run up to those things? The complexity cost is this jagged step function.

A lot of unpredictable jagged edges

And I’m only looking at a single dimension. Will the client need to be changed? I don’t know. Will that connection string need special attention? Perhaps.

You see, the difficulty/complexity isn’t in actually launching any of those scenarios. The challenge is in having a predictable cost of going from one scenario to a different scenario.

Why should this matter?

I’m advocating for predictable growth in complexity.

Let’s talk about my favorite example — rule-based security systems. Does any rule-based system (IPTables, firewalls, SELinux, AuthZ services) handle 10 million rules? You bet. If you have a static defined system that is architected on blueprints with every rule carefully predefined, it’s possible to create the rules and use them.

Can you smoothly go from 10 rules to 10,000 rules on a smooth slope? Paying complexity as you need it?


This is hardly ever the case. You might think that I’m advocating for a linear growth in complexity. I’m not. I’m advocating for a predictable growth in complexity. I’d be fine with an exponential curve, if I knew it was exponential.

What makes it unscalable, isn’t that the cost is VERY high, or that it is a predictable step function. What makes it truly unscalable is that the complexity is both abruptly and, worse, unpredictably step-py. You will add 10 rules sometimes. Add an 11th rule and it causes a conflict that leads to a 2-day investigation and debugging! You might add 100 nodes with ease. Add an extra node past some IP-range and you’ll be spending weeks with a network-tracer looking for the problem.

An example a bit closer to home. We’ve been looking for a home for Polyverse’s BigBang system — the world’s largest build farm that powers all the scrambling you get transparently and easily.

As an aside, you’ll notice that Polymorphic Linux is “scalable.” What cost/complexity does it take for n nodes? Whether that n be 1, 100, 10,000, 10,000,000? The answer is easily O(n). It is sub-linear in practice, but even in the worst case it is linear. There are no emergency consultants, system designers or architects required to rethink or redesign anything. This is an example of what good scalability looks like.

Behind the scenes of that scalability though, is another story. I’ve spoken to nearly every cloud provider on the planet. I may have missed a few here and there, but I bet if you named a vendor, I’ve spoken to them. They all have “scalable systems,” but what they really have are various systems built to different sizes.


Finding clouds/systems/clusters that can just run really, really large loads is easy. Running those loads is also easy. Finding clouds that are predictable in complexity based on a particular load? Even with all the cloud propaganda, that’s a tough one.

Cybersecurity needs more scalable systems, not systems “at scale”

Scalable systems are not about size, numbers or capability. They have a predictable cost in the dimension of size.

Hopefully I’ve explained what scalable really means. In much the same way that you’d measure a system in number of operations, amount of memory, number of transactions, or expected wall-clock time, a scalable system is operationally predictable in terms of size.

It doesn’t have to be cheap or linear. Merely predictable.

Cybersecurity today is desperately in need of solutions that “can scale,” not ones that merely run “at scale.” We need scalable solutions that encourage MORE security by adding MORE money. Not haphazard, arbitrary and surprising step functions.

Threat Models Suck

They’re everything that’s wrong with cybersecurity

The coffee I’m sipping right now could kill me. You think I jest; but I assure you, if you work backwards from “death”, there is a possible precondition for some very deadly coffee. I just brewed another pot. I survived it to the end of this post. I love living on the edge and ignoring threats.

In cybersecurity though, we love our threat models. We think they’re smart and clever. Intuitively they make sense; in much the same way that a dictatorship and police state make sense, or nearly all the dystopian science fiction AIs make sense. If we programmed the AI to “keep us safe”, it is going to reach the optimal annealed solution: Remain under curfew, work out, stay isolated, don’t interact, and eat healthy synthetic nutritional supplements.

I’ve hated Threat Models since the day I had the displeasure to build one a decade ago. The first, and easy problem with them, is that they are product/solution driven; they’re rhetorical. Any credible threat model should have 80% of threat mitigations as “shrug.” When we don’t have a way to react to a threat, we subconsciously consider it non-existent. Nearly all threat models are playing jeopardy (pun intended).

The second and more subtle problem is they encourage social grandstanding. How do you become a “more serious cybersecurity expert”? By coming up with a crazier threat vector than the last person.

“What if that CIA agent, is in reality, an NSA operative who was placed there by MI6, in order to leak the NOC list to MI6? Have you ever considered that? Now stop isolating that XML deserializer like some kind of pure functional programming evangelist, and let’s do some cybersecurity! Booyah!”

This is why we keep coming up with crazier and crazier tools when overlooking the obvious. I still cringe when someone calls Meltdown and Spectre “timing attacks”. The problem isn’t that the cache is functioning as it is and that you can measure access times. The problem is in shared state. But that doesn’t sound sexy and you can’t sell a 50 year old proven concept. Linus has perhaps the most profound quote in Cybersecurity history: Security problems are primarily just bugs.

Adding jitter to timers, however, is clever, sexy, complicated, protects jobs, creates new jobs, and gets people promoted. Removing shared state across threads/processes is just a design burden that mitigates any impact (and solves a bunch of other operational problems while at it.)

Impact Model

I propose we build Impact Models. Impact Models help prioritize investments, they help us make common sense decisions, but more so, they help us course-correct our decisions, by measuring whether the impact is mitigated/reduced.

In one of my talks aimed at Startups to prioritize security investments correctly, I use this slide.

Why are you investing in Cybersecurity anyway? Is it to run cool technology? Is it to do a lot of “Blockchain”? Or is it to reduce/mitigate impact to business?

Just because something is technically a threat, doesn’t imply it has an appreciable impact. You’ll notice in the slide above, if I were to lose an encrypted laptop, it’d be incredibly inconvenient, painful and frustrating. However, Polyverse as an entity would suffer very little. How or Why I might lose the said laptop becomes less of a concern, since I’ve mitigated the impact of losing it.

This applies to our website too. We try to follow best practices. But beyond a certain point, we don’t consider what happens if AWS were to be hacked, and these legendary AWS-hackers were interested in defacing the website of a scrappy little startup above all else. Would it annoy me? You bet. Would it be inconvenient? Sure. Would it get a snarky little headline in the tech press? Absolutely. But would it leak a 150 million people’s PII? Not really.

Another benefit of impact modeling, is that it can present potentially non-“cybersecurity” solutions. I usually present this slide which is similar to Linus’s quote.

Focussing on preventing a threat is important, and you should do it for good hygiene. Reducing the impact of that threat breaking through anyway, gives you a deeper sense of comfort.

We don’t live our lives based on threat models. We live them based on impact models. You’ll find that they will bring a great deal of clarity in your cybersecurity decision making, they’ll help you prioritize what comes first and what comes next. They’ll equip you to ask the right questions when purchasing and implementing technology. Most of all — they’ll help you get genuine buy-in from your team. Providing concrete data and justification motivates people far more than mandates and compliance.

“My threats are already sorted by impact!”, you say

I knew this would come up. Indeed every threat model does have three columns: Threat Vector, Impact, Mitigation.

Without impact you wouldn’t be able to pitch the threat seriously. InfoSec teams are nothing, if not good at visualizing world-ending scenarios. Much like my coffee’s purported impact of “death” got you this far, as opposed to “mild dehydration”.

Threat Model

The problem is, read that Mitigation column and ask yourself what it’s mitigating. Is it mitigating the Threat Vector, or is it mitigating the Impact?

This is not a syntactical difference, it’s a semantic one. Multiple threats can have the same impact. Mitigating the impact, can remove all of them — even if some new threat is announced, which would have led to the same impact, you remain unconcerned. Your reaction is, “no change.”

Impact Model

In short, if Equifax had changed the “if” to “when”, they’d have had a much smaller problem to deal with.

Wishing you all a reduced impact.

ASLR simplified!

ASLR explained in one simple picture

ASLR increases difficulty without adding complexity. In Part 1 and Part 2 of this series I demonstrated that crafting attacks can be a pleasant experience without a lot of furious typing. I’ve even shown you how defeating exploits is easy when we really understand how the attack works. Lets see dive deeper into ASLR, your first line of defense.


Let me explain what you’re seeing in this picture. I loaded a CentOS 7.2 libc-2.17, which we crafted an attack against in my previous post. When I loaded the exact same file on the right, I did it with an offset of 16 bytes (10 in hexadecimal).

I’m adding features to the tool when I need them for the story.

I picked 16 (hex 10) because it provides easy to interpret uniform offsets across all addresses.


You’ll notice how the binary on the right is the same binary as on the left, but it’s moved (which is why the lines are all orange.) The gadgets still exist intact but they’re in a different location. Let’s tabulate the first 5 addresses:

1807d1 + 0x10 = 1807e1
1807f1 + 0x10 = 180801
1807b1 + 0x10 = 1807c1
1a44cc + 0x10 = 1a44dc
1770b0 + 0x10 = 1770c0

This is clever because, as you saw in the title image, if we tried to execute our ROP chain, c6169 c7466 1b92, it will work on the original binary, but it falls flat on the offset one.


In a nutshell, this is what ASLR does! If we offset the same library differently (and unpredictably) for every program on a machine, the chances that the same attack would work or spread are very low.

Remember, security is not about complexity and two people typing furiously on keyboards, entertaining as that is. Security is about doing what is necessary and sufficient to defeat an attack vector.

How is this movement possible?

Offsets are easy because right around the time virtual memory became a thing with the i386, and we moved away from segmented memory to paged memory. All operating systems, processors and compilers came together to work on an offset model. This was originally not intended for security, but rather to enable programs to view a really large memory space, when physically they would only ever use a little bit. It allowed every program to work from memory address 0 through MAX, and the operating system would map it to something real.

ASLR makes use of what already existed which enables any program compiled for a modern operating system to automatically benefit from it.

Can we discover more?

I’m particularly proud of this disassembler because you’re not looking at some block diagram I drew in photoshop or name your favorite visualizer program. You’re looking at a real binary of your choice that you uploaded and can now watch these offsets, gadgets and chains at work. This is ASLR on real gadgets in action!

The cliffhanger for this post is to figure out what techniques you might use to discover the offset… remember there’s only one piece of information we need to jump to any ROP location in the offset binary. All I would have to do is add 0x10 to each address in my chain, and I broke ASLR. Like so: c6179 c7476 1ba2


This gave me an idea. You’ll notice that somehow pop rdi ; ret was in the base library even at the offsetted position! Can we find something common?

I filtered the offsetted library to show surviving gadgets, and some 2,279 gadgets survived.


I have to admit, I sometimes rig these posts to tell a story but this caught me off guard. I discovered that an offset isn’t enough and a sufficiently LARGE offset is needed if a lot of gadgets tend to occur consecutively. This was crazy!

So the second cliffhanger for today is… given that they ALL offset by a fixed amount, is it possible to infer the offset trivially? The answer is of course yes, since the video in Part 2 demonstrated it happening. It’s one thing to read a dry answer and another to intuitively understand it.

Next up I’ll see if I can’t easily figure out an intuitive way to find the offset. I’m basically solving these problems as I write them — this is not some planned series. My team wanted this tool for some other demo, but it ended up being so much fun, I started writing these posts. So I honestly don’t know if I have an answer for intuitive offset-discovery.

Fun with binaries!

ASLR and DEP defeated with three instructions and one offset!

This is Part 2 of my previous post that demonstrated how you craft undetectable attacks against binaries, using our colorful Open Source Entropy Visualization tool. I left you with a cliffhanger… so let’s begin there!

Recap of the cliffhanger

The cliffhanger I left you with was that all we need are three tiny ROP gadgets, and the offset of mprotect, to make any arbitrary part memory executable. First, I present my proof:

This is a video by Roy Sundahl, one of our most senior engineers, and our resident ROP expert who spends a lot of his time figuring out offensive tools.

Before we proceed, if you’re wondering why we can’t just block calls to mprotect, it turns out there’s some truth to Greenspun’s tenth rule. Let’s forgo the obvious candidates like interpreters and JITers. I learned that the tiniest of programs that might use regular expressions will need to call mprotect — including the innocuous “ls”.

Let’s cast a wider net!

Okay that exploit was cool, and you can do this for yourself by finding gadgets across all the libc’s in the samples.

But can we do more? Can we easily go after a range of machines *without* knowing a target signature? Let’s find out!

Here I’m comparing the same “version” of libc across CentOS 7.1 and 7.2. For a quick reference, on the right, rows with a red background are gadgets that survived perfectly, yellow background are gadgets that exist but at a different location, and no background are gadgets that didn’t exist in the first file.

We found some 2503 gadgets across them. You notice how little variation there is when the code was compiled at two different times, from what is probably two variations. The more gadgets that fall on the same addresses, the easier it is for us to cast a wide net since it requires that many fewer custom craftings to go after a binary. The way to determine if your exploit will work across both, first filter the right side by “Surviving Gadgets”, and then search for gadgets you want.

Let’s try that across CentOS 7.1 and 7.2. First up, pop rdi ; ret? Yep! There it is! The first common address is: c6169.

Second up, pop rsi ; ret? Yep! There it is also! First common address is: c7466.

Finally, pop rdx ; ret? Yep! The first surviving address is: 1b92.

We got our complete ROP chain across both binaries: c6169 c7466 1b92. We can validate this by simulating execution across both binaries.

Now you know the complete power of the tool!

This is what the tool is intended to do! You can verify rop chains across binaries without ever leaving your browser. You can now tell, visually and graphically, whether a particular attack will work against a given binary you run. It can be used to craft attacks, but it can also be used to ensure that a patch really worked.

There’s a bit of emotional comfort when you can execute a chain visually, see how the flow jumps around, and see that it doesn’t work.

Are Overflows/Leaks that common?

All this depends of course, on you being able to manipulate some little bit of stack space. Aren’t overflows so…. 2000s? We use bounds-checked modern languages that don’t suffer from these problems.

First of all, if you subscribe to our weekly breach reports, you’ll empirically find that overflows and memory leaks are pretty common. Even the internet’s favorite language, Javascript, is not immune.

Secondly, my best metric to find truth is to look for back-pressure (the sociological version of proof-by-contradiction). Look out for attempts at locking this down 100%, and then follow the backlash.

However, I also want you to get an intuitive understanding of where they arise and why they happen.

Even I have to admit that certain operations (such as sorting or XML/JSON parsing) are better implemented by manipulating memory buffers directly, despite my well-publicized extremist views favoring immutable data and list comprehensions,

So what does a “real overflow” look like? (Code in the samples directory.)

#include <stdio.h>
#define BUF_LEN 20
int main()
{
    char buf[BUF_LEN];
    int i=0;
    while (i++ < BUF_LEN) {
        printf("Setting buf[%d] to zero. n",i);
        buf[i] = 0;
    }
}

I just overwrote a byte on the stack frame. It’s obvious when I point it out. If you were working on this code and not looking for overruns, this is easy to miss. Ever seen the college textbook example of a quicksort using while-loops to avoid using the system stack? They are liberal with while(1)s all over the place.

Personal Rant: They are very common, and they are insanely difficult to find. This is why I’m such an extremist about immutability, list comprehensions, symbolic computation. For your business apps, you should NEVER, unless under extreme exceptions, listen to that “clever” developer who is doing you the favor of writing efficient code. Pat them on the back. Give them a promotion or whatever. Get them out of the way. Then find a lazy person who’ll use list-comprehensions and copy-on-change wherever possible! I’m a big believer in Joe Armstrong’s advice here: First make it work. Then make it beautiful. Finally, if necessary, make it fast.

In our analyses, more than 65% of critical CVEs since June 1st fell under this category. I could be off by a few points on that number since it changes as we compile our reports periodically and tweak how we classify them. But it’s well over 60%.

Putting it all together

In Part 1, I showed you what ROP gadgets are, how to find them, chain them, and exploit them.

In Part 2, I completed the story by demonstrating how to find common gadgets across a wide array of deployed binaries.

The purpose of the Entropy Visualizer is to enable all this decomposition in your browser. In fact this is an easier tool than most ROP finders I know. 🙂

Happy Hunting!

Let’s craft some real attacks!

If you read security briefings, you wake up every morning to “buffer overflow” vulnerabilities, “control flow” exploits, crafted attacks against specific versions of code, and whatnot.

Most of those descriptions are bland and dry. Moreover, much of it makes no intuitive sense, everyone has their fad of the week, and it is easy to feel disillusioned. What’s real, and what’s techno-babble? Didn’t we just pay for the firewalls and deploy the endless stream of patches? What is with all this machine-code nonsense?

A gripe I’ve always had with our industry is that the first solutions we come up with are architectural ivory towers. We try curing cancer on day one, and then in a few years we would sell our soul just to be able to add two numbers reliably. (Yeah, I’m still holding a grudge against UML, CORBA, SOAP, WSDL, and oh for god’s sake — DTDs!)

Let’s skip all that and actually begin by crafting a real attack visually and interactively! No more concepts. No more theory. No more descriptions of instruction set layouts and stacks and heaps! Liberal screenshots to follow! Brace yourself! This is as colorful as binaries will ever get!

Let’s play attacker for a bit

Intro to Tools

Let’s start by visiting this tool I wrote specifically for this blog post, and open a binary.

https://analyze.polyverse.io

(Source code here: https://github.com/Polyverse/binary-entropy-visualizer)

Everytime I build a web app, I end up putting a CLI in there.

Now you can drag-drop a file on there to analyze it — yeah that web page is going to do what advanced geeky nerdy tools are supposed to do on your desktop. For now it only supports Linux 64-bit binaries. Don’t look too hard, there’s two samples provided on my github repo: https://github.com/polyverse/binary-entropy-visualizer/tree/master/samples. Simply download either of the files ending in “.so”.

When you throw it on there, it should show you a progress bar with some analysis…..

Getting this screenshot was hard — it analyzes quickly.

If you want to know what it’s doing, click on the progress bar to see a complete log of actions taken.

Proof: Despite my best attempts, I hid a CLI in there for myself.

When analysis is complete, you should see a table. This is a table of “ROP gadgets.” You’re witnessing a live analysis in your browser of what people with six screens in dark rooms run with complex command lines and special programs.

But wait.. what about those other two sections?

We won’t go into what ROP gadgets are, what makes them a gadget and so on. Anyone who’s ever gone through Programming 101 will recognize it as “Assembly Language code”, another really fun thing that is always presented as dry and irritating. It’s also everywhere.

What is an exploit?

Execution of unwanted instructions

In the fashion of my patron saints, McGyver (the old one) and the Mythbusters, I am not going to go into how you find a buffer overrun and get to inject stuff onto a stack and so on. Sorry. Plenty of classes online to learn how to do that, or you might want to visit Defcon.

Let’s just assume you have a process with a single byte buffer overrun. This isn’t as uncommon as you’d think. Off-by-one errors are plentiful out there. Sure, everyone should use Rust, but didn’t I just rant about how we all want to be “clever” and struggle to plug holes later?

Let’s simply accept that an “exploit” is a set of commands you send to a computer to do what you (the attacker) wants, but something the owner/developer/administrator (the victim) definitely does not want. No matter what name the exploit goes under, at the end of the day it comes down to executing instructions that the attacker wants, and the victim doesn’t. What does stealing/breaking a password do? Allow execution. What does a virus do? Executes instructions. What does SQL-injection do? Executes SQL instructions.

Remember this: execution of unwanted instructions is bad.

Always know what you want

We want a specific set of instructions to run, given below.

Okay let’s craft an exploit now. We’re going to simulate it. All within the browser.

Let’s say for absolutely arbitrary reasons that running the following instructions makes something bad happen. WOPR starts playing a game. Trust me: nobody wants that! You don’t have to understand assembly code. In your mind, the following should translate to, “Later. Let’s play Global Thermonuclear War.

jbe 0x46c18 ; nop ; mov rax, rsi ; pop rbx ;
add byte ptr [rax], al ; add bl, ch ; cmpsb byte ptr [rsi], byte ptr [rdi] ; call rax
and al, 0xf8 ; 
mov ecx, edx ; cmp rdx, rcx ; je 0x12cb78 ; 
jne 0x1668e0 ; add rsp, 8 ; pop rbx ; pop rbp ; 
sub bl, bh ; jmp qword ptr [rax]
add byte ptr [r8–0x77], r9b ; fimul dword ptr [rax — 0x77] ; 
or byte ptr [rdi], 0x94 ; 
push r15 ; in eax, dx ; jmp qword ptr [rdx]
jg 0x95257 ; jne 0x95828 ; 
jb 0x146d9a ; movaps xmmword ptr [rdi], xmm4 ; jmp r9
or byte ptr [rdx], al ; add ah, dl ; div dh ; call rsp
jg 0x97acb ; movdqu xmmword ptr [rdi + 0x10], xmm2 ; 
or dword ptr [rax], eax ; add byte ptr [rax], al ; add byte ptr [rax], al ; 
add byte ptr [rax], al ; enter 8, 0 ; 
xor ch, ch ; mov byte ptr [rdi + 0x1a], ch ;

So how do we do it? The most effective ways to do this is to use social engineering, spearfishing, password-guessing, etc, etc. They are also ways that leave traces. They are effective and blunt, and, with enough data, they will be caught. Also, look at that code. Once someone figures out that this set of instructions causes bad things, it is easy to generate a signature to find any bits of code that match it, and prevent it from running.

But I wouldn’t be writing this post if that was the end of it.

Just because you can’t inject this code through the other methods, doesn’t mean you can’t inject code that will cause this series of instructions to be executed. AI/analytics/machine learning: all suffer from one big flaw — the Turing Test.

A program isn’t malicious because it “has bad instructions.” There’s no such thing as “bad instructions”. Why would processors, and machines and servers and phones ship with “bad instructions?” No, there are bad sequences of instructions!

A program doesn’t necessarily have to carry the bad sequence within itself. All it has to do is carry friendly good sequences, which, on the target host, lead to bad sequences getting executed. If you haven’t guessed already, this behavior may not necessarily be malicious; it might even be accidental.

How to get what you want

Now go back to the tool if you haven’t closed it. Use the file “libc-2.17.so” from the samples, and load it.

Then enter this sequence of numbers in the little text box below “ROP Chain Execution:”

46c1c 7ac3f 46947 12cb5f 166900 183139 cfdcb 12f7ea 191614 95236 146d8a 1889ad 97abb 4392 17390e 98878

It should look something like this:


Go ahead and execute the chain.


Well guess what? An exact match to my instructions to activate WOPR!

The libc that you just analyzed is a fundamental and foundational library linked into practically any and every program on a Linux host. It is checked and validated and patched. Each of those instructions is a good instruction — approved and validated by the processor-maker, the compiler-maker, the package manager all the way down to your system administrator.

What’s a REAL sequence of bad instructions?

pop rdi, pop rsi, pop rdx, and offset of mprotect is all it takes!

I made up the sequence above. In a complete break from convention, I made it more complex just so it’d look cool. Real exploits require gadgets so simple, you’ll think I’m making this part up!

A real known dangerous exploit we (simulated) in our lab requires only three ROP gadget locations, and the offset to mprotect within libc. We can defeat ASLR remotely in seconds, and once we call mprotect, we can make anything executable that we want.

You can see how easy it is to “Find Gadget” and create your own chain for:
pop rdi ; ret
pop rsi ; ret
pop rdx ; ret

This illustrates how simple exploits hide behind cumbersome tools, giving the illusion of difficulty or complexity.

Crafting your own real, serious payloads

So why is this ROP analyzer such a big deal? If you haven’t put two and two together, an exploit typically works like this:

  1. You figure out what you want (we covered this step above).
  2. You need to figure out a sequence of instruction groups, all ending with some kind of a jump/return/call that you can exploit to get the intermediate instructions in between executed.

Turns out that step 2 is not so easy. You need to know what groups of instructions you have to play with, so you can craft chains of them together.

This tool exports these little instruction-groups (called gadgets) from the binaries you feed it. You can then solve for finding which gadgets in what sequence will get achieve your goal.

This is a complex computational problem that I won’t solve today.

Look out for Part 2 of my post which will go into what the other “Compare File” dialog is for… stay tuned! It’s dead trivial to figure out, anyway, so go do it if you want.