Sean Cassidy

The company I helped start, DefenseStorm, just celebrated its second year, acquired loads more customers, changed its name, and we're just about to announce another round of fundraising. I have learned more in the past two years than I thought possible. Many of the lessons were not about technology, as I expected, but instead were about people. That surprised me, but in retrospect, it seems obvious. Companies are functions of who they are.

https://www.defensestorm.com/

Obfuscated hack
https://www.seancassidy.me/diagnosis-of-the-openssl-heartbleed-bug.html sean cassidy : Diagnosis of the OpenSSL Heartbleed Bug Mon 07 April 2014in: programming

When I wrote about the GnuTLS bug, I said that this isn't the last severe TLS stack bug we'd see. I didn't expect it to be quite this bad, however.

The Heartbleed bug is a particularly nasty bug. It allows an attacker to read up to 64KB of memory, and the security researchers have said:

Without using any privileged information or credentials we were able steal from ourselves the secret keys used for our X.509 certificates, user names and passwords, instant messages, emails and business critical documents and communication.

How could this happen? Let's read the code and find out. The bug

The fix starts here, in ssl/d1_both.c:

int dtls1_process_heartbeat(SSL *s) {             unsigned char *p = &s->s3->rrec.data[0], *pl; unsigned short hbtype; unsigned int payload; unsigned int padding = 16; /* Use minimum padding */

So, first we get a pointer to the data within an SSLv3 record. That looks like this:

typedef struct ssl3_record_st {       int type;               /* type of record */ unsigned int length;   /* How many bytes available */ unsigned int off;      /* read/write offset into 'buf' */ unsigned char *data;   /* pointer to the record data */ unsigned char *input;  /* where the decode bytes are */ unsigned char *comp;   /* only used with decompression - malloced */ unsigned long epoch;   /* epoch number, needed by DTLS1 */ unsigned char seq_num[8]; /* sequence number, needed by DTLS1 */ } SSL3_RECORD;

Records have a type, a length, and data. Back to dtls1_process_heartbeat:

/* Read type and payload length first */ hbtype = *p++; n2s(p, payload); pl = p;

The first byte of the SSLv3 record is the heartbeat type. The macro n2s takes two bytes from p, and puts them in payload. This is actually the length of the payload. Note that the actual length in the SSLv3 record is not checked.

The variable pl is then the resulting heartbeat data, supplied by the requester.

Later in the function, it does this:

unsigned char *buffer, *bp; int r;

/* Allocate memory for the response, size is 1 byte * message type, plus 2 bytes payload length, plus * payload, plus padding */ buffer = OPENSSL_malloc(1 + 2 + payload + padding); bp = buffer;

So we're allocating as much memory as the requester asked for: up to 65535+1+2+16, to be precise. The variable bp is going to be the pointer used for accessing this memory. Then:

/* Enter response type, length and copy payload */ s2n(payload, bp); memcpy(bp, pl, payload);
 * bp++ = TLS1_HB_RESPONSE;

The macro s2n does the inverse of n2s: it takes a 16-bit value and puts it into two bytes. So it puts the same payload length requested.

Then it copies payload bytes from pl, the user supplied data, to the newly allocated bp array. After this, it sends this all back to the user. So where's the bug? The user controls payload and pl

What if the requester didn't actually supply payload bytes, like she said she did? What if pl really is only one byte? Then the read from memcpy is going to read whatever memory was near the SSLv3 record and within the same process.

And apparently, there's a lot of stuff nearby.

There are two ways memory is dynamically allocated with malloc (at least on Linux): using sbrk(2) and using mmap(2). If the memory is allocated with sbrk, then it uses the old heap-grows-up rules and limits what can be found with this, although multiple requests (especially simultaneously) could still find some fun stuff1.

The allocations for bp don't matter at all, actually. The allocation for pl, however, matters a great deal. It's almost certainly allocated with sbrk because of the mmap threshold in malloc. However, interesting stuff (like documents or user info), is very likely to be allocated with mmap and might be reachable from pl. Multiple simultaneous requests will also make some interesting data available.

And your secret keys will probably be available: The fix

The most important part of the fix was this:

/* Read type and payload length first */ if (1 + 2 + 16 > s->s3->rrec.length) return 0; /* silently discard */ hbtype = *p++; n2s(p, payload); if (1 + 2 + payload + 16 > s->s3->rrec.length) return 0; /* silently discard per RFC 6520 sec. 4 */ pl = p;

This does two things: the first check stops zero-length heartbeats. The second check checks to make sure that the actual record length is sufficiently long. That's it. Lessons

What can we learn from this?

I'm a fan of C. It was my first programming language and it was the first language I felt comfortable using professionally. But I see its limitations more clearly now than I have ever before.

Between this and the GnuTLS bug, I think that we need to do three things:

Pay money for security audits of critical security infrastructure like OpenSSL Write lots of unit and integration tests for these libraries Start writing alternatives in safer languages

Given how difficult it is to write safe C, I don't see any other options. I would donate to this effort. Would you?

This section originally contained my skepticism about the feasability of a PoC due to the nature of how the heap works via sbrk. Neel Mehta has validated some of my concerns, but there are many reports of secret key discovery out there. ↩

tags: security bugs crypto c code

Sean is the CTO and co-founder at DefenseStorm, a cloud cybersecurity startup.

related posts

The Story of the GnuTLS Bug Wrong Solutions LostPass

about - github - twitter - feed - tags