The Web is World-Wide, or who still needs RC4?

May 19, 2014, 4:00 pm

≪ Previous: Killing RC4: The Long Goodbye

Two weeks ago we changed our TLS configuration to deprioritize the RC4 encryption method because it is widely thought to be vulnerable to attack. At the time we had an internal debate about turning off RC4 altogether, but statistics showed that we couldn't. Although only a tiny percentage of web browsers hitting CloudFlare's network needed RC4 that's still thousands of people contacting web sites that use our service.

To understand who needs RC4 I delved into our live logging system and extracted the User-Agent and country the visitor was coming from. In total, roughly 0.000002% of requests to CloudFlare use the RC4 protocol. It's a small number, but it's significant enough we believe we need to continue to support it for our customers.

Requests to CloudFlare sites that are still using RC4 fall into four main categories: people passing through proxies, older phones (often candy bar), other (more on that below) and a bucket of miscellaneous (like software checking for updates, old email programs and ancient operating systems).

Breakdown

Ciphers

Examining data for a 59 hour period last week showed that 34.4% of RC4-based requests used RC4-SHA and 63.6% used ECDHE-RSA-RC4-SHA. RC4-SHA is the oldest of those; ECDHE-RSA-RC4-SHA uses a newer elliptic curve based method of establishing an SSL connection. Either way, they both use the RC4 encryption algorithm to secure data sent across the SSL connection. We'd like to stop supporting RC4 altogether, because it is no longer believed to be secure, but continue to offer it for the small number of clients who can't connect more securely.

If you ever need to know the details of an SSL cipher you can use the openssl ciphers command:

$ openssl ciphers -tls1 -v RC4-SHA
RC4-SHA SSLv3 Kx=RSA Au=RSA Enc=RC4(128) Mac=SHA1

which shows that RC4-SHA uses RSA for key exchange, RSA for authentication, RC4 (128-bit) for encryption and SHA1 for message authentication.

Similarly,

$ openssl ciphers -tls1 -v ECDHE-RSA-RC4-SHA
ECDHE-RSA-RC4-SHA SSLv3 Kx=ECDH Au=RSA Enc=RC4(128) Mac=SHA1

shows that the same encryption, authentication and message authenitication are used as RC4-SHA but the key exchange is made using Elliptic Curve Diffie Hellman.

Inside RC4

One of the reasons RC4 is used for encryption is its speed. RC4 is a very fast encryption algorithm and it can be easily implemented on a wide variety of hardware (including phones with slow processors and even on 8-bit systems like the Arduino). After all, RC4 dates back to 1987.

The core of RC4 is the following algorithm:

i := 0
j := 0 
while GeneratingOutput:
    i := (i + 1) mod 256
    j := (j + S[i]) mod 256
    swap values of S[i] and S[j]
    K := S[(S[i] + S[j]) mod 256]
    output K
endwhile

It generates a pseudo-random stream of numbers (all 8-bit numbers in the range 0 to 255) by doing simple lookups in a table, swapping of values and addition modulo 256 (which is very, very fast). The output of that algorithm is usually combined with the message to be encrypted byte-by-byte using some fast scheme like an XOR.

The following short video shows the RC4 algorithm in action. It's been restricted to 32 values rather than 256 to fit it nicely on the screen but the algorithm is identical. The red stream of numbers at the bottom shows the pseudo-random stream output by the algorithm. (The code for this animation is available here; thanks to @AntoineGrondin for making the animated GIF).

So, RC4 is fast, but who's still using it? To answer that I looked at the HTTP User-Agent reported by the device connecting to CloudFlare. There were 292 unique User-Agents.

Who still uses RC4?

Firstly, lots of people using older "candy bar" style phones. Phones like the Nokia 6120 classic which was released in 2007 (and is the phone with the greatest number of RC4 requests to CloudFlare sites: 4% of the RC4-based requests in the measurement period), the Lemon T109 or the Sony Ericcson K310 which was released in 2006.

Older phones

And, of course, it's not all older telephones being used to visit CloudFlare-powered web sites. There are old browsers too. For example, we've seen the now ancient iCab 2.9.9 web browser (it was released in 2006) and we've seen it being used running on a 68k based Macintosh (last sold by Apple in 1996).

iCab 2.9.9

Another source of RC4-only connections is older versions of Adobe AIR. AIR is often used for games and if users don't update the AIR runtime they can end up using the older RC4 cipher.

Yet another source is stand-alone software that makes its own SSL connection. We've seen some software checking update servers using RC4-secured connections. The software makes a connection to its own update server using HTTPS but the available ciphers are limited and RC4 is chosen. The command-line program curl was used to generate 1.9% of RC4-based requests to CloudFlare sites (all done with versions dating to 2009).

IE 5.01

There's also quite a bit of older Microsoft Internet Explorer around including Internet Explorer 5.01 (which dates back to 1999!). Here's a breakdown of Internet Explorer versions connecting using RC4:

IE Versions

Looking at Windows tells a similar story of older version of the operating system (except for the presence of Windows 7 which is explained below) with lots of Windows XP out there:

Windows versions

I sampled connections using RC4 to see which countries they came from. The following mapping shows the distribution of RC4-based connections across the world (the darker the more RC4 was used).

The World

From the map you can see that in Brazil, India, and across central Africa RC4 is still being used quite widely. But you'll also notice that the coloring of the US indicates that a lot of RC4 is in use there. That seems like a surprise, and there's an extra surprise.

Transparent SSL Proxies

Digging into the User-Agent data for the US we see the following web browser being used to access CloudFlare-powered sites using RC4:

Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko)
Chrome/34.0.1847.137 Safari/537.36

That's the most recent version of Google Chrome running on Windows 7 (you can see the presence in Windows 7 in the chart above). That should not be using RC4. In fact, most of the connections from Windows machines that we see using RC4 should not be (since we prioritize 3DES over RC4 for older machines).

It was initially unclear why this was happening until we looked at where the connections were coming from. They were concentrated in the US and Brazil and most seemed to be coming from IP addresses used by schools, hospitals and other large institutions.

Although the desktop machines in these locations have recent Windows and up to date browsers (which will not use RC4) the networks they are on are using SSL-based VPNs or firewalls that are performing man-in-the-middle monitoring of SSL connections.

This enables them to filter out undesirable sites, even those that are accessed using HTTPS, but it appears that the VPN/firewall software is using older cipher suites. That software likely needs updating to stop it using RC4 for secure connections.

What you can do

You can check the strength of your browser's SSL configuration by visiting How's My SSL. If you get a rating of "Probably Okay" then you're good. If not make sure you have the latest browser version.

Also be sure to update to the latest version of Adobe Flash (if you use it) and Adobe AIR (if you use it).

If you are using an old phone that uses RC4 consider not using it for secure web browsing. RC4 is widely believed to be broken.

↧

BPF - the forgotten bytecode

May 21, 2014, 4:00 pm

≫ Next: CloudFlare Meetups: Set your mind on fire.

≪ Previous: The Web is World-Wide, or who still needs RC4?

Every once in a while I run into an obscure computer technology that is a hidden gem, which over the years has become mostly forgotten. This is exactly how I feel about the tcpdump tool and its kernel counterpart the packet filter interface.

For example, say you run:

$ tcpdump -ni eth0 ip and udp and port 53

For most of us this command is pure magic, almost nobody understands what happens behind the scenes. This is understandable, there is little need to know how it works: the tool does its job very well, it's descriptive and very fast.

In this article I'll try to explain how tcpdump works and how we use its spinoffs to help fight the packet floods that hit us every day.

But first, we need a bit of history.

Historical context

Since workstations became interconnected, network administrators had a need to "see" what is flowing on the wires. The ability to sniff the network traffic is necessary when things go wrong, even for the most basic debugging.

For this reason operating systems developed APIs for packet sniffing. But, as there wasn't any real standard for it every OS had to invent a different API: Sun’s STREAMS NIT, DEC's Ultrix Packet Filter, SGI’s Snoop and Xerox Alto had CMU/Stanford Packet Filter. This led to many complications. The simpler APIs just copied all the packets to the user space sniffer, which on a busy system resulted in a flood of useless work. The more complex APIs were able to filter packets before passing them to userspace, but it was often cumbersome and slow.

All this changed in 1993 when Steven McCanne and Van Jacobson published the paper introducing a better way of filtering packets in the kernel, they called it "The BSD Packet Filter" (BPF).

Since then the BPF has taken the world by a storm and along with libpcap and tcpdump become the de-facto standard in network debugging.

Tcpdump dissected

Tcpdump is composed of three logical parts:

Expression parser: Tcpdump first parses a readable filter expression like ip and udp and port 53. The result is a short program in a special minimal bytecode, the BPF bytecode.
The BPF bytecode (filter program) is attached to the network tap interface.
Finally, tcpdump pretty prints filtered packets received from the network tap. Pretty printing is far from a simple task, tcpdump needs to understand many network protocols to do it.

Expression parser

Given a packet filtering expression, tcpdump produces a short program in the BPF bytecode. The easiest way to see the parser in action is to pass a -d flag, which will produce a readable assembly-like program:

$ sudo tcpdump -p -ni eth0 -d "ip and udp"
(000) ldh      [12]
(001) jeq      #0x800           jt 2    jf 5
(002) ldb      [23]
(003) jeq      #0x11            jt 4    jf 5
(004) ret      #65535
(005) ret      #0

This program reads like this:

Load a half-word (2 bytes) from the packet at offset 12.
Check if the value is 0x0800, otherwise fail. This checks for the IP packet on top of an Ethernet frame.
Load byte from a packet at offset 23. That's the "protocol" field 9 bytes within an IP frame.
Check if the value is 0x11, which is the UDP protocol number, otherwise fail.
Return success. Packet is matching the rule.

Here you can find the full documentation of the assembly syntax.

Less readable compiled bytecode is printed with -ddd option:

$ sudo tcpdump -p -ni eth0 -ddd "ip and udp"|tr "\n" ","
6,40 0 0 12,21 0 3 2048,48 0 0 23,21 0 1 17,6 0 0 65535,6 0 0 0,

Kernel API

Tcpdump can open a network tap by requesting a SOCK_RAW socket and after a few magical setsockopt calls a filter can be set with SO_ATTACH_FILTER option:

sock = socket(PF_PACKET, SOCK_RAW, htons(ETH_P_ALL))
...
setsockopt(sock, SOL_SOCKET, SO_ATTACH_FILTER, ...)

From now on the BPF filter will be run against all received packets on a network interface and only packets matching that filter will be passed to that network tap file descriptor.

All the gritty details are described in the Documentation/networking/filter.txt file. For the best performance one can use a zero-copy PACKET_MMAP / PACKET_RX_RING interface, though most people should probably stick to the high level libpcap API.

The BPF bytecode

In essence Tcpdump asks the kernel to execute a BPF program within the kernel context. This might sound risky, but actually isn't. Before executing the BPF bytecode kernel ensures that it's safe:

All the jumps are only forward, which guarantees that there aren't any loops in the BPF program. Therefore it must terminate.
All instructions, especially memory reads are valid and within range.
The single BPF program has less than 4096 instructions.

All this guarantees that the BPF programs executed within kernel context will run fast and will never infinitely loop. That means the BPF programs are not Turing complete, but in practice they are expressive enough for the job and deal with packet filtering very well.

The original concepts underlying the BPF were described in a 1993 and didn't require updates for many years. The Linux implementation on the other hand is steadily evolving: recently a new and shiny just-in-time BPF compiler was introduced, and a few months ago an attempt was made to upgrade the BPF assembly to a 64-bit form.

Not only tcpdump

BPF is an absolutely marvelous and flexible way of filtering packets. For years it got reused in more places and now Linux uses BPF filters for:

tcpdump-style packet filtering
"cls_bpf" classifier for traffic shaping (QoS)
"seccomp-bpf" syscalls filter to sandbox applications
"xt_bpf" iptables module

How we use it

CloudFlare deals with massive packet floods on a daily basis. It's very important for us to be able to drop malicious traffic fast, long before it hits the application.

Unfortunately matching before the application is not easy. Basic iptables filtering, for example looking just at the source IP, doesn't work as floods get more sophisticated. The iptables module closest to our needs is "xt_u32", but it's hard to understand and somewhat limited. Though it's generally pretty useful, and to make it easier people wrote rule generators.

But what works for us best is the "xp_bpf" iptables module by Willem de Bruijn. With it we can match an iptable rule based on any BPF expression.

Unfortunately, our BPF bytecode became pretty complex and it can't be written as a usual tcpdump expression any more. Instead we rely on a custom crafted BPF bytecode, for example, this is an "xt_bpf" bytecode that matches a DNS query for "www.example.com":

    ld #20
    ldx 4*([0]&0xf)
    add x
    tax

lb_0:
    ; Match: 076578616d706c6503636f6d00 '\x07example\x03com\x00'
    ld [x + 0]
    jneq #0x07657861, lb_1
    ld [x + 4]
    jneq #0x6d706c65, lb_1
    ld [x + 8]
    jneq #0x03636f6d, lb_1
    ldb [x + 12]
    jneq #0x00, lb_1
    ret #1

lb_1:
    ret #0

To compile it we use the tools from the tools/net directory:

$ bpf_asm drop_example_com.bpf
14,0 0 0 20,177 0 0 0,12 0 0 0,7 0 0 0,64 0 0 0,21 0 7 124090465,64 0 0 4,21 0 5 1836084325,64 0 0 8,21 0 3 56848237,80 0 0 12,21 0 1 0,6 0 0 1,6 0 0 0

Finally you can apply the rule like so:

iptables -A INPUT \
    -p udp --dport 53 \
    -m bpf --bytecode "14,0 0 0 20,177 0 0 0,12 0 0 0,7 0 0 0,64 0 0 0,21 0 7 124090465,64 0 0 4,21 0 5 1836084325,64 0 0 8,21 0 3 56848237,80 0 0 12,21 0 1 0,6 0 0 1,6 0 0 0," \
    -j DROP

This is a fairly simple rule just looking for a particular bytes in the packet. The same could be achieved using "u32" or "string" modules. But "xt_bpf" gives us more flexibility. For example we can make the rule case insensitive:

...
lb_0:
    ; Match: 076578616d706c6503636f6d00 '\x07example\x03com\x00'
    ld [x + 0]
    or #0x00202020
    jneq #0x07657861, lb_1
    ld [x + 4]
    or #0x20202020
    jneq #0x6d706c65, lb_1
    ld [x + 8]
    or #0x00202020
    jneq #0x03636f6d, lb_1
    ldb [x + 12]
    jneq #0x00, lb_1
    ret #1
...

Or match all the subdomains of "example.com":

 ...
lb_0:
    ; Match: *
    ldb [x + 0]
    add x
    add #1
    tax
    ; Match: 076578616d706c6503636f6d00 '\x07example\x03com\x00'
    ld [x + 0]
    jneq #0x07657861, lb_1
    ld [x + 4]
    jneq #0x6d706c65, lb_1
    ld [x + 8]
    jneq #0x03636f6d, lb_1
    ldb [x + 12]
    jneq #0x00, lb_1
    ret #1
...

These kind of rules are very useful, they allow us to pinpoint the malicious traffic and drop it early. Just in the last couple of weeks we dropped 870,213,889,941 packets with few BPF rules. Recently during a flood we saw 41 billion packets dropped throughout a night due to a single well placed rule.

Summary

Just as intended by Steven McCanne and Van Jacobson, the BPF is still very useful and extremely fast. Even without enabling the BPF JIT we don't see any performance hit of applying complex BPF rules.

I'm sure we'll use more BPF filters in the future to shield ourselves from malicious traffic and to have more CPU to deal with legitimate requests.

Does generating BPF assembly sound like fun? We're hiring talented developers, including to our elite office in London.

↧

CloudFlare Meetups: Set your mind on fire.

May 27, 2014, 12:59 pm

≫ Next: Welcome to Miami: HostingCon 2014

≪ Previous: BPF - the forgotten bytecode

Education, expertise, and community: these themes define Meetups at CloudFlare. Meetups in our office bring together industry leaders, academics, and field experts to examine topics ranging from the Go programming language, to databases, to cryptography, and more.

We’re creating a space for people interested in learning about, and hashing-out, specialized topics together; check out CloudFlare’s upcoming Meetups here.

Our latest series on cryptography covers a broad range of topics including elliptic curves, secure messaging, password-authenticated encryption, lattice-based cryptography, new block chain modes, fully homomorphic cryptography, and memory-hard hashing algorithms.

The next crypto Meetup is on July 16th! We’re still looking for presenters: if interested, email us at cloudflare-team@cloudflare.com.

Join us sometime!

Need an appetizer?

Here’s a taste of our May 8th Meetup:

CloudFlare welcomed Brian Warner, a security engineer from Mozilla, to talk about cryptography. Brian’s work focuses on the security of the Mozilla ecosystem including Firefox Sync and Firefox accounts. During his session, he described the changes Firefox has made in the last few years, three different protocols they’ve used, as well as problems Mozilla faced and the compromises made between usability and security.

Check out Brian’s talk here.

CloudFlare was also excited to have cryptographer and software engineer Michael Hamburg, PhD from Cryptography Research speaking on elliptic curves. Michael wrote his dissertation under the supervision of Dan Boneh at Stanford. His research indicates that elliptic curves have been the "next big thing" in cryptography for a while; however, they’re tricky to implement securely. Michael explains how Montgomery and Edwards curves give faster and simpler implementations, and Dan Bernstein's Curve25519 (aka Ed25519) have caught on, so there is increasing interest in a stronger curve that doesn’t sacrifice as much speed or simplicity. In this talk, he discusses his research findings—specifically, the "Ed448-Goldilocks” curve.

View Michael’s presentation here.

There's been a surge of interest in end-to-end security for applications like chat, text messaging, and email; in light of this, CloudFlare’s latest Meetup group included independent cryptographic consultant Trevor Perrin. Trevor’s work focuses on designing and reviewing cryptographic systems, making him well placed to inform our community. Moving away from deploying existing protocols like OTR, PGP, and S/MIME, Trevor discusses a number of projects working on "next-generation" protocols to improve usability, security, and protection for new forms of communication.

Hear Trevor’s thoughts on protocol designs using TextSecure, Pond, and more—here.

Remember: The next crypto meetup is on July 16th! If you are interested in presenting, please email us at cloudflare-team@cloudflare.com.

↧

Welcome to Miami: HostingCon 2014

May 28, 2014, 3:15 pm

≫ Next: CloudFlare is PCI Certified

≪ Previous: CloudFlare Meetups: Set your mind on fire.

This year’s HostingCon will be held in Miami Beach, and the CloudFlare team is busy prepping. This is our fourth year at the show and our team is excited to see partners, customers and friends.

You bring the sunscreen, we’ll supply:

Complimentary limousine transfers from Miami International Airport to Loews Miami Beach Hotel on Sunday, June 15. Reserve your spot today.

CloudFlare t-shirts

Live music during breakfast each morning to start your day off right

Our signature Nerf Railguns (quantity is limited: be sure to visit us early at booth #407!)

As it happens, this will be my first HostingCon, I've just joined CloudFlare as Partner Account Manager. Stop by to say hi to me and the team and learn about our new features and products.

If you’re interested in becoming a partner, please drop by the CloudFlare Booth #407 to learn how CloudFlare can reduce your server load; improve the performance of your network; block spammers, botnets, and other web threats; and provide DDOS protection.

Here’s where the CloudFlare Team will be during the show:

Sunday, June 15

Limo transfers from Miami International Airport to Loews Miami Beach Hotel. Registration is open, reserve your spot now.

Monday, June 16

7:45am-8:45am: Breakfast, sponsored by CloudFlare - Level 2 of the convention center (there will be signs)

5:30pm-8:30pm: Welcome Reception

Tuesday, June 17

7:45am-8:45am: Breakfast, sponsored by CloudFlare - Level 2 of the convention center

12:30pm-6:30pm: Exhibit hall is open. CloudFlare is booth #407

4:00pm-6:30pm: Happy hour! Visit our booth while you sip on your preferred beverage

Wednesday, June 18

7:45am-8:45am: Breakfast, sponsored by CloudFlare - Level 2 of the convention center

12:30pm-4:00pm: Exhibit hall is open. CloudFlare is booth #407

We’ll see you in Miami!

↧

CloudFlare is PCI Certified

June 4, 2014, 10:00 am

≫ Next: Reset the Net with One Click

≪ Previous: Welcome to Miami: HostingCon 2014

Great news for everyone using CloudFlare on an e-commerce site, or a site accepting or processing credit card transactions.

After undergoing a Payment Card Industry (PCI) Data Security Standard (DSS) 2.0 security control assessment, we’ve been certified as a Level 1 service provider. Achieving Level 1 status requires an assessment of our security controls by an independent third party—a Qualified Security Assessor (QSA).

Additionally, CloudFlare’s Web Application Firewall (WAF) helps companies meet PCI requirement 6.6. Our WAF not only helps protect our customers from application layer attacks, but also secures the data of online consumers making purchases on sites within the CloudFlare network.

What’s even better is that we’ve achieved Level 1 PCI compliance while still allowing for expansion of our global data center network. Over the coming weeks, we plan to turn on four new data centers in Madrid, Spain; Milan, Italy; Medellín, Colombia; and São Paulo, Brazil.

Stay tuned for updates on these new locations!

Have questions about CloudFlare’s PCI status? Check out this FAQ section.

↧

Reset the Net with One Click

June 4, 2014, 6:33 pm

≫ Next: New OpenSSL vulnerabilities: CloudFlare systems patched

≪ Previous: CloudFlare is PCI Certified

On Thursday, June 5th, 2014 Reset the Net has organized a day of awareness about protecting yourself against mass surveillance, and CloudFlare is offering an easy way for its customers to participate.

CloudFlare believes that one of the best ways to improve the net is to increase the amount of encrypted traffic because encrypting your traffic keeps it safe from prying eyes. We offer various SSL options to keep your website secure for any of our customers on a paid CloudFlare plan.

Keep in mind, SSL isn’t just for big banks and large corporations, it also helps boggers protect their readers and keeps your e-mails secure. By default, everything sent over HTTP is open for anyone to look at; using HTTPS (SSL/TLS) ensures that only authorized viewers can see what’s happening on your website.

In line with our mission to help build a better internet, we’re announcing a new app that lets you easily add the Reset the Net splash screen to your website.

The splash screen provides your visitors with a link to join the cause, and information on how to secure your communications against mass surveillance. It also links to resetthenet.org, allowing your visitors to learn how they can protect their computers and Internet-connected devices from mass surveillance. Visitors can also join the Reset the Net effort by pledging to implement suggested protections.

Reset the Net Splash Screen

Organizations such as the Electronic Frontier Foundation (EFF), the American Civil Liberties Union (ACLU), Amnesty International, and others have joined this effort to express their concern about the impact that mass surveillance has on the freedom and openness of the internet. For more information, please visit resetthenet.org.

Installing the Reset the Net App

You can add the Reset the Net app to your site starting today—it’s free. The splash screen will automatically appear on June 5, 2014, and, like all CloudFlare Apps, installation is just one click—no coding necessary. To get the banner app, please visit [our Apps marketplace] (https://www.cloudflare.com/apps/reset_the_net). (The Reset the Net splash screen will appear only on June 5th.)

Above is a screenshot of what the splash screen on an example website. The screen appears in front of your regular website content, and will be permanently dismissed by a visitor by clicking “Close” in the upper right hand corner.

↧

New OpenSSL vulnerabilities: CloudFlare systems patched

June 5, 2014, 6:00 am

≫ Next: Three years after World IPv6 Day

≪ Previous: Reset the Net with One Click

The OpenSSL team announced seven vulnerabilities covering OpenSSL 0.9.8, 1.0.0, 1.0.1 and 1.0.2 (i.e. all versions) earlier today.

The most serious of these is a potential man-in-the-middle attack CVE-2014-0224 which is being referred to as CCS Injection. Both Google's Adam Langley and the original reporter of the problem have write ups that give more technical detail.

We have applied the required patch to all CloudFlare servers and customers are protected against CVE-2014-0224 and all the other vulnerabilities announced today.

Everyone who uses OpenSSL in their software or on their server should upgrade as soon as possible; the OpenSSL team has released new versions today.

↧

Three years after World IPv6 Day

June 5, 2014, 4:00 pm

≫ Next: Eliminating the last reasons to not enable IPv6

≪ Previous: New OpenSSL vulnerabilities: CloudFlare systems patched

It’s been over 1,000 days since the Internet Society’s World IPv6 Day and a lot of positive things have happened in that world of IPv6 content delivery. At CloudFlare we have not been sitting still either. Providing IPv6 access is core to our services and each and every customer has full IPv4 and IPv6 services available when their website is CloudFlare enabled.

Starting three months ago CloudFlare moved to make IPv6 enabled the default setting for all new accounts and for existing accounts that had simply sat on the default settings mode. This resulted in an uptick in the IPv6 traffic stats from our network and was noticed globally. Our support issues have been near zero; which makes sense as the previous three years has been a fantastic proving ground for IPv6 networking.

IPv6 is here today and it works.

But it's still not widely used. It's time for that to change.

20% of the IPv6 web uses CloudFlare IPv6

Looking at the top one million web sites tracked by Alexa we see that around 7% are accessible using IPv6. Of that 7% a full 20% are accessible using IPv6 because they are CloudFlare customers with IPv6 enabled.

We hope and expect that both those percentages will grow over time.

The web logfile problem

However some complex sites, that use the origin server IP address within their codebase, have had issues enabling IPv6.

Many sites log the IP address of the visitor when a web page is accessed. This has caused some interesting headaches when migrating to IPv6. Why? Because when software developers hardcode a 4-byte IPv4-address into a database they will have a near-impossible time dealing with at 16-byte IPv6 address. Take a look at this IETF style graphic showing the overall size of the two addressing schemes.

IPv4 (32 bits)
+----+----+
|####|####|
+----+----+

IPv6 (128 bits)
+----+----+----+----+----+----+----+----+
|####|####|####|####|####|####|####|####|
+----+----+----+----+----+----+----+----+

When a developer uses a string (or character) area to store the IP address there’s a string length issue.

In IPv4, the maximum length could be represented by the IP address 255.255.255.255 (which is 15 bytes long). For IPv6 the address could be FFFF:FFFF:FFFF:FFFF:FFFF:FFFF:FFFF:FFFF (which is 39 bytes long). Some programming languages and databases can be string length agnostic; however most aren’t.

Pseudo IPv4 to the (Temporary) Rescue

As a stopgap measure CloudFlare has introduced a pseudo IPv4 address that is mapped from the real IPv6 address and can be used in place of the IPv6. It's not a perfect solution and software and databases are going to have to be upgraded to support IPv6, but in the meantime this pseudo IPv4 address can be used where an IPv4 format address is expected.

Going forward

This workaround is now available for every CloudFlare customer and we promise to persuade as many of our users who have chosen to disable IPv6 to enable IPv6 fully. Those few remaining sites will want to be part of the new Internet.

Happy third anniversary World IPv6 Day!

↧

Eliminating the last reasons to not enable IPv6

June 5, 2014, 4:00 pm

≫ Next: Protecting Free Expression Online

≪ Previous: Three years after World IPv6 Day

Today is June 6. For the last two years, the date has been celebrated as World IPv6 Day. CloudFlare has offered full IPv6 support as well as our IPv6-to-IPv4 gateway since 2012. In preparation for this year's IPv6 Day, we scanned the world's largest websites in order to figure out how many are available over IPv6. There's good news and bad news.

The good news is that our IPv6 gateway is being widely used in order to enable the IPv6 Web. In fact, of the sites that support IPv6, more than 20% of them do so via CloudFlare. The bad news is that, while CloudFlare, Google, Facebook and other big guys have shown IPv6 can be adopted without a performance penalty, still only 7% of the world's largest websites are available over IPv6. And, disappointingly, in spite of CloudFlare offering it for free, only about half of our customers have turned on our IPv6 gateway.

The silver lining is that, if we can just get all our current customers to enable our IPv6 gateway, we'd nearly reach the milestone of 10% of the world's largest sites being available over IPv6. With that goal in mind, we set out to find and solve the last stumbling blocks for our customers enabling IPv6.

Y U No IPv6?

CloudFlare makes supporting IPv6 on a website ridiculously easy. If your backend supports IPv6 then visitors arriving on an IPv6 connection will be transported via the protocol end-to-end. If your backend only supports IPv4, CloudFlare will accept a visitor over IPv6 and then seamlessly make a request to your server over IPv4. We've been defaulting the gateway on for new customers for several months and any existing customer can enable it with a single click. So, if we've made IPv6 so easy, what's stopping those sites that haven't yet enabled it?

The answer is usually legacy software that assumes an IPv4 world. Often this is software that handles sessions or stops fraud or abuse. For example, the popular, if sometimes controversial, website 4chan is a CloudFlare customer. 4chan has a notoriously mischievous audience. Sometimes, users will try to spam the service. To keep spammers at bay, 4chan uses algorithms which look for suspicious behavior. Unfortunately, these algorithms use a visitor's IP address as one of their parameters and they assume the IP will be in the IPv4 format.

The long term solution, of course, is for platforms like 4chan to rewrite their software to accommodate IPv6. That takes time. We wanted to provide a stopgap solution that would allow CloudFlare customers to enable IPv6 quickly while they worked to upgrade their software to fully support it.

Introducing Pseudo IPv4

To accommodate software that assumes a IPv4 world, today we're enabling a new CloudFlare option: Pseudo IPv4. The option will, whenever a connection is established over IPv6, add a HTTP header to requests with a "pseudo" IPv4 address. We are using Class E IP space (240.0.0.0 - 255.255.255.255) for these addresses. Class E address space is reserved as experimental and no actual traffic should originate from it. If, at some point, some or all of Class E space is designated for a specific use then we'll adjust the schema for the Pseudo IPv4 header.

Class E space gives us 268,435,456 possible unique IPv4 addresses. That's much smaller than the 340 undecillion possible IPv6 addresses, but it's larger than the number of IPs seen by all but the very largest websites. The Pseudo IPv4 service uses a hashing algorithm applied to top 64 bits of the connecting IPv6 address in order to map the visitor to one of the Class E IPs. Because the hashing algorithm will always produce the same output for the same input, the same IPv6 address will always result in the same Pseudo IPv4 address.

There are two options for the Pseudo IPv4 service: 1) you can have us automatically add a header (Cf-Pseudo-IPv4), which can then be parsed by software as needed; or 2) you can have us overwrite the existing Cf-Connecting-IP and X-Forwarded-For headers with a Pseudo IPv4 address. The advantage of the overwrite option is that, in most cases, it won't require any software changes. If you choose the overwrite option, we'll append a new header (Cf-Connecting-IPv6) in order to ensure you can still find the actual connecting IP address for debugging.

The Pseudo IPv4 service, like our IPv6 gateway, is available to all our customers, even those on the free plan. Our hope is that it will eliminate one of the last reasons for IPv6 holdouts. If you're already a CloudFlare customer, login to your account and make sure IPv6 is enabled on the CloudFlare Settings page. You can find the toggle for Pseudo IPv4 there as well. If you're not yet a CloudFlare customer, it only takes five minutes to sign up, so for most of the world there's still time to get your site on our network and join the modern web before World IPv6 Day 2014 comes to an end.

And, by the way, as of today, 4chan is now available over IPv6.

Addendum: Nitty Gritty Details

For the technical in the audience, here are some of the nitty gritty details on how we implemented Pseudo IPv4.

We chose to use the MD5 hashing algorithm. While MD5 has risks when it's used in cryptography, it is faster than many alternatives and has a acceptable uniformity over the hash space. Note that there is some risk of a collision (i.e., where two IPv6 addresses map to the same IPv4 pseudo address), however most abuse/fraud systems take this into account already in order to deal with NATs and other instances where multiple people may share a single IP.

At CloudFlare, we use a modified version of NGINX and wrote most of our request processing code in Lua, relying on the ngx_openresty app server. Here's the Lua code for the proof-of-concept prototype of the hashing algorithm:

function pseudo_ipv4(ipv6_top64)
   -- Grab bottom 32 bits from MD5 hash
   -- ngx.md5 does not suppress leading zeros so regexp will always match.       
   local hash = ngx.md5(ipv6_top64)
   local mod = ngx.re.match(hash,
                  "([a-f0-9]{2})([a-f0-9]{2})([a-f0-9]{2})([a-f0-9]{2})$",
                  "joi")

   -- Normalize first byte to fit in class E space. Done using subtraction as
   -- Lua doesn't have built-in bitwise operators
   local b1 = tonumber(mod[1], 16)
   if b1 >= 0xF0 then
      b1 = b1 - 0xF0
   end

   return string.format("%d.%d.%d.%d",
             b1 + 0xF0, tonumber(mod[2], 16), tonumber(mod[3], 16),
             tonumber(mod[4], 16))
end

Since every request needs to pass through the hashing algorithm, we wanted to make it as fast as possible. We set to work optimizing the Lua prototype for speed. Here's the result:

function pseudo_ipv4(ipv6_top64)
   -- Grab bottom 32 bits from MD5 hash
   local md5 = ngx.md5_bin(ipv6_top64)
   local b1, b2, b3, b4 = md5:byte(13, 16)

   -- Normalize first byte to fit in class E space
   b1 = bit.bor(0xF0, bit.band(b1, 0x0F))

   return string.format('%d.%d.%d.%d', b1, b2, b3, b4)
end

PS - If writing Lua code like the above looks fun, we're always hiring.

↧

Protecting Free Expression Online

June 12, 2014, 3:00 am

≫ Next: Naming Project Galileo

≪ Previous: Eliminating the last reasons to not enable IPv6

Over the last few years, we’ve witnessed a troubling trend: an increasing number of politically or artistically important sites targeted by very large denial of service attacks. Often these attacks appear politically motivated — going after, for instance, citizen journalists reporting on government corruption. The promise of the Internet is that it is a great leveler — that anyone with an idea can reach a global audience. These attacks threaten that promise.

CloudFlare’s mission is to build a better Internet. Fundamental to that is ensuring that bullies cannot use attacks to censor content simply because they disagree with it. We knew we needed to do something to stop this troubling trend. To that end, today we’re announcing Project Galileo.

Project Galileo’s goal is to protect free expression online. Sites can participate in Project Galileo if they meet the following criteria:

They are engaged in news gathering, civil society, or political/artistic speech;
They are subject to online attacks related to their news gathering, civil society, or political/artistic speech;
They are not-for-profit organizations or small commercial entities; and
They act in the public interest, broadly defined.

For sites that meet these criteria, CloudFlare will extend its full, enterprise-class DDoS attack protection at no cost.

Working with Civil Society to Remain Neutral

As we invite sites into Project Galileo, it is important that we remain content neutral. We believe that CloudFlare should never decide, based on their content, what sites deserve protection. We’re very good at technology, but deciding what content is politically or artistically important is above our pay grade. As such, we’ve partnered with a number of civil society organizations to identify at-risk sites that qualify for Project Galileo. Our launch referring partners include:

These organizations now have access to a sort of “bat phone.” If they are aware of a current CloudFlare customer that qualifies for Project Galileo, or a qualified site that is under attack and needs CloudFlare’s services, they have access to a hotline to enable our full DDoS attack protection.

If you have a site you believe qualifies for Project Galileo, you can contact one of our partner organizations to request to be included. If you don’t already have a relationship with one of our partner organizations, contact us and we’ll steer you in the right direction. Over time, we will continue to expand Project Galileo’s list of partner organizations. It’s important to us that the organizations that can refer sites span the political and artistic spectrum. If you’re an organization that would like to be listed as a partner, you can contact us using the form on the Project Galileo website.

Ensuring Freedom of Expression

For the last several weeks we’ve been quietly enrolling sites in Project Galileo. While, for obvious reasons, we will never publish a list of the sites we’re shielding, we’re proud of some of the voices of free expression we’ve helped protect. They include organizations advocating for LGBT rights in the Middle East, tracking political corruption in Sri Lanka, monitoring deforestation in Malaysia, exposing bribery across Africa, and reporting on the civil war in Syria. CloudFlare is helping ensure these voices will never be silenced.

When Michelle, Lee and I started CloudFlare we said that our goal was to bring the resources that were previously available only to the Internet’s giants to everyone online. I’m proud that today we took another step toward that goal.

Learn more: www.projectgalileo.org

↧

Naming Project Galileo

June 12, 2014, 3:30 pm

≫ Next: New .uk domains now supported

≪ Previous: Protecting Free Expression Online

What’s in a Name

Earlier today, CloudFlare announced Project Galileo to protect free speech on the Web by using its sophisticated anti-DDoS resources. Seventeen (at last count) free speech, public interest, and civil society organizations are helping us identify at-risk, in-need websites for the Project. If one these websites comes under attack, CloudFlare will make sure that the website stays online.

You can read more about the story in the press in: ArsTechnica; Re/Code; Slate; TechCrunch; and The Verge.

Since we’ve launched, we keep getting asked why we called it “Project Galileo.”

Subversive Moons

In 1610, Galileo Galilei fashioned a homemade telescope and pointed it towards the heavens. He saw sights never witnessed before by human eyes: moons orbiting Jupiter, rings around Saturn, sunspots, craters on the moon, and phases of Venus.

These observations gave evidence for a dangerous truth—we are not the center of the universe, and the Earth revolves around the sun. Galileo published his discoveries in a book modestly entitled Dialogue Concerning the Two Chief World Systems. As reward for his discoveries, Galileo was labeled a heretic and his book was banned until 1718. The Earth was to revolve around the sun 107 more times before Galileo’s discoveries would reach a wider audience.

Like Galileo, websites espousing politically sensitive—even heretical—speech are often victims of suppression. Like Galileo, most of these sites don’t have the resources to protect their discoveries from being suppressed.

“And yet it moves...”

How would history be different if Galileo’s book had been able to stay “online”? Would we have reached the moon in July 1861, not 1969?

If you would like to help Project Galileo as a public interest organization, identifying those websites most in need, please visit: www.projectgalileo.org. If you would like to be a participating website, we suggest you contact one of our partner organizations so they can recommend you.

↧

New .uk domains now supported

June 13, 2014, 7:40 am

≫ Next: CloudFlare's 25th data center turns up to ease collective disappointment of Spanish fútbol fans

≪ Previous: Naming Project Galileo

On the 10th of June, the UK domain registry Nominet made available second-level domain namespace to the public, allowing anyone to register a domain ending with simply .uk This is something the members of the UK CloudFlare team have been looking forward to since the initial announcement at the beginning of this year, and as such are happy to announce that CloudFlare now has full support for .uk domains.

Previously, only third-level domains have been available to the public, in the form .co.uk and .org.uk This is in contrast to other countries such as France and Germany who have always been able to register second-level .fr and .de domains.

According to Nominet, within the first 24 hours of .uk domains becoming available, 50,000 have been registered. Owners of a .co.uk address are given the option to also register the shorter .uk version of their domain for the next 5 years, allowing businesses and personal users to register the shorter version without fear that the domain would be taken by someone else. The third-level UK domains can be used alongside .uk, and will continue to work as before.

As of now, CloudFlare users can set up .uk domain names with CloudFlare in the usual way through their accounts, from within the My websites section.

↧

CloudFlare's 25th data center turns up to ease collective disappointment of Spanish fútbol fans

June 13, 2014, 4:30 pm

≫ Next: Buongiorno, Milano! CloudFlare’s 26th Data Center Now Live

≪ Previous: New .uk domains now supported

After a drubbing earlier today of La Roja by The Flying Dutchmen, we felt obliged to deliver at least one piece of good news to the 34 million fútbol-loving Internet users in Spain: CloudFlare's 25th data center in Madrid, Spain.

"We figured the only thing that could help the Spanish get over such a stinging defeat was a faster Internet," explained Matthew Prince, Co-founder and CEO of CloudFlare. "We're happy to do our small part in easing the malaise."

With the World Cup now in full swing, it is with fury (or as we refer to it: La Furia Naranja) that we commence our latest round of data center additions in celebration of many of the great fútbol nations. By the time we finish we will have launched four new data centers serving over 200 million fútball-loving Internet users.

En Nuestro Corazón, La Pasión de un Campeón

Despite the loss, it isn't over for the reigning world (and European) champions. Anyone that follows the Spanish national team knows its tiki-taka style of play characterized by short passing, patience and possession above all else. At CloudFlare, we take the tiki-taka strategy to heart with unrelenting focus on short (network) hops, strong backbone and availability above all else.

With Madrid now live, internet traffic previously routed through Paris and London will stay much closer to Plaza Mayor. For those counting, Madrid brings us to 9 data centers in Europe including (in order of launch) Amsterdam, Paris, Frankfurt, London, Vienna, Prague, Stockholm and Warsaw.

Ready for some more jogo bonito? In homage to our favorite Spanish pop sensation, David Bisbal, start waving your flag...

Traducción Español

Tras de la paliza histórica de “La Roja” por parte de “La Naranja Mecánica”, nos sentimos obligados a ofrecer al menos una buena noticia para los 34 millones de usuarios de Internet en España: Presentamos el centro de datos #25 de CloudFlare en Madrid, España.

"Nos dimos cuenta de que lo único que podría ayudar a los españoles olvidar semejante derrota era un Internet más rápido", explicó Matthew Prince, Co-fundador y CEO de CloudFlare. "Estamos contentos de hacer nuestra pequeña parte para aliviar este malestar."

Con la Copa del Mundo en pleno apogeo, es con furia (o como lo llamamos: La Furia Naranja) que comenzamos nuestra mas reciente ronda de expansión en celebración de las grandes naciones Futbolísticas. Al finalizar esta ronda habremos puesto en marcha cuatro nuevos centros de datos que sirven a más de 200 milliones de usuarios de Internet(y fanáticos del Futbol).

En Nuestro Corazón, La Pasión de un Campeón

Pese a esta derrota, no todo ha terminado para el actual campeón del mundo (y de Europa-UEFA). Cualquier persona que sigue a la selección española conoce su estilo tiki-taka caracterizado por pases cortos, paciencia y por encima de todo tiempo de posesión. En CloudFlare, también adoptamos la estrategia tiki-taka con un enfoque implacable en reducir saltos en la conexiones, fortalecer el backbone de la red y alta disponibilidad por encima de todo.

Con Madrid activo en nuestra red, el tráfico de Internet que previamente enrutaba hacia París y Londres se quedará mucho más cerca a la Plaza Mayor. Para aquellos que estén contando, con Madrid sumamos a 9 centros de datos en Europa, incluyendo (en orden de lanzamiento) de Ámsterdam, París, Frankfurt, Londres, Viena, Praga, Estocolmo and Varsovia.

Y no termina aquí, muy pronto tendremos mas jogo bonito. En homenaje a nuestro ídolo pop favorito, David Bisbal, comiencen a Agitar las Banderas.

↧

Buongiorno, Milano! CloudFlare’s 26th Data Center Now Live

June 17, 2014, 12:00 am

≫ Next: CloudFlare Acquires CryptoSeal

≪ Previous: CloudFlare's 25th data center turns up to ease collective disappointment of Spanish fútbol fans

Team Italy exiting the data center after racking and cabling our servers (in style)

Italy is known for many great things. First and foremost, the Azzurri—2006 World Cup champions and (as of yesterday) dashers of England's world cup hopes—but also the Colosseum, the fountains of Florence, the canals of Venice, the Alps, Lambos, pizza, pasta, the good life in Tuscany, Michelangelo, da Vinci, the leaning tower of Pisa, opera, fashion and, of course, colorful gestures. And not forgetting some of the world's largest companies and brands like Luxottica Group, Finmeccanica, Gucci, Prada, Dolce & Gabbana, Bulgari, Barilla, Chicco, Lamborghini, Pirelli and Ferrero.

That's "No more!", eh?

It is with great pride, and song, that we add CloudFlare's latest data center to the list.

(to the tune of “That’s Amore” by Dean Martin)
View on Youtube

When you've had quite enough of this latency stuff,
That’s “No more!”, eh?
When it seems deja vu, DoS is comin' for you
That’s “No more!”, eh?
Alarm bells will ring ting-a-ling-a-ling, ting-a-ling-a-ling
And you’ll sleep like a baby...
First things first, the origin is cursed, fear the worst
Only to realize you have CloudFlare...

Fit for a President

La Presidenza della Repubblica that is. With the inauguration of our latest data center in Milan we are proud to welcome the President of Italy as our newest customer. The Italian Presidency, joining over 40 other national governments, use CloudFlare to keep their online presences fast and safe. Click here to learn more about CloudFlare services for governments and enterprises.

E adesso?

So, where to from here? The answer isn't song-writing, we promise! We'll stick to what we know best. If you've kept pace with our previous data center launch in Madrid, you already know that we're launching four new data centers in countries with, collectively, over 200 million internet users. Stay tuned for what's next...

↧

CloudFlare Acquires CryptoSeal

June 18, 2014, 6:00 am

≫ Next: Q&A with Ryan Lackey

≪ Previous: Buongiorno, Milano! CloudFlare’s 26th Data Center Now Live

We're excited to announce that CloudFlare has acquired the Trusted Computing and virtual private network (VPN) as a service company CryptoSeal. CryptoSeal was founded by Ryan Lackey, a well-known engineer in the security community whom we've admired for some time. The company was funded by Y Combinator and angel investors from the security community.

At CloudFlare, our mission is to build a better Internet. Today, we provide a best-in-class service to ensure that our customers' websites and services are fast, safe, and reliable. We do this through a massive global network that acts as a reverse proxy, filtering traffic from attackers while accelerating traffic for legitimate visitors.

CryptoSeal's service was, effectively, CloudFlare in reverse. Rather than acting as a reverse proxy, the CryptoSeal VPN functions as a forward proxy. Where CloudFlare's customers are websites, CryptoSeal's are web surfers. By joining forces, we believe there are exciting opportunities to help build a better Internet end-to-end.

Beyond the company’s work in VPNs, CryptoSeal applied Trusted Computing technology to commodity servers, protecting them from compromise by outsider attackers or insider subversion, and guaranteeing the integrity of server-side applications to remote end users. This technology has clear application at CloudFlare as we build a secure edge for the Internet.

“I’m really excited to work with CloudFlare to bring secure, easy-to-use networking to a huge number of sites and users,” Ryan told me when we were working on the deal. “CloudFlare really is the perfect company to deploy security technologies like Trusted Computing, robust cryptography, and secure networks to make the Internet a better place.”

We began the process of shutting down CryptoSeal's service last week, shortly after the acquisition was finalized. We've notified current customers and expect the service to be retired completely by June 30, 2014. Ryan Lackey is joining CloudFlare's team as a product manager focused on new security products. In the months to come, expect new product extensions that take advantage of CryptoSeal's technology combined with our rapidly expanding network.

↧

Q&A with Ryan Lackey

June 18, 2014, 6:00 pm

≫ Next: Making code better with reviews

≪ Previous: CloudFlare Acquires CryptoSeal

Lackey being hoisted onto Sealand in the North Sea circa 2000

How did you get into computer security?

I started using the Internet when I was young—in the early 1990s, before I was a teenager. I was drawn to security for two main reasons: First, I was interested in how individuals could stand up to large groups, even nation states, using mathematics. Also, learning about computer security meant I was able to subvert security systems, and this gave me access to things I wasn’t supposed to see. I never used my skills to harm anyone, I just thought it was fun to get an account on a supercomputer and things like that.

Do you have any advice for younger people getting into the field?

The best way to learn about computer security, as with most technology, is to get hands-on experience. Once you have some practical experience and have decided that you’re interested, then go back and learn the theory through formal education or certifications. If you use this approach, you have an intuitive understanding of how the different parts of the field are related and it is very powerful.

Probably one of the hardest problems for young people in this field is assessing how much you really know—once you understand the basics, you think you know everything, but there can be huge problems lurking just under the surface. I think it’s important to keep in mind that no matter how much you learn, there’s always something new—it’s a rapidly changing field. This fluidity means it is possible to become a relative expert on very specific things quite early in your career. The most important thing is to learn how to find information, perform experiments, and figure out how things work. Sometimes just being able to identify a problem by name is enough to solve it, since there’s a huge amount of academic, industrial, and hacker literature about most topics.

How do you feel about having your company acquired by CloudFlare?

I’m very happy! My main interest in starting CryptoSeal was to get Trusted Computing technology into commercial use. At CryptoSeal, we were working on using that technology for a general cloud computing solution, key management, and overlay networks. These are all fairly sophisticated, difficult to use applications, and not really directly usable by end users. I wanted to change that.

I think CloudFlare has done an amazing job of bringing high-end anti-DDoS, caching, firewalling, and filtering technology to a huge number of users, and by working with CloudFlare to incorporate Trusted Computing technology, I get to accomplish everything I wanted to do with CryptoSeal.

Also, CloudFlare has a really amazing team—people with cryptographic and protocol expertise, great network engineers, peering specialists, and one of the best support teams in the tech industry—so I’m really excited to be working with them.

What attracted you to CloudFlare in the first place?

I was first attracted to CloudFlare because I was a customer for three years, and I was always impressed with their service. They are a great service for startups, and through my interactions as a customer, I got to know some of the team.

As I looked further into CloudFlare I realized that they are solving some really difficult problems, especially now that they are operating at Internet scale—5% of web requests. Sometimes projects at CloudFlare require actually fixing the underlying infrastructure of the Internet, and the company is willing—and able—to invest the resources to make that happen.

The three founders, Matthew, Lee, and Michelle, are actively involved, and they’ve created CloudFlare to be a flat organization without unneeded bureaucracy and process. As a company, it’s a great place to be -- the hiring bar is really high, so all of your coworkers are brilliant and hard-working. Everyone is focused on doing the right thing for CloudFlare’s users, and for the Internet as a whole. (If you’re interested, we’re hiring. Check out our openings here).

Before CryptoSeal, what are some interesting projects you’ve worked on?

I’ve done a lot of different things. I worked on an early anonymous electronic cash system while living on a Caribbean island in the late 1990s. The electronic cash system didn’t work out, but we ended up building some useful cryptographic tools that we later used in other products.

My neighbors on the island, the guys who ran the “.ai” domain name, introduced me to Sean Hastings. After I left the Caribbean, Sean and I got together to figure out the best place to host content free of government interference, but we couldn’t find a country which would be good enough. At that point, we bought a book: “How to Start Your Own Country”, and soon after that we found this abandoned WW2 anti-aircraft fortress occupied by pirate radio guys called Sealand in the North Sea. It was exciting, but the costs to provide service were really high: diesel fuel, helicopters, boats, etc. So I eventually left and moved back to the US.

For a couple years, I worked on cryptographic software for payment systems companies and RFID/NFC payments for credit cards. When the Iraq War started, and I got in touch with some Iraqi expats who needed help setting up Internet in Iraq after the US intervention. I flew into Iraq on a civilian flight, and spent six months working with them on satellite and wireless networking for a variety of military, government, and commercial customers. Then, as the country got more dangerous, I moved onto a US military base in Iraq and spent the next six years doing defense contracting, primarily building satellite, cellular, and wireless networks for a diverse set of customers in Iraq, Afghanistan, Kuwait, and elsewhere.

After enough close calls with explosions and helicopters, and missing the Bay Area, I moved back to work on a more “conventional” tech startup—CryptoSeal—which was also a great adventure.

What’s different between being at a company versus your own startup?

Getting to focus on the parts of the company I really care about which are product and technology, and not having to constantly worry about administration, finance, etc. It’s more efficient, less stressful, and produces better results. Since CloudFlare has such a great team, I’m also really enjoying getting to learn from people across the company.

↧

Making code better with reviews

July 2, 2014, 2:14 am

≫ Next: Introducing the BPF Tools

≪ Previous: Q&A with Ryan Lackey

In the past we've written about how CloudFlare isn't afraid to rip out and replace chunks of code that have proved to be hard to maintain or have simply reach end of life. For example, we wrote a brand new DNS server and replaced our old DNS infrastructure with it. Doing so was greatly helped by two things: a large test suite (that keeps growing) and code reviews.

Recently, I was working on that same DNS server when I needed to understand and change the following code:

if raw[2]&0x86 != 0 || raw[3]&0x4f != 0 {
    return false
}

raw is a []byte containing a DNS packet and this code is part of a filter that throws away packets that contain invalid flags. In this case the code is throwing away query packets that contain flags that are only used in responses.

It took me a while to grok which flags 0x86 and 0x4f were masking and having understood them I decided to update the code with the following comment:

// DNS Header looks like this:
//
//                                 1  1  1  1  1  1
//   0  1  2  3  4  5  6  7  8  9  0  1  2  3  4  5
// +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
// |                     ID                        |
// +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
// |QR|   OPCODE  |AA|TC|RD|RA| Z|AD|CD|   RCODE   |
// +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
//
// So ID is in raw[0:1], raw[2] contains QR, OPCODE, AA TC and RD. raw[3]
// contains RA, Z, AD, CD and RCODE.
//
// 0x86 in raw[2] corresponds to masking on the QR, AA and TC
// 0x4f in raw[3] corresponds to masking on the Z and RCODE

if raw[2]&0x86 != 0 || raw[3]&0x4f != 0 {
    return false
}

I figured that I'd save the next person time in dissecting those values since the actual time consuming item was figuring out what they meant; writing a comment dumping my current memory state was quick. And it meant leaving code in better shape than when I first looked at it.

I sent that code (and bunch of other change) off for review and my colleague Marek came back with:

> 0x86 in raw[2] corresponds to masking on the QR, AA and TC

Nah, how about making it: flags := raw[2] <<8 | raw[3]
and then checking it against or-ed flags, like (flags & (0x1 | 0x2 | 0x4 ....) 
and letting the compiler to come up with magic 0x86 :)

That made sense to me and I changed the code to:

// DNS Header looks like this:
//
//                                 1  1  1  1  1  1
//   0  1  2  3  4  5  6  7  8  9  0  1  2  3  4  5
// +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
// |                     ID                        |
// +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
// |QR|   OPCODE  |AA|TC|RD|RA| Z|AD|CD|   RCODE   |
// +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
//
// So ID is in raw[0:1], raw[2] contains QR, OPCODE, AA TC and RD. raw[3]
// contains RA, Z, AD, CD and RCODE.

const QR byte = 0x80
const AA byte = 0x04
const TC byte = 0x02

const Z byte = 0x40
const RCODE byte = 0x0f

if raw[2]&(QR|AA|TC) != 0 || raw[3]&(Z|RCODE) != 0 {
    return false
}

That's much clearer and requires no thinking on the part of the next person who reads or maintains the code. Along the way Marek had also pointed me to some information about the AD and CD bits which meant that I'd improved the code and learnt something new.

At that point the Approve button was hit on the pull request and the code was available to be merged into the mainline. All our code goes through this type of process to improve its quality and security, and it's an essential part of our PCI certification. Our SRE team follows a similar process for configuration changes.

Even this blog post got peer review.

If you're interested in working on CloudFlare-scale code, and making things better in any of the languages we work with Lua, Go, C, PHP, Python, JavaScript, ... then we're hiring in London and San Francisco.

↧

Introducing the BPF Tools

July 3, 2014, 5:00 pm

≫ Next: Parabéns, Brasil! CloudFlare's 27th Data Center Now Live

≪ Previous: Making code better with reviews

In a recent article I described the basic concepts behind the use of Berkeley Packet Filter (aka BSD Packet filter or BPF) bytecode for high performance packet filtering, and the xt_bpf iptables module. In this post I'll explain how we use BPF and xt_bpf as one tool to deal with large scale DDoS attacks.

And, today, CloudFlare is open sourcing the tools we've created to generate and deploy BPF rules.

The Code

Our BPF Tools are now available on the CloudFlare Github: https://github.com/cloudflare/bpftools

For installation instructions review the README, but typing make should do most of the work:

$ git clone https://github.com/cloudflare/bpftools.git
$ cd bpftools
$ make

The BPF Tools repository contains a number of simple Python scripts, some of them focus on analyzing pcap files, others focus more on the generation and use of the BPF bytecode itself:

pcap2hex, hex2pcap
parsedns
bpfgen
filter
iptables_bpf, iptables_bpf_chain

We rely on the BPF assembler from the Linux Kernel /tools/net directory. To make your life easier we ship a copy in linux_tools.

Here at CloudFlare we run a very large number of authoritative DNS servers and we constantly deal with malicious actors flooding our servers with, amongst other things, DNS requests. So no surprise that our current BPF Tools focus on DNS traffic, although they are easily adaptable to any other stateless floods.

The BPF Tools should be usable and in working order, but don't expect too much. These small utilities were written to be easily hackable and they will be in a state of constant flux: this is our toolkit after all. Please expect some degree of code instability.

It all starts with a pcap

Here's a concrete example of using these tools to identify and filter a DNS attack. This is based on a real world attack but the actual details have been changed for anonymization.

To start you need a pcap savefile containing a traffic dump. For example to capture a pcap of DNS requests we run:

$ sudo tcpdump -pni eth0 -s0 -w example.pcap -vv -c 10000 \
       "ip and dst port 53"
listening on eth0, link-type EN10MB (Ethernet)

It's important to record the traffic on the EN10MB (Ethernet) device, as the scripts expect to see packets with a 14-byte Ethernet header. If you forget about that and record on the any interface (LINUX_SLL) you can fix the pcap by using the pcap2hex / hex2pcap tools. They are able to amend the layer 2 header and make it look like Ethernet again:

$ cat sll.pcap | ./pcap2hex | ./hex2pcap > ethernet.pcap

Here is a sample output of the pcap2hex tool after we captured requests going to our favorite domain www.example.uk (notice the --ascii flag):

$ cat example.pcap | ./pcap2hex --ascii | head
000ffffff6603c94d5cb47f0080045000056817b4000f91147a3cba204c6
adf53a1aa408003500426dd26366000000010000000000010e697471766d
6e737a656c757a6f6a03777777076578616d706c6502756b000001000100
00291000000080000000        ..S..`<...G...E..V.{@...G.......
:....5.Bm.cf...........itqvmnszeluzoj.www.example.uk.......)
........
000ffffff6603c94d5cb47f008004520004fdf234000f41110107b1e341c
adf53a1a84a00035003b4a99e25c00000001000000000001076969766c69
657903777777076578616d706c6502756b00000100010000291000000080
000000      ..S..`<...G...E..O.#@.....{.4...:....5.;J..\....
.......iivliey.www.example.uk.......)........

Taking a look at the traffic, it looks like we captured a flood of requests to <random string>.www.example.uk! We see this kind of flood all the time. I believe the goal of this is to keep our DNS server busy preparing NXDOMAIN responses and not have enough CPU to serve legitimate traffic.

Let's take a closer look at these packets.

Parsing the DNS request

With DNS traffic handy we can take a closer look at the details of the DNS requests. For that pick a hex-encoded packet from the output of pcap2hex and pass it to the parsedns script:

$ ./parsedns 000ffffff6603c94d5...
...
[.] l4: a408003500426dd2
      source port: 41992
 destination port: 53
           length: 66
[.] l5: 6366000000010000000000010e6974717...
               id: 0x6366
            flags: 0x0000 query op=0 rcode=0
        questions: 1
          answers: 0
             auth: 0
            extra: 1
#-46         q[0]: 'itqvmnszeluzoj' 'www' 'example' 'uk' .
                    type=0x0001 class=0x0001
         extra[0]: .
                    type=0x0029 class=0x1000
                    ttl=32768 rrlen=0:
                        bufsize=4096
                        dnssec_ok_flag

This tool pretty prints a DNS packet and presents all the interesting bits. Sometimes the flooding tools have bugs and set a bit somewhere making it easy to distinguish malicious requests from legitimate DNS queries hitting our servers.

Unfortunately the request above looks pretty normal. We could distinguish the traffic on the EDNS DNS extension but some real recursors also set this flag as well, so this strategy would result in false positives.

Preparing the BPF

Blocking this flood is however, simple - we can safely assume that the www.example.uk domain doesn't have any subdomains, instead of looking at low level bits of DNS packets we can drop all the packets asking for *.www.example.uk.

The tool bpfgen generates the BPF bytecode to do that. This is the most important tool in the repo.

Right now it has three "BPF generators": dns, dns_validate and suffix. We'll focus only on the first one which generates BPF rules matching given DNS domains. To match all the requests matching the pattern *.www.example.uk run:

$ ./bpfgen dns -- *.www.example.uk
18,177 0 0 0,0 0 0 20,12 0 0 0,7 0 0 0,80 0 0 0, ...

That does look pretty cryptic, here's how can you generate an assembly-like BPF syntax:

$ ./bpfgen --assembly dns -- *.www.example.uk
    ldx 4*([0]&0xf)
    ; l3_off(0) + 8 of udp + 12 of dns
    ld #20
    add x
    tax
...

The generated code is way too long to post and explain here, I strongly recommend looking at the bpftools/gen_dns.py file and reviewing the kernel networking/filter.txt documentation.

For more details about the bpfgen tool and its features see the documentation:

$ ./bpfgen --help
$ ./bpfgen dns -- --help
$ ./bpfgen dns_validate -- --help
$ ./bpfgen suffix -- --help

The BPF bytecode generated by bpfgen is somewhat special - it's prepared to be passed to the xt_bpf iptables module and not the usual tcpdump. The bytecode passed to xt_bpf must assume the packet starts from the IP header without any layer 2 header. This is not how it usually works for tcpdump which assumes packets do have a proper layer 2 header. In other words: you can't swap bytecodes between tcpdump and xt_bpf.

To work around that bpfgen has an --offset flag. To create BPF for xt_bpf you can supply the explicit --offset=0 flag:

$ ./bpfgen --offset=0 dns -- *.www.example.uk

To create BPF for tcpdump on Ethernet packets you must supply --offset=14 flag:

$ ./bpfgen --offset=14 dns -- *.www.example.uk

Verification

It's always a good idea to test the bytecode before putting it on production servers. For that we have a filter script. It consumes a pcap file, runs it through a tcpdump-like BPF and produces another pcap with only packets that matched given bytecode.

To see what traffic will match our BPF:

$ cat example.pcap \
    | ./filter -b "`./bpfgen --offset 14 dns -- *.www.example.uk`" \
    | tcpdump -nr - | wc -l
9997

Hooray, our BPF successfully matches 99.97% of the flood we recorded. Now let's see that which packets it will not match:

$ cat example.pcap \
    | ./filter -b "`./bpfgen -o 14 --negate dns *.www.example.uk`" \
    | tcpdump -nr - | wc -l
3

It's often worthwhile to inspect the matched and unmatched packets and make sure the BPF is indeed correct.

Note: filter uses the usual libpcap infrastructure, that's why it requires the BPF to consume a layer 2 header. We will likely rewrite that code and change filter to use BPF generated for xt_bpf.

iptables

With the BPF bytecode tested we can safely deploy it to the servers. The simplest way to do it is to apply an iptables rule manually:

iptables -I INPUT 1 \
    --wait -p udp --dport 53 \
    -m bpf --bytecode "14,0 0 0 20,177 0 0 0,12... \
    -j DROP

(You will need a recent iptables with xt_bpf support.)

This can be very cumbersome. Especially because the --bytecode parameter contains spaces which makes it pretty unfriendly for parsing with bash or ssh.

Generating a bash script

To speed up the process we have another tool iptables_bpf. It accepts almost the same parameters as bpfgen but, as opposed to printing a raw BPF bytecode, it produces a bash script:

$ ./iptables_bpf dns -- *.example.uk
Generated file 'bpf_dns_ip4_any_example_uk.sh'

The generated script is fairly straightforward and at its core it applies an iptables rule like this:

iptables \
    --wait -I INPUT 1 \
    -i eth0 \
    -p udp --dport 53 \
    -m set --match-set bpf_dns_ip4_any_example_uk dst \
    -m bpf --bytecode "16,177 0 0 0,0 0 0 20,12 ... \
    -m comment --comment "dns -- *.example.uk" \
    -j DROP

As you can see, it depends on an ipset "match-set" named bpf_dns_ip4_any_example_uk. ipsets are a pretty recent addition to the iptables family and they allow us to control which destination IPs the rule will be applied to. We use this for additional safety. When you deploy the generated script by default it will not match any traffic. Only when you add an IP to the ipset will the BPF rule be executed. To add an IP to the ipset run:

ipset add bpf_dns_ip4_any_example_uk 1.1.1.1/32

Alternatively rerun the script with an IP as a parameter:

$ sudo ./bpf_dns_ip4_any_example_uk.sh 1.1.1.1/32

If things go wrong pass --delete to remove the BPF iptables rule and the ipset:

$ sudo ./bpf_dns_ip4_any_example_uk.sh --delete

Although fairly advanced and I hope practical, this generated script is not really intended as a fit-for-all deployment tool for all BPF scripts. Feel encouraged to tweak it or fork it for your needs.

Chaining BPF rules

In extreme cases you might want to chain BPF rules. As an example see the iptables_bpf_chain script, you can run it like this:

$ ./iptables_bpf_chain -w example_uk \
    --accept www.example.uk \
    --accept ns.example.uk \
    --drop any
Generated file 'example_uk_ip4.sh'

The generated file will create the iptables chain example_uk and it will add three rules to it: two BPF rules accepting some packets and one rule dropping everything else. The chain will be referenced from the "INPUT" chain in a similar fashion to the previous example. Before using iptables_bpf_chain please do review it carefully.

Summary

This article only scratched the surface of our tools. They can do much more, like:

match IPv6 packets
do suffix matching
match domains case insensitively
perform basic DNS request validation

For details read the documentation with --help.

Fighting packet floods is tough, but with tools in place it can be managed efficiently. The xt_bpf iptables module is very effective and with our BPF generation tools it allows us to drop malicious traffic in iptables before it hits the application.

By sharing these tools we hope to help administrators around the world, we know we are not the only ones fighting packet floods!

↧

Parabéns, Brasil! CloudFlare's 27th Data Center Now Live

July 7, 2014, 11:45 am

≫ Next: ReplayLastGoal and CloudFlare: Fanning World Cup Flames

≪ Previous: Introducing the BPF Tools

“Chile may have scored a CloudFlare data center first, but at least we’re still in the Cup”

Brazil is home to not only the most successful national football team in history (Pentacampeão), but also to the most technologically advanced World Cup ever hosted. From vanishing foam and goal-line technology, to the biological passport and the thermally-bonded design of the Adidas Brazuca, FIFA has finally (if not begrudgingly!) entered the modern age. In honor of this great football nation, and their advance to the semi-finals, it is only fitting that we “ice the cake” with CloudFlare’s 27th data center in São Paulo.

Our journey to Brazil

As this year’s 32 qualifying teams will attest, the journey to Brazil wasn’t easy. To make our own journey happen we had to work with multiple hardware vendors to achieve certification from Anatel (the telecom regulator of Brazil), coordinate dozens of shipments, and patiently wait for the Receita Federal do Brasil (the Brazilian tax authority) to receive, inspect, inspect again (and again), and then release our equipment to its new home in the NAP do Brazil.

Our Prize? Not quite the US$35 million in winnings that will greet the World Cup champions, but we do get the satisfaction of increasing the performance of the 2 million+ sites using CloudFlare for Brasil’s 100 million Internet users. This launch also means that the millions of web sites and mobile apps hosted in Brazil can now use CloudFlare to increase the performance and security of their online presence both in Brazil and around the world. Se você é do Brasil, nunca houve um uma melhor oportunidade para inscrever-se na CloudFlare!

For those keeping score

São Paulo is our second data center in Latin America following our deployment in Chile. And even though only one more data center remains in this latest round of expansion, we're just getting started. Curious to know what's next? Spoiler: If you are Colombian (we're sorry for your loss!), you won't be as disappointed after reading the next blog post. Stay tuned!

↧

ReplayLastGoal and CloudFlare: Fanning World Cup Flames

July 8, 2014, 12:00 pm

≫ Next: Introducing CFSSL - CloudFlare's PKI toolkit

≪ Previous: Parabéns, Brasil! CloudFlare's 27th Data Center Now Live

The World Cup is, at once, exhilarating and agonizing, and not just for those lucky souls in Rio or Manaus, folks working from home, or anyone else sitting happily in front of a TV. I’m talking about the quiet victories and silent pains loyal fans everywhere experience week after week at their desks. The muffled “YES!” breathed through your teeth as you check the FIFA app update; or a soft sigh when a text from a friend informs you that, yep, unfortunately, Belgium managed to score in overtime; and all too many of us know the slow burning resentment that accompanies those howls wafting through an open window from a bar near by. Sure, from your desk you can see the score, you can read the updates, you know who is winning—but you need to see that goal! And the major sports networks won’t start showing replays for hours!

Replay that last goal! NOW!

So, too, was Xavier Damman, co-founder of Storify, stuck at his desk during the World Cup. He felt our pain, but, unlike the rest of us, he also knew there was a solution. With a vision for innovation and disruption, Xavier spent a few long nights building Replaylastgoal.com. ReplayLastGoal is a Twitter bot that automatically records and posts—instantly—an animated gif of the most recent World Cup goal. By the time you get that phone alert, or start hearing those cheers from down the street, you can watch the goal yourself on Twitter via @ReplayLastGoal. I’ll say it for all the World Cup fanatics far from Brazil, out of reach of a TV, but still trying to follow the action from the office: Thank you.

Xavier thinks that automatic and instant generation of sports highlights is the way people will follow sports in the future. So what did he do? He open sourced his code, and kept the project not for profit. “Use my code!” Xavier exclaimed on the phone this afternoon, “FIFA! ESPN! use my code! Give your fans what they want: instant access to goal replays as they happen!” That’s the idea, and, to pour some sugar on top, Xavier wrote hooks for Hipchat, Slack, and Flowdock.

CloudFlare: changing the game, delivering the passion.

The World Cup is truly an international event, and Xavier created a simple solution to a global problem. ReplayLastGoal.com has changed the way those of us stuck in an office during the big game can participate in the glory.

It only took one server for Xavier to create and host ReplayLastGoal.com’s content, but, given the massive World Cup fan base, it wasn't scalable. At best, it would be great for a few friends to enjoy.

Sites like ReplayLastGoal.com are prone to downtime because, if they are massively popular, they see spikes in traffic, and those spikes can overwhelm a single server. You know this afternoon, as soon as Germany or Brazil score, thousands of people are going to want to see the replay, and ReplayLastGoal.com will get a flood of traffic.

That might be a problem if CloudFlare wasn't changing the game. But, in five minutes or less, Xavier put ReplayLastGoal.com on the CloudFlare network, letting us handle those traffic spikes. At the same time, we are able to distribute the content globally, bringing each goal closer to World Cup fans across the planet.

According to Xavier, "Developers have access to powerful tools that increase their productivity like never before. One developer in a garage can develop, in just a few nights, things that used to take months and months for a team to do. It’s companies like Cloudflare that make this possible. CloudFlare allowed me to focus on the core functionality of the app which is to automatically take a video clip and turn it into an animated GIF file. CloudFlare takes care of load balancing and distributes the content around the world through their CDN. I can rest easy because my server only has to be able to scale to serve one request."

Keeping ReplayLastGoal.com stable during spikes in traffic, and distributing its content globably is just one more way CloudFlare is changing the game to build a better web.

↧