A few weeks ago we started noticing a dramatic change in the pattern of network traffic hitting our tracking API servers in Washington DC. From a fairly stable daily pattern, we started seeing spikes of 300-400 Mbps, but our rate of legitimate traffic (events and people updates) was unchanged.
Suddenly our network traffic started spiking like crazy.
Pinning down the source of this spurious traffic was a top priority, as some of these spikes were triggering our upstream routers into a DDos mitigation mode, where traffic was being throttled.
In my previous post covering OpenVPN, I said that we needed to restrict access to most of our servers – they will only be accessible to each other, rather than open to the outside world.
How do we do this? iptables. You can add iptables rules that explicitly state the ip addresses that are allowed through the firewall, and then disallow everything else.
If our network was static – meaning we would never have to add more machines – then this would be really simple. All you’d need to do is update your iptables file once with the ip of every server you own, and you’re done. No worries.
In the real world, the network isn’t static. We’re adding new machines all the time, and if we don’t update iptables at the same time, the new machines won’t be able to communicate with the old ones. To solve this problem, I dynamically generate iptables files and deploy them with Fabric.
Note: all code mentioned in this post can be found on github here: http://github.com/ttrefren/firewall
When I want to deploy code to http://mixpanel.com, I open a new terminal window and type
fab deploy. Even though we have quite a few servers these days, our deploy process is really streamlined – we push code multiple times per day.
When you’re first starting a new web project, deployment is easy. All you have to do is log in to your server and do a
git pull, and probably restart your web server. No problem.
If you grow beyond a single machine, though, this technique is rife with problems: the time it takes to deploy code grows linearly with the number of servers you have, it’s difficult to synchronize deployment, and it’s simply error-prone. Any point in your deployment process that requires you to log in to a server and type multiple commands is just asking for trouble.
We use a tool called Fabric to automate this process. Fabric makes it really easy to run commands across sets of machines. It’s similar to Capistrano (Ruby) but it’s written in Python so it was an easy choice for us
Imagine this scenario: your company is growing rapidly and you’re hiring tons of engineers. The passwords to many of your servers are stored in plaintext in configuration files — everyone has access to them. Your main database – the one with all the user data – is available to anyone who has that file.
You fire an engineer. Now what? He knows the passwords to everything. Do you trust him? What if you’re firing him because he’s just a bad egg? What can you do?
Well, you could change the passwords on every single server… but that’s a huge pain in the ass for everyone involved. Luckily, there’s a solution for this problem, one that comes with quite a few other benefits as well: you set up a VPN.