Airbrakers–thank you for your patience with our major May migration. You’ve asked us to share details of our new infrastructure and this will be the first of many posts where we peel off one aspect of our deployment and share our approach.
Today we’re going to talk about our Frontend.
From 2008 through 2011, the entire Airbrake codebase consisted of a monolithic Ruby on Rails application. Over time this application became optimized to handle an ever-increasing volume of customers.
[ Graph showing increased numbers of exception processed by Airbrake. ]
Earlier this year we began testing our new backend with a subset of our users. At the same time, we looked for opportunities to allow us to make faster progress on our frontend. Giving us the ability to do things like WebHooks (in beta), GitHub Integration, Commenting, Better Debugging (Simulate Error) and more… (these all came to Airbrake in today’s release!)
Originally, we intended to host our new frontend (still Ruby on Rails) on Amazon EC2. We made this decision as our previous hosting facility was unable to keep up with our high-throughput needs–and our experience on Amazon in our other businesses (Exceptional and RightSignature) was positive in their ability to provision reliable service.
During our migration planning, we also brought up a staging server on Heroku (a PaaS that operates on Amazon). Amazingly–we had thought Airbrake to be sluggish when on our dedicated hosting provider’s servers; however, the Heroku app was significantly faster . It turns out that much of our performance was being wasted on calls within high-latency network calls to our support services (REDIS, Memcache, Legacy Mongo and our MySQL database). On Heroku (and therefore Amazon), these became low-latency, high-speed calls.
We spoke to the Heroku team and heard their commitment to making Heroku a platform that could handle Airbrake (at least our frontend) and we decided to use Heroku as one component of a robust deployment environment.
We have a commitment to demonstrating Heroku’s world class hosting environment by rocking Airbrake for their team and their most excellent users.
Oren Teich, Heroku COO.
With this support, we continued to focus on the components of our migration. We’ve added a pair of NGINX + HAProxy gateways on Amazon AWS XL instances (XL for network throughput). These serve Airbrake.io’s root domain. Requests to /notifier_api/v2/notices are routed to our new backend and the rest of the requests head onward to our frontend servers.
We launched two weeks ago with our own EC2 application pool as well as Heroku serving our application tier for our frontend. Unfortunately, our HAProxy configuration did not correctly account for session handling–and our ad-hoc attempts to remedy were causing more pain than relief–and during a fix attempt, we sent all frontend traffic to our Heroku application servers. We started to bump our Heroku dynos to match our usage and met demand at just over 75 dynos. We have identified a large number of our users have scripted access to /errors.xml and other popular endpoints. Having lots of these frequent requests necessitates a high dyno count. But, besides the dyno count, Heroku showed itself to be stably handling our application load (besides a few of these that occurred as our initial 20 dynos were saturated).
We let this architecture settle in with our usage patterns and for the past two weeks we saw steadily faster-than-before access times for our users. We’ve been monitoring our support–and we have a few issues–which are mainly due to our new backend (next week!)–but our frontend has been blissfully consistent.