Case study: Moving an app to IaaS/PaaS hosting

Singlebrook recently moved one of its long-term customers, Debt Compliance Services, from traditional VMs to IaaS/PaaS. This is a case study of that project, written for IT decision makers who are considering a similar move, from the perspective of a full-stack engineer. Focus is given to the steps taken and hurdles encountered. This is not a think-piece about the general benefits of cloud computing.

Background

Debt Compliance Services (DCS) owns a Rails application with a MariaDB database. Since 2010, DCS was hosted on traditional VMs. In late 2017, a decision was reached to change hosting companies. The primary concern was security. Secondary concerns included reduced maintenance cost and improved disaster recovery time.

Since its founding in 2006, Singlebrook has operated approximately 175 applications, of which ~55 were IaaS/PaaS, and ~120 were hosted on traditional VMs or in rare cases, physical servers. Moving applications to new hosting is a routine project at Singlebrook, and almost all of our moves in the past few years have been from traditional hosting to IaaS/PaaS.

The way things were

At the time DCS decided to move, it was hosted on two VMs: staging and production. Each deployment level had its own Rails app and database, a simple and common arrangement. On these VMs, we were responsible for everything but the hardware. Maintenance and security were moderately expensive, ongoing burdens. Horizontal scaling was impossible. Disaster recovery time (complete server rebuild) was estimated to be ~6 hours.

In all, each server ran Apache, Rails, MediaWiki (MW), MariaDB, and ElasticSearch. The connections between all of these were network-ready, except for the connection between Rails and MW, which communicated in part via a shared filesystem. Most data was in MariaDB, but the Rails app also stored some uploaded files on disk.

So, our two main hurdles for moving to IaaS/PaaS would be:

Rails and MW would have to communicate over the network
Uploaded files would have to be stored in an IaaS object storage service like AWS S3

Shopping around and planning the move

After comparing AWS, Google Cloud Platform, and Microsoft Azure, AWS was found to have the most mature offerings, and the lowest prices. Also, importantly, Singlebrook has the most experience with AWS.

On Jan. 17th, Singlebrook presented the following plan to DCS.

The entire system would be hosted on AWS, primarily on the following four services:

Elasticsearch – MW will use Amazon Elasticsearch
S3 – Rails will store uploaded files in S3.
RDS – Two MariaDB instances per deployment level: one for Rails and one for MW.
Heroku – Rails and MW will run on Heroku

Heroku is a PaaS built on top of Amazon Elastic Compute Cloud (Amazon EC2). Heroku sells processes, not servers. Each process is called a dyno, and has a certain amount of memory. Dynos are not accessible via SSH. Code is deployed via git.

Our decision to use Heroku introduced a third hurdle. HTTP requests to Heroku applications go through a routing layer where a hard limit of 30s is imposed. Many parts of DCS were not capable of that level of performance and would have to be optimized or run in background processes.

Finally, the presented plan included a thorough analysis of service options and pricing.

Timeline and order of operations

The project schedule was constrained by the beginning of the second fiscal quarter of 2018. It was currently the first quarter of 2018. Around the beginning of the second quarter (April 1) people would be busily using the system, finishing up the previous quarter and preparing for the next quarter. During this time, we would not want to disrupt them, so we planned to either complete the move by early March or wait until after, perhaps June.

As we worked, we deployed certain important changes to production as soon as possible, instead of waiting until moving day. First, we developed an HTTP API for the Rails app to talk to MW, and deployed that to production. Then we moved uploaded files to S3, and rewrote that part of the Rails app to use S3 for new files. That, also, was deployed to production well before moving day.

Anything that can be done ahead of time to make moving day easier is worth doing.

We had hoped to complete the move by March. To our great satisfaction, the entire project was complete and in production by Feb. 17th, only one month after our plan was proposed, and a full month ahead of schedule.

Post-move optimizations

Over the next month, we quickly identified additional features of DCS that were not capable of the 30s performance mentioned above, and, as before, Singlebrook optimized them if possible, and moved them to background processes if not. Almost all such features were administrative, or were reports used by only a few people, and we were able to fix each within a day or two of it being identified. The impact on people’s work was acceptable, but it stands out as one thing we could have done better.

Since then, DCS has been operating smoothly in its new hosting environment. RDS and S3 have near-perfect uptime, and Heroku’s uptime has also exceeded our requirements. We’ve since taken advantage of the new platform features available to use, like Heroku’s ability to horizontally scale, and AWS’ excellent metrics (CloudWatch). Also, we have started to use some of peripheral services that are now available to us, from the AWS/Heroku service ecosystem.

If this case study was of interest to you, please let us know. We’ve done dozens of projects like this since we opened our doors in 2006. If you’re considering a similar project, please contact us for a free 30-minute consultation.

Improving security and reducing monthly operating costs of our client websites are core DCS business objectives. We very much appreciate Singlebrook’s recommendations and implementations in helping us achieve these objectives.

Jeff Wallace, Debt Compliance Services