Here you will find ideas and code straight from the Software Development Team at SportsEngine. Our focus is on building great software products for the world of youth and amateur sports. We are fortunate to be able to combine our love of sports with our passion for writing code.
The SportsEngine application originated in 2006 as a single Ruby on Rails 1.2 application. Today the SportsEngine Platform is composed of more than 20 applications built on Rails and Node.js, forming a service oriented architecture that is poised to scale for the future.
Amazon Web Services (AWS) is a cloud infrastructure provider that is designed to make cloud computing streamlined, integrated, and uniform. AWS provides a variety of different products, including virtual servers, storage, and a variety of database options. Although AWS provides many services, the one that we'll be discussing here is AWS's instances, the cost of them, and reserved instances (RIs) and how they can be used.
Since SportsEngine uses AWS, we get flexible cloud hosting services, which allow for a variety of different server sizes, memory sizes, CPU options, etc. These instances are one of three types: EC2 instances, RDS instances, or ElastiCache instances. The large array of instance choices allow us to choose exactly what we need for each purpose. For some of our bigger applications, we need instances of a larger size, and for a few of our applications, we only need a medium sized server. It's important that we figure out what size is required for each service because the bigger the instance, the more it costs.
AWS lays out the basic costs of running various EC2 instances here. However, for those that don't want to read the details, I'll summarize. A t2.micro instance costs about $0.012 per hour, or $8.06 a month. A t2.micro is almost the smallest and cheapest type of instance available in EC2. However, there's a significant leap to powerful and large instances: a c4.8xlarge costs $1.591 per hour or $1,069.15 per month. Alone, one instance doesn't sound like much money. But having many instances can really add up to a hefty bill at the end of each month.
According to AWS, the use of reserved instances (RIs) can save users a significant amount: up to 75%, depending on the size of RI and the payment method. The idea behind an RI is that a user can "reserve" their usage of an instance type for a specific region or a single availability zone within a region. The user then chooses a payment option: All Upfront (pay the entire amount right then and there and save the most money), Partial Upfront (pay a small upfront fee and then get charged hourly), or No Upfront (nothing upfront, but charges a larger hourly fee). This RI is then yours for exactly one year or exactly three years. Let's go back to the t2.micro example above. To use a t2.micro instance for exactly one year, it will cost $104.83. To purchase a RI of type t2.micro, it will cost $69 if a user pays All Upfront. That's a 33% discount! If a user pays No Upfront, then AWS will charge $6.83 monthly, or about $81.96 over a year. This comes out to about a 25% discount. No matter which payment method and what size instance, it's undeniable that using reserved instances will save AWS users money.
But, like most "saving money" plans, there are a couple challenges to the AWS plan. With AWS it's super easy to say "We're currently using a t2.large. But that's too big. Let's switch that to a t2.medium." So, then the user makes a new t2.medium and shuts off their t2.large. Easy peasy. Here at SportsEngine, we do this type of conversion pretty often. But, what if they purchased an RI of type t2.large? Does that automatically convert? No. What if the users exchange two t2.mediums for one t2.large... do the RIs realize this and switch themselves? No. Therefore, it's up to the users to continually manage their RI usage to make sure they're maximizing the discounts they could be receiving and not paying for a server they're not even using.
To help us with our management of our reserved instances, we've created the AWS Auditor. The Auditor is designed to pull all of our instances from a single account through the API. Then, it pulls all of our RIs through the API. Then, it goes through the lists and matches them up... it's a math game. For every single running instance (a positive number), there should be a matching RI (a negative number) of the same zone, type, and size. At the end of the audit, every instance should be matched (every number should equal 0). Any instances that are running that are not matched show up in yellow, and any RIs that are unmatched show up in red. If everything matches up correctly, it will show up in green.
There are a few other little details about our auditor. Sometimes, we intentionally decide not purchase a RI. This could be because we're testing something, or if the instance is temporary. So, we use EC2 tags to indicate that we don't want to purchase an RI for that instance. Then, we read the tags during the audit and print them separately, to remind us that we intentionally are not covering those instances. If there's an expiration date on the tag, then we'll write that, too, so that we can plan for the future. And if any tags have recently expired, then we'll indicate those so we know to reevaluate the instance and whether we should be purchasing an RI for it.
In other instances (no pun intended), sometimes an instance is auto scaled based on load, meaning that it automatically turns on or off based on if we need extra CPU or memory at a specific time. Because these instances turn on and off sporadically, we don't want to cover these with RIs. So, we'll filter those instances out of the audit matching based on name of the instance.
SportsEngine doesn't just have one AWS account. We have many accounts, and each one has multiple instances, reserved instances, and regions (a single instance can be in one of many regions). Therefore, it's necessary for us to be able to run all of our regions for each account at once. And moreover, it's nice to be able to run each account easily. So, we created a Slack integration. By using Slack, we're able to type one line, and get back the full audit for one of our accounts.
With this workflow, we can do the auditing process mentioned above for each of our accounts. We do a complete audit of all of our accounts about 6-8 times a month. This way, we're ensuring we're maintaining matches for our running instances to save money, properly matching our reserved instances (since we already spent money on them), and keeping track of how we can make continual improvements to our infrastructure.
SportsEngine are not the only people who have discovered using RIs have their challenges. There are a variety of 3rd party tools and services that can be bought and used to help analyze AWS spending and try to save money. This blog outlines a couple of them. CloudCheckr is the first one that popped up on my Google search. However, CloudCheckr only does analysis once a month. AWS's Trusted Advisor or Cost Explorer work similarly to CloudCheckr, except they offer additional AWS friendly services. However, what we've made is completely unique: it allows us to make our own rules for how we want to audit, what we want to consider, and what we want to leave alone. In addition, we've found that by analyzing more frequently, the analyses go quicker, keep us informed longer, and allow us to save the most money. Moreover, we have added a decent amount of custom features, such as the Slack integration, tags, and ignoring abilities. These custom-made features allow us to gain a new perspective of our AWS accounts that other services don't necessarily offer.
SportsEngine is always looking for ways we can improve our AWS Auditing system. Since our code for the AWS Auditor is open source we're always welcome to ideas and pull requests from others in the community. The
README.md will have more information on actually running auditor and what all of the different options and flags are. If you feel like you have some meaningful contributions, or if you notice anything that can help others save even more money, then feel free to fork the AWS Auditor, make some awesome changes, and make a pull request back.