by Alejandro Jacobo, Director of Sales - Vorbi, Inc.
In this article, you will learn how AWS charges its customers for data transfer as it travels across different regions in the U.S., out to the internet, and within its ecosystem.
Whether you are an avid AWS user or considering to use AWS, you are probably aware of their unpredictable pricing. The goal of this post is to help you understand how AWS pricing structure works and why AWS users might be thinking "is this a death by a thousand cuts?"
Before we dive in, I want to make one thing clear; Amazon is an outstanding company. I use their e-commerce platform more than I care to admit. According to my credit card annual account summary, Amazon.com amounted to more than 5% of my total purchases in 2016! Clearly, I am an Amazon patron.
That being said, let's dive in and learn more about Amazon Web Services.
How Does AWS Pricing Structure Work?
AWS offers a broad set of cloud services. You can pick and choose from global compute, storage, database, analytics, application, and deployment services. Because the goal of this post is to help you understand how AWS pricing structure works, we will focus on their most popular service offering; the Amazon Elastic Compute Cloud (EC2).
There are four ways to pay for Amazon EC2 instances: On-Demand, Reserved Instances, Spot Instances, and Dedicated hosts. Most users start with the on-demand option because there are no long-term commitments or upfront payments, users just pay the specified "on-demand" hourly rate.
Here is where AWS pricing gets very interesting...and complicated.
AWS has different pricing for each of its four main U.S. regions:
US East (N. Virginia), US East (Ohio), US West (Northern California), US West (Oregon)
And different pricing for each of the operating systems needed in your EC2 instance such as Linux, RHEL, SLES, Windows, Windows with SQL Standard, Windows with SQL Web, and Windows with SQL Enterprise.
US East (N. Virginia) is Amazon's primary standard region; thus, we will go ahead and set up a memory optimized r3.large EC2 instance with a Windows operating system in this region. This instance would cost $0.291 per hour. Thus, the first cost is the hourly rate to "own" an EC2 instance per the chosen region and the cost of the specified operating system selected.
The next cost to account for is Data Transfer "in" and "out" from an EC2 instance to other AWS Regions i.e. there is a data transfer cost for every gigabyte (GB) going "in" and "out" of the EC2 instance to another AWS region.
To illustrate, the drawing below shows the cost of a GB traveling from the N. Virginia region to another AWS region.
As you can see, it costs $0.01 or $0.02 for every GB that goes "in" and "out" of an EC2 instance hosted in N. Virginia to another AWS region.
Although it looks like AWS is "double dipping" because data is charged as it travels in and out their network infrastructure, it works just like a tollway; the longer the distance traveled the higher the cost, which seems reasonable.
Next, we have the cost of transferring data OUT to the Internet from our EC2 instance. The image below illustrates the different price tiers per the number of GB transferred OUT to the internet in one month.
On average, the on-demand cost for transferring data out to the internet using a Windows SQL Enterprise EC2 (r3.large) instance is $0.04 per GB.
Before we go any further, let's do a quick recap. Thus far, we have covered 3 different costs:
Hourly cost of an EC2 instance
Data transfer among AWS U.S. regions
Data transfer OUT to the internet
As I stated at the beginning of this pricing structure section, AWS offers a broad set of cloud services e.g. global compute, storage, database, analytics, application, and deployment services.
Tools like Amazon Redshift (data warehouse) and Amazon Relational Database (RDS) provide a cost-efficient way to analyze your data and scale up databases as needed, respectively.
Every time you transfer data "in" and "out" of an EC2 instance to Amazon Redshift and Amazon RDS, there is an extra cost of $0.01 per GB. This is a potential case of "double dipping." Take a look at the illustration below...
Let's use the tollway analogy again. Imagine you are getting ready to go on a road trip. You get in your car and start driving from home towards the closest tollway entrance; let's call this entry point "A."
As soon as you enter the tollway at point "A," you remember you forgot your tennis shoes at home, so you take the closest exit (point B). The good news is, you already paid for the A ==> B tolls. The bad news is, you now have to pay for B ==>A to get back home.
Once you get home and grab your tennis shoes, you head back to the tollway and enter at point A, again. Consequently, you must pay for A ==> B again even though you were just there.
Similarly, every time you access and/or retrieve data from an EC2 instance to Amazon Redshift or RDS, you pay for A ==> B AND for B ==>A. In other words,
You are charged from EC2 ==> RDS and from RDS ==> EC2
However, if you pay close attention, you will notice that the charges applied can be more than just A ==> B and B ==>A. Your charges can turn into a triple dipping" experience and look something like this.....
If you are running an analytics EC2 service (analytics are paramount to improve the user experience or recommendations), you would need to move data in bulk from your RDS to your Redshift (data warehouse) cluster to run those "heavy" queries. These queries results would then need to be exported or "unloaded" from your Redshift and then "loop back" into the RDS database to be used in your EC2 instance to improve user experience or recommendations.
To simplify this scenario a bit (if I haven't lost you yet), this is what 1 GB of data transfer looks like as it cycles through an EC2 instance into RDS and into Redshift, and vice versa:
EC2 ==> RDS $0.01
RDS ==> Redshift $0.01
Redshift ==> RDS ==> EC2 $0.02
Hence, it costs $0.04 for every 1 GB of data transferred from an EC2 instance to the RDS and Redshift tools at the end of its cycle or "travel experience." This does not sound like much, does it?
Well, have you ever heard the boiling frog" parable? The premise is that if a frog is suddenly put into boiling water, it will jump out. However, if the frog is put into tepid water, which is then brought to a boil slowly, it will not perceive the danger and will be cooked to death.
Which begs the question:
Are AWS users experiencing " Death by a Thousand Cuts"?
The answer to this question is twofold. First, AWS makes it uber attractive for first-time users to start using them. AWS Free* Tier includes:
750 hours per month of Linux, RHEL, or SLES t2.micro instance usage (Amazon EC2)750 Hours per month of db.t2.micro database usage (Amazon RDS)20 GB of General Purpose (SSD) database storage (RDS)20 GB of storage for database backups and DB Snapshots (RDS)5 GB of Standard Storage (Amazon S3)1 M Free requests per month (AWS Lambda)
*Expires 12 months after sign-up
This is quite tempting, isn't it? If someone with a reputable and trustworthy name offered me a pool filled with tepid water and all of these extra amenities, I would probably swim in it. There is no perceived danger; after all, it is the industry's standard bearer, right?
While this looks all fine and dandy, the longer you stay and the more you "play," the higher the probability that you will exceed the aforementioned "free" parameters. If you do exceed the number of laps that you are allowed to swim, then you are liable for those extra laps you swam. But wait, shouldn't someone be there to tell you:
"Hey! You are about to exceed the number of laps you are allowed to swim!"
Unfortunately, there is no "lifeguard" to warn you. The water is too hot now and you must pay to "cool it down" if you want to stay, or pay and leave.
The second part to the death by a thousand cuts" question has already been answered. AWS charges users in many different ways, albeit at a seemingly low cost.
The first cost is the hourly rate to own a virtual computing environment or "instance." Second, the cost per the operating systems required. Third, the data transfer cost among AWS regions. Fourth, the cost of transferring data OUT to the Internet. And Fifth, data transfer "in"from an instance to other Amazon Web Services such as RDS and Redshift PLUS the cost of data transfer back "out" to your instance from the aforementioned Amazon Web Services.
To recap, you learned how AWS charges data as it travels across different regions in the U.S., out to the internet, and within its ecosystem. You also learned about the Amazon Elastic Compute Cloud (EC2) and how the EC2 on-demand payment option works. Hopefully, this helped you to understand how it all works and prevent the "head-scratching" next time an AWS EC2 bill comes your way.
FINAL NOTE: We have yet to talk about security. We did not cover the costs involved to make an EC2 instance HIPAA compliant nor the steps required to protect your data. Because in AWS own words:
All Communication Between Regions is Across the Public Internet. Therefore, You Should Use the Appropriate Encryption Methods to Protect Your Data.” - AWS
So beware: while the previously shown swimming pool picture looks alluring, your pool is located in a multi-tenant environment by default wherein you share resources with other tenants unless you indicate otherwise. Consequently, here is a more accurate representation of your public swimming pool...
For practical solutions and strategies on how to avoid "Death by Thousand Cuts," send me an email to firstname.lastname@example.org. I will provide you with real-world alternatives to simplify your cloud experience.