When do you use AWS? When do you use Azure? Is an on premises solution still cost-effective in 2020 or 2021? This article seeks to give you advice on using Azure vs AWS and on-premises solutions in the serverless age.
There is nothing worse than picking a solution and finding that the hours, time, and money spent were in vain. We develop a filter based on nearly 10 years of experience to help you decide on the stack that maximizes your return on investment.
How much data will I use?
The first consideration is the amount of data and I/O you use. These dimensions become your most costly. AWS and Azure continue to lower the cost of accessing and storing data but the fractions of pennies add up to tens or even hundreds of thousands of dollars over time.
Develop an understanding of:
- The amount of data you need to process immediately
- Storage requirements
- The number of requests for data
- Whether a full table scan will be necessary
Serverless architecture promises to alleviate the need to manage small projects and offers terrific cost savings for large projects. Everything in between is a matter of analysis. Performing table scans adds an entirely new dimension, especially in AWS.
How complex is my project?
Project scope and complexity are both important as well. A project requiring dozens of AWS Lambda or Azure compute functions will become costly quite quick.
Generate business and technical requirements and then consider:
- If you are applying AI or models to your data
- Benefit from splitting functions into smaller compute units
- The number of pipelines that will run
- How frequently your pipelines run
Time, frequency, and scale cost money. Model training and deployment are expensive as well with AWS SageMaker being among the costliest available services. Dedicated, on-premises hardware or EC2 instances may actually cut costs when compared with non-managed solutions.
AWS vs Azure Pricing
If you determine that a serverless compute cluster or other architecture is perfect for your needs, you then need to perform a cost analysis. Gather as much information as possible from clients, benchmarks, and other sources regarding your data storage, usage, and compute times.
With this information, analyze Azure vs AWS pricing. You should find that:
- AWS offers lower cost storage but higher cost I/O making it better for big data
- EC2 typically starts to help cut down costs after your MVP if your project solves a small problem or the initial feature set is relatively light
- Amplify can cut down costs for small web projects but may be more difficult to manage if you are building online platforms with non-standard backends
- Azure works well for smaller data marts in SQL but Cosmos is quite expensive
- AWS Lambda is relatively inexpensive for a small number of functions but can grow to hundreds of thousands of dollars if used inappropriately
- Both offer comparable billing and management dashboards but logging to Azure costs slightly more
Not every project is equal. Your needs are not the same as your peer’s requirements. A hybrid solution may actually work best. You can handle hundreds of users on thousands of dollars of on-premises hardware in the right circumstances. Consider every angle.
What do Azure and AWS charge for?
Diving more deeply into how these insights were gathered, online services charge money for everything. AWS notoriously contains hidden fees for data ingress, data egress, IP addresses, storage, compute time, scaling, and more.
Take your expected costs and add 25 percent, 50 percent if this is your first adventure in the cloud, to account for scale and the unknown. Failing to do so could lead to a complete shut down of mission critical services.
Auto-scaling is terrific but induces anxiety in new developers. Luckily, services such as AWS Cognito and AWS Lambda allow you to throttle usage.
You can easily control the amount of information coming in. In the AWS API Gateway, attach limits with Cognito or by using a Labmda function and an API key. Just be aware that AWS limits the number of available keys to 10,000.
Overall, try to:
- Create user related throttles
- Use compute functions to limit requests while charging based on usage
- Make use of secondary indices to avoid table scans in your databases
- Mix EC2 instances or virtual machines in the same data center with your fully managed serverless components
- Use Azure VMs if you plan on scaling for the long-term and Microsoft is a good fit
Each company wants your business. While their documentation is extensive, it is a good place to look for tips.
Azure and AWS Use Cases
Many of these insights were gained from actual experience. In my own work, I found that:
- A small data-intensive project still benefited from traditional pipelines making use of RabbitMQ
- A website requesting form input from users benefited from DynamoDB but serving clean and structured information to WebFlow was best done in Azure SQL
- ETL tasks that exceeded a single Lambda or Azure compute function for a job posting website became incredibly costly even before considering the use of SageMaker
- CosmosDB is costlier than an ElasticSearch instance while DynamoDB can be cheaper for ad hoc queries and NoSQL storage than either
- Data in the cloud can conform to HIPPA, FERPA, and the GDPR and API Gateways in the cloud are easier to manage while following these legal requirements
While not exhaustive, these use cases cover everything from web development to backend processing. The allure of easy management and fast turn around overtakes common sense if you let it.
How do I know if I can benefit from Serverless computing?
You can begin to develop an idea for which stack to use with proper requirements gathering. Some projects shine in the cloud while others do not.
Take every requirement into account when considering between Azure vs AWS and on-premises solutions. Feel free to comment and I will try to help you answer any questions below.