Loadbalancing in AWS Cloud: Options & Considerations
This is a Core Technical Blog, intended for the interest of Architects (esp. Platform Architects), Platform owners, CTOs (or) platform guys.
About Us: We do Research & Development in Software space. Connect to know more. Let’s go on…
This article specifically focusses on AWS LB service, but the concepts / design would be the same if you would want to extrapolate this to Azure or GCP or any other cloud.
For someone coming from conventional Platform Architecture, AWS’s offerings of Load balancing seems to hit them with a “wow” experience (unfortunately AWS doesn’t pay me to say this). That’s considering the speed at which an LB can be configured/ modified/ managed.
When it comes to the right physical architecture for requirements in hand & the business requirements v/s cost considerations, irrespective of the platform in hand, it is important to identify the different options and thereby the best option.
AWSs Arsenal of Load Balancers
AWS offers 2 “types” of Loadbalancers
- NLB or Network Load balancer: a Layer 4 Loadbalancer &
- ALB or Application Load balancer: a Layer 7 Loadbalancer
As mentioned above, the purpose of each of them is specific to the Layer of transport that it handles.
NOTE: This paper will not go into the details of networking or the loabalancer AWS service offering, as that is covered by numerous other AWS Specific youtube videos / or documentation on the internet.
It is interesting to note that, for a full-fledged loadbalancer experience you need to have the capabilities offered by both of them.
As AWS offers each of them as a separate service and hence separate costs, this opens up an option to allow for different architectures including — Hybrid Cloud, Multi-Cloud (or) if — just using AWS, then a choice to use just one of the services, for different reasons, some of which we will be discussing below.
Let's dive into the different options, for a Single Cloud / AWS-only scenario.
NOTE: If you wish to talk about other options viz. Multi Cloud or Hybrid Cloud — do connect with me / us.
Requirements
I would go with the list of basic — must-have — requirements (on top of the LB functionality) for any HTTP traffic (web application / Mobile backend APIs (or) System APIs). The requirements are —
- HTTPS Offloading — ability to offload the HTTPS and change to HTTP traffic at the server in order to get access to request headers
- Sticky Sessions — Ability to set up sticky sessions, which ensures that clients do get to the “right server” every time, where they originally logged in.
- HTTP to HTTPS routing — to ensure that HTTP requests are routed to HTTPS for the same URL, covering ground for all end users
- Static IPs for LoadBalancer endpoints— allowing these IPs to be whitelisted for security reasons.
Assumption: This Architecture, for the purpose of this article, doesn’t focus on the architecture behind the LB and assumes just one server on each DC aka. availability zone.
No Comment: Given that an LB should not be theoretically adding any latency in a request journey, we will not comment on the performance impact of the LB under a load test or similar scenarios
Option 1: Use both AWS LB Services — ALB & NLB
A typical architecture for the fulfillment of simple LB requirements as above would look as in the diagram below — and this uses both NLB and ALB.
Given that this is the recommended approach by AWS, this is loaded with all the typical advantages of a Cloud service — automated updates, availability, resilience, maintainability, management, etc.
The cost would be the key negative to mention here as it would keep creeping up with every passing year, resulting in a higher infra cost.
Cross-checking the requirements as a checklist —
- HTTPS Offloading — [Y] [@NLB]
- Sticky Sessions — [Y] [@ALB — better than NLB]
- HTTP to HTTPS routing — [Y] [@ALB]
- Static IPs — [Y] [@NLB]
Use this option — when (1) there is a sufficient budget for Infra, and (2) the organization would rather rely on a fully managed service rather than build any solution
Option 2: Using only ALB (and not NLB)
A typical architecture that uses only ALB will look as in the diagram below
The key benefit of this approach would be some reduction in cost (as there is no NLB). An AWS ALB is a great promise as this can do pretty much everything that is needed out of an LB in a conventional scenario — except — that ALB doesn't allow for “Static IPs” on the LB, which might be a key requirement for a few customers who want to limit/whitelist these IPs (for e.g. on their proxies)
Cross-checking the requirements as a checklist —
- HTTPS Offloading — [Y] [@ALB]
- Sticky Sessions — [Y] [@ALB — better than NLB]
- HTTP to HTTPS routing — [Y] [@ALB]
- Static IPs — [N] — Feature Not available
Use this option — when (1) there is a mandate to reduce operating costs and (2) there is no need for a static IP for Load balancers.
Option 3: Using only NLB (and not ALB)
With this option, the plot thickens…
This architecture can be addressed in two sub-scenarios to meet the requirements (mentioned above).
At a high level, cross-checking the requirements as a checklist —
- HTTPS Offloading — [Y] [@NLB]
- Sticky Sessions — [Y] [@NLB] — this is not a great feature at NLB and is also not applicable if HTTPS is not offloaded at NLB as without offloading, it cannot read the headers to enforce stickiness.
- HTTP to HTTPS routing — [N] Feature Not available
- Static IPs — [Y] [@NLB]
In summary, one cannot route the HTTP to HTTPS traffic and hence the two options below.
Use this option — where Static IP is mandatory as the calling client “may” want to whitelist the IPs (usually for internal or corporate purposes) applications. This mode of course saves some cost by not using ALB. Read further to narrow down to one of the specific architectures.
Mandatory HTTP to HTTPS routing
Let's understand this requirement for a second, before proceeding with options — this is when a user enters an HTTP URL on the browser and still expects the app to start working for example if someone enters - www.google.com on their browser, this is by default an HTTP request and it can be observed that the browser redirects to the HTTPS URL = https://www.google.com. This redirection is not automatic and is to be configured somewhere.
This requirement is key for a public-facing (aka. internet-facing) software, for e.g. a website, a business-only portal, a multi-tenant web application, or even an API (only) layer. For (corporate) internal systems this requirement can be worked around by educating/offering HTTPS-only links to employees/ users, however the same is not possible for public-facing software.
The resulting architecture is dependent on this requirement when combined with others.
Option 3.1 HTTP to HTTPS routing is Not Mandatory
In NLB, the HTTP routing option is not available (as per the checklist above), and hence once the HTTPS offloading is done, the architecture after this layer is not aware if the incoming call was HTTPS or not.
Once the HTTPS is offloaded, NLB has access to the Request header and hence the server stickiness (if enabled) will take effect. This ensures that the load balancing happens at the NLB and the behaviour is truly cross-site.
I have assumed that the reverse proxy layer (Nginx in the diagram) simply forwards the request within the same Datacenter / AZ — though this may not be the case, and could be configured to be cross-site. This assumption was to showcase that — a Cross-site functionality in this case is not mandatory.
Use this option — when (1) HTTP to HTTPS rerouting is not mandatory and (2) Static IP is needed (and hence the choice of NLB ). These use cases fit quite well for a corporate or internal application.
Option 3.2 Mandatory HTTP to HTTPS routing
In this Architecture, in order to achieve the routing for all HTTP traffic to HTTPS, we will need to add another layer post the NLB. We chose Nginx as the candidate, though any other similar products could be chosen. The HTTPS will need to now end in the Nginx layer i.e. NLB will simply passthrough the HTTPS traffic and just do the loadblancing — without any Sticky sessions functionality— as NLB won’t be able to read the request headers for sticky sessions to be applicable.
The Nginx layer will need to do the SSL offloading, configure both nodes for loadbalancing and also implement the sticky sessions functionality (if required).
It's interesting to note that the traffic, in this case, may come to one AZ / Datacenter, and then Nginx might reroute it back to the other datacenter depending on the sticky session needs. This Cross site functionality in this case is mandatory — when compared to the one in 3.1 above, where it is optional.
NOTE: In such a case of cross-site traffic, allowed traffic on the App port is to be configured for the other nodes. Static IPs on EC2 will help in this case.
Use this option — when (1) HTTP to HTTPS rerouting is mandatory and (2) Static IP is needed (and hence the choice of NLB ). These use cases fit quite well for an internet-facing application.
Summary
There are many ways to skin a cat, its just important to understand the right set of requirements and drivers to arrive at a target architecture.
Do connect with us in case you are looking for expertise on Platform Architecture (or) Management. Our team of experienced Engineers and Architects are available to help.
Author: Dilish Kuruppath, BridgeApps ltd. | UK