Scalability is the capability of a system, network, or process to handle a growing amount of work, or its potential to be enlarged to accommodate that growth.


An operating system feature in which the kernel allows the existence of multiple isolated user-space instances called containers. Containers isolate software from its environment so that it may run uniformly on all machines.

Virtual machine
A container is a runtime instance of an image–what the image becomes in memory when executed (an image with state). An image is an executable package that includes everything needed to run an application–the code, a runtime, libraries, environment variables, and configuration files. Containers are a key enabling technology for microservices, providing a lightweight encapsulation of each component so that it is easy to maintain and replicate.
An architectural style that structures an application as a collection of loosely coupled services. The benefit of decomposing an application into different smaller services is that it improves modularity. This makes the application easier to understand, develop, test, and become more resilient to architecture erosion. It parallelizes development by enabling small autonomous teams to develop, deploy and scale their respective services independently. It also allows the architecture of an individual service to emerge through continuous refactoring.
A platform for developers and sysadmins to develop, deploy, and run applications with containers.

The Docker Client runs commands like docker build or docker run. The Docker Daemon (aka Host or Engine) makes the system calls to create, operate and manage containers. Many of the configurations in Docker image will from Docker Registry where it will be downloaded.

Swarm is a native clustering tool for Docker. Swarm pools together several Docker hosts and exposes them as a single virtual Docker host. In Swarm mode, there are two types of nodes: Managers and workers. Manager nodes maintain cluster state and schedule services. Worker nodes execute containers. Clustering is an important feature for container technology, because it creates a cooperative group of systems that can provide redundancy through failover and scalability through automation.

AWS ECS and Kubernetes are two alternate services for maintaining and coordinating clusters of container. Their functionlity is similar.
↑ Top


A cloud-computing execution model in which the cloud provider acts as the server, dynamically managing the allocation of machine resources. Pricing is based on the actual amount of resources consumed by an application, rather than on pre-purchased units of capacity. This allows for infrastructure to be provisioned and terminated quickly and cheaply.

  • Pay per use - paying for use rather than capacity can be significantly cheaper depending on use case
  • Scalability - horizontal scaling is built into the platform
  • Faster development - functions can be built standalone, independent from the rest of the application.
  • Maintenance - vendor handles server upgrades and maintenance.
  • 3rd party API - vendor lock-in
  • Lack of operational tools - debugging and testing are difficult
  • Architectural complexity - as functions get more granular, the wiring between those functions increases exponentially.
  • Timeouts & Latency - becuase serverless functions run in ephemeral containers, there is increased latency when the container does not yet exist. In addition, there is a timeout which prevents long running functions from executing to completion.
Serverless framework
  • vendor agnostic (AWS, Azure, Google Cloud, etc)
  • language independent (Node, Java, Python, etc)
  • packages functions AND provisions resources
  • large community that builds excellent plugins
Infrastructure as code (IaC) - The process of managing and provisioning computer data centers through definition files.
  • Cost - aims at helping not only the enterprise financially, but also in terms of people and effort, meaning that by removing the manual component, people are able to refocus their efforts towards other enterprise tasks.
  • Speed - Infrastructure automation enables speed through faster execution when configuring your infrastructure and aims at providing visibility to help other teams across the enterprise work quickly and more efficiently.
  • Risk - Automation removes the risk associated with human error, like manual misconfiguration; removing this can decrease downtime and increase reliability.
Services - A property of the serverless.yml that names the project.
Provider - A property describing the cloud provider and default parameters for the rest of the file. The IAM role and permissions will be set here as well.
Functions - A property of the serverless.yml which lists the names of the function in your service. Inside the function is a handler pointing to the code. Functions can be nested inside other functions. Environment variables can be set as well as tagging. If the function fails, a DLQ (SNS) can be published.
Events - things that trigger your functions to run (an S3 bucket upload, an SNS topic, and HTTP endpoints created via API Gateway)
Layers - Code that you’ve used else where that you can import into the serverless function.
Resources - A property of the serverless.yml which defines the infrastructure your functions depend on, like AWS DynamoDB or AWS S3. Resources deployed by Serverless have the naming scheme - {Function Name}{Cloud Formation Resource Type}{Resource Name}{SequentialID or Random String}. Does not apply to S3.
Variables - allow users to dynamically replace config values in serverless.yml config with ${} notation.
  • Self reference - ${self:someProperty}
  • Other files - ${file(../myFile.json):someProperty}
  • In JS files - reference JavaScript files to add dynamic data into your variables. ${file(../myFile.js):someModule}
  • Recursively reference properties - ${file(../config.${self:provider.stage}.json):CREDS}
  • Environment Variables - ${env:SOME_VAR}. Keep in mind that sensitive information which is provided through environment variables can be written into less protected or publicly accessible build logs, CloudFormation templates, et cetera.
  • Referencing CLI Options -${opt:some_option}. A common CLI option is the dev stage.
  • CloudFormation Outputs - allows your service to communicate with other services/stacks using ${cf:stackName.outputKey}.
    Output names are added to the Export field in the resources property:
    Import into other services:
  • S3 Objects - ${s3:myBucket/myKey}-hello
Full documentation
Reference .yml
↑ Top


Load balancing and Autoscaling
A load balancer is a device that acts as a reverse proxy and distributes network or application traffic across a number of servers. Autoscaling is when computational resources in a server farm, typically measured in terms of the number of active servers, scales automatically based on the load on the farm. When using containers, orchestration services Kubernetes and Amazon ECS handle load balancing and autoscaling for you.

A content delivery network (CDN) refers to a geographically distributed group of servers which work together to provide fast delivery of Internet content. A CDN allows for the quick transfer of static assets needed for loading Internet content including HTML pages, javascript files, stylesheets, images, and videos. Using a CDN improves website load times, reduces bandwidth costs, increases content availability and redundancy, and improves website security (e.g. DDoS mitigation).

A cache A hardware or software component that stores data so that future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation or a copy of data stored elsewhere. A CDN is an example of a cache when a user retrieves assets from an edge location rather than a distant origin server. Web browsers also uses caches that store static data rather than re-retrieve them. The timeframe for how long these items stay cached can be set by a cache header or service worker. For databases, Redis can be used to intercept requests and send cached data rather than querying the database additional times for the same data.

Data transfer
QUIC - An experimental transport layer network protocol that is built on top of UDP. It handles error handling without the handshaking that the higher latency protocol TCP has.
Compression - compress all static assets. Also make sure your server uses gzip - a file format and a software application used for file compression and decompression.
Lazy loading - Transfer data to the client as it is needed like a stream, rather than having them wait for the entire bundle to complete downloading.

Language selection
Java - For some tasks a different language might be necessary for performance gains. Java is an excellent language for computationally intensive operations. IO operations, on the other hand, might be better served using Node.js.
WebAssembly - a web standard that defines a binary format and a corresponding assembly-like text format for executable code in Web pages. It is meant to enable executing code nearly as quickly as running native machine code. Many believe WebAssembly could replace JavaScript entirely in the distant future.
↑ Top