Chapter 7 - DevOps and Deployment
By John Lenz. .
DevOps is one of the new buzzwords and therefore brings with it a lot of fluff and hype, but DevOps has a kernel of great ideas: infrastructure as code, immutable infrastructure, and "Treat servers like cattle, not pets". Infrastructure as code is the idea that you should never manually run commands directly on servers or configure them individually; everything should be described and controlled by files checked into your source repository. Immutable infrastructure is the extension where you move away from shell scripts which edit existing infrastructure. The core idea is that in your source repository you describe what you want the final infrastructure to look like and then say: make it so!
There are a huge range of options for infrastructure: everything from running Kubernetes on your own hardware, to cloud providers, to a fully managed system like Heroku. Once I started using infrastructure as code I quickly realized it is shockingly easy to deploy infrastructure and even quite easy to write the deployment code once you learn. Cloud providers are a natural fit since with the infrastructure automated you create and destroy resources such as compute instances, storage, and databases as needed. I chose AWS. I think of AWS as the swiss army knife of cloud providers; AWS provides a huge range of tools and resources and lets you combine them however you want, creating or destroying compute, storage, and services on demand. This lets AWS live in a sweet spot of large flexibility and infrastructure automation, making it much simpler than running or renting your own hardware and almost as straightforward as a service like Heroku. Google's cloud was a close second, but at the moment they don't have a great way of running Haskell (App Engine has a beta version using docker which might work in the future). I plan on keeping Google's cloud in mind.
The next step is automation. All of the cloud providers allow you to create resources by clicking around their website. Want a Redis cluster? Log in to the AWS ElastiCache developer console website, enter a few text fields like how many servers and how much memory, and click the start button. This goes against infrastructure as code, makes it difficult to replicate a development environment and production environment, and makes it hard to manage changes and updates. Thankfully, AWS (and other cloud providers) allow everything to be done via APIs. There are a large number of tools which take advantage of these APIs to automate infrastructure, and this is currently a space experiencing rapid growth with a new tool or service coming out what seems like every other week (as of late 2016 at least).
Containers are another buzzword, but at their core containers provide immutable infrastructure. You package the app or service into a container, the container itself is never edited, and the container can be deployed automatically on a cluster of servers. I think of a container as the ultimate output of the build, test, and integrate stages. There are many kinds of containers:
Source code in a zip/tar file are a common container format. Services like Heroku or AWS ElasticBeanstalk or AWS Lambda or Google's App Engine take your bundle of source code and deploy it to servers based on some configuration. The source code is typically required to use a specific language or framework, although sometimes you can include a statically compiled binary in the "source" bundle. Since you don't want to manually create these source zips (you want your source in source control), these source bundles are created as part of the build process.
Docker. Docker allows arbitrary languages and frameworks by packaging the application and its dependencies. Docker images are then deployed by a service which executes the Docker containers on a fleet of servers. These can be hosted options like AWS ElasticBeanstalk for Docker, AWS ECS, Google's Container Engine, or many docker management frameworks.
Virtual Machine Images such as Amazon's AMIs. Here you use a strategy where you do not use a generic virtual machine image such as an Ubuntu image, but instead bake a machine image which contains your app as part of your build process (using a tool such as Packer). Each build/application version results in a new image which when deployed automatically starts your application. In this way, a machine image is never configured or modified during production to keep immutable infrastructure.
Once you have your container, you want to automatically deploy it. If we think of a container as the output of the build of a specific version, then deployment automation will automatically take the container and run it on some servers. This could be uploading a zip file to AWS Lambda, deploying the docker container on a fleet of servers using Kubernetes or ElasticBeanstalk, or starting a new Auto Scaling Group from the AMI.
The oldest tools are orchestration tools such as Puppet, Chef, Ansible, SaltStack, AWS OpsWorks, AWS CodeDeploy, and many others. These tools help run shell scripts and update configuration on a large number of servers at once, and can be used to deploy any of the above kinds of containers. You could provision a machine image, copy the application as a zip file into the image, and perform other configuration at runtime. Or the scripts can start a docker container on a number of servers. The shell scripts you write can become quite complicated and they focus on modifying, provisioning, and editing existing servers instead of immutable infrastructure, so I have avoided these tools.
The alternative is to use dedicated tools to provision infrastructure. The idea is you describe in a configuration file (typically YAML) in your source repository what you want the end result to look like, and then run a tool which compares the configuration file to the existing infrastructure and then either creates new resources or edits existing resources. A good example of this is stack for building Haskell code. In
stack.yaml and the
cabal file, you describe in a config file what the dependencies are, what files to build, what options to use, etc. and then say
stack build or
stack test which handles all the details on how to actually perform the build or test. Similarly, you can describe in a configuration file that you want these DNS records, a Redis cluster of this size, a storage bucket with this access policy, this bundle of code running on Lambda, etc. and the tool makes it so. It can also keep the infrastructure alive across failures such as when instances fail health checks, the instance is destroyed a new instance is created.
There are several tools for this, including AWS CloudFormation, Terraform, AWS SAM, and stratosphere for a Haskell EDSL. For Docker, Kubernetes is quickly becoming the biggest Docker management tool. With these tools, you specify something such as "I want 5 copies of this container running behind a load balancer" or "I want a database table with these columns" in a configuration file and then a call to Kubernetes'
kubectl or CloudFormation's CLI tool or Terraform makes it happen. There are many other tools such as Apex and Serverless for Lambda and Openshift and Deis for containers. I use CloudFormation, although I might use stratosphere to get a Haskell EDSL.
The source code plus configuration files such as
stack.yaml and CloudFormation YAML files are in your source repository. To get the application deployed, the code flows through a sequence or pipeline of stages. Each stage is a call to one of these provisioning tools which take the configuration file from the source and performs the actions. Typically, there will be a build stage, a test stage, a integrate stage (which creates the container), an acceptance test stage, various deployment stages including manual approval stages. Pipeline stages can even do fancy things such as watch the logs for 2 minutes checking for 500 error codes and if a large number are found automatically roll back to a previous version (called canary testing). There might even be separate pipelines: one pipeline for the application and a separate pipeline for the database.
At least for me, pipelines was quite a different way of thinking about infrastructure. I want to add a new DNS record, so instead of adding it directly or even scripting the addition, I instead add the DNS record into the CloudFormation config file, push the commit, and the pipeline automatically updates the infrastructure to match the config file. The goal is no one ever executes a command manually. Everything is triggered by pushing a changeset (perhaps to a specific branch).
There are a variety of tools to control pipelines. The simplest is actually a Makefile on the developer machine and is where I started. The Makefile does the build with a few calls to
docker and handles the infrastructure with calls to CloudFormation using the AWS command line interface or Kubernetes using
kubectl. The Makefile is only a handful of lines long since all the hard work is in the stack or CloudFormation configuration files. Other options for pipelines are hosted solutions such as AWS CodePipeline, CodeShip, Bitbucket Pipelines. Finally, Spinnaker is a pipeline and deployment tool which you run yourself. You can also extend continuous integration tools such as Jenkins, TravisCI, or others. The CI tool creates and pushes the container (docker image, AMI, bundle of code) to a storage service (S3, container registry), and then uses a call to Kubernetes or CloudFormation to deploy. The disadvantage of extending CI tools in this way is typically there isn't automatic rollback to a previous version on failure, although CI tools are always improving and I wouldn't be surprised to see them become more involved with deployment and monitoring.
As I mentioned above, I am using AWS although I am keeping an eye on Google's Cloud. I have two separate AWS accounts with shared billing: one for development and one for production. By using the same pipelines for both, it is easy to keep development and production as the same environment. To manage the pipelines, I started out with Makefiles and eventually started using AWS's CodePipeline and CodeBuild. I have three pipelines.
The first pipeline is a short pipeline for "static" infrastructure which typically doesn't change that frequently. The pipeline creates and manages the IAM roles and policies, the VPC, the Redis ElastiCache cluster, the database, sets up CloudTrail, creates the Route53 domains, the CloudWatch log groups and metrics, the KMS keys, and so on. All of these resources are described by several CloudFormation templates in YAML. I don't use nested stacks but instead the new feature of stack exports to have these resources configured by several different files. To control this pipeline via Makefiles, the Makefile has a single call to CloudFormation using the AWS CLI, passing in the configuration files. CodePipeline has a built-in deployment stage to CloudFormation which can be used.
The second pipeline is for the client. In a sequence of stages, it builds the client GHCJS, builds the metalsmith static site, uses CloudFormation to create the S3 bucket and CloudFront distribution, runs the acceptance test, and finally uploads the static files to the S3 bucket. All these stages are controlled by
stack.yaml, the metalsmith configuration, or the CloudFormation template for the S3 bucket and CloudFront distribution. The various build and test stages, plus the final sync to S3 use CodeBuild.
The third pipeline is for the server. It builds the server using stack, runs the unit tests, creates a Docker container with the server, runs the acceptance tests, and deploys. For running the Haskell server, AWS provides a large number of options.
Elastic Beanstalk. The Haskell server is packaged as a docker container, the container is pushed to Amazon's Container Registry, and Beanstalk is used to create the load balancer, create the auto scaling group, and manage the versions and deployment rollout. This is the solution I currently use and it works well, mainly because I have only a single service and don't need any strange configuration.
Elastic Load Balancer + Auto Scaling Group. This is similar to beanstalk, except you create the ELB and ASG yourself using CloudFormation (it ends up not being that much more work than ElasticBeanstalk). Instead of Docker, you bake AMIs as part of the build using packer to combine the server binary into say the Amazon Linux AMI. The AMI is then never edited or configured, it automatically boots into the running Haskell server. CloudFormation then can be used to update the ASG to point at the new AMI, and additional CodePipeline stages can monitor the CloudWatch metrics to determine if an automatic rollback to a previous AMI is needed. I didn't select this because it is essentially the same as Beanstalk except it provides a little more control if you need to customize some aspect.
ApiGateway+Lambda. Despite not having Haskell as a language, you can run Haskell on Lambda. Lambda code runs in a Linux container, so the technique is to compile the Haskell binary statically, use the nodejs runtime, and write a small node function which just executes the Haskell binary and returns the response to Lambda (this is similar to how Apex supports Go). I am not using this currently, but may start in the future. The tooling around Lambda and ApiGateway is still in flux and there isn't any direct Haskell support yet (I would have to hack it together myself). Lambda is enticing to run Haskell especially because the overall design of the application has a stateless server which just processes requests. Perhaps once AWS's Serverless Application Model becomes a little more mature, I could look into creating a tool to automatically create a Lambda bundle from the stack build output.
Amazon EC2 Container Service. I only have one service so currently EC2 Container Service is overkill, but could still be used to manage the Haskell server in a Docker container. Also, EC2 Containers currently lack features compared to Kubernetes or Spinnaker for container management such as service discovery, although some recently released features like the Application Load Balancer and Blox is closing the gap.
Kubernetes can be run on top of AWS infrastructure. For a single service like I have, Kubernetes doesn't offer much over Beanstalk.
Spinnaker. I really like how Spinnaker works since it handles the execution and the pipelines and the monitoring, but it is a little bit much for a single application since you have to host Spinnaker yourself.
OpsWorks on EC2 or Lightsail or AWS CodeDeploy. I suppose I should mention it because it is an option, but I didn't consider this at all. OpsWorks is a hosted Chef based around running scripts and configuring individual servers. CodeDeploy is a simpler version of this where CodeDeploy takes your application bundle and copies it to a fleet of servers, editing them similar to how OpsWorks would.
As mentioned above, I am using Beanstalk but keeping an eye on Lambda. The pipeline first uses stack to build and run the unit test (either locally using the Makefile or using CodeBuild as a CodePipeline stage). Next, it creates a docker image which is pushed to Amazon's Container Registry. Finally, the deployment happens by running some AWS CLI commands to create a new version in beanstalk.
For the docker image, there is the stack-run image which you can use as a base, but instead I took an approach inspired by the Haskell Web Server in a 5MB Docker Image. Since I am not concerned with getting it as small as absolutely possible, I start with the statically linked busybox+uclibc image which is only 650KiB or so. I then add the binary produced by
stack build, all the libraries from running
ldd on the binary, plus the gconv libraries. I stole some code from the haskell-scratch repository, but since haskell-scratch was missing a few libraries I needed like libcrypto and libssl, I couldn't use it directly. In any case, it was easy enough to copy and modify the Makefile from haskell-scratch to start from the busybox+uclibc image and include the DLLs that I need.