Static website hosting using S3 and CloudFront
- 7 minutes read - 1479 wordsA note on dates: this post was written in 2020 using Terraform 0.12 and the
nodejs12.xLambda runtime. The architecture still holds up well, but if you’re following along today, check the current AWS provider docs and use a supported Lambda runtime and Terraform version. In particular, theaws_s3_bucketresource has since been split into smaller resources (aws_s3_bucket_acl,aws_s3_bucket_logging,aws_s3_bucket_website_configurationand so on).
Recently I had the chance to create a static website using S3 and CloudFront. I used Hugo as the framework for content management and Terraform to provision the infrastructure. There were a few interesting challenges, but in the end it worked out well.
If you are creating a brand new AWS account for this purpose, there are a few things you need to take care of for security. First, of course, is to create a new AWS user with admin privileges rather than using the root account. Second is to enable MFA on the admin user, and finally to create a billing alarm that emails you if the bill exceeds a certain value. Once you have this in place, you are good to go.
Terraform
I used Terraform 0.12 for this project. If you are used to an older version of Terraform, please note that there are changes in the way variables are referenced and that list brackets have been made redundant. I also found a neat little way to let Terraform use AWS credentials from your credentials file.
First, set your permanent AWS credentials in the ~/.aws/credentials file. You then get Terraform to use these credentials via the shared_credentials_file and profile fields in the provider, as shown below. Note that profile-long-term is the profile under which the credentials have been set in the AWS credentials file.
provider "aws" {
region = "us-east-1"
shared_credentials_file = "~/.aws/credentials"
profile = "profile-long-term"
}
S3
Create two S3 buckets: one for your static content, and another to store S3 access logs. The logs are useful for debugging issues and I recommend creating them - but expect to see a lot of log files. I have added a rule that archives the logs after one day.
resource "aws_s3_bucket" "mybucket" {
bucket = var.bucket_name
acl = "private"
tags = {
App = "myapp"
}
logging {
target_bucket = aws_s3_bucket.logs.bucket
target_prefix = "${var.bucket_name}/"
}
website {
index_document = "index.html"
error_document = "error.html"
}
}
resource "aws_s3_bucket" "logs" {
bucket = "${var.bucket_name}-site-logs"
acl = "log-delivery-write"
lifecycle_rule {
id = "archive"
enabled = true
expiration {
days = 1
}
}
}
We now need to create a policy and attach it to the bucket. Note that the Principal is the CloudFront origin access identity. The idea is to allow read access to the bucket only from CloudFront, which minimises the security risk.
resource "aws_s3_bucket_policy" "b" {
bucket = aws_s3_bucket.mybucket.bucket
policy = <<POLICY
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "PublicRead",
"Effect": "Allow",
"Principal": {
"AWS": "${aws_cloudfront_origin_access_identity.origin_access_identity.iam_arn}"
},
"Action": ["s3:GetObject"],
"Resource": "arn:aws:s3:::${var.bucket_name}/*"
}
]
}
POLICY
}
Note the
${...}around the origin access identity ARN - the policy is a heredoc string, so the Terraform reference has to be interpolated, not dropped in raw.
CloudFront
Here we create a CloudFront distribution and an origin access identity. We point the CloudFront origin at the S3 bucket. A couple of things to note. First, the default_root_object is the index page in the root folder of the S3 bucket - it does not work for any subdirectory. To handle that, you need this neat little solution using Lambda@Edge running on the CloudFront edge. The Lambda function inspects the request as it comes in from the client and rewrites it so that CloudFront requests a default index object for any request URI that ends in /.
Second, make sure to include your www site in the aliases field. I spent quite a bit of time troubleshooting this because my site was not working on www initially.
The viewer_protocol_policy = "redirect-to-https" redirects all HTTP requests to HTTPS.
The lambda_function_association associates the CloudFront distribution with the Lambda function. Note that event_type has to be set to origin-request.
I created custom_error_response blocks that redirect any 4xx errors to the custom 404.html page in the root folder of my S3 bucket.
The viewer_certificate associates the SSL certificate with the distribution. I’ll explain domain registration and certificate generation next - these were the only two things I had to set up manually.
resource "aws_cloudfront_origin_access_identity" "origin_access_identity" {
comment = "cloudfront origin access identity"
}
resource "aws_cloudfront_distribution" "website_cdn" {
enabled = true
price_class = "PriceClass_200"
http_version = "http1.1"
origin {
origin_id = "origin-bucket-${aws_s3_bucket.mybucket.id}"
domain_name = aws_s3_bucket.mybucket.bucket_regional_domain_name
s3_origin_config {
origin_access_identity = aws_cloudfront_origin_access_identity.origin_access_identity.cloudfront_access_identity_path
}
}
default_root_object = "index.html"
aliases = ["www.${var.site_name}", var.site_name]
default_cache_behavior {
allowed_methods = ["GET", "HEAD"]
cached_methods = ["GET", "HEAD"]
target_origin_id = "origin-bucket-${aws_s3_bucket.mybucket.id}"
min_ttl = "0"
default_ttl = "300"
max_ttl = "1200"
viewer_protocol_policy = "redirect-to-https"
compress = true
forwarded_values {
query_string = false
cookies {
forward = "none"
}
}
lambda_function_association {
event_type = "origin-request"
lambda_arn = aws_lambda_function.cf_lambda.qualified_arn
include_body = false
}
}
restrictions {
geo_restriction {
restriction_type = "none"
}
}
viewer_certificate {
acm_certificate_arn = "arn_of_certificate"
ssl_support_method = "sni-only"
}
custom_error_response {
error_caching_min_ttl = "0"
error_code = "403"
response_code = "200"
response_page_path = "/404.html"
}
custom_error_response {
error_caching_min_ttl = "0"
error_code = "404"
response_code = "200"
response_page_path = "/404.html"
}
custom_error_response {
error_caching_min_ttl = "0"
error_code = "400"
response_code = "200"
response_page_path = "/404.html"
}
depends_on = [aws_lambda_function.cf_lambda]
}
Register a domain and generate a certificate using ACM
Registering a domain is very easy with Route 53. Navigate to Route 53, click on Registered domains, and choose a domain name. I went for a co.uk domain that cost me around $10. Once you have a domain successfully registered, note down the Route 53 zone ID and head over to Certificate Manager. Here you request a public certificate. For my domain I went with a wildcard like *.domain.co.uk. For the validation method I selected DNS validation, since I already had my domain registered, and then chose the option to create the record in Route 53. The process takes a couple of hours to complete, and once finished the status of the certificate changes to Issued.
Route 53
Since the domain was already created earlier, I had to import the zone ID into my Terraform state file using the command below.
terraform import aws_route53_zone.myzone zoneid
I created two Route 53 records. The first is my main site without www - an A record that points to the CloudFront distribution. The second is the www site, a CNAME pointing to my main site.
resource "aws_route53_zone" "myzone" {
name = var.site_name
}
resource "aws_route53_record" "record" {
zone_id = aws_route53_zone.myzone.zone_id
name = var.site_name
type = "A"
alias {
name = aws_cloudfront_distribution.website_cdn.domain_name
zone_id = aws_cloudfront_distribution.website_cdn.hosted_zone_id
evaluate_target_health = false
}
}
resource "aws_route53_record" "www-record" {
zone_id = aws_route53_zone.myzone.zone_id
name = "*.${var.site_name}"
type = "CNAME"
ttl = "1"
records = [var.site_name]
}
Lambda
Create an IAM role that can be assumed by the service principals lambda.amazonaws.com and edgelambda.amazonaws.com. Whenever a CloudFront event triggers a Lambda function, data is written to CloudWatch Logs, so the execution role needs permission to write to CloudWatch Logs. More information can be found here.
resource "aws_iam_role" "iam_for_lambda" {
name = "iam_for_lambda"
assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {
"Service": [
"lambda.amazonaws.com",
"edgelambda.amazonaws.com"
]
},
"Effect": "Allow",
"Sid": ""
}
]
}
EOF
}
resource "aws_iam_policy" "lambda_logging" {
name = "lambda_logging"
path = "/"
description = "IAM policy for logging from a lambda"
policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "arn:aws:logs:*:*:*",
"Effect": "Allow"
}
]
}
EOF
}
resource "aws_iam_role_policy_attachment" "lambda_logs" {
role = aws_iam_role.iam_for_lambda.name
policy_arn = aws_iam_policy.lambda_logging.arn
}
The data resource below creates an archive file containing the Lambda execution code mentioned here.
data "archive_file" "lambda_zip_file_int" {
type = "zip"
output_path = "lambda_zip_file_int.zip"
source {
content = file("index.js")
filename = "index.js"
}
}
Finally, we create the Lambda function.
resource "aws_lambda_function" "cf_lambda" {
filename = data.archive_file.lambda_zip_file_int.output_path
source_code_hash = data.archive_file.lambda_zip_file_int.output_base64sha256
function_name = "index_function"
role = aws_iam_role.iam_for_lambda.arn
handler = "index.handler"
runtime = "nodejs12.x"
publish = true
depends_on = [aws_iam_role.iam_for_lambda, aws_iam_role_policy_attachment.lambda_logs]
}
A short note on Hugo
Hugo is very easy to set up and use - you can find the installation steps here. The Hugo Themes page has a list of themes you can use; just clone the theme into the themes folder of your Hugo project and reference its name in the config.toml configuration file.
Once you have created your content, run HUGO_ENV="production" hugo to publish your site (add -D only if you want to include draft pages). This populates the public folder in your Hugo project. I found this script very useful for syncing my local changes to the S3 bucket.
This article pointed me to the Lambda@Edge solution when I was struggling to work out why my S3 subfolders were not being served by CloudFront. This one is also excellent reference material.
Good luck!
Share on: