AWS S3 Static Website Hosting. It is cheap, scalable, and “performant”. Especially when it tag team with CloudFront.

This is a documentation of how to host a Single Page Application (React for this case) on AWS S3 with SSL over CloudFront using this pet project of mine as an example.

1) The project

A simple static site so no redux is used; this setup would also work with redux. So its gonna be react and react router mainly. Here are the specifics:

The bundler I am using is webpack: ^3.5.5.

2) AWS S3

S3 can host static website apart from just storage.

Note that each bucket is meant for only 1 website, that is you cannot have a bucket called my-static-websites and have each directory hosting 1 website. No. It is going to be per website per bucket.

Set up the static website hosting configuration as such for the bucket. Take note of the Endpoint.

This setup is saying:

So when we upload the react project into the bucket:

What is happening? Well the /something path is looking for a file something.html in the S3 bucket but it was not to be found. Since this is a Single Page Application, there is only 1 html file, 1 GOD html file.

So here is the challenge.

We need to map all paths to the index.html file.

Since this is a react project, we do not need to map each path to a specific other html page like a typical website; the index.html will load the javascript bundle and react router will get to work to show users the correct page based on the path.

Hygiene pages

Not sure if this is the correct term for sitemap.xml and robots.txt files but yea you’ll need these files for SEO. These files go into the root directory of your bucket as siblings to the index.html file. and the url to them are, eventually, https://www.yourdomain.com/robots.txt and https://www.yourdomain.com/sitemap.xml respectively.

3a) AWS CloudFront — Distribution

CloudFront is the CDN of AWS it can handle the mapping of the routes, on top of caching the site.

Start off by creating a web distribution. The key configurations I will like to mention are:

Create the CloudFront distribution and wait for it to get deployed. Take note of the distribution’s Domain Name.

3b) AWS CloudFront — Error Pages

After creating the CloudFront distribution, while its status is In Progress, proceed to the Error Pages tab. Handle response codes 404 and 403 with Customize Error Response.

Google recommends 1 week or 604800 seconds of caching.

What we are doing here is to set up CloudFront to handle missing html pages, which typically occurs when a user enters an invalid path or, in particular, when they refresh a path other than the root path.

When that happens:

  1. CloudFront will be looking for a file that does not exist in the S3 bucket; there is only 1 html file in the bucket and that is the index.html for the case of a Single Page Application like this project example
  2. A 404 response will be returned and our custom error response setup will hijack it. We will return a 200 response code and the index.html page instead.
  3. React router, which will be loaded along with the index.html file, will look at the url and render the correct page instead of the root path. This page will be cache for the duration of the TTL for all requests to the queried path.

Why do we need to handle 403 as well? It is because this response code, instead of 404, is returned by Amazon S3 for assets that are not present. For instance, a url of https://yourdomain.com/somewhere will be looking for a file called somewhere (without extension) that does not exist.

PS. It used to be returning 404, but it seems to be returning 403 now; either way it is best to handle both response codes).

4) DNS

I intend to use the www version of the domain.

Go to the DNS zone file and set up as such.

This setup indicates:

  1. domain.com will be redirected to www.domain.com
  2. requests will be rewritten, if valid, from http to https

I am using namecheap.com as my DNS service provider, and they come with an option to redirect https or http non-www to https www at the DNS level.

However.

If your DNS service provider does not provide this function, you can use AWS S3 to do the redirect instead. Create another bucket with these settings.

Set the value DNS A record of the root domain to the end point of this bucket.

What will be achieved is all non-www request will be directed to this bucket. This bucket will in turn redirect the request to the www domain, which points to the bucket where the files are. And yes it will be a 301 redirect. In case you are wondering, this is the significance of a 301 redirect.

Conversion of http to https will be handled by CloudFront configuration (Viewer Protocol Policy) that was setup previously.

At this point of time, you should be able to access your site like a normal website. Refreshing at a path other than the root path should also work.

All non https requests will be redirected under the https protocol.

All non www request will be redirected to the www domain under the https protocol as well.

Bots and crawlers should be able to access your robots.txt and sitemap.xml files as usual.

5) Conclusion

Pros

Cons

Side quest

In this section of the article, I will be documenting how to automate the deployment process of such a site in such a setup from just the command line.

1) AWS IAM

To start off, you will need to create an IAM user and give it the necessary S3 permissions.

Note the access key id and the secret access key, as well as the User ARN.

IAM users are access control configuration in your AWS account, principally to answer the question of who can do what to which of the services under your account.

Let’s call this user iam_user.

2) AWS S3

Change the bucket policy to allow this iam_user to make changes to the bucket.

{"Version": "2012-10-17","Id": "someID","Statement": [{"Effect": "Allow","Principal": {"AWS": "arn:aws:iam::123456789:user/iam_user"},"Action": "s3:*","Resource": "arn:aws:s3:::bucket-name"}]}

3) Deployment

As this is a simple, mostly static, website, there is no testing scripts or any CI server set up for the deployment procedure. It will just be a simple task to upload new files to the correct bucket in S3 using AWS CLI.

Cleanup

But before uploading, make sure you clean up the distribution folder where you build your files for the production environment. Since I use webpack as my bundler, I utilise the clean-webpack-plugin to help me dispose of old files before building new ones. This is to prevent uploading the same old assets again to the bucket.

# webpack.config

const CleanWebpackPlugin = require('clean-webpack-plugin')const HtmlWebpackPlugin = require('html-webpack-plugin')const pathsToClean = ["dist"]const cleanOptions = {}

...

output: {path: path.resolve(__dirname, "dist", "assets"), // all files are bundled into the dist/assets sub-directorypublicPath: '/assets/',filename: 'bundle.js'},

...

plugins: [...,new CleanWebpackPlugin(pathsToClean, cleanOptions), // cleanup the whole "dist" foldernew HtmlWebpackPlugin({template: "./src/index.production.html",filename: "../index.html" // all files are bundled into the dist/assets sub-directory, but index.html will be placed 1 directory up in the dist directory itself}),

...]

Uploading

Now to upload the files to S3.

To prevent any Tom Dick and Harry from being able to do so, authentication is required. This is where all the work for IAM comes into play.

We will use a script to do the uploading, with custom configuration to authenticate the request.

You can use --dryrun flag to test your script before actually doing the upload. This is the final version of my script.

aws s3 cp ./dist s3://better-cover-letter --recursive --exclude "*.DS_Store" --acl public-read --cache-control public,max-age=604800 --dryrun --profile iam_user

The --exclude flag is to prevent the upload of the irritating, ever present .DS_Store file in macOS.

The --acl flag will set the access control level of the files. Make it public readable so people can access your site, otherwise they will be slapped with a 403 Forbidden message.

The --cache-control flag adds the cache-control header to the S3 objects when Cloudfront calls for them. These cache control headers will be passed to the browser to leverage on browser caching and thereby increasing page speed. 604800 is 1 week in seconds, so this max-age value will cache these assets for a week.

[Google] recommend[s] a minimum cache time of one week and preferably up to one year for static assets, or assets that change infrequently

The --profile flag is used to set the specific IAM user credential to authenticate this operation. As I am using this same macbook pro for my work and my personal projects, I have multiple AWS accounts to handle, thus the need for this flag to differentiate the different IAM users. Check out AWS CLI named profiles for more information. These are my config and credentials files for your reference.

# ~/.aws/config[default]region=us-west-2output=text

# ~/.aws/credentials[iam_user]aws_access_key_id=somethingaws_secret_access_key=something

[company_user]aws_access_key_id=something_elseaws_secret_access_key=something_else

The aws_access_key_id and aws_secret_access_key are specific to the iam_user that was created.

Once you are ready, you can remove the --dryrun flag and do a test run to ensure that your files are indeed uploaded to the correct bucket. Yes, a test run. It is not the end of the deployment step. We can go further to completely automate the whole process.

NOTE: AWS S3 does not charge data transfer in to the bucket, only out. So feel free do spam deployment. (In fact, S3 does not charge data transfer out to Cloudfront.)

Combine the Steps

As it stands now, we have to build our site first using webpack -p — config webpack.config.js to generate the files, then upload the files using theaws s3 cp command.

To make our life better, we can create a new script command to run these commands one after another, without having us to be there waiting for the first command to finish then manually execute the other.

# package.json

..."scripts": {..."deploy": "webpack -p --config webpack.config.prod.js && aws s3 cp ./dist s3://better-cover-letter --recursive --exclude "*.DS_Store" --cache-control public,max-age=604800 --dryrun --profile iam_user"...}

So just run npm run deploy and these will happen in chronological order.

There it is, the fully automated process for uploading the static website.

More Housekeeping (Optional)

If you are bundling your javascript files with a hash like me, you will find your S3 bucket accumulating with old js file instead of getting replaced by the new ones since they are different files by virtue of the hash in their file name, eg bundle-0af19d01880334b789.js. Not so much if you are uploading just bundle.js which will replace any bundle.js present in the bucket.

Since storing files in S3 isn’t free, albeit not that expensive either, its still wise to remove files that you will never be using again.

So we can use AWS CLI again to do a removal of these old js files before upload (note: I am leaving the files in the root directory of the bucket untouched, just cleaning up the assets folder).

aws s3 rm s3://better-cover-letter/assets --recursive --profile iam_user --dryrun

Once again, combine them in the deploy script.