scalable hexo blog with s3 and cloudfront

There are many many ways to deploy a site nowadays. This write-up will go in-depth on the development, build and deployment system of the blog. I think I might have over-engineered it a bit, but at least I don’t need to worry about scaling.

This is not a new process by any means, but would like to expand on what other people wrote.

This process will work with any site that is built into static files.

in-depth stack components

The setup it quite simple.

For you technical folks, Hexo deploys files to an S3 bucket which is served up by a Cloudfront distribution. Then I created a CNAME record that points my domain to the cloudfront distribution.

static site

Having a static site is what enables the use of S3. Hexo basically takes all of your post, pages, scripts, settings, etc etc and “compiles” it down to a set of static assets. There is no need for an application server to serve dynamic content or a database to store all the content. Each page, each post has it’s own html file. It’s quite straight-forward.

The alternative would be something like WordPress. WordPress serves up pages dynamically using their templating done in PHP and using database queries to fetch content. There are no hard-coded html files, each page is “created” when a user visits it.

To some, it might seem like a degression to have a static site. The headers and footers are duplicated in every file, there’s no flexibility in content distribution and content changes. In most cases, a dynamically served site is the way to go, but for the purpose of a simple blog with some text and photos, it’s most definitely an overkill to serve data dynamically.

The few instances where one might need dynamic data would be for something like comments. I invision a solution for this would be to have something like an AWS Lambda function. But for the time being, a static site fits my needs perfectly.

s3

AWS S3 is Amazon’s file storage service. It’s a fast and easy way to store any type of files. It takes care of storing your files across multiple systems in order to have maximum uptime. According to the S3 documentation, S3 is designed for 99.999999999% uptime and durability. It’s a pretty fool proof and efficient way to store files.

Netflix uses S3 to host their movies and tv showes, so if it’s good enough for Netflix, its good enough for me.

cloudfront

AWS Cloudfront is Amazon’s CDN (content distribution network). Like any CDN, cloudfront distributes your files across their network of physical data centers and depending where a user is accessing the file it will be served from the closest data center. Therefore decreasing the latency.

porkbun.com

I purchased my domain on porkbun.com because they have a promotion on their .com sites and I like promotions. I also like their name.

AWS Certificate Manager

I wanted to have an SSL for the domain so AWS Certificate Mangager was the way to go, since I could use the certificate for the cloudfront distribution. I initially created an SSL on porkbun.com that I imported into AWS Certificate Manager but I was unable to use that to sign the cloudfront distribution.

stepup process

The setup process is quite simple but the aws console is super confusing.

s3 bucket creation

Creating the S3 bucket is quite simple. Configuring it is a pain.

The bucket will need to be public since we are serving the files publically and it’s also required for the S3 static hosting to work properly.

A security policy will need to be created for the bucket. AWS has a security policy generator here.

Here’s the settings I used to generate my security policy.

WARNING! Make sure to only select GetObject in the list of Actions. This makes sure that the S3 bucket only allows GET requests for a file.

The Principal field specifies the user, account, service, or other entity that is allowed or denied access to a resource. So in our instance, we want allow anyone to be able to view an S3 file.

The Amazon Resource Name (ARN) should be formatted like arn:aws:s3:::<bucket_name>, in my case it was arn:aws:s3:::posixprojects.com since my bucket name is posixprojects.com

After generating the policy, paste the policy in the Permission -> Bucket Policy tab of you S3 bucket.

Next, we’ll need to enable static web hosting. Under Properties, click on Static web hosting. The only configuration here is to set the Index document to be index.html since that’s what Hexo generates as the index page.

Now the S3 bucket is ready to be deployed to. YAY!

hexo s3 deployment

There is an hexo plugin that deploys to s3 called hexo-deployer-s3.

To install run this:

1
$ npm install hexo-deployer-s3 --save

After installing hexo-deployer-s3, the _config.yml will need to be updated so use hexo-deployer-s3 as the deployment method.

From the documenation:

1
2
3
4
5
6
7
8
9
10
11
deploy:
type: s3
bucket: <S3 bucket>
aws_key: <AWS id key> // Optional, if the environment variable `AWS_ACCESS_KEY_ID` is set
aws_secret: <AWS secret key> // Optional, if the environment variable `AWS_SECRET_ACCESS_KEY` is set
aws_cli_profile: <an AWS CLI profile name, e.g. 'default'> // Optional
concurrency: <number of connections> // Optional
region: <region> // Optional, see https://github.com/LearnBoost/knox#region
headers: <headers in JSON format> // pass any headers to S3, usefull for metadata cache setting of Hexo assets
prefix: <prefix> // Optional, prefix ending in /
delete_removed: <true|false> // if true will delete removed files from S3. Default: true

WARNING! I do not recommend using pasting your AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY into the config. Use environmental variables!!!!

After this configuration, do a test deployment using the hexo cli.

1
$ hexo generate --deploy

If this is successful, you should see the files in the S3 bucket and you can access your site by using a the S3 Bucket hosting URL that’s formatted like this:

http://<bucket-name>.s3-website-<region>.amazonaws.com

AWS Certificate Manager Setup (SSL)

SSL is pretty necessary these days so, I would recommend setting this up.

To generate a certificate, navigate to the AWS Certificate Manager.

Provision a public certificate. The process is pretty self explanitory so I won’t go through every step.

You’ll need to verify the domain by either a DNS record or an email verification.

Cloudfront setup

Cloudfront setup is also very easy. To setup a distribution, navigate to Cloudfront.

There is one tricky part that took me a while to figure out.

For the Origin Domain Name make sure to use the S3 static hosting URL that we got in the S3 setup process. DO NOT USE THE AUTOFILL S3 BUCKET!

I would select the Redirect HTTP to HTTPS option if SSL is setup.

Finally, for the SSL Certificate section, select the Custom SSL Certificate option and select your provisioned SSL Certificate. Sometimes it takes a while for the SSL Certificate to provision. If so, the Custom SSL Certificate selection should be greyed out. If so, continue with the cloudfront setup. Come back edit the cloudfront distribution in an hour or so.

After creating the Cloudfront distribution, make note of the distribution URL. It should look something like:

xxxxxxxxx.cloudfront.net

You can find this in the main cloudfront dashboard under the domain name column.

NOTE: After deploying, you will need to invalidate the cache. You can manually do it in the cloudfront dashboard or you can programatically do it. I’ll probably have a post on this sometime soon.

Domain DNS Setup

This is the last step. This points your domain to the cloudfront distribution.

Create a CNAME record that points whatever domain / subdomain / whatever to your cloudfront distribution URL that you got in the previous step.

After this is done, clear your DNS cache.

If everything went ok, you should be able to access your site by using your domain name!!!!

final thoughs

This process took me around an hour to complete, shouldn’t take people more than that. I like the currently stack a lot, I’m pretty sure I won’t change too much in the future. I do want to automate the deployment process with something like travis. I’ll probably create releases and use travis to deploy to S3 and invalidate cloudfront cache.