Converting my Jekyll website to Hugo
2020 Oct 09 - Brian Kloppenborg
Every decade or so some new technology comes along that vastly simplifies the maintenance and operation of my personal website. In the late 1990s I wrote pure HTML. In the early 2000s I wrote HTML with server-side includes. Later I started using custom Perl or PHP scripts. I tried out a few content management systems, like WordPress and Drupal but found them too complex for my needs. I hopped on the static site generator bandwagon in 2012 when I started using Jekyll and GitHub Pages.
Jekyll served me well for the past eight years, but I’ve never been fond of several of its implementation details. In particular, the following issues became annoying when writing blog posts:
- Forced segmentation of static assets and data from related content (big
complaint). For static files (e.g. images, PDFs, and other non-rendered
content), Jekyll prefers that they are located with a directory specified in
the global
_config.yml
file. This can be overridden on a per-page basis by inserting additional information into the front matter, but that’s just annoying. Unlike static content, data files must be placed in the_data
directory or a sub-directory thereof. - No segmentation between configuration data and content (minor complaint).
Although blog posts are separated into the
_posts
directory in Jekyll, regular pages are mixed into the over-all directory structure of the site. Thus you end up with content in directories with and without an underscore prefix. - The use of Ruby (minor complaint). Ruby has been a powerhouse for web development since the mid 2000s peaking in 2012. Despite its popularity, I’ve never used it and am not fond of the idea of installing another interpreter on my machine just to build a static website.
None of these complaints were deal breakers for me because Jekyll was absolutely a step in the right direction and it was so easy to use with GitHub pages; however, I kept looking for something better. When Hugo was released in 2013 it didn’t offer many compelling features to switch. But with the introduction of page bundles in version 0.32 (late 2017 time period) I became quite intrigued. After using it for my business website for the last two years, I finally decided to convert my website from Jekyll to Hugo.
Converting from Jekyll to Hugo
Although there are a few tools to convert Jekyll to Hugo (see migrations), I elected to do it by hand this time. Here are the major steps:
- Move all posts, pages, images, and data to
old-content
directory. Most of my content was under_posts
, but I did have a few additional pages sitting around. - Delete all Jekyll-specific content and directories. This includes the
_data
,_includes
,_layouts
,_plugins
,_posts
, and_site
directories as well as a few random pages. - Instantiate a new Hugo website and populate templates. This is done using
the
hugo new site .
command plus a little bit of reading on the hugo website. The Hugo lookup order is particularly important to read and understand. - Re-package and restore content, fixing broken pages / images along the way. This will be discussed below.
- Fix a few issues with my Git repository. This will also be discussed below.
Re-packaging content
Updated and standardized the content in the front matter
With Hugo I’m using the TOML format for my front matter. For blog posts I implemented the following archetype and updated the pages accordingly:
+++
title = "{{ replace .Name "-" " " | title }}"
date = {{ .Date }}
draft = true
author = "Brian Kloppenborg"
categories = [""]
tags = [""]
+++
Likewise, for regular pages, I’m using the following:
+++
title = "{{ replace .Name "-" " " | title }}"
date = "{{ .Date }}"
draft = true
author = "Brian Kloppenborg"
+++
After fixing this I re-packaged pages and their data into page bundles which let me keep the content neatly organized. Within the blog, I applied this to the following posts:
- Posts about my first startup attempt, Hastings Wireless
- Instructions for setting up Restic on Windows and Linux
- My 2017 review of Linux media center software
- and seven other pages.
This really simplified the directory structure of my blog as now the images and
content are hosted in the same place. Although I have plenty of pages with tables,
I didn’t use Jekyll’s data feature to implement them because of how separated the
content and data would be. Perhaps, in the future, I’ll try out Hugo’s
data-driven content
feature which lets you use JSON or CSV files directly using the getCSV
and getJSON
functions.
Fixing broken links
As with any major website migration, links are going to be broken. The first thing I repaired was Jekyll internal page references. I found them using grep:
grep -I -r 'post_url' content
Next I fired up Hugo and ran Linkchecker to find any additional broken references:
hugo server
linkchecker http://localhost:1313/
(Note that linkchecker
depends on Python 2 and, as of this date, has not been
updated for Python 3.)
Fixing git repository issues
While doing the conversion I noticed a few issues with my git repository:
- Some of my commits used a different email address.
- The repository was a direct pull of Jekyll from around 2015 which I never synced with upstream, so the history of Jekyll was superfluous.
- The repository had a bunch of binary data stored in it (mostly PDF files of my posters and papers).
These issues were relatively easy to fix, but took a little time.
To fix the author email issue, I used the instructions found on Stack Overflow for how to change the author and committer name information. This was quite straightforward.
Next to remove the old Jekyll commits, I
truncated my git history
starting from my first commit to this repository in 2015. The only thing I had
to do was find my first commit which was easily found by
git log author="Brian Kloppenborg"
and going to the bottom.
Lastly, I decided to move the PDF files into Git Large File Storage. To do so, I applied the BFG Repo-Cleaner to remove the PDFs from the git history as follows:
java -jar bfg-1.13.0.jar --convert-to-git-lfs "*.pdf" --no-blob-protection .
Then I cleared up the reflog
as follows:
git reflog expire --expire=now --all && git gc --prune=now --aggressive
And instantiated git-lfs
sudo apt install git-lfs
cd REPOSITORY
git lfs install
git lfs track "*.pdf"
git add .gitattributes
git add content/*.pdf
git commit -m "..."
git push
After this the repository was squeaky clean.
Deployment
One of the most awesome thing about static site generators is that they build
static HTML pages which means you can host your website anywhere. In the case
of kloppenborg.net I just host things locally.
So my deployment process is as easy as running Hugo to produce the pages
in the public
directory and then rsync-ing them to my hosting provider:
hugo && rsync -avz --delete public/ ${USER}@${HOST}:${DIR}
Where USER
and HOST
are the username and hostname of the server where
I place my content and DIR
is the directory where the content goes.