Migrating from Jekyll to org-mode and Github Actions

Sep 03, 2019

Introduction

My website has been generated until now by Github pages using a static site generator tool called Jekyll and the Lagom theme.

Github Pages

Github Pages allows to host static sites in Github repositories for free. It is very simple, you would just put the website in a branch (eg. gh-pages) and Github will serve it as username.github.io/repo, or via a custom domain by adding a simple CNAME text file to the repository.

Static Site Generators

If you did not want to write HTML directly, you could use a static site generator to keep a set of templates and content in the git repository and putting the output of the generator (the HTML files) into the gh-pages branch to be served by Github.

In addition to be able to write the content in Markdown, site generators made much easier to maintain content like a blog, with features like different templates for posts, code syntax highlighting, drafts and support for themes, which allowed to change the look and feel of the site by just changing one configuration option and just re-generating it.

If you wanted this process to be automatic, you could run the generation as part of some CI job (eg. with TravisCI), so that the site is re-generated when its sources are updated.

Jekyll support in Github Pages

Jekyll is one of these site generators and it was the most popular for a while. The nice part was that if you used Jekyll, Github Pages will generate your website automatically, without having to setup CI. Just push your changes and a minute later your site was published.

On the other hand, you were limited by using the Jekyll version that was installed at Github, and you could not just install any add-on that you wanted. You did not control the environment to the level you did in a typical CI.

Moving away from Jekyll

Jekyll just worked. I can’t complain about it. However, I feel too tied to the Github Pages environment. You had to use a Jekyll version that was close to a year behind and live with the plugins that the environment supported, and nothing more.

If I was to setup my own CI workflow to overcome this limitation, why keep using Jekyll?. Hugo started to feel faster, easier to deploy locally and mostly compatible.

I have been using Emacs for more than 10 years. 3 years ago, I switched mail clients from Thunderbird to mu4e on top of Emacs. Then I discovered org-mode as a plain-text personal organization system and gradually started to live more time inside emacs. Microsoft did a so good job with Visual Studio Code that for a moment I thought I would not resist. However, Microsoft created an ecosystem by making the interaction with programming languages a standard, via the Language Server Protocol, and emacs-lsp made my programming experience with emacs just better.

I knew that org-mode was quite good at exporting. After I saw a couple of websites generated from org, I started to toy with the idea of using org too. It could also be a good chance to learn Emacs Lisp for real. So I started learning about org and websites.

Inspiration

As I did not know where to start, I started by reading a lot of solutions by other people, documentation, posts, etc.

Most of the structure of the final solution, its ideas, conventions, configuration and some snippets were taken from the following projects:

Implementation

Principles

Once I had a clear picture of the domain, I made up my mind of how I wanted it and my own requirements:

  • Use as much standard packages as possible. eg. org-publish. Avoid using “frameworks” on top of emacs/org
  • Links to old posts should still work (I had configured Jekyll to use /year/month/day/post-name.html )
  • Initially, I thought about the ability to migrate content gradually, eg. supporting Markdown posts for a while
  • Self contained. Everything should be in a single git repo, not interfering with my emacs.d.
  • Ability to run it from the command line, so that CI could be used to automatically generate the site from git

Emacs concepts to be used

There are a bunch of emacs concepts that help putting all the pieces together:

Emacs batch mode

While I could have most of the configuration in my ~/.emacs.d/init.el, I wanted a self-contained solution, not depending on my personal emacs configuration being available.

There are a bunch of emacs options that help achieving this:

$ emacs --help
...
--batch                     do not do interactive display; implies -q
...
--no-init-file, -q          load neither ~/.emacs nor default.el
...
--load, -l FILE         load Emacs Lisp FILE using the load function
...
--funcall, -f FUNC      call Emacs Lisp function FUNC with no arguments
...

With these options, we can put all our configuration and helper functions in a lisp file, call emacs as a script engine, skip our personal configuration, have emacs load the file with the configuration, and call a function to run everything.

org-mode Export (ox)

org-mode includes an Export subsystem with several target formats (ASCII, beamer, HTML, etc). Every backend/converter is a set of functions that take already parsed org-mode structures (eg. a list, a timestamp, a paragraph) and converts it to the target format. Worg, a section of the Org-mode web site that is written by a volunteer community of Org-mode fans, provide documentation on how to define an export backend (org-export-define-backend). From here is important to understand the filter system and org-export-define-derived-backend, which allows to define a backend by overriding an existing one. This is what I will end using to tweak, for example, how timestamps are exported.

org-publish

org-mode includes a publishing management system that helps exporting a interlinked set of org files. There is a nice tutorial also available.

Using org-publish boils down to defining a list of components (blog posts, assets, RSS), their options (base directory, target directory, includes/excludes, publishing function) being the publishing function one of the most interesting ones, as org comes with a few predefined ones eg. org-html-publish-to-html to publish HTML files and org-publish-attachment to publish static assets. The most important thing to learn here is that you can wrap those in your own to do additional stuff and customize publishing very easily. I use this for example, to skip draft posts or to write redirect files for certain posts in addition to the post itself.

The solution

Directory Structure

├── CNAME
├── css
│   ├── index.css
│   └── site.css
├── index.org
├── Makefile
├── posts
│   ├── 2019-10-31-some-post
│   │   └── index.org
│   ├── 2014-06-11-other-post
│   │   ├── images
│   │   │   ├── someimage.png
│   │   │   └── another-image.png
│   │   └── index.org
│   ├── archive.org
│   └── posts.org
├── public
├── publish.el
├── README.org
├── snippets
│   ├── analytics.js
│   ├── postamble.html
│   └── preamble.html
└── tutorials
    └── how-to-something
        └── index.org

Inside the directory tree, you can find:

  • a publish.el file with the org-publish project description and all the support code and helper functions
  • a CNAME file for telling Github Pages my domain name
  • a folder with a CSS file for all the site. Another one that is included only on the index page
  • a Makefile that just calls emacs with the parameters we described above and calls the function duncan-publish-all
  • a subdirectory for each post, and another one for tutorials
  • a public directory where the output files are generated and the static assets copied
  • a snippets directory with the preamble, postamble and Google analytics snippets

publish.el

The main file containing code and configuration includes a few custom publishing functions that are used as hooks for publishing and creating sitemaps.

org-publish project

The org-publish project (org-publish-project-alist) is defined in the variable duncan–publish-project-alist, and defines the following components:

  • blog

    This components reads all org files in the ./posts/ directory, and exports them to HTML using duncan/org-html-publish-post-to-html as the publishing function.

    This function injects the date as the page subtitle in the property list before delegating to the original function. This is a common pattern that you can use to override the publishing function. Note that subtitle is a recognized configuration property of the HTML export backend.

(defun duncan/org-html-publish-post-to-html (plist filename pub-dir)
  "Wraps org-html-publish-to-html.  Append post date as subtitle to PLIST.  FILENAME and PUB-DIR are passed."
  (let ((project (cons 'blog plist)))
    (plist-put plist :subtitle
               (format-time-string "%b %d, %Y" (org-publish-find-date filename project)))
    (duncan/org-html-publish-to-html plist filename pub-dir)))

The function also checks the #+REDIRECT_TO property, and generates redirect pages accordingly, by spawning another export to a different path in the same pub_dir.

This component is configured with a sitemap function which, even if goes through all posts, it is programmed to take only a few ones and write an org file with links to them. This file (posts.org) is then included in index.org and used as the list of recent posts.

  • archive-rss

    This component also operates on /.posts/, but instead of generating HTML, it uses the RSS backend to generate the full site archive as RSS.

    The sitemap function is also configured, like in the blog component, but this function generates an org file with all posts, not just the latest ones. The sitemap-format-entry function is shared between the sitemaps functions, as the list of posts looks the same.

sitemap-function.png
  • site

    The rest of the content of the site, including the index.org page and the generated org files for the archive (archive.org) and latest posts (posts.org).

  • assets

    All files that are just copied over

  • tutorials

    It works just like posts, but I setup each (don’t expect to have many) to use the ReadTheOrg theme. It uses the default HTML publishing function.

readtheorg.png

Export workflow

publishing-function.png

Look & Feel

I managed to port most of the Lagom look and feel by starting a CSS from scratch (learned CSS Grid in the process) and manually fixing each difference. I was quite satisfied with the final result. I had to use some extra page-specific CSS to hide the title in the front-page, or to display FontAwesome icons.

Before

jekyll.png

After

org.png

Publishing

While locally you can test by running emacs via the Makefile, I wanted a new way to run the generation on every git push:

  • I started with Travis, but I could not find container jobs anymore. When I relalized that Ubuntu had an older Emacs version I just lost interest.
  • Gitlab was easy to get working. Not only because is very simple and elegant, but some of the examples I took inspiration from where already using it, along the Emacs Alpine container image. However, I did not want to have everything in Github, except my site
  • Then I realized Github Actions was in beta, but I was not yet in. Until:

Setting up a workflow to build the site

While, on Linux, Github actions run on Ubuntu, they allow you to execute an action inside a container.

name: Build and publish to pages
on:
  push:
    branches:
    - master

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@master
      with:
	fetch-depth: 1
    - name: build
      uses: docker://iquiw/alpine-emacs
      if: github.event.deleted == false
      with:
	args: ./build.sh
    - name: deploy
      uses: peaceiris/actions-gh-pages@v1.1.0
      if: success()
      env:
	GITHUB_TOKEN: ${{ secrets.PERSONAL_ACCESS_TOKEN }}
	PUBLISH_BRANCH: gh-pages
	PUBLISH_DIR: ./public

Combining actions/checkout, our own build script running as a container docker://iquiw/alpine-emacs inside the VM, and peaceiris/actions-gh-pages we get the desired results.

As a caveat, you need to setup a personal access token for the action, as the default one will not work and what gets pushed to the gh-pages will not show up in your website.

The experience with Github Actions has been very positive. I will definitely replace most of my TravisCI usage in my repositories. Kudos to the Github team.

Conclusions

Not only I have a website powered by the tool I use daily, but it is also packed with awesome features. For example, the diagrams in this post are inlined as PlantUML code in the org file and exported via Org Babel.

It also gave me project to learn Emacs Lisp. I do plan to add some minor features, like blog tags or categories and perhaps commenting. The learning will also benefit personalizing my editor and mail client.

The source of this site is available on Github.