Max Schmitt

July 15 2019

Writing a static site generator with MDX & Webpack

While making an effort to publish more content on my blog in the last few weeks, I've become increasingly frustrated with the lacking flexibility of my previous setup which was now about 4 years old.

My previous website and blog were built with a custom gulp-based static site-generator. It didn't have a solution for including JavaScript on pages and my CSS was quite the mess. So I spent the past week redoing my site and writing a custom static site generator in the process.

Introducing my new setup

Every page on this site is generated as static HTML from either a React component or an MDX file.

Each page also has its own companion JavaScript-bundle that hydrates the static markup and adds some dynamic behaviour if necessary.

The framework code to accomplish this is just about 500 lines long. There are a few gotchas that I discovered along the way which I'm going to discuss in this article.

Fullstack MDX with 2 Webpack configs

At the core of the framework that produces this site is a Webpack setup with two separate configs:

  1. Client-side Webpack: A Webpack config to produce client-side JavaScript that is served from a <script> tag for each content page.
  2. Server-side Webpack: A second Webpack config to produce React-components from the MDX content that can be required from the build process which then renders them into static HTML-files.

1. The client-side Webpack config

Producing the bundles for each individual content page is pretty straight-forward by simply using the babel-loader and @mdx-js/loader:

JS

const clientConfig = {
entry: clientEntryPoints,
output: {
filename: '[name].js',
path: clientOutDir,
},
module: {
rules: [
{
test: /\.mdx?$/,
use: ['babel-loader', '@mdx-js/loader'],
},
{
test: /\.js?$/,
use: ['babel-loader'],
},
],
},
}

⚠️ Gotcha: Components don't render anything when loaded by a <script> tag

*.mdx files get compiled into React components. For our browser-side code we need to make sure that we're calling React.render(<Component />, domNode).

That's why, before running Webpack, we generate an entry point for each page and write it to a temporary folder:

JS

async function writeClientEntryPoint(mdxContentFile, outFile) {
const content = `
const SiteContext = require('${require.resolve('./SiteContext')}')
const React = require('react')
const ReactDOM = require('react-dom')
const page = require('${mdxContentFile}')
const Component = page.default || page
ReactDOM.hydrate(
<SiteContext.Provider value={window.__siteContext__}>
<Component />
</SiteContext.Provider>,
document.querySelector('#site-root')
)
`
await fs.writeFile(outFile, content)
}

These are the actual entry points for the client-side Webpack config.

2. The server-side Webpack config

For our server-side Webpack config we must take into account a few special considerations:

⚠️ Gotcha 1: Compile content pages as modules, not normal Webpack bundles

When compiling MDX content for static rendering, the resulting Webpack bundles will not be loaded by the browser. Instead they are required by the build process.

That means we need to write them to a temporary folder so they don't end up in the dist folder. And they should be compiled as CommonJS modules so they can be required from the build process.

⚠️ Gotcha 2: Make sure we there is only one copy of React required at build time

Because React is already required by the build process during static rendering, we need to make sure that the bundles don't already contain a bundled version of React, otherwise React would throw an error.

So we configure Webpack externals to exclude anything in node_modules from the compiled output. This also speeds up the build and reduces file size of our temporary files.

⚠️ Gotcha 3: Exclude React components that are shared between framework and content pages

Every component gets access to certain variables like currentPageTitle and pagesList (a list of all pages on this site, it's used on the Posts overview to link to every blog post).

To expose this data, any component can require a special React context called SiteContext.

This SiteContext is used by the actual site content (a SiteContext.Consumer) that gets bundled by Webpack, but it's also used by the build process who is a SiteContext.Provider:

JS

// At build time:
const SiteContext = require('./SiteContext')
const siteContext = {
contentPages,
currentPageTitle: page.title,
}
const componentHTML = ReactDOMServer.renderToString(
<SiteContext.Provider value={siteContext}>
<Component />
</SiteContext.Provider>
)

JS

// Somewhere in the compiled site content:
const SiteContext = require('framework/SiteContext')
const { currentPageTitle } = useContext(SiteContext)
return <h1>{currentPageTitle}</h1>

If we don't exclude this SiteContext from the bundles, React will not identify the context required by build process and site content as one and the same (because their objects would have no referential equality) and any SiteContext.Consumer would always receive null for the context's value.

To address all these gotchas, the server-side Webpack config looks like this:

JS

const serverConfig = {
target: 'node',
// Because we want components as output, we can directly use our
// content files as the entry points for our server-side bundles.
entry: mdxContentFiles,
output: {
filename: '[name].js',
// We don't want our server-side bundles to end up in the dist
// folder so we write them to a temporary folder.
// ✅️Addressing gotcha 1: Compile content pages as modules,
// not normal Webpack bundles
path: tmpDir,
libraryTarget: 'commonjs',
},
// Server- and client-side Webpack configs use the same loaders.
module: {
rules: [
{
test: /\.mdx?$/,
use: ['babel-loader', '@mdx-js/loader'],
},
{
test: /\.js?$/,
use: ['babel-loader'],
},
],
},
// Externals allows us to exclude certain modules from
// being included in the single-file output bundle.
externals: [
// ✅Addressing gotcha 2: Make sure we there is only
// one copy of React required at build time:
require('webpack-node-externals')(),
// ✅Addressing gotcha 3: Exclude React components
// that are shared between framework and content pages:
function excludeSiteContext(context, request, cb) {
const absPath = require.resolve(path.resolve(context, request))
if (absPath === siteContextPath) {
return cb(null, 'commonjs ' + siteContextPath)
}
cb()
},
],
}

Rendering MDX pages in the build process

Of course Webpack by itself doesn't bring with it all the tools that are needed to actually render the static site, so here is roughly what happens.

First, it's important to note that we're not using the Webpack CLI. Instead we're requiring Webpack as a module to gain easy access to the stats object.

After Webpack produces its bundles, we go through the list of all server-side bundles (the compiled MDX pages) that were created. We require them, render them with ReactDOMServer.renderToString and inject the resulting component HTML into an HTML template that looks something like this:

master-page.html

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<meta http-equiv="X-UA-Compatible" content="ie=edge" />
${meta}
<title>${title}</title>
${styles}
</head>
<body>
<!-- htmlmin:ignore -->
<div id="site-root">${renderOutput}</div>
<!-- htmlmin:ignore -->
${scripts}
</body>
</html>

At this point we also have to collect the client-side JS from our Webpack output that needs to be included on this page. This is easy because server-side and client-side entry points share the same chunk names:

JS

const clientEntryPoints = {
index: '~/code/maxschmitt.me/.tmp/client-entry-points/index.js',
'posts/tutorial-rest-api-integration-testing-node-js':
'~/code/maxschmitt.me/.tmp/client-entry-points/posts/tutorial-rest-api-integration-testing-node-js.js',
'posts/css-100vh-mobile-browsers/index':
'~/code/maxschmitt.me/.tmp/client-entry-points/posts/css-100vh-mobile-browsers/index.js',
// ...
}
const serverEntryPoints = {
index: '~/code/maxschmitt.me/content/index.js',
'posts/tutorial-rest-api-integration-testing-node-js':
'~/code/maxschmitt.me/content/posts/tutorial-rest-api-integration-testing-node-js.mdx',
'posts/css-100vh-mobile-browsers/index': '~/code/maxschmitt.me/content/posts/css-100vh-mobile-browsers/index.mdx',
// ...
}

That's pretty much all there is to it! In summary, this is what the process looks like at a high level:

By the way, this site uses Emotion for CSS which inlines all the styles during static rendering so there was no special setup required for the CSS. 🙃

During development I'm using the handy live-server package to serve up my work in progress with live reload functionality from the dist folder.

All in all, I had a great time writing my own slightly more advanced static site generator. I learned a lot about server-side Webpack which I had mostly avoided up until that point. I hope that sharing some of my learnings in this post with you was helpful to you. As always, feel free to contact me for questions or feedback. :)