Fight Link Rot with Server- and Client-side Redirects (Netlify and Gatsby)
Links break. Here’s how to fight link rot on your site with redirects.
Why you need server-side redirects
URL architectures change all the time as the needs of a site grows. What starts out as /my-post
can become @swyx/my-post
or posts/my-post
or news/2019/my-post
in future.
Consider the user experience. Imagine you’re doing some deep research and after hours of scrounging through the back pages of your Google results, you find a link to something which could solve all your problems. You click it, and the site is still up, but all you see is a 404 page! If the content is still up, hopefully the site has a search function to find it, or Google indexes it. A minor annoyance, sure, but one you as a responsible webmaster could avoid for your users.
The principle of “Don’t Break the Web” becomes even more pressing considering automated workflows like social media unfurls and Email/RSS/site scrapers will simply break.
At the most basic form, you will want to redirect from URL A to URL B:
/my-broken-url -> /posts/my-new-url
The way this is typically done is by setting up a .htaccess file or server redirect.
One-to-one redirection is very customizable, however may fail to scale for large groups of posts that you may need to redirect.
Netlify offers more powerful redirect configuration with Netlify Redirects. You can use placeholders to declaratively rearrange URLs. You can proxy serverless functions. You can use cookie-based language redirects. My favorite, partly because it is fun to say, is the splat feature:
/posts/* /news/:splat
Which is kind of like the spread operator of URLs.
Search engine indexes will update accordingly as your redirects get visited.
However, in the age of modern Single Page Apps, this isn’t the full story.
Why you need client-side redirects
Server-side redirects take care of inter-site linking: the case of other sites navigating into your site on a broken link.
Client-side redirects address the case of intra-site linking: when your own site links to other pages in your site, rendered via JavaScript so it doesn’t refresh via the server, and the link breaks.
Single Page Apps use client-side routing in order to avoid a full page refresh (which I recently learned isn’t always faster! TIL). This has two primary implications for link rot.
The first and simplest case is that a basic Single Page App, like one set up by create-react-app
, can actually not need a long list of complex server-side redirects. Just setting up a simple Single Page App catchall and letting clientside routing handle everything means we can also set up redirects only on the clientside. All JavaScript frameworks support this, from React Router to Vue Router. This allows us to manage our redirects (old routes) in the same codebase/location as our routing (current routes).
One drawback of this approach is that the process of resolving the page upon a full page refresh is rather roundabout. First you hit /my-old-url
, which the server then serves the client bundle for /
, which then parses and renders on the clientside, which then reads /my-old-url
, which then redirects to /posts/my-new-url
.
The second implication of client-side routing for link rot is that modern JavaScript static site generators like Gatsby face a hybrid problem where the bundles for each statically generated page must do client-side routing, however we do not want to configure server-side catchalls (in the simple Single Page App way) or we will lose the whole benefit of using a static site generator in the first place.
It is extraordinarily easy to set up client-side redirects with Gatsby, as createRedirect
is a first-class API. This is made easier by several plugins like gatsby-plugin-client-side-redirect or gatsby-redirect-from which all make the redirects slightly easier to write. There are even hacky redirect plugins using meta tags and for serving Gatsby on an Express server.
Flash of NotFound Content
The drawback of using SSG with client-side redirects and no server-side redirects is that you will first render the not-found page first until the JavaScript loads and takes over. Leading to the dreaded Flash of NotFound Content (kidding, I made it up):
Using both Server-side and Client-side redirects
Ostensibly, the solution here is to set up parallel server-side and client-side redirects. You may wish to autogenerate the .htaccess file if you are using an Apache server, or use Netlify Redirects to set up a parallel implementation of redirects.
You might even write a tool to output these redirects automatically… 🤔