70 %
Chris Biscardi

Introspecting Gatsby Data for URL Redirects

I recently needed to switch out the slugger I was using from slugify, which requires a bunch of configuration to get the behavior I want, to @sindresorhus/slugify which requires none. To do this I needed to get all of the URLs generated on my Gatsby site and figure out which ones would change in the transition.

My first step was to open the GraphQL explorer that comes with any Gatsby site and input a query that would give me the path of every page on the site.

GraphQL
{
allSitePage {
edges {
node {
path
}
}
}
}

Which yields a result that is structured like this:

JS
{
"data": {
"allSitePage": {
"edges": [
{
"node": {
"path": "/dev-404-page/"
}
},
{
"node": {
"path": "/post/100-days-of-french"
}
},
...
]
}
}
}

Step 2 was to take this JSON output and trim it down into a list of URLs using jiq, an interactive version of jq. The following command takes the JSON I copied out of the GraphQL explorer, filters it through jiq, allowing me to modify the query if I want to, and then "copies" it back into my "clipboard".

sh
pbpaste | jiq '.data.allSitePage.edges[].node.path' -rc | pbcopy

I then installed the new slugger and re-ran the same steps. I put both outputs one over the other in an Emacs buffer and ran delete-duplicate-lines. Doing this with the first output on top of the second results in all of the first URLs and only the differing second URLs. I did the same with the orders flipped, which gives me all of the original URLs that will break at the bottom:

/
/2012/12/6/a-python-flask-crud/
/2013/1/12/riak-core-unique-identifiers/
/2013/1/12/riak-core-up-and-running/
/2013/1/13/riak-core-myappping/
/2013/1/13/riak-core-quickstart/
/2013/1/16/consistant-hash-routing-in-riak-core/
/2013/2/17/clojure-compojure-jetty-integration/
/2013/2/26/installing-yokozuna-on-osx-lion-with-homebrew/
/2013/2/8/quick-tip-haskell-list-comprehensions/
/2013/2/8/quick-tip-javascript-partially-applied-functions/
/2013/7/12/vagrant-can-not-ssh-into-the-box/
/2013/7/25/installing-montage-a-riak-sibling-resolution-library-written-in-haskell-on-osx/
/2014/1/10/common-lisp-on-android-running-the-mocl-android-example/
/2014/1/11/getting-started-with-snap-and-user-authentication-part-3/
/2014/1/15/scaffolding-a-clojurecompojure-webapp-for-heroku/
/2014/1/16/mocl-for-android-part-2-drakma/
/2014/1/27/android-robospice-with-googlehttpclient/
/2014/1/28/android-listfragment-populated-by-robospice/
/2014/1/5/flattening-nested-arrays-in-javascript/
/2014/1/6/getting-started-with-snap-and-user-authentication-part-1/
/2014/1/7/books-on-my-bookshelf-for-2014/
/2014/1/9/getting-started-with-snap-and-user-authentication-part-2/
/2014/1/9/using-databases-inside-of-snaplet-auth-restricted-routes/
/2014/10/15/deploying-snap-with-docker/
/2014/10/17/emacs-in-docker/
/2014/10/5/working-with-snap-1-0/
/2014/11/23/using-the-docker-haskell-official-image/
/2014/12/21/docker-machine/
/2014/12/27/deploying-a-minecraft-server-with-docker-machine/
/2014/2/1/in-search-of-a-riak-solr-client-for-haskell/
/2014/2/12/riak-haproxy-and-haskell-multimachine-vagrant-on-osx/
/2014/2/2/deploy-haskells-snap-on-heroku/
/2014/2/4/writing-a-first-emacs-elisp-function/
/2014/2/6/environment-variables-in-haskell/
/2014/2/7/geospatial-indexing-with-riak-search-2-0-yokozunasolr/
/2014/3/5/ghc-ddump-minimal-imports-and-cpp-error-missing-binary-operator-before-token/
/2014/3/8/getting-started-with-robolectric-headless-android-testing-with-vagrant/
/2014/4/8/snap-postgres-and-heist-displaying-data-from-queries/
/2014/4/9/handling-taps-on-listgrid-items-with-viewpagers-gesture/
/2014/6/22/getting-started-with-purescript/
/2014/6/29/stm-containers-benchmarks/
/2014/7/4/a-foray-into-haxl-postgresql-simple/
/dev-404-page/
/media/
/post/100-days-of-french
/post/accessing-props-in-mdx
/post/autocompleting-yarn-commands-in-zsh
/post/building-a-basic-gatsby-site
/post/building-a-docker-registry
/post/building-gatsby-plugin-webmentions
/post/chaining-emacs-commands-without-elisp
/post/codeblocks-mdx-and-mdx-utils
/post/component-shadowing-in-gatsby-child-themes
/post/components-vs-ast-manipulation-in-mdx
/post/composing-yarn-workspaces
/post/composition-of-styles-strings-vs-objects
/post/confluent-kafka-gke
/post/contorting-webpack-to-render-html-for-gatsby-mdx
/post/controlling-the-gatsby-application-root
/post/cool-stuff-you-can-do-with-fzf
/post/css-in-js-theming-on-a-component-level
/post/design-systems-people-processes-and-emergent-behavior
/post/do-expressions-and-optional-chaining
/post/drawing-a-blank
/post/emacs-command-frequency-2017
/post/emotion-configurable-imports
/post/emotion-patterns-button-group
/post/episode-1-a-new-blog
/post/formatting-markdown-and-codeblocks-with-prettier-and-hugo
/post/gatsby-plugin-mdx-components-api-design
/post/gatsby-themes-core-algorithm
/post/gatsby-themes-webpack-loaders-and-npm-packages
/post/getting-gatsby-images-from-generated-fields
/post/getting-started-with-emotion-and-gatsby
/post/hard-scheduling
/post/how-and-why-static-queries-in-gatsby-themes
/post/how-mdx-transforms-into-jsx
/post/instrumenting-servant-with-prometheus
/post/kafka-kube-and-statefulsets-on-gke
/post/listeners-without-renders-in-react
/post/mdx-custom-elements
/post/multi-package-repos-with-lerna
/post/multi-package-repos-with-yarn
/post/notes-on-parsing-the-graph-ql-sdl
/post/react-communicating-with-children
/post/react-state-with-class-properties
/post/react-waypoint-revealable
/post/removing-gatsby-mdx-wrappers
/post/reproducible-tutorials-with-stack
/post/response-headers-with-servant
/post/seo-images-with-unsplash
/post/servant-custom-content-types
/post/styled-system-on-react-hooks
/post/towards-shortcodes-for-gatsby-sites
/post/work-vs-home-equipment
/post/working-faster
/posts/
/post/codeblocks,-mdx,-and-mdx-utils
/post/design-systems-people,-processes,-and-emergent-behavior
/post/gatsby-themes,-webpack-loaders,-and-npm-packages
/post/kafka,-kube,-and-statefulsets-on-gke
/post/notes-on-parsing-the-graphql-sdl

I then took both outputs and combined the "extra" URLs into a netlify _redirects file in my static folder, which gets directly copied into my final site.

/post/codeblocks,-mdx,-and-mdx-utils /post/codeblocks-mdx-and-mdx-utils
/post/design-systems-people,-processes,-and-emergent-behavior /post/design-systems-people-processes-and-emergent-behavior
/post/gatsby-themes,-webpack-loaders,-and-npm-packages /post/gatsby-themes-webpack-loaders-and-npm-packages
/post/kafka,-kube,-and-statefulsets-on-gke /post/kafka-kube-and-statefulsets-on-gke
/post/notes-on-parsing-the-graphql-sdl /post/notes-on-parsing-the-graph-ql-sdl

Fin

I used Gatsby's introspection capabilities to get a list of all URLs on my site, filtered it through jq and then de-duplicated those lists to get the final list I needed fairly quickly. Getting this list of URLs can be annoying and difficult in other ecosystems, but Gatsby's GraphQL support makes it trivial.