MDX is at the core of gatsby-mdx and unified is at the core of MDX... but if you're not immersed in the tech it can be hard to understand what MDX is, how it transforms markdown into JSX, and thus also why it's powerful.
If we import @mdx-js/mdx
, the core package for MDX, we can
transform arbitrary markdown strings into JSX. MDX does this
through a multi-step,
multi-AST
process. During this process we'll touch
mdast,
hast, and jsx. mdast
is usually referred to as
remark while hast is
usually referred to as
rehype.
Now that we've gotten our vocab out of the way let's take a look at some markdown content
This markdown is fairly standard. It has a header, some paragraph content and a code block.
some paragraph content
const some = 'code'
If we run this markdown through mdx using mdx.sync
(mdx is
async by default but we're toning it down a bit for this
blog post) we end up with the following JSX output.
export default class MDXContent extends React.Component {constructor(props) {super(props);this.layout = null;}render() {const { components, ...props } = this.props;return (<MDXTag name="wrapper" components={components}><MDXTagname="h1"components={components}>{`a title`}</MDXTag><MDXTagname="p"components={components}>{`some paragraph content`}</MDXTag><MDXTag name="pre" components={components}><MDXTagname="code"components={components}parentName="pre"props={{className: "language-js",metastring: "some-prop","some-prop": true}}>{`const some = 'code'`}</MDXTag></MDXTag></MDXTag>);}}
MDX outputs a React component authored as a class. The
layout
in the constructor is an MDX layout, which we'll go
over another time. For now, know that it's in the
constructor because we need a stable reference to the layout
component if there is one.
MDXTag
is a special component that handles rendering any
dom-level components. These are the ones that you typically
write in lowercase when using React regularly. MDXTag
is
what lets us replace the rendering of any of these elements
with out own React components. There are also a couple names
with special handling that we can replace, wrapper
and
inlineCode
.
Each MDXTag
takes the components
prop as an argument.
This is the same components
prop you might have used to
define your own React components to handle rendering a
pre
, a
, or p
tag.
The props
prop are the props that get passed to the
underlying element when rendered. In this example, the
code
element gets these props. Notice how one of the props
was parsed from the meta string of the code block. this is a
standard markdown feature.
{className: "language-js",metastring: "some-prop","some-prop": true}
That doesn't tell us much about how we got here though. So
let's create two plugins: one remark and one hast. These
plugins do nothing but print out the AST they receive and
return it. This allows us to see the mdast
and hast
output before it gets transformed into JSX.
const mdPlugin = () => ast => console.log(ast) || ast;const hastPlugin = () => ast => console.log(ast) || ast;console.log(mdx.sync(mdxContent, {mdPlugins: [mdPlugin],hastPlugins: [hastPlugin]}));
First the mdast
. Since mdast
is a markdown ast, we'll
see very high level objects here such as heading
, text
,
and paragraph
. There are no HTML tags yet. There's also
positional data that tells us where each node starts and
ends.
Nodes like heading
can have additional metadata on them
like depth
, which indicates what level of heading it is.
the mdast then gets transformed into HAST
using a set of
custom handlers for code blocks, ES2015 imports, and some other nodes.
{type: "root",children: [{type: "heading",depth: 1,children: [{type: "text",value: "a title",position: {start: {line: 1,column: 3,offset: 2},end: {line: 1,column: 10,offset: 9},indent: []}}],position: {start: {line: 1,column: 1,offset: 0},end: {line: 1,column: 10,offset: 9},indent: []}},{type: "paragraph",children: [{type: "text",value: "some paragraph content",position: {start: {line: 3,column: 1,offset: 11},end: {line: 3,column: 23,offset: 33},indent: []}}],position: {start: {line: 3,column: 1,offset: 11},end: {line: 3,column: 23,offset: 33},indent: []}},{type: "code",lang: "js",meta: "some-prop",value: "const some = 'code'",position: {start: {line: 5,column: 1,offset: 35},end: {line: 7,column: 4,offset: 74},indent: [1, 1]}}],position: {start: {line: 1,column: 1,offset: 0},end: {line: 8,column: 1,offset: 75}}}
Next the HAST
. Since HAST
is an HTML AST
, we'll see
how the original heading
in mdast
translated to an h1
tag in HAST
. Notice how the depth
property was used to
translate the heading
into an h1
specifically. The
HAST
also gets transformed in a similar way to the mdast
earlier using a
toJSX
function.
{type: "root",children: [{type: "element",tagName: "h1",properties: {},children: [{type: "text",value: "a title",position: {start: {line: 1,column: 3,offset: 2},end: {line: 1,column: 10,offset: 9}}}],position: {start: {line: 1,column: 1,offset: 0},end: {line: 1,column: 10,offset: 9}}},{type: "text",value: "\n"},{type: "element",tagName: "p",properties: {},children: [{type: "text",value: "some paragraph content",position: {start: {line: 3,column: 1,offset: 11},end: {line: 3,column: 23,offset: 33}}}],position: {start: {line: 3,column: 1,offset: 11},end: {line: 3,column: 23,offset: 33}}},{type: "text",value: "\n"},{type: "element",tagName: "pre",properties: {},children: [{type: "element",tagName: "code",properties: {className: ["language-js"],metastring: "some-prop","some-prop": true},children: [{type: "text",value: "const some = 'code'\n"}],position: {start: {line: 5,column: 1,offset: 35},end: {line: 7,column: 4,offset: 74}}}],position: {start: {line: 5,column: 1,offset: 35},end: {line: 7,column: 4,offset: 74}}}],position: {start: {line: 1,column: 1,offset: 0},end: {line: 8,column: 1,offset: 75}}}
It's true that we can import and use React components directly in MDX too, so what does that do to the JSX output? Let's take a look at the following MDX as it goes through the process.
# a title<SketchPicker />
Notice how the mdast keeps the React component as an HTML
node, untouched.
{type: "root",children: [{type: "heading",depth: 1,children: [{type: "text",value: "a title",position: {start: {line: 1,column: 3,offset: 2},end: {line: 1,column: 10,offset: 9},indent: []}}],position: {start: {line: 1,column: 1,offset: 0},end: {line: 1,column: 10,offset: 9},indent: []}},{type: "html",value: "<SketchPicker />",position: {start: {line: 3,column: 1,offset: 11},end: {line: 3,column: 17,offset: 27},indent: []}}],position: {start: {line: 1,column: 1,offset: 0},end: {line: 4,column: 1,offset: 28}}};
Check out how the html
block was transformed into a jsx
block.
{type: "root",children: [{type: "element",tagName: "h1",properties: {},children: [{type: "text",value: "a title",position: {start: {line: 1,column: 3,offset: 2},end: {line: 1,column: 10,offset: 9}}}],position: {start: {line: 1,column: 1,offset: 0},end: {line: 1,column: 10,offset: 9}}},{type: "text",value: "\n"},{type: "jsx",value: "<SketchPicker />",position: {start: {line: 3,column: 1,offset: 11},end: {line: 3,column: 17,offset: 27},indent: []}}],position: {start: {line: 1,column: 1,offset: 0},end: {line: 4,column: 1,offset: 28}}};
Notice how the React component was passed through unaltered.
export default class MDXContent extends React.Component {constructor(props) {super(props);this.layout = null;}render() {const { components, ...props } = this.props;return (<MDXTag name="wrapper" components={components}><MDXTagname="h1"components={components}>{`a title`}</MDXTag><SketchPicker /></MDXTag>);}}
Hopefully you have a bit more of an understanding of how MDX
content gets transformed into JSX and the steps involved. In
a future post we'll cover how the MDXProvider
and MDXTag
components work together to enable replacing dom element
rendering with custom React components.
In the meantime, since you made it this far, have a codesandbox link you can play around with and check the logs for how different content affects the mdast, hast, and JSX. Maybe even try writing your own remark plugins!