Gatsby SEO Component
This is a 4 part series about building SEO-optimized Gatsby blog.
In Part 3, we will create an SEO
component, which will render Twitter, Open Graph, and Schema.org meta tags.
Then we will cover all the necessary tools for validating the markup.
At any point of time feel free to checkout the source code in GitHub or the live blog.
Let's take a few moments to familiarize ourselves with the project structure.
Click on the folder to expand/collapse- articles
- golden-retriver
- cover.jpg
- index.mdx
- pug
- cover.jpg
- index.mdx
- siberian-husky
- cover.jpg
- index.mdx
- src
- images
- about.jpg
- blog.jpg
- contact.jpg
- home.jpg
- components
- Layout
- index.jsx
- Nav
- index.jsx
- SEO
- DefaultMeta.jsx
- OpenGraph.jsx
- SchemaOrg.jsx
- Twitter.jsx
- getActivePages.js
- getCurrentUrl.js
- getImageUrls.js
- index.jsx
- fragments
- FrontmatterFields.js
- ImageUrlFields.js
- helpers
- slashify.js
- hooks
- useSiteMetadata.js
- pages
- about.jsx
- blog.jsx
- contact.jsx
- index.jsx
- templates
- article.jsx
- static
- logo.jpg
- .nvmrc
- .env.development
- .env.production
- site-metadata.js
- gatsby-config.js
- gatsby-node.js
- package.json
The SEO
Component
The SEO
component receives data from every page and acts as a wrapper for other, smaller components. Based on the
received data it constructs imageUrls
, activePages
objects, and an url
string that is passed down the
components tree to DefaultMeta
, Twitter
, OpenGraph
, and SchemaOrg
.
import React from "react"import DefaultMeta from "./DefaultMeta"import OpenGraph from "./OpenGraph"import Twitter from "./Twitter"import SchemaOrg from "./SchemaOrg"import useSiteMetadata from "../../hooks/useSiteMetadata"import getActivePages from "./getActivePages"import getImageUrls from "./getImageUrls"import getCurrentUrl from "./getCurrentUrl"const SEO = ({pathName,slug,title,description,images,imageAlt,pageId,type,breadcrumb,published,modified,}) => {const {siteUrl,siteName,firstName,lastName,language,socialMedia,logo,address,speakableSelector,pages,} = useSiteMetadata()const imageUrls = getImageUrls({ images, siteUrl })const activePages = getActivePages({ pages, pageId })const url = getCurrentUrl({siteUrl,pathName,slug,pages,activePages,})const defaultMeta = {title,description,language,url,}const twitter = {title,description,imageUrls,imageAlt,socialMedia,}const openGraph = {siteName,firstName,lastName,title,description,imageUrls,imageAlt,modified,published,language,activePages,url,}const schemaOrg = {siteUrl,siteName,firstName,lastName,logo,language,socialMedia,address,speakableSelector,pathName,title,description,imageUrls,breadcrumb,type,modified,published,slug,pages,activePages,url,}return (<><DefaultMeta {...defaultMeta} /><Twitter {...twitter} /><OpenGraph {...openGraph} /><SchemaOrg {...schemaOrg} /></>)}export default SEO
The getImageUrls
Helper Function
After applying the ImageUrlFields
fragment
on a File node we get back a deeply nested object with multiple relative paths. The purpose of the getImageUrls
function is to generate a URL for every path and pack it in a concise object.
const getImageUrls = ({ images, siteUrl }) =>Object.entries(images).reduce((acc, image) => {const [key,{images: {fallback: { src: path },},},] = imageconst url = `${siteUrl}${path}`acc[`${key}ImageUrl`] = urlreturn acc}, {})export default getImageUrls
The getActivePages
Helper Function
All page types have different markup templates. For example, the /contact
page corresponds to the
ContactPage schema and the /blog/pug/
page corresponds to the
Article schema. However, there is no mechanism within the SEO
component to distinguish between these two pages, and therefore no way of knowing which template to use.
The getActivePages
helper takes care of identifying the page type of the received data. It does
this by checking the pageId
prop against a list of all pages defined in
site-metadata.js
. Later on, we will rely
on its output to map pages with their corresponding templates.
const getActivePages = ({ pages, pageId }) =>Object.entries(pages).reduce((acc, page) => {const [name, { id }] = pageconst firstLetter = name[0].toUpperCase()const remainingLetters = name.substr(1)const key = `is${firstLetter}${remainingLetters}`acc[key] = id === pageIdreturn acc}, {})export default getActivePages
The getCurrentUrl
Helper Function
getCurrentUrl
returns the page URL based on the output from the getActivePages
helper.
import slashify from "../../helpers/slashify"const getCurrentUrl = ({siteUrl,pathName,slug,pages: {blog: { pathName: blogPathName },},activePages: { isHome, isArticle },}) =>isHome? siteUrl: isArticle? slashify(siteUrl, blogPathName, slug): slashify(siteUrl, pathName)export default getCurrentUrl
The DefaultMeta
Component
Before adding the DefaultMeta
component, we will need to install react-helmet
and
its corresponding Gatsby plugin gatsby-plugin-react-helmet:
yarn add react-helmet gatsby-plugin-react-helmet
Update gatsby-config.js
:
// ✂️module.exports = {// ✂️plugins: [`gatsby-plugin-react-helmet`,// ✂️],}
react-helmet
is a document head manager for React, while gatsby-plugin-react-helmet
provides a drop-in support
for server rendering data added with React Helmet.
import React from "react"import Helmet from "react-helmet"const DefaultMeta = ({ title, description, language, url }) => {const lang = language.replace(`_`, `-`)return (<Helmet><html lang={lang} /><title>{title}</title><meta name="description" content={description} /><link rel="canonical" href={url} /></Helmet>)}export default DefaultMeta
The Helmet
component takes your HTML tags and renders them inside of the head
tag.
For the lang
attribute we replace _
on -
to follow BCP 47.
The Twitter
Component
The Twitter
component contains Twitter specific meta tags. When the URL on your blog
is shared via Twitter, it will be expanded in an attractive summary card with a large image.
import React from "react"import { Helmet } from "react-helmet"const Twitter = ({title,description,imageUrls: { twitterImageUrl: image },imageAlt,socialMedia: { twitter: creator },}) => (<Helmet><meta name="twitter:card" content="summary_large_image" /><meta name="twitter:title" content={title} /><meta name="twitter:description" content={description} /><meta name="twitter:image" content={image} /><meta name="twitter:image:alt" content={imageAlt} /><metaname="twitter:creator"content={new URL(creator).pathname.replace(`/`, `@`)}/></Helmet>)export default Twitter
twitter:card
is a type of card. We use "Summary Large Image" to create a prominent, full-width image beside the tweet.twitter:creator
is the person who wrote this content, so its value has to be a Twitter handle starting with the@
symbol.
To learn more about Twitter cards, check out their documentation.
<meta name="twitter:card" content="summary_large_image" /><meta name="twitter:title" content="Jane Doe" /><meta name="twitter:description" content="Jane Doe's place on the web" /><meta name="twitter:image" content="https://gatsby-seo.netlify.app/static/0cf5c621d3f5b6ef8b857e489c39e2e5/c3xs4/home.jpg" /><meta name="twitter:image:alt" content="Two corgis sitting next to each other" /><meta name="twitter:creator" content="@gatsbyjs" />
The OpenGraph
Component
The Open Graph protocol is supported by a wide range of social media platforms, including Facebook (which created this protocol in 2010), LinkedIn, Pinterest, and Twitter (when Twitter can't find Twitter meta tags, it reverts to Open Graph), and others. To learn more about it, you can have a look at the Open Graph protocol documention.
The OpenGraph
component renders markup based on the provided page type. This is when getActivePages
becomes handy.
import React from "react"import { Helmet } from "react-helmet"const OpenGraph = ({siteName,firstName,lastName,title,description,imageUrls: { openGraphImageUrl: image },imageAlt,modified,published,language,activePages: { isArticle },url,}) => (<Helmet><meta property="og:type" content={isArticle ? `article` : `website`} /><meta property="og:url" content={url} /><meta property="og:site_name" content={siteName} /><meta property="og:locale" content={language} /><meta property="og:title" content={title} /><meta property="og:description" content={description} /><meta property="og:image" content={image} /><meta property="og:image:alt" content={imageAlt} />{isArticle && published && (<meta property="article:published_time" content={published} />)}{isArticle && modified && (<meta property="article:modified_time" content={modified} />)}{isArticle && (<meta property="article:author" content={`${firstName} ${lastName}`} />)}</Helmet>)export default OpenGraph
og:type
is the type of object represented in Open Graph (e.g. music, video, website, profile, article, etc.). In our case, this will be article when rendered inside of theArticle
page component, and website for the rest of the pages.og:url
is the canonical URL of the object.og:locale
is the language of the content (when not specified, this defaults toen_US
).
The following tags will be rendered only in the Article
template:
article:author
article:published_time
article:modified_time
<meta property="og:type" content="website" /><meta property="og:url" content="https://gatsby-seo.netlify.app" /><meta property="og:site_name" content="Gatsby SEO" /><meta property="og:locale" content="en_US" /><meta property="og:title" content="Jane Doe" /><meta property="og:description" content="Jane Doe's place on the web" /><meta property="og:image" content="https://gatsby-seo.netlify.app/static/0cf5c621d3f5b6ef8b857e489c39e2e5/c3xs4/home.jpg" /><meta property="og:image:alt" content="Two corgis sitting next to each other" />
What is Schema.org?
Schema.org is a structured data markup that helps search engines interpret the content of web pages. It was initially launched in 2011 by Google, Yahoo, and Bing. Once added to a webpage, it conveys the contextual meaning to search engines by using a hierarchical set of schemas.
HTML cannot provide the meaning of a text string. For example, a webpage can include a header of <h1>Apple</h1>
, yet it
remains unclear whether it is referring to the fruit or the company. This is when Schema.org comes into play to provide semantic meaning to search engines.
The Schema.org vocabulary is vast, so for practical reasons, we will focus only on the schemas that Google officially supports. This way, we are going to help the search engines and benefit from rich results in the process (aka rich snippets which are normal Google search results with additional data displayed). Rich snippets stand out from the other search results, look more appealing, and have a higher click-through rate.
Before jumping into the code, we should clarify what a node identifier is. A node identifier is a unique @id
property
that gets attached to a schema and can be referenced from inside another schema. This is extremely useful for
keeping the code DRY and readable.
Take a look at this example:
<script type="application/ld+json">{"@context": "https://schema.org","@type": "PostalAddress","@id": "https://example.com/#address","addressCountry": "US","addressLocality": "Los Angeles","addressRegion": "CA"}</script><script type="application/ld+json">{"@context": "https://schema.org","@type": "Person","@id": "https://example.com/#person","name": "Jane Doe","address": {"@id": "https://example.com/#address"}}</script><script type="application/ld+json">{"@context": "https://schema.org","@id": "https://example.com/#organization","@type": "Organization","url": "https://example.com","name": "Gatsby SEO","logo": {"@type": "ImageObject","url": "https://example.com/logo.jpg","height": 640,"width": 640},"address": {"@id": "https://example.com/#address"}}</script>
In the above example, we created a PostalAddress
schema and then referenced it inside of the Person
and
Organization
schemas by passing its node identifier ID. If we don't take advantage of node identifiers, we will end up
having to duplicate the whole PostalAddress
object inside the Person
and Organization
schemas.
The node identifier value can be pretty much anything, but the standard practice is to use the domain name appended with
the number sign and the schema type name as follows: https://example.com/#schema
.
The SchemaOrg
Component
In the SchemaOrg
component we generate address, person, organization, breadcrumb, and page (WebPage
, Blog
,
ContactPage
, AboutPage
, or Article
) schemas, link them together using node identifiers
and compose into a single JSON object.
import React from "react"import Helmet from "react-helmet"import slashify from "../../helpers/slashify"const SchemaOrg = ({siteUrl,siteName,firstName,lastName,logo: { pathName: logoPathName, width: logoWidth, height: logoHeight },language,socialMedia: { twitter, github },address,speakableSelector,pathName,title,description,imageUrls: {schemaOrg1x1ImageUrl,schemaOrg4x3ImageUrl,schemaOrg16x9ImageUrl,},breadcrumb,type,modified,published,slug,pages: {home: { breadcrumb: homeBreadcrumb },blog: { breadcrumb: blogBreadcrumb, pathName: blogPathName },},activePages: { isHome, isBlog, isAbout, isContact, isArticle },url,}) => {const schemaId = id => `${siteUrl}/#${id}`const fullName = `${firstName} ${lastName}`const pageUrl = slashify(siteUrl, pathName)const blogPageUrl = slashify(siteUrl, blogPathName)const articlePageUrl = slashify(siteUrl, blogPathName, slug)const schemaOrgAddress = {"@type": `PostalAddress`,"@id": schemaId(`address`),...address,}const schemaOrgPerson = {"@type": `Person`,"@id": schemaId(`person`),url: siteUrl,name: fullName,sameAs: [twitter, github],address: {"@id": schemaId(`address`),},}const schemaOrgOrganization = {"@id": schemaId(`organization`),"@type": `Organization`,url: siteUrl,name: siteName,logo: {"@type": `ImageObject`,url: slashify(siteUrl, logoPathName),height: logoHeight,width: logoWidth,},address: {"@id": schemaId(`address`),},}const schemaOrgPage = Object.assign({"@type": type,author: { "@id": schemaId(`person`) },publisher: { "@id": schemaId(`organization`) },description,headline: title,inLanguage: language,name: title,url,mainEntityOfPage: url,image: [schemaOrg1x1ImageUrl,schemaOrg4x3ImageUrl,schemaOrg16x9ImageUrl,],},isArticle &&published && {datePublished: published,},isArticle && modified? { dateModified: modified }: published? { dateModified: published }: null,!isBlog &&speakableSelector && {speakable: {"@type": `SpeakableSpecification`,cssSelector: speakableSelector,},})const breadcrumbList = [{id: siteUrl,name: homeBreadcrumb,},]if (isBlog || isContact || isAbout) {breadcrumbList.push({id: pageUrl,name: breadcrumb,})} else if (isArticle) {breadcrumbList.push({id: blogPageUrl,name: blogBreadcrumb,},{id: articlePageUrl,name: title,})}const schemaOrgBreadcrumbs = {"@type": `BreadcrumbList`,"@id": schemaId(`breadcrumbs`),name: `Breadcrumbs`,itemListElement: breadcrumbList.map(({ id, name }, index) => ({"@type": `ListItem`,position: index + 1,name,item: {"@type": `WebPage`,"@id": id,},})),}const schemaOrg = {"@context": "http://schema.org","@graph": [schemaOrgAddress,schemaOrgPerson,schemaOrgOrganization,schemaOrgPage,],}if (!isHome) {schemaOrg["@graph"].push(schemaOrgBreadcrumbs)}return (<Helmet><script type="application/ld+json">{JSON.stringify(schemaOrg)}</script></Helmet>)}export default SchemaOrg
PostalAddress
consists ofcountry
,locality
, andregion
fields.Person
consists ofname
,url
(root URL),sameAs
(list of social media profiles) andaddress
fields.Organization
consists ofurl
,name
,address
(reference to thePostalAddress
schema), andlogo
fields.
Now let's add the logo.jpg
image to the ./static
directory. Any file you put into this directory will be
copied to the public directory. For instance, if you add a file named dog.jpg
to the static directory,
it’ll be copied to public/dog.jpg
and become available at the https://<SITE_URL>/dog.jpg
URL.
Once we are done with the smaller pieces, we can move on to the generic page schema. This will generate WebPage
,
Blog
, ContactPage
, AboutPage
, and Article
schemas. Here is the list of properties that
are shared in all of these schemas:
type
name
description
author
publisher
headline
inLanguage
url
mainEntityOfPage
image
The rest of the properties will be added conditionally depending on the page type. The article schema requires
datePublished
, and dateModified
, and the blog schema contains speakable
field.
To conditionally add properties to the object, we leverage the Object.assign
method. To avoid duplicating the
datePublished
value in dateModified
for every new blog post, we check if dateModified
exists and then apply
it. Otherwise, we reuse the datePublished
value as dateModified
.
The breadcrumbList
schema doesn't appear to be a part of any other schema. It consists of a chain of linked web pages.
The position property is used to order the items and starts with 1
. The home page will always be the first item, which
is why we define breadcrumbList
with siteUrl
right away. The second level is a blog, contact, or about page.
With our URL structure, the third level will always
be an article and always go after the blog page. That's why we push both of these values to the breadcrumbList
array at once.
Once all the schemas are constructed, we create a node array using @graph
and fill it in with schemas. The only
edge case we need to handle is breadcrumbList
on the home page. There is no reason to have to, so it needs to be excluded by
checking the isHome
variable.
{"@context": "http://schema.org","@graph": [{"@type": "WebPage","name": "Jane Doe","headline": "Jane Doe","description": "Jane Doe's place on the web","url": "https://gatsby-seo.netlify.app","mainEntityOfPage": "https://gatsby-seo.netlify.app","image": ["https://gatsby-seo.netlify.app/static/0cf5c621d3f5b6ef8b857e489c39e2e5/5bb6f/home.jpg","https://gatsby-seo.netlify.app/static/0cf5c621d3f5b6ef8b857e489c39e2e5/34erv/home.jpg","https://gatsby-seo.netlify.app/static/0cf5c621d3f5b6ef8b857e489c39e2e5/c3xs4/home.jpg"],"inLanguage": "en_US","author": {"@id": "https://gatsby-seo.netlify.app/#person"},"publisher": {"@id": "https://gatsby-seo.netlify.app/#organization"},"speakable": {"@type": "SpeakableSpecification","cssSelector": ["data-speakable=\"true\""]}}]}
SEO
Component Validation
To validate Twitter meta tags, use the Twitter Card Validator tool. Make sure
that you are testing production URLs, because this tool respects robots.txt
settings.
For Open Graph validation, use the Facebook Sharing Debugger. The Facebook sharing debugger caches results, so if you don't see the latest changes, click
the "Scrape Again" button to fetch the latest version of the page. After clicking the debug button, you will most likely see a
warning - "The following required properties are missing: fb:app_id
". Don't panic! The fb:app_id
meta tag is not required and it doesn’t do anything nowadays anyway.
For schema.org validation, I recommend using Google Rich Results in conjunction with the Schema.org Validator. In my experience, there are times that Google Rich Results will not find specific issues whereas Schema.org Validator will, and vice versa.