Passing Search A Query Without Using The SearchBox Component In Algolia

You probably haven't noticed (and you'd be forgiven if this is the case!) that my site now has the ability to search through posts. This is a strange turn of events for me as I decided to remove search capability from my site many years ago as I didn't feel it added any benefits for the user. This became evident from Google Analytics stats where searches never hit high enough numbers to warrant having it. The numbers don't lie!

So what caused this turnaround?

I've noticed that I'm regularly referring back through posts to refresh myself on things I've done in the past and to find solutions to issues I know I've previously written about. Having a search would make trawling through my few hundred posts a lot easier. So this is more of a personal requirement than commercial. But there is an exciting aspect to this as well - experimenting with Algolia. Using Algolia search is something I've been meaning to look into for a long time and integrating with GatbsyJS.

The thought of having the good ol' magnifying glass back in the navigation makes me nostalgic!

Note: In this post, I won't be covering the basic Algolia setup or the plugins needed to install as there is already a great wealth of information online. Check out my "Useful Links" section at the end of the post.

Basic Setup

Integrating Algolia into GatbsyJS was relatively straight-forward due to the wealth of information that others have already written and also the plugins themselves. The plugins make light work of rendering search results quickly allowing enough customisations to the HTML markup for easy implementation within any site. By default, the plugins contain the following components:

  • InstantSearch
  • SearchBox
  • Hits
import algoliasearch from 'algoliasearch/lite';
import PropTypes from 'prop-types';
import { Link } from 'gatsby';
import { InstantSearch, Hits, Highlight, SearchBox } from 'react-instantsearch-dom';
import React from 'react';

// Get API keys from the environment file.
const appId = process.env.GATSBY_ALGOLIA_APP_ID;
const searchKey = process.env.GATSBY_ALGOLIA_SEARCH_KEY;
const searchClient = algoliasearch(appId, searchKey);

const SearchPage = () => (
  <InstantSearch
    searchClient={searchClient}
    indexName={process.env.GATSBY_ALGOLIA_INDEX_NAME}
  >
    <SearchBox />
    <Hits hitComponent={Hit} />
  </InstantSearch>
);

function Hit(props) {
  return (
    <article className="hentry post">
      <h3 className="entry-title">
        <Link to={props.hit.fields.slug}>
          <Highlight attribute="title" hit={props.hit} tagName="mark" />
        </Link>
      </h3>
      <div className="entry-meta">
        <span className="read-time">{props.hit.fields.readingTime.text}</span>
      </div>
      <p className="entry-content">
        <Highlight hit={props.hit} attribute="summary" tagName="mark" />
      </p>
    </article>
  );
}

Hit.propTypes = {
  hit: PropTypes.object.isRequired,
};

export default SearchPage;

The InstantSearch is the core component that directly interacts with Algolia's API and takes in two properties, "searchClient" and "indexName" containing the Application ID and Search Key that is acquired from the Algolia account setup. This component contains two child components, SearchBox is the search textbox and Hits that displays results from the search query.

It is the Hits component where we can customise the HTML with our own markup by using it's "hitComponent" attribute. In my case, I created a function to generate HTML where I access the properties from the search index. What's really cool is here is we have the ability to also highlight our search term where they may occur in the results by using the Highlight component (also provided by the Algolia plugin) and adding a "tagName" attribute.

Removing The SearchBox Component

The standard implementation may not suit all scenarios as you may want a search term to be sent to the InstantSearch component differently. For example, it could be from a custom search textbox or (as in my case) read from a query-string parameter. It wasn't until I started delving further into the standard setup I realised you cannot just remove the SearchBox component and pass a value directly, but there is a workaround.

I have expanded upon the code-snippet, above, to demonstrate how my search page works...

import algoliasearch from 'algoliasearch/lite';
import PropTypes from 'prop-types';
import { Link } from 'gatsby';
import { InstantSearch, Hits, Highlight, connectSearchBox } from 'react-instantsearch-dom';
import Layout from "../components/global/layout";
import React, { Component } from "react";

// Get API keys from the environment file.
const appId = process.env.GATSBY_ALGOLIA_APP_ID;
const searchKey = process.env.GATSBY_ALGOLIA_SEARCH_KEY;
const searchClient = algoliasearch(appId, searchKey);
const VirtualSearchBox = connectSearchBox(() => <span />);

class SearchPage extends Component { 
  state = {
    searchState: {
      query: '',
    },
  };

  componentDidMount() {   
    // Get "term" query string parameter value.
    let search = window.location.search;
    let params = new URLSearchParams(search);
    let searchTerm = params.get("term");

    // Send the query string value to a "searchState" object used by Algolia.
    this.setState(state => ({
      searchState: {
        ...state.searchState,
        query: searchTerm,
      },
    }));
 }

  render() {
      // Default "instantSearch" HTML to prompt user to enter a search term.
      var instantSearch = null;
      
      // If there is a search term, utilise Algolia's instant search.
      if (this.state.searchState.query) {
        instantSearch = <div className="entry-content">
                          <h2>You've searched for "{this.state.searchState.query}".</h2>
                          <div className="post-list archives-list">
                          <InstantSearch
                              searchClient={searchClient}
                              indexName={process.env.GATSBY_ALGOLIA_INDEX_NAME}
                              searchState={this.state.searchState}
                            >
                              <VirtualSearchBox />
                              <Hits hitComponent={Hit} />
                            </InstantSearch>  
                          </div>
                        </div>
      }
      else {
        instantSearch = <div className="entry-content">
                          <h2>You haven't entered a search term.</h2>
                          <p>Carry out a search by clicking the <em>magnifying glass</em> in the navigation.</p>
                        </div>
      }

      return (
        <Layout>
          <header className="page-header">
            <h1>Search</h1>
            <p>Search the knowledge-base...</p>
          </header>
          <div id="primary" className="content-area">
            <div id="content" className="site-content" role="main">
                <div className="layout-fixed">
                    <article className="page hentry">
                      {instantSearch}
                    </article>
                </div>
            </div>
          </div>
      </Layout>
    )
  }
}

function Hit(props) {
  return (
    <article className="hentry post">
      <h3 className="entry-title">
        <Link to={props.hit.fields.slug}>
          <Highlight attribute="title" hit={props.hit} tagName="mark" />
        </Link>
      </h3>
      <div className="entry-meta">
        <span className="read-time">{props.hit.fields.readingTime.text}</span>
      </div>
      <p className="entry-content">
        <Highlight hit={props.hit} attribute="summary" tagName="mark" />
      </p>
    </article>
  );
}

Hit.propTypes = {
  hit: PropTypes.object.isRequired,
};

export default SearchPage

My code is reading from a query-string value and passing that to a "searchState". The searchState object is created by React InstantSearch internally. Every widget inside the library has its own way of updating it. It contains parameters on the type of search that should be performed, such as query, sorting and pagination, to name a few. All we're interested in doing is updating the query parameter of this object with our search term.

If the query parameter from the "searchState" object is empty, render search results, otherwise, display a message stating a search term is required.

One thing to notice is the SearchBox has been replaced with a VirtualSearchBox, which uses the connector of the search box to create a virtual widget - in our case an empty span tag. This will link the InstantSearch component with the query. Having some form of search box component is compulsory.

Conclusion

I prefer not to use the out-of-the-box search box component as I can potentially save requests to Algolia's API, as searches aren't being made on the fly as a user enters a search term. This is the plugins default behaviour.

Passing a search term through a query-string may come across as a little backwards, especially when it's rather nice to see search results change before your eyes as you type letter-by-letter. However, this approach misses one key element: Tracking in Google Analytics. Even though I will be primary the person making the most use of my site search, it'll be interesting to see who else uses it and what search keywords are used.

Useful Links

GatsbyJS Markdown Plugin: Automatically Open External Links In A New Tab

I’ve been doing a little research into how I can make posts written in markdown more suited for my needs and decided to use this opportunity to develop my own Gatsby Markdown plugin. Ever since I moved to Gatsby, making my own Markdown plugin has been on my todo list as I wanted to see how I could render slightly different HTML markup based on the requirements of my blog post content.

As this is my first markdown plugin, I thought it best to start small and tackle bug-bear of mine - making external links automatically open in a new window. From what I have looked online, some have suggested to just add an HTML anchor tag to the markdown as you will then have the ability to apply all attributes you’d want - including target. I’ll quote my previous post about aligning images in markdown and why I’m not keen on mixing HTML with markdown:

HTML can be mingled alongside the markdown syntax... I wouldn't recommend this from a maintainability perspective. Markdown is platform-agnostic so your content is not tied to a specific platform. By adding HTML to markdown, you're instantly sacrificing the portability of your content.

Setup

We will need to create a local Gatsby plugin, which I’ve named gatsby-remark-auto-link-new-window. Ugly name... maybe you can come up with something more imaginative. :-)

To register this to your Gatsby project, you will need start of with the following:

  • Creating a plugin folder at the root of your project (if one hasn’t been created already).
  • Add a new folder based on the name of our plugin, in this case - /plugins/gatsby-remark-auto-link-new-window.
  • Every plugin consists of two files:

    • index.js - where the plugin code to carry out our functionality will reside.
    • package.json - contains the details of our plugin, such as name, description, dependencies etc. For the moment this can just contain an empty JSON object {}. If we were to publish our plugin, this will need to be completed in its entirety.

Now that we have our bare-bones structure, we need to register our local plugin by adding a reference to the gatsby-config.js file. Since this is a plugin to do with transforming markdown, the reference will be added inside the 'gatsby-transformer-remark options:

{
  // ...
  resolve: 'gatsby-transformer-remark',
    options: {
      plugins: [        
        {
          resolve: 'gatsby-remark-embed-gist',
          options: {
            username: 'SurinderBhomra',
          },
        },
        {
          resolve: 'gatsby-remark-auto-link-new-window',
          options: {},
        },
        'gatsby-remark-prismjs',
      ],
    },
  // ...
}

For the moment, I’ve left the options field empty as we currently have no settings to pass to our plugin. This is something I will show in another blog post.

To make sure we have registered our plugin with no errors, we need run our build using the gatsby clean && gatsby develop command. This command will always need to be run after every change made to the plugin. By adding gatsby clean, we ensure the build clears out all the previously built files prior to the next build process.

Rewriting Links In Markdown

As the plugin is relatively straight-forward, let’s go straight into the code that will be added to our index.js file.

const visit = require("unist-util-visit")

module.exports = ({ markdownAST }, pluginOptions) => {
  visit(markdownAST, "link", node => {
    // Check if link is external by checking if the "url" attribute starts with http.
    if (node.url.startsWith("http")) {
      if (!node.data) {
        // hProperties refers to the HTML attributes of the node in question.
        // Ensure this object is added to the node.
        node.data = { hProperties: {} };
      }
      
      // Assign the 'target' attribute.
      node.data.hProperties = Object.assign({}, node.data.hProperties, {
        target: "_blank",
      });
    }
  })

  return markdownAST
}

As you can see, I want to target all markdown link nodes and depending on the contents of the url property we will perform a custom transformation. If the url property starts with an "http" we will then add a new attribute, "target" using hProperties. hProperties refers to the HTML attributes of the targeted node.

To see the changes take effect, we will need to re-run gatsby clean && gatsby develop.

Now that we have understood the basics, we can beef up our plugin by adding some more functionality, such as plugin options. But that's for another post.

Aligning Images In Markdown

Every post on this site is written in markdown since successfully moving over to GatsbyJS. Overall, the transition has been painless and found that writing blog posts using the markdown syntax is a lot more efficient than using a conventional WYSIWYG editor. I never noticed until making the move to markdown how fiddly those editors were as you sometimes needed to clean the generated markup at HTML level.

Of course, all the efficiency of markdown does come at a minor cost in terms of flexibility. Out of the minor limitations, there was one I couldn't let pass. I needed to find a way to position images left, right and centre as majority of my previous posts have been formatted in this way. When going through the conversion process from HTML to markdown, all my posts were somewhat messed up and images were rendered 100% width.

HTML can be mingled alongside the markdown syntax, so I do have an option to use the image tag and append styling. I wouldn't recommend this from a maintainability perspective. Markdown is platform-agnostic so your content is not tied to a specific platform. By adding HTML to markdown, you're instantly sacrificing the portability of your content.

I found a more suitable approach would be to handle the image positioning by appending a hashed value to the end of the image URL. For example, #left, #right, or #centre. We can at CSS level target the src attribute of the image and position the image along with any additional styling based on the hashed value. Very neat!

img[src*='#left'] {
float: left;
margin: 10px 10px 10px 0;
}

img[src*='#center'] {
display: block;
margin: 0 auto;
}

img[src*='#right'] {
float: right;
margin: 10px 0 10px 10px;
}

Being someone who doesn’t delve into front-end coding techniques as much as I used to, I am amazed at the type of things you can do within CSS. I’ve obviously come late to the more advanced CSS selectors party.

Journey To GatsbyJS: We Are Live

If you’re seeing this post, then this means I have fully made the transition to a static-generated website architecture using GatsbyJS. I started this process late December last year but then started taking it seriously into the new year. It’s been a learning process getting to grips with a new framework as well as a big jump for me and my site.

Why has it been a big jump?

Everything is static. I have downsized my website footprint exponentially. All 250+ blog posts have been migrated into markdown files, so from now on, I will be writing in markdown and (with the help of Netlify) pushing new content by a simple git commit. Until now, I have always had a website that used server-side frameworks that stored all my posts in a database. It’s quite scary moving to a framework that feels quite unnatural to how I would normally build sites and the word “static” when used in relation to a website reminds me of a bygone era.

Process of Moving To Netlify

I was pleasantly surprised by how easy the transition to Netlify was. There is a vast amount of resources available that makes for good reading before making the switch to live. After linking my website Bitbucket repository to a site, the only things left to do to make it live were the following:

  • Upload a _redirects file, listing out any redirects you require Netlify to handle. For GatsbyJS sites, this will need to be added to the /static directory.
  • Setup Environment variables to allow the application to easily switch between development and production states. For example, my robots.txt is set to be indexable when only in production mode.
  • Add CNAME records to your existing domain that point to your Netlify domain. For example, surindersite.netlify.com.
  • Issue a free Let’s Encrypt SSL certificate, which is easily done within the account Domain settings.

Post live, the only thing that stumped me was the Netlify domain didn’t automatically redirect to my custom domain. This is something I thought Netlify would automatically handle once the domain records were updated. To get around this, an explicit domain 301 redirect needs to be added to your _redirects file.

# Domain Redirect
https://surinderbhomra.netlify.com/*     https://www.surinderbhomra.com/:splat    301!

New Publishing Process

Before making the switchover, I had to carry out some practice runs on how I would be updating my website just to be sure I could live with the new way of adding content. The process is now the following:

  1. Use “content/posts” branch to add a new blog post.
  2. Create a new .md file that consists of the date and slug. In my case, all my markdown files are named "2010-04-02---My-New-Post.md".
  3. Ensure all categories and tags in the markdown frontmatter is named correctly. This is an important step to ensure no unnecessary new categories or tags are created.
  4. Add any images used in the post to the site. The images should reference Imagekit.io.
  5. Check over the post locally.
  6. Push to master branch and let Netlify carry out the rest.

Out of all the steps, I have only found steps 3 and 4 to require a little effort when compared to using a CMS platform, as previously, I could select from a predefined list of categories and upload images directly. Not a deal-breaker.

Next Steps

I had a tight deadline to ensure I made the move to Netlify before my current hosting renews for another year and still have quite a bit of improvement to make. Have you seen my Google Lighthouse score!?! It’s shockingly bad due to using the same HTML markup and CSS from my old site. I focused my efforts cramming in all the functionality to mimic how my site used to work and efficiencies in keeping build times to Netlify low.

First thing on the list - rebuild website templates from the ground up.

Lazyload and Responsively Serve External Images In GatsbyJs

For the Gatsby version of my website, currently in development, I am serving all my images from Imagekit.io - a global image CDN. The reasons for doing this is so I will have the ultimate flexibility in how images are used within my site, which didn’t necessarily fit with what Gatsby has to offer especially when it came to how I wanted to position images within blog post content served from markdown files.

As I understand it, Gatsby Image has two methods of responsively resizing images:

  1. Fixed: Images that have a fixed width and height.
  2. Fluid: Images that stretch across a fluid container.

In my blog posts, I like to align my images (just take look at my post about my time in the Maldives) as it helps break the post up a bit. I won’t be able to achieve that look by the options provided in Gatsby. It’ll look all a little bit too stacked. The only option is to serve my images from Imagekit.io, which in the grand scheme isn’t a bad idea. I get the benefit of being able to transform images on the fly, optimisation (that can be customised through Imagekit.io dashboard) and fast delivery through its content-delivery network.

To meet my image requirements, I decided to develop a custom responsive image component that will perform the following:

  • Lazyload image when visible in viewport.
  • Ability to parse an array “srcset" sizes.
  • Set a default image width.
  • Render the image on page load in low resolution.

React Visibility Sensor

The component requires the use of "react-visibility-sensor” plugin to mimic the lazy loading functionality. The plugin notifies you when a component enters and exits the viewport. In our case, we only want the sensor to run once an image enters the viewport. By default, the sensor is always fired every time a block enters and exits the viewport, causing our image to constantly alternate between the small and large versions - something we don't want.

Thanks for a useful post by Mark Oskon, he provided a solution that extends upon the react-visibility-sensor plugin and allows us to turn off the sensor after the first reveal. I ported the code from Mark's solution in a newly created component housed in "/core/visibility-sensor.js", which I then reference into my LazyloadImage component:

import React, { Component } from "react";
import PropTypes from "prop-types";
import VSensor from "react-visibility-sensor";

class VisibilitySensor extends Component {
  state = {
    active: true
  };

  render() {
    const { active } = this.state;
    const { once, children, ...theRest } = this.props;
    return (
      <VSensor
        active={active}
        onChange={isVisible =>
          once &&
          isVisible &&
          this.setState({ active: false })
        }
        {...theRest}
      >
        {({ isVisible }) => children({ isVisible })}
      </VSensor>
    );
  }
}

VisibilitySensor.propTypes = {
  once: PropTypes.bool,
  children: PropTypes.func.isRequired
};

VisibilitySensor.defaultProps = {
  once: false
};

export default VisibilitySensor;

LazyloadImage Component

import PropTypes from "prop-types";
import React, { Component } from "react";
import VisibilitySensor from "../core/visibility-sensor"

class LazyloadImage extends Component {
    render() {
      let srcSetAttributeValue = "";
      let sanitiseImageSrc = this.props.src.replace(" ", "%20");

      // Iterate through the array of values from the "srcsetSizes" array property.
      if (this.props.srcsetSizes !== undefined && this.props.srcsetSizes.length > 0) {
        for (let i = 0; i < this.props.srcsetSizes.length; i++) {
          srcSetAttributeValue += `${sanitiseImageSrc}?tr=w-${this.props.srcsetSizes[i].imageWidth} ${this.props.srcsetSizes[i].viewPortWidth}w`;

          if (this.props.srcsetSizes.length - 1 !== i) {
            srcSetAttributeValue += ", ";
          }
        }
      }

      return (
          <VisibilitySensor key={sanitiseImageSrc} delayedCall={true} partialVisibility={true} once>
            {({isVisible}) =>
            <>
              {isVisible ? 
                <img src={`${sanitiseImageSrc}?tr=w-${this.props.widthPx}`} 
                      alt={this.props.alt}
                      sizes={this.props.sizes}
                      srcSet={srcSetAttributeValue} /> : 
                <img src={`${sanitiseImageSrc}?tr=w-${this.props.defaultWidthPx}`} 
                      alt={this.props.alt} />}
              </>
            }
          </VisibilitySensor>
      )
    }
}

LazyloadImage.propTypes = {
  alt: PropTypes.string,
  defaultWidthPx: PropTypes.number,
  sizes: PropTypes.string,
  src: PropTypes.string,
  srcsetSizes: PropTypes.arrayOf(
    PropTypes.shape({
      imageWidth: PropTypes.number,
      viewPortWidth: PropTypes.number
    })
  ),
  widthPx: PropTypes.number
}

LazyloadImage.defaultProps = {
  alt: ``,
  defaultWidthPx: 50,
  sizes: `50vw`,
  src: ``,
  widthPx: 50
}

export default LazyloadImage

Component In Use

The example below shows the LazyloadImage component used to serve a logo that will serve a different sized image with the following widths - 400, 300 and 200.

<LazyloadImage 
                src="https://ik.imagekit.io/surinderbhomra/Pages/logo-me.jpg" 
                widthPx={400} 
                srcsetSizes={[{ imageWidth: 400, viewPortWidth: 992 }, { imageWidth: 300, viewPortWidth: 768 }, { imageWidth: 200, viewPortWidth: 500 }]}
                alt="Surinder Logo" />

Useful Links

https://alligator.io/react/components-viewport-react-visibility-sensor/ https://imagekit.io/blog/lazy-loading-images-complete-guide/ https://www.sitepoint.com/how-to-build-responsive-images-with-srcset/

Journey To GatsbyJS: Beta Site Release v2

It’s taken me a little longer to make more progress as I’ve been stumped on how I would go about listing blog posts filtered by year and/or month. I’ve put extra effort in ensuring the full date is included in the URL for all my blog posts. In the process of doing this, I had to review and refactor the functions used within gatsby-node.js.

Refactoring

I noticed that I was carrying out build operations inefficiently and in some cases where they didn’t need to happen. For example, I was building individual blog post pages all over the place thinking I was required to do this in areas where I was listing blog posts. Reviewing my build operations had a positive impact and managed to reduce build times to Netlify from 2 minutes 17 seconds to 2 minutes 3 seconds. Where you are able to make build time savings, why wouldn’t you want to do this? By being efficient, you could squeeze in more builds within Netlify’s 300-minute monthly limit (based on free-tier).

Page Speed Tests

The GatsyJS build is at a point where I can start carrying out some performance tests using Google Page Insights and Lighthouse. Overall, the tests have proved more favourable when compared against my current site. The Lighthouse analysis still proves there is work to be done, however, the static-site generator architecture sets you off to a good start with minimal effort.

Google Lighthouse Stats - Current Site Current site

Google Lighthouse Stats - Gatsby Site Gatsby site

Current HTML/CSS Quality

I can see the main area of failure is the HTML and CSS build... not my strong suit. The template has inherited performance-lag remnants from my current site and even though I have cleaned it up as well as I can, it’s not ideal. At this moment, I have to focus on function over form.

Site Release Details

This version contains the following:

  • Blog post-filtering by year and/or month. For example:
    • /Blog/2019
    • /Blog/2019/12
  • Refactored build functions.
  • Removed unneeded CSS from the old template (still got more to do).

GatsbyJS Beta Site: http://surinderbhomra.netlify.com

GatsbyJS: Automatically Include Date In Blog Post Slug

There will be times where you will want to customise the slug based on fields from your markdown file. In my case, I wanted all my blog post URL's in the following format: /Blog/yyyy/MM/dd/Blog-Post-Title. There are two ways of doing this:

  1. Enter the full slug using a “slug” field within your markdown file.
  2. Use the onCreateNode() function found in the gatsby-node.js file to dynamically generate the slug.

My preference would be option 2 as it gives us the flexibility to modify the slug structure in one place when required. If for some reason we had to update our slug structure at a later date, it would be very time consuming (depending on how many markdown files you have) to update the slug field within each markdown file if we went ahead with option 1.

This post is suited for those who are storing their content using markdown files. I don’t think you will get much benefit if your Gatsby site is linked to a headless CMS, as the slugs are automatically generated within the platform.

The onCreateNode() Function

This function is called whenever a node is created or updated, which makes it the most ideal place to add the functionality we want to perform. It is found in the gatsby-node.js file

What we need to do is retrieve the fields we would like to form part of our slug by accessing the nodes frontmatter. In our case, all we require is two fields:

  1. Post Date
  2. Slug
exports.onCreateNode = ({ node, actions, getNode }) => {
    const { createNodeField } = actions
  
    if (node.internal.type === `MarkdownRemark`) {
      const relativeFilePath = createFilePath({ node, getNode, trailingSlash: false });
      const postDate = moment(node.frontmatter.date); // Use moment.js to easily change date format.
      const url = `/Blog/${postDate.format("YYYY/MM/DD")}${node.frontmatter.slug}`;

      createNodeField({
        name: `slug`,
        node,
        value: url,
      });
    }
  }

After making this change, you will need to re-run the gatsby develop command.

Journey To GatsbyJS: Beta Site Release v1

I am surprised at just how much progress I have made in replicating my site using the GatsbyJS framework. I have roughly spent around 10-12 days (not full days) getting up to speed on everything GatsbyJS and transitioning what I have learnt over to the GatsbyJS version of my site.

Initially, my progress was slow as I had to get my head around GraphQL, the process of how static pages are generated in the hierarchy I require and export my existing blog content to markdown. Having previous experience in React has definitely helped in making relatively swift progress.

What I would say to new GatsbyJS developers is to use the Gatsby Starter Default package - if you really want to understand Gatsby in its entirety. The package gives you enough functionality to understand what’s going on so you can easily make your own customisations. Using other fully functional starter packages can cause confusion and led me asking more questions when attempting to make changes. Trust me, it’s not wise to get too ahead of yourself (as admirable as that might be) in the early stages. Start simple and work your way up!

The interesting thing I noticed whilst working with GatsbyJS is when I think I am stumped from a functionality point-of-view, I find there is a plugin that does exactly what I require. GatsbyJS offers a foray of quality plugins. For example, I had issues in ordering my "preconnect" declarations within the <head> block so they resided before any styles or scripts. It seemed GatsbyJS has its own way of ordering the <head> elements. Thankfully, like always, there’s a plugin on hand to cure my woes.

Site Release Details

As of today, I have released the first version of my GatsbyJS site to Netlify. It’s by no means perfect and will be a work-in-progress for many iterations to come.

This version contains the following:

  • Implemented styling from the current site. Still rough around the edges and in no way efficiently done.
  • All images are hosted via Imagekit.io to be served efficiently via CDN with responsive capability.
  • Added custom routing for blog posts to include the date. For example, “/Blog/2020/01/01/My-Blog-Post”.
  • Posts can be filtered by Category (unstyled).
  • Posts Archive page (unstyled)
  • Implemented pagination for blog listing.
  • Added the following plugins:

Making my first publish to Netlify was completed in: 2 minutes 17 seconds. From an efficiency standpoint, I don’t know if this is good or bad. For me, 2 minutes seems like a long time. I wonder if it could be due to the 250+ markdown files I’m using for my blog posts and the multiple filtering routes. It’s also worth noting, I’m going completely static by not relying on any content management platform.

GatsbyJS Beta Site: http://surinderbhomra.netlify.com

Journey To GatsbyJS: Exporting Kentico Blog Posts To Markdown Files

The first thing that came into my head when testing the waters to start the process of moving over to Gatsby was my blog post content. If I could get my content in a form a Gatsby site accepts then that's half the battle won right there, the theory being it will simplify the build process.

I opted to go down the local storage route where Gatsby would serve markdown files for my blog post content. Everything else such as the homepage, archive, about and contact pages can be static. I am hoping this isn’t something I will live to regret but I like the idea my content being nicely preserved in source control where I have full ownership without relying on a third-party platform.

My site is currently built on the .NET framework using Kentico CMS. Exporting data is relatively straight-forward, but as I transition to a somewhat content-less managed approach, I need to ensure all fields used within my blog posts are transformed appropriately into the core building blocks of my markdown files.

A markdown file can carry additional field information about my post that can be declared at the start of the file, wrapped by triple dashes at the start and end of the block. This is called frontmatter.

Here is a snippet of one of my blog posts exported to a markdown file:

---
title: "Maldives and Vilamendhoo Island Resort"
summary: "At Vilamendhoo Island Resort you are surrounded by serene beauty wherever you look. Judging by the serendipitous chain of events where the stars aligned, going to the Maldives has been a long time in the coming - I just didn’t know it."
date: "2019-09-21T14:51:37Z"
draft: false
slug: "/Maldives-and-Vilamendhoo-Island-Resort"
disqusId: "b08afeae-a825-446f-b448-8a9cae16f37a"
teaserImage: "/media/Blog/Travel/VilamendhooSunset.jpg"
socialImage: "/media/Blog/Travel/VilamendhooShoreline.jpg"
categories: ["Surinder's Log"]
tags: ["holiday", "maldives"]
---

Writing about my holiday has started to become a bit of a tradition (for those that are worthy of such time and effort!) which seem to start when I went to [Bali last year](/Blog/2018/07/06/My-Time-At-Melia-Bali-Hotel). 
I find it's a way to pass the time in airports and flights when making the return journey home. So here's another one...

Everything looks well structured and from the way I have formatted the date, category and tags fields, it will lend itself to be quite accommodating for the needs of future posts. I made the decision to keep the slug value void of any directory structure to give me the flexibility on dynamically creating a URL structure.

Kentico Blog Posts to Markdown Exporter

The quickest way to get the content out was to create a console app to carry out the following:

  1. Loop through all blog posts in post date descending.
  2. Update all images paths used as a teaser and within the content.
  3. Convert rich text into markdown.
  4. Construct frontmatter key-value fields.
  5. Output to a text file in the following naming convention: “yyyy-MM-dd---Post-Title.md”.

Tasks 2 and 3 will require the most effort…

When I first started using Kentico, all references to images were made directly via the file path and as I got more familiar with Kentico, this was changed to use permanent URLs. Using permanent URL’s caused the link to an image to change from "/Surinder/media/Surinder/myimage.jpg", to “/getmedia/27b68146-9f25-49c4-aced-ba378f33b4df /myimage.jpg?width=500”. I need to create additional checks to find these URL’s and transform into a new path.

Finding a good .NET markdown converter is imperative. Without this, there is a high chance the rich text content would not be translated to a satisfactorily standard, resulting in some form of manual intervention to carry out corrections. Combing through 250 posts manually isn’t my idea of fun! :-)

I found the ReverseMarkdown .NET library allowed for enough options to deal with Rich Text to Markdown conversion. I could set in the conversion process to ignore HTML that couldn’t be transformed thus preserving content.

Code

using CMS.DataEngine;
using CMS.DocumentEngine;
using CMS.Helpers;
using CMS.MediaLibrary;
using Export.BlogPosts.Models;
using ReverseMarkdown;
using System;
using System.Collections.Generic;
using System.Configuration;
using System.IO;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;

namespace Export.BlogPosts
{
    class Program
    {
        public const string SiteName = "SurinderBhomra";
        public const string MarkdownFilesOutputPath = @"C:\Temp\BlogPosts\";
        public const string NewMediaBaseFolder = "/media";
        public const string CloudImageServiceUrl = "https://xxxx.cloudimg.io";

        static void Main(string[] args)
        {
            CMSApplication.Init();

            List<BlogPost> blogPosts = GetBlogPosts();

            if (blogPosts.Any())
            {
                foreach (BlogPost bp in blogPosts)
                {
                    bool isMDFileGenerated = CreateMDFile(bp);

                    Console.WriteLine($"{bp.PostDate:yyyy-MM-dd} - {bp.Title} - {(isMDFileGenerated ? "EXPORTED" : "FAILED")}");
                }

                Console.ReadLine();
            }
        }

        /// <summary>
        /// Retrieve all blog posts from Kentico.
        /// </summary>
        /// <returns></returns>
        private static List<BlogPost> GetBlogPosts()
        {
            List<BlogPost> posts = new List<BlogPost>();

            InfoDataSet<TreeNode> query = DocumentHelper.GetDocuments()
                                               .OnSite(SiteName)
                                               .Types("SurinderBhomra.BlogPost")
                                               .Path("/Blog", PathTypeEnum.Children)
                                               .Culture("en-GB")
                                               .CombineWithDefaultCulture()
                                               .NestingLevel(-1)
                                               .Published()
                                               .OrderBy("BlogPostDate DESC")
                                               .TypedResult;

            if (!DataHelper.DataSourceIsEmpty(query))
            {
                foreach (TreeNode blogPost in query)
                {
                    posts.Add(new BlogPost
                    {
                        Guid = blogPost.NodeGUID.ToString(),
                        Title = blogPost.GetStringValue("BlogPostTitle", string.Empty),
                        Summary = blogPost.GetStringValue("BlogPostSummary", string.Empty),
                        Body = RichTextToMarkdown(blogPost.GetStringValue("BlogPostBody", string.Empty)),
                        PostDate = blogPost.GetDateTimeValue("BlogPostDate", DateTime.MinValue),
                        Slug = blogPost.NodeAlias,
                        DisqusId = blogPost.NodeGUID.ToString(),
                        Categories = blogPost.Categories.DisplayNames.Select(c => c.Value.ToString()).ToList(),
                        Tags = blogPost.DocumentTags.Replace("\"", string.Empty).Split(',').Select(t => t.Trim(' ')).Where(t => !string.IsNullOrEmpty(t)).ToList(),
                        SocialImage = GetMediaFilePath(blogPost.GetStringValue("ShareImageUrl", string.Empty)),
                        TeaserImage = GetMediaFilePath(blogPost.GetStringValue("BlogPostTeaser", string.Empty))
                    });
                }
            }

            return posts;
        }

        /// <summary>
        /// Creates the markdown content based on Blog Post data.
        /// </summary>
        /// <param name="bp"></param>
        /// <returns></returns>
        private static string GenerateMDContent(BlogPost bp)
        {
            StringBuilder mdBuilder = new StringBuilder();

            #region Post Attributes

            mdBuilder.Append($"---{Environment.NewLine}");
            mdBuilder.Append($"title: \"{bp.Title.Replace("\"", "\\\"")}\"{Environment.NewLine}");
            mdBuilder.Append($"summary: \"{HTMLHelper.HTMLDecode(bp.Summary).Replace("\"", "\\\"")}\"{Environment.NewLine}");
            mdBuilder.Append($"date: \"{bp.PostDate.ToString("yyyy-MM-ddTHH:mm:ssZ")}\"{Environment.NewLine}");
            mdBuilder.Append($"draft: {bp.IsDraft.ToString().ToLower()}{Environment.NewLine}");
            mdBuilder.Append($"slug: \"/{bp.Slug}\"{Environment.NewLine}");
            mdBuilder.Append($"disqusId: \"{bp.DisqusId}\"{Environment.NewLine}");
            mdBuilder.Append($"teaserImage: \"{bp.TeaserImage}\"{Environment.NewLine}");
            mdBuilder.Append($"socialImage: \"{bp.SocialImage}\"{Environment.NewLine}");

            #region Categories

            if (bp.Categories?.Count > 0)
            {
                CommaDelimitedStringCollection categoriesCommaDelimited = new CommaDelimitedStringCollection();

                foreach (string categoryName in bp.Categories)
                    categoriesCommaDelimited.Add($"\"{categoryName}\"");

                mdBuilder.Append($"categories: [{categoriesCommaDelimited.ToString()}]{Environment.NewLine}");
            }

            #endregion

            #region Tags

            if (bp.Tags?.Count > 0)
            {
                CommaDelimitedStringCollection tagsCommaDelimited = new CommaDelimitedStringCollection();

                foreach (string tagName in bp.Tags)
                    tagsCommaDelimited.Add($"\"{tagName}\"");

                mdBuilder.Append($"tags: [{tagsCommaDelimited.ToString()}]{Environment.NewLine}");
            }

            #endregion

            mdBuilder.Append($"---{Environment.NewLine}{Environment.NewLine}");

            #endregion

            // Add blog post body content.
            mdBuilder.Append(bp.Body);

            return mdBuilder.ToString();
        }

        /// <summary>
        /// Creates files with a .md extension.
        /// </summary>
        /// <param name="bp"></param>
        /// <returns></returns>
        private static bool CreateMDFile(BlogPost bp)
        {
            string markdownContents = GenerateMDContent(bp);

            if (string.IsNullOrEmpty(markdownContents))
                return false;

            string fileName = $"{bp.PostDate:yyyy-MM-dd}---{bp.Slug}.md";
            File.WriteAllText($@"{MarkdownFilesOutputPath}{fileName}", markdownContents);

            if (File.Exists($@"{MarkdownFilesOutputPath}{fileName}"))
                return true;

            return false;
        }

        /// <summary>
        /// Gets the full relative path of an file based on its Permanent URL ID. 
        /// </summary>
        /// <param name="filePath"></param>
        /// <returns></returns>
        private static string GetMediaFilePath(string filePath)
        {
            if (filePath.Contains("getmedia"))
            {
                // Get GUID from file path.
                Match regexFileMatch = Regex.Match(filePath, @"(\{){0,1}[0-9a-fA-F]{8}\-[0-9a-fA-F]{4}\-[0-9a-fA-F]{4}\-[0-9a-fA-F]{4}\-[0-9a-fA-F]{12}(\}){0,1}");

                if (regexFileMatch.Success)
                {
                    MediaFileInfo mediaFile = MediaFileInfoProvider.GetMediaFileInfo(Guid.Parse(regexFileMatch.Value), SiteName);

                    if (mediaFile != null)
                        return $"{NewMediaBaseFolder}/{mediaFile.FilePath}";
                }
            }

            // Return the file path and remove the base file path.
            return filePath.Replace("/SurinderBhomra/media/Surinder", NewMediaBaseFolder);
        }

        /// <summary>
        /// Convert parsed rich text value to markdown.
        /// </summary>
        /// <param name="richText"></param>
        /// <returns></returns>
        public static string RichTextToMarkdown(string richText)
        {
            if (!string.IsNullOrEmpty(richText))
            {
                #region Loop through all images and correct the path

                // Clean up tilda's.
                richText = richText.Replace("~/", "/");

                #region Transform Image Url's Using Width Parameter

                Regex regexFileUrlWidth = new Regex(@"\/getmedia\/(\{{0,1}[0-9a-fA-F]{8}\-[0-9a-fA-F]{4}\-[0-9a-fA-F]{4}\-[0-9a-fA-F]{4}\-[0-9a-fA-F]{12}\}{0,1})\/([\w,\s-]+\.[A-Za-z]{3})(\?width=([0-9]*))", RegexOptions.Multiline | RegexOptions.IgnoreCase);

                foreach (Match fileUrl in regexFileUrlWidth.Matches(richText))
                {
                    string width = fileUrl.Groups[4] != null ? fileUrl.Groups[4].Value : string.Empty;
                    string newMediaUrl = $"{CloudImageServiceUrl}/width/{width}/n/https://www.surinderbhomra.com{GetMediaFilePath(ClearQueryStrings(fileUrl.Value))}";

                    if (newMediaUrl != string.Empty)
                        richText = richText.Replace(fileUrl.Value, newMediaUrl);
                }

                #endregion

                #region Transform Generic File Url's

                Regex regexGenericFileUrl = new Regex(@"\/getmedia\/(\{{0,1}[0-9a-fA-F]{8}\-[0-9a-fA-F]{4}\-[0-9a-fA-F]{4}\-[0-9a-fA-F]{4}\-[0-9a-fA-F]{12}\}{0,1})\/([\w,\s-]+\.[A-Za-z]{3})", RegexOptions.Multiline | RegexOptions.IgnoreCase);

                foreach (Match fileUrl in regexGenericFileUrl.Matches(richText))
                {
                    // Construct media URL required by image hosting company - CloudImage. 
                    string newMediaUrl = $"{CloudImageServiceUrl}/cdno/n/n/https://www.surinderbhomra.com{GetMediaFilePath(ClearQueryStrings(fileUrl.Value))}";

                    if (newMediaUrl != string.Empty)
                        richText = richText.Replace(fileUrl.Value, newMediaUrl);
                }

                #endregion

                #endregion

                Config config = new Config
                {
                    UnknownTags = Config.UnknownTagsOption.PassThrough, // Include the unknown tag completely in the result (default as well)
                    GithubFlavored = true, // generate GitHub flavoured markdown, supported for BR, PRE and table tags
                    RemoveComments = true, // will ignore all comments
                    SmartHrefHandling = true // remove markdown output for links where appropriate
                };

                Converter markdownConverter = new Converter(config);

                return markdownConverter.Convert(richText).Replace(@"[!\", @"[!").Replace(@"\]", @"]");
            }

            return string.Empty;
        }

        /// <summary>
        /// Returns media url without query string values.
        /// </summary>
        /// <param name="mediaUrl"></param>
        /// <returns></returns>
        private static string ClearQueryStrings(string mediaUrl)
        {
            if (mediaUrl == null)
                return string.Empty;

            if (mediaUrl.Contains("?"))
                mediaUrl = mediaUrl.Split('?').ToList()[0];

            return mediaUrl.Replace("~", string.Empty);
        }
    }
}

There is a lot going on here, so let's do a quick breakdown:

  1. GetBlogPosts(): Get all blog posts from Kentico and parse them to a “BlogPost” class object containing all the fields we want to export.
  2. GetMediaFilePath(): Take the image path and carry out all the transformation required to change to a new file path. This method is used in GetBlogPosts() and RichTextToMarkdown() methods.
  3. RichTextToMarkdown(): Takes rich text and goes through a transformation process to relink images in a format that will be accepted by my image hosting provider - Cloud Image. In addition, this is where ReverseMarkdown is used to finally convert to markdown.
  4. CreateMDFile(): Creates the .md file based on the blog posts found in Kentico.

Delving Into The World of Gatsby and Static Site Generators

I have been garnering interest in a static-site generator architecture ever since I read Paul Stamatiou’s enlightening post about how he built his website. I am always intrigued to know what goes on behind the scenes of someone's website, especially bloggers and the technology stack they use.

Paul built his website using Jekyll. In his post, he explains his reasoning to why he decided to go down this particular avenue - with great surprise resonated with me. In the past, I always felt the static-site generator architecture was too restrictive and coming from a .NET background, I felt comfortable knowing my website was built using some form of server-side code connected to a database, allowing me infinite possibilities. Building a static site just seemed like a backwards approach to me. Paul’s opening few paragraphs changed my perception:

..having my website use a static site generator for a few reasons...I did not like dealing with a dynamic website that relied on a typical LAMP stack. Having a database meant that MySQL database backups was mission critical.. and testing them too. Losing an entire blog because of a corrupt database is no fun...

...I plan to keep my site online for decades to come. Keeping my articles in static files makes that easy. And if I ever want to move to another static site generator, porting the files over to another templating system won't be as much of a headache as dealing with a database migration.

And then it hit me. It all made perfect sense!

Enter The Static Site Generator Platform

I’ll admit, I’ve come late to the static site party and never gave it enough thought, so I decided to pick up the slack and researched different static-site generator frameworks, including:

  • Jekyll
  • Hugo
  • Gatsby

Jekyll runs on the Ruby language, Hugo on Go (invented by Google) and Gatsby on React. After some tinkering with each, I opted to invest my time in learning Gatsby. I was very tempted by Hugo, (even if it meant learning Go) as it is more stable and requires less build time which is important to consider for larger websites, but it fundamentally lacks an extensive plugin ecosystem.

Static Generator of Choice: Gatsby

Gatsby comes across as a mature platform offering a wide variety of useful plugins and tools to enhance the application build. I’m already familiar coding in React from when I did some React Native work in the past, which I haven’t had much chance to use again. Being built on React, it gave me an opportunity to dust the cobwebs off and improve both my React and (in the process) JavaScript skillset.


I was surprised by just how quickly I managed to get up and running. There is nothing you have to configure unlike when working with content-management platforms. In fact, I decided to create a Gatsby version of this very site. Within a matter of days, I was able to replicate the following website functionality:

  • Listing blog posts.
  • Pagination.
  • Filtering by category and tag.
  • SEO - managing page titles, description, open-graph tags, etc.

There I such a wealth of information and support online to help you along.

I am very tempted to move over to Gatsby.

When to use Static or Dynamic?

Static site generators isn’t a framework that is suited for all web application scenarios. It’s more suited for small/medium-sized sites where there isn't a requirement for complex integrations. It works best with static content that doesn’t require changes to occur based on user interaction.

The only thing that comes into question is the build time where you have pages of content in their thousands. Take Gatsby, for example...

I read one site containing around 6000 posts, resulting in a build time of 3 minutes. The build time can vary based on the environment Gatsby is running on and build quality. I personally try to ensure best case build time by:

  • Sufficiently spec'd hardware is used - laptop and hosting environment.
  • Keeping the application lean by utilising minimal plugins.
  • Write efficient JavaScript.
  • Reusing similar GraphQL queries where the same data is being requested more than once in different components, pages and views.

We have to accept the more pages a website has, the slower the build time will be. Hugo should get an honourable mention here as the build speed beats its competition hands down.

Static sites have their place in any project as long as you conform within the confines of the framework. If you have a feeling that your next project will at some point (or immediately) require some form of fanciful integration, dynamic is the way to go. Dynamic gives you unlimited possibilities and will always be the safer option, something static will never measure against.

The main strengths of static sites are that they’re secure and perform well in Lighthouse scoring potentially resulting favourably in search engines.

Avenue’s for Adding Content

The very cool thing is you have the ability to hook up to your content via two options:

  1. Markdown files
  2. Headless CMS

Markdown is such a pleasant and efficient way to write content. It’s all just plain text written with the help of a simplified notation that is then transformed into HTML. The crucial benefit of writing in markdown is its portability and clean output. If in the future I choose to jump to a different static framework, it’s just a copy and paste job.

A more acceptable client solution is to integrate with a Headless CMS where a more familiar Rich Text content editing and the storage of media is available to hand.

You can also create custom-built pages without having to worry about the data layer, for example, landing pages.

Final Thoughts

I love Gatsby and it’s been a very long time since I have been excited by a different approach to developing websites. I am very tempted to make the move as this framework is made for sites like mine, providing I can get solutions to areas in Gatsby where I currently lack knowledge, such as:

  • Making URL’s case-insensitive.
  • 301 redirects.
  • Serving different responsive images within the post content. I understand Gatsby does this at templating-level but cannot currently see a suitable approach for media housed inside content.

I’m sure the above points are achievable and as I have made quite swift progress on replicating my site in Gatsby, if all goes to plan, I could go the full hog. Meaning I don’t plan on serving content from any form of content-management system and cementing myself in Gatsby.

At one point I was planning on moving over to a headless CMS, such as Kontent or Prismic. That plan was swiftly scrapped when there didn’t seem to be an avenue of migrating my existing content unless a Business or Professional plan is purchased, which came to a high cost.

I will be documenting my progress in follow up posts. So watch this space!