Azure Function: 404 Page Checker

Sometimes the simplest piece of development can be the most rewarding and I think my Azure Function that checks for broken links on a nightly basis is one of those things. The Azure Function reads from a list of links from a database table and carries out a check to determine if a 200 response is returned. If not, the link will be logged and sent to a user by email using the Sendgrid API.

Scenario

I was working on a project that takes a list of products from an API and stores them in a Hubspot HubDB table. This table contained all product information and the expected URL to a page. All the CMS pages had to be created manually and assigned the URL as stored in the table, which in turn would allow the page to be populated with product data.

As you can expect, the disadvantage of manually created pages is that a URL change in the HubDB table will result in a broken page. Not ideal! In this case, the likelihood of a URL being changed is rare. All I needed was a checker to ensure I was made aware on the odd occasion where a link to the product page could be broken.

I won't go into any further detail but rest assured, there was an entirely legitimate reason for this approach in the grand scheme of the project.

Azure Function

I have modified my original code purely for simplification.

using System;
using System.Collections.Generic;
using System.Net;
using System.Text;
using System.Threading.Tasks;
using Microsoft.Azure.WebJobs;
using Microsoft.Extensions.Logging;
using SendGrid;
using SendGrid.Helpers.Mail;

namespace ProductsSyncApp
{
  public static class ProductLinkChecker
  {
    [FunctionName("ProductLinkChecker")]
    public static void Run([TimerTrigger("%ProductLinkCheckerCronTime%"
      #if DEBUG
      , RunOnStartup=true
      #endif
      )]TimerInfo myTimer, ILogger log)
    {
      log.LogInformation($"Product Link Checker started at: {DateTime.Now:G}");

      #region Iterate through all product links and output the ones that return 404.

      List<string> brokenProductLinks = new List<string>();

      foreach (string link in GetProductLinks())
      {
        if (!IsEndpointAvailable(link))
          brokenProductLinks.Add(link);
      }

      #endregion

      #region Send Email

      if (brokenProductLinks.Count > 0)
        SendEmail(Environment.GetEnvironmentVariable("Sendgrid.FromEmailAddress"), Environment.GetEnvironmentVariable("Sendgrid.ToAddress"), "www.contoso.com - Broken Link Report", EmailBody(brokenProductLinks));

      #endregion

      log.LogInformation($"Product Link Checker ended at: {DateTime.Now:G}");
    }

    /// <summary>
    /// Get list of a product links.
    /// This would come from a datasource somewhere containing a list of correctly expected URL's.
    /// </summary>
    /// <returns></returns>
    private static List<string> GetProductLinks()
    {
      return new List<string>
      {
        "https://www.contoso.com/product/brokenlink1",
        "https://www.contoso.com/product/brokenlink2",
        "https://www.contoso.com/product/brokenlink3",
      };
    }

    /// <summary>
    /// Checks if a URL endpoint is available.
    /// </summary>
    /// <param name="url"></param>
    /// <returns></returns>
    private static bool IsEndpointAvailable(string url)
    {
      try
      {
        HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);

        using HttpWebResponse response = (HttpWebResponse)request.GetResponse();

        if (response.StatusCode == HttpStatusCode.OK)
          return true;

        return false;
      }
      catch
      {
        return false;
      }
    }

    /// <summary>
    /// Create the email body.
    /// </summary>
    /// <param name="brokenLinks"></param>
    /// <returns></returns>
    private static string EmailBody(List<string> brokenLinks)
    {
      StringBuilder body = new StringBuilder();

      body.Append("<p>To whom it may concern,</p>");
      body.Append("<p>The following product URL's are broken:");

      body.Append("<ul>");

      foreach (string link in brokenLinks)
        body.Append($"<li>{link}</li>");

      body.Append("</ul>");

      body.Append("<p>Many thanks.</p>");

      return body.ToString();
    }

    /// <summary>
    /// Send email through SendGrid.
    /// </summary>
    /// <param name="fromAddress"></param>
    /// <param name="toAddress"></param>
    /// <param name="subject"></param>
    /// <param name="body"></param>
    /// <returns></returns>
    private static Response SendEmail(string fromAddress, string toAddress, string subject, string body)
    {
      SendGridClient client = new SendGridClient(Environment.GetEnvironmentVariable("SendGrid.ApiKey"));

      SendGridMessage sendGridMessage = new SendGridMessage
      {
        From = new EmailAddress(fromAddress, "Product Link Report"),
      };

      sendGridMessage.AddTo(toAddress);
      sendGridMessage.SetSubject(subject);
      sendGridMessage.AddContent("text/html", body);

      return Task.Run(() => client.SendEmailAsync(sendGridMessage)).Result;
    }
  }
}

Here's a rundown on what is happening:

  1. A list of links is returned from the GetProductLinks() method. This will contain a list of correct links that should be accessible on the website.
  2. Loop through all the links and carry out a check against the IsEndpointAvailable() method. This method carries out a simple check to see if the link returns a 200 response. If not, it'll be marked as broken.
  3. Add any link marked as broken to the brokenProductLinks collection.
  4. If there are broken links, send an email handled by SendGrid.

As you can see, the code itself is very simple and the only thing that needs to be customised for your use is the GetProductLinks method, which will need to output a list of expected links that a site should contain for cross-referencing.

Email Send Out

When using Azure functions, you can't use the standard .NET approach to send emails and Microsoft recommends that an authenticated SMTP relay service that reduces the likelihood of email providers rejecting the message. More insight into this can be found in the following StackOverflow post - Not able to connect to smtp from Azure Cloud Service.

When it comes to SMTP relay services, SendGrid comes up favourably and being someone who uses it in their current workplace, it was my natural inclination to make use of it in my Azure Function. Plus, they've made things easy by providing a Nuget package to allow direct access to their Web API v3 endpoints.

Hubspot CMS for Marketers Certified

Since around September last year, I've been involved in a lot of Hubspot projects at my place of work - Syndicut. It's the latest edition to the numerous other platforms that are offered to clients.

The approach to developing websites in Hubspot is not something I'm used to coming from a programming background where you build everything custom using some form of server-side language. But I was surprised by what you can achieve within the platform.

Having spent months building sites using the Hubspot Markup Language (HUBL), utilising a lot of the powerful marketing features and using the API to build a custom .NET Hubspot Connector, I thought it was time to attempt a certification focusing on the CMS aspect of Hubspot.

There are two CMS certifications:

  1. Hubspot CMS for Marketers
  2. Hubspot CMS for Developers

I decided to tackle the "CMS for Marketers" certification first as this mostly covers the theory aspect on how you use Hubspot to create a user-friendly, high-performing website and leveraging that with Hubspot CRM. These are the areas you can get quite shielded from if you're purely just developing in pages and modules. I thought it would be beneficial to expose myself from a marketing standpoint to get an insight into how my development forms part of the bigger picture.

I'm happy to report I am now Hubspot CMS for Marketers certified.

Hubspot CMS for Marketers Certification

UniFi: Restrict Network Device Access On A Guest Network

On my UniFi Dream Machine, I have set up a guest wireless network for those who come to my house and need to use the Internet. I've done this across all routers I've ever purchased, as I prefer to use the main non-guest wireless access point (WAP) just for me as I have a very secure password that I rather not share with anyone.

It only occurred to me a few days ago that my reason for having a guest WAP is flawed. After all, the only difference between the personal and guest WAP's is a throw-away password I change regularly. There is no beneficial security in that. It is time to make good use of UniFi’s Guest Control settings and prevent access to internal network devices. I have a very simple network setup and the only two network devices I want to block access to is my Synology NAS and IP Security Camera.

UniFi’s Guest Control settings do a lot of the grunt work out the box and is pretty effortless to set up. Within the UniFi controller (based on my own UniFi Dream Machine), the following options are available to you:

  1. Guest Network: Create a new wireless network with its own SSID and password.
  2. Guest User Group: Set download/upload bandwidth limitations that can be attached to the Guest Network.
  3. Guest Portal: A custom interface can be created where a guest will be served a webpage to enter a password to access the wireless network - much like what you'd experience when using the internet at an airport or hotel. UniFi gives you enough creative control to make the portal interface look very professional. You  can expire the connection by a set number of hours.
  4. Guest Control: Limit access to devices within the local network via IP address.

I don't see the need to enable all guest features the UniFi controller offers and the only two that are of interest to me is setting up a guest network and restricting access (options 1 and 4). This is a straight-forward process that will only take a few minutes.

Guest Network

A new wireless network will need to be created and be marked as a guest network. To do this, we need to set the following:

  • Name/SSID: MyGuestNetwork
  • Enable this wireless network: Yes
  • Security: WPA Personal. Add a password
  • Guest Policy: Yes

All other Advanced Options can be left as they are.

UniFi Controller - Guest Network Access Point

Guest Control

To make devices unavailable over your newly create guest network, you can simply add IPV4 hostname or subnet within the "Post Authorisation Restrictions" section. I've added the IP to my Synology NAS - 172.16.1.101.

UniFi Controller - Guest Control

If all has gone to plan when connecting to the guest WAP you will not be able to access any network connected devices.

UniFi: Unable To Access Synology On Local Network

Investing in a UniFi Dream Machine has been one of the wisest things I've done last year when it comes to relatively expensive purchases. It truly has been worth every penny for its reliability, security and rock-solid connection - something that is very much needed when working from home full-time.

The Dream Machine has been very low maintenance and I just leave it to do its thing apart from carrying out some minor configuration tweaks to aid my network. The only area that I did encounter problems was accessing the Synology Disk Station Manager (DSM) web interface. I could access Synology if I used the local IP address instead of the "myusername.synology.me" domain. Generally, this would be an ok solution, but not the right one for two reasons:

  1. Using a local IP address would restrict connection to my Synology if I was working outside from another location. This was quite the deal-breaker as I do have a bunch of Synology apps installed on my Mac, such as Synology Drive that carries out backups and folder synchronisation.
  2. I kept on getting a security warning in my browser when accessing DSM regarding the validity of my SSL certificate, which is to be expected as I force all connections to be carried out over SSL.

To my befuddlement, I had no issue accessing the data in my Synology by mapping them as network drives from my computer.

There was an issue with my local network as I was able to access the Synology DSM web interface externally. From perusing the UniFi community forum, there have been quite a few cases where users have reported the same thing and the common phrase that came popping up in all the posts was: Broken Hairpin NAT. What is a Hairpin NAT?

A Hairpin NAT allows you to run a server (in this case a NAS) inside your network but connect to it as if you were outside your network. For example via a web address, "myusername.synology.me" that will resolve to the internal IP of the server.

What I needed to do was to run an internal DNS server and a local entry for "myusername.synology.me" and point that to the internal IP address of the NAS. What was probably happening is that my computer/device was trying to make a connection past the firewall and then back in again to access the NAS. Not the most efficient way to make a connection for obvious reasons and in some cases may not work. A loopback would resolve this.

A clever user posted a solution to the issue on the UniFi forum that is very easy to follow and worked like a charm - Loopback/DNS Synology DiskStation.

I have also saved a screenshot of the solution for posterity.

Time Is The School In Which We Learn

I've always considered time an enemy as I always had a disdain for how fast the hours and days would just fly by. The speed dial seems to turn a little more further with every year that passes and then one day you wake up and you're the big 3-5!

Ever since the pandemic hit, time has become an enemy once again but for a different reason entirely... It just goes sooo slow! On top of that how one would normally progress themselves (for business and pleasure) before the pandemic is no longer within our reach. With the combination of living on my own and a somewhat lack of social interactions, you can easily find yourself just letting time pass doing a whole lot of nothing.

The following quote by Delmore Schwartz, an American poet, resonates with me:

 Time is the school in which we learn,  Time is the fire in which we burn.

The worst thing I can do is let time pass and have nothing to show for it. There is a need for something tangible to prove my worth over this period to look back on. For me, writing about what I've learnt is something I can use to quantify progress and this very post just adds to that. I am hoping this will be the fuel to focus on cranking out more posts throughout the year.

I decided to write about the areas in my life that give me the ability to hone my skill set and the process involved.

The Workaholic

Most of my learning happens in a work environment as I am constantly allowed to work on upcoming technologies and platforms. This is probably the reason why I’ve become quite the workaholic. I’m lucky to be in a job that is of great interest to me where I can flex my technical muscle. I am constantly learning new things based on challenging client requirements and that in itself plants the seeds on what I need to learn next.

In the UK, the average working hours per week is 42.5 - above the European average of 41.2. I generally work 45-50 hours a week and that’s not to brag. It’s fun and I genuinely enjoy it. Maybe working from home has also contributed to this. After all, there is nothing else to do in the current climate we find ourselves in.

So far, this year alone, I've learnt the following within a working environment:

  • Azure Functions
  • Azure DevOps
  • Hubspot
  • Hubspot API Development
  • Ucommerce

The Daily 30-Minute Cram Session

Mastering something is just a matter of investing some time no matter how short a learning session is. As minutes become hours and hours become days, it all adds up.

I have a regiment where my day starts with a quick 30-minute learning session on a subject of interest to me. It’s quite surprising how effective a 30-minute cram session can be. I have progressed through my career and adapted to learning new subjects quicker by doing just this. This has benefitted me in other areas: preparing for meetings.

There have been numerous times within my job where I have to be in client meetings to talk about platforms that may be a little foreign to me and provide solutions. I now feel relatively confident that I'm prepared for such a meeting within a short period of 30 minutes.

At the time of writing, my current 30-minute cram sessions are focused on Hubspot development to push the boundaries on what the platform can do and keeping up with Azure’s vast offerings.

Focus Time

I have my "30 Minute Cram Session" but when is the best time to do them? I find the most ideal time is the start of a working day where I get to my desk an hour before the working day starts. Normally, this would be impossible pre-Covid times, as this time would be spent getting my things together and making my way to work. Throughout the pandemic, I have continued to get up at my normal time so I can get to my desk by 8 am.

I find it amazing what this one hour of solitude can give me. I either use to extend a "30 Minute Cram Session" for reading and research or to just get through some tasks before the working day starts. After the pandemic is over and normal life resumes, I hope this can continue.

Creating A Knowledgebase Through Blogging

Being the forgetful person I am (just ask my mum!), I find I remember things more when I write about them - one of the main reasons I started this blog. It allows my brain to process big subjects into more digestible chunks. To aid this further, I added Algolia search to my site at the start of the year, as there have been several times where it's taken me too much time to find something I've previously written.

I have quite a backlog of stuff that I want to write and sometimes I find it difficult to put some technical subjects into words. Believe it or not, I generally find writing a little difficult even after 10+ years of blogging. But I like this challenge.

My approach to writing blog posts is a little unconventional. I work on a handful at a time. Each post starts in my note-taking application of choice, Evernote,  where I can start things off simple with a subject title, a skeletal structure to then flesh out. I then write in small chunks across various posts.

Twitter

I may not post much to Twitter, but I follow people who either work in the same industry like me or those who instil similar interests. The conversations that are had on the platform open my eyes to other areas I should be looking into. As a result, this breaks the monotony of approaching something I've been doing the same for so long and try a different approach. It was tweets that got me into seeing the power of Azure Functions and provided an alternative way of running a piece of code on a schedule effortlessly.

Ongoing List of Ideas

Along with my pile of "in progress" blog posts to write, I also have a to-do list of potential things I want to work on. It could be random things of interest based on what I see day-to-day.

For example, I am currently looking into creating my own Twitter bot (not the spamming kind) that carries out some form of automation. I see quite a few of these bots when checking Twitter and interested to see how I could create my own.

I don't plan on developing anything fancy, such as the very impressive colorize_bot, where black and white images are made colour by simply mentioning the Colorize Bot Twitter handle. But maybe something a little more reserved, such as some textual response based on a hashtag or phrase.

Putting such ideas into practice is the prime environment to learning as I'm developing something that is of interest to me personally on a subject that excites me.

MacBook Pro Charge Limiting for Battery Health

Since working from home, my laptop is constantly left plugged into the mains as there isn’t much of a reason to ever disconnect, especially when you have a nice office to work in. I’ve been told leaving your laptop on charge has a negative impact on the longevity of your battery.

I’ve learnt this the hard way. The battery from my previous laptop, a Macbook Pro 2015, died a slow death until it got to a point where it soon became a glorified workstation. This seemed to happen quicker than I would have liked - within 3 years from purchase. Not something I’d expect from the build quality expected from an Apple product.

I was brave enough to replace the battery myself giving a new lease of life! The post teaser image is proof of my efforts. That picture was taken in when I managed to carefully pry the first cells of the old battery away from the existing adhesive. This was the most hardest part of the whole process!

My old laptop has now been replaced with the most recent iteration of the Macbook Pro, as I needed a little more power and most importantly 32GB of RAM to run intensive virtual environments. I made the conscious decision to actively take care of the battery and not repeat the mistakes I made in how I used my previous laptop. This is easier said than done especially when my laptop is connected via Thunderbolt to my monitor, both powering my laptop and gives dual-screen capability. It’s impossible to disconnect!

My only option was to find a “battery charge limiter” application that would set a maximum battery charge. Now, there is a great debate across forums whether going to such lengths does have any positive impact on battery health. Apparently, MacOS’s battery health management should suffice for the majority of scenarios when it comes to general usage. Going by experience, this didn’t help the lifespan of my previous Macbook’s battery. Hence my scepticism.

One indirect benefit of setting a charge limit is there will be less charge cycles counted, resulting in increased resale value should you decide to sell your laptop. Also, according to the Battery University, setting a charging threshold to 80% might get you around 1500 charge cycles.

If the likes of Lenovo, Samsung and Sony (all running on Windows) provide support software to limit the charge threshold, there has to be some substance to this approach. Unfortunately, you’re very limited to find a similar official application for macOS. But all is not lost. Two open-source variants carry out the job satisfactorily:

Both these apps modify the “Battery Charge Level Max” (BCLM) parameter in the SMC, which when set limit the charge level. The only thing to be aware of when using these applications is that sometimes the set charge limit can be wiped after a shutdown or restart. This is a minor annoyance I can live with. Out of the two, my preference was AlDente as I noticed the set charge limit didn’t get wiped as often when compared with Charge Limiter.

I’ll end this post with one final link from The Battery University on the best conditions to charge any battery - How To Charge and When To Charge.

My Work from Home Setup

It'll soon be coming up to a year working from home full-time due to the pandemic and I thought I'd write a post about my current setup as it has evolved over the months. Starting from a bare empty room with just a desk and chair has now become a fitting place to ensure maximum productivity and comfort.

I believe investing in a good home office setup is what can make working from home that little bit easier. Not everyone will be fortunate enough to have a single room dedicated to an office space, or afford all the niceties you've see other bloggers write about or showcased on Instagram.

The most important part of any office is investing in a good desk and chair. Everything else is secondary. I can't stress how important this is. Working on something like a dining table can get uncomfortable very easily and this can be a big distraction in itself. Start small with the basic's and overtime work your way up and make improvements when you can. This is the approach I’ve taken.

In general, working from home over long periods can be a real chore and a good setup will help you stay healthier and focussed whilst working. Interesting enough, The Atlantic wrote an article detailing why so many people are now experiencing medical problems after making the switch to working from home. A combination of long working hours, fewer breaks, stress and isolation is creating a negative impact on all of us.

Desk

I’m quite particular about desks and prefer ones that are a little industrial looking and made from real material. None of that MDF or veneered manufactured stuff. I went for a desk made from Indian reclaimed mango wood, constructed on a sturdy metal steel frame. It certainly adds a bit of character to the office.

I’ve been told I should have opted for a standup desk for further health benefits, but I’m doing just fine as both my desk and chair are at the right height suitable for my posture.

Chair

I went for an Ikea Alefjall office chair that provides great support in a relatively small form factor. The seat and backrest are height adjustable. You also get support for your thighs and back through its depth adjustment along with tilt capability.

Monitor

Samsung Ultrwide 34 inch monitor

I managed to snap a real bargain on an ultra-wide curved monitor from last years Amazon Black Friday deal and now a proud owner of a Samsung 34 inch ultra-wide beauty! This is a major upgrade over my Dell Ultrasharp, which by no means is a bad monitor, but just felt I needed more screen real-estate.

Being Thunderbolt-compatible is a bonus as my MacBook Pro can charge and transmit data simultaneously over a single cable. Makes cable management that little bit easier.

Mouse

I have a Logitech MX Master and it’s the most comfortable mouse I’ve ever used. Fits very comfortably in the palm of your hand and is very customisable. I don’t generally like wireless mice as they can be fiddly to connect and I always question the usage time in between charges.

This mouse works for weeks and that's with me leaving it switched on all the time. When it comes to charging, just connect the cable and carry on using it.

Keyboard

I've been a big fan of mechanical keyboards and prefer them over Apple’s over-priced ones. You just can’t beat the nice responsive “clickity-clack from every keypress. I’m still using the Ducky DK9008 Shine 2 my Dad got me in 2013. It’s still going strong unlike the many Apple keyboards that have failed previously.

Just be careful whilst using it when on a Zoom call. You will notice how noisy it can come across. The amount of noise emitted by a mechanical keyboard depends on the type of switches used. You can get some really good mechanical keyboards across a variety of price points. If I didn’t already have one, I’d choose one from the range offered by Keychron.

Speaker

I have a Google Home Max smart speaker that packs a real punch sat in the corner of the room. Even though the speaker itself isn’t in close proximity to where my desk is, I can summon commands without having to raise my voice.

Google Home Max speaker

Plants

An office space can quite quickly look very sterile and I like a little bit of greenery, which is thought to improve productivity and relieve stress. I’m not sure if that’s true. All I know it makes my working space that little bit nicer to be in. The plants I went for are very low maintenance and consist of:

  • Sansevieria: Known as “The Mother in Law's tongue” due it’s sharp upright leaves. It emits oxygen and filters toxins from the air.
  • Succulents: Really cheap and small enough to fit into any space.
  • Orchid: Not so low maintenance. Looks very cool when alive though! Mine is currently making its way back from the dead.

What's Next?

I think I'm done for the moment. It'll be nice to get some LED strips to fix to my desk and behind my monitor for subtle accent lighting.

Passing Search A Query Without Using The SearchBox Component In Algolia

You probably haven't noticed (and you'd be forgiven if this is the case!) that my site now has the ability to search through posts. This is a strange turn of events for me as I decided to remove search capability from my site many years ago as I didn't feel it added any benefits for the user. This became evident from Google Analytics stats where searches never hit high enough numbers to warrant having it. The numbers don't lie!

So what caused this turnaround?

I've noticed that I'm regularly referring back through posts to refresh myself on things I've done in the past and to find solutions to issues I know I've previously written about. Having a search would make trawling through my few hundred posts a lot easier. So this is more of a personal requirement than commercial. But there is an exciting aspect to this as well - experimenting with Algolia. Using Algolia search is something I've been meaning to look into for a long time and integrating with GatbsyJS.

The thought of having the good ol' magnifying glass back in the navigation makes me nostalgic!

Note: In this post, I won't be covering the basic Algolia setup or the plugins needed to install as there is already a great wealth of information online. Check out my "Useful Links" section at the end of the post.

Basic Setup

Integrating Algolia into GatbsyJS was relatively straight-forward due to the wealth of information that others have already written and also the plugins themselves. The plugins make light work of rendering search results quickly allowing enough customisations to the HTML markup for easy implementation within any site. By default, the plugins contain the following components:

  • InstantSearch
  • SearchBox
  • Hits
import algoliasearch from 'algoliasearch/lite';
import PropTypes from 'prop-types';
import { Link } from 'gatsby';
import { InstantSearch, Hits, Highlight, SearchBox } from 'react-instantsearch-dom';
import React from 'react';

// Get API keys from the environment file.
const appId = process.env.GATSBY_ALGOLIA_APP_ID;
const searchKey = process.env.GATSBY_ALGOLIA_SEARCH_KEY;
const searchClient = algoliasearch(appId, searchKey);

const SearchPage = () => (
  <InstantSearch
    searchClient={searchClient}
    indexName={process.env.GATSBY_ALGOLIA_INDEX_NAME}
  >
    <SearchBox />
    <Hits hitComponent={Hit} />
  </InstantSearch>
);

function Hit(props) {
  return (
    <article className="hentry post">
      <h3 className="entry-title">
        <Link to={props.hit.fields.slug}>
          <Highlight attribute="title" hit={props.hit} tagName="mark" />
        </Link>
      </h3>
      <div className="entry-meta">
        <span className="read-time">{props.hit.fields.readingTime.text}</span>
      </div>
      <p className="entry-content">
        <Highlight hit={props.hit} attribute="summary" tagName="mark" />
      </p>
    </article>
  );
}

Hit.propTypes = {
  hit: PropTypes.object.isRequired,
};

export default SearchPage;

The InstantSearch is the core component that directly interacts with Algolia's API and takes in two properties, "searchClient" and "indexName" containing the Application ID and Search Key that is acquired from the Algolia account setup. This component contains two child components, SearchBox is the search textbox and Hits that displays results from the search query.

It is the Hits component where we can customise the HTML with our own markup by using it's "hitComponent" attribute. In my case, I created a function to generate HTML where I access the properties from the search index. What's really cool is here is we have the ability to also highlight our search term where they may occur in the results by using the Highlight component (also provided by the Algolia plugin) and adding a "tagName" attribute.

Removing The SearchBox Component

The standard implementation may not suit all scenarios as you may want a search term to be sent to the InstantSearch component differently. For example, it could be from a custom search textbox or (as in my case) read from a query-string parameter. It wasn't until I started delving further into the standard setup I realised you cannot just remove the SearchBox component and pass a value directly, but there is a workaround.

I have expanded upon the code-snippet, above, to demonstrate how my search page works...

import algoliasearch from 'algoliasearch/lite';
import PropTypes from 'prop-types';
import { Link } from 'gatsby';
import { InstantSearch, Hits, Highlight, connectSearchBox } from 'react-instantsearch-dom';
import Layout from "../components/global/layout";
import React, { Component } from "react";

// Get API keys from the environment file.
const appId = process.env.GATSBY_ALGOLIA_APP_ID;
const searchKey = process.env.GATSBY_ALGOLIA_SEARCH_KEY;
const searchClient = algoliasearch(appId, searchKey);
const VirtualSearchBox = connectSearchBox(() => <span />);

class SearchPage extends Component { 
  state = {
    searchState: {
      query: '',
    },
  };

  componentDidMount() {   
    // Get "term" query string parameter value.
    let search = window.location.search;
    let params = new URLSearchParams(search);
    let searchTerm = params.get("term");

    // Send the query string value to a "searchState" object used by Algolia.
    this.setState(state => ({
      searchState: {
        ...state.searchState,
        query: searchTerm,
      },
    }));
 }

  render() {
      // Default "instantSearch" HTML to prompt user to enter a search term.
      var instantSearch = null;
      
      // If there is a search term, utilise Algolia's instant search.
      if (this.state.searchState.query) {
        instantSearch = <div className="entry-content">
                          <h2>You've searched for "{this.state.searchState.query}".</h2>
                          <div className="post-list archives-list">
                          <InstantSearch
                              searchClient={searchClient}
                              indexName={process.env.GATSBY_ALGOLIA_INDEX_NAME}
                              searchState={this.state.searchState}
                            >
                              <VirtualSearchBox />
                              <Hits hitComponent={Hit} />
                            </InstantSearch>  
                          </div>
                        </div>
      }
      else {
        instantSearch = <div className="entry-content">
                          <h2>You haven't entered a search term.</h2>
                          <p>Carry out a search by clicking the <em>magnifying glass</em> in the navigation.</p>
                        </div>
      }

      return (
        <Layout>
          <header className="page-header">
            <h1>Search</h1>
            <p>Search the knowledge-base...</p>
          </header>
          <div id="primary" className="content-area">
            <div id="content" className="site-content" role="main">
                <div className="layout-fixed">
                    <article className="page hentry">
                      {instantSearch}
                    </article>
                </div>
            </div>
          </div>
      </Layout>
    )
  }
}

function Hit(props) {
  return (
    <article className="hentry post">
      <h3 className="entry-title">
        <Link to={props.hit.fields.slug}>
          <Highlight attribute="title" hit={props.hit} tagName="mark" />
        </Link>
      </h3>
      <div className="entry-meta">
        <span className="read-time">{props.hit.fields.readingTime.text}</span>
      </div>
      <p className="entry-content">
        <Highlight hit={props.hit} attribute="summary" tagName="mark" />
      </p>
    </article>
  );
}

Hit.propTypes = {
  hit: PropTypes.object.isRequired,
};

export default SearchPage

My code is reading from a query-string value and passing that to a "searchState". The searchState object is created by React InstantSearch internally. Every widget inside the library has its own way of updating it. It contains parameters on the type of search that should be performed, such as query, sorting and pagination, to name a few. All we're interested in doing is updating the query parameter of this object with our search term.

If the query parameter from the "searchState" object is empty, render search results, otherwise, display a message stating a search term is required.

One thing to notice is the SearchBox has been replaced with a VirtualSearchBox, which uses the connector of the search box to create a virtual widget - in our case an empty span tag. This will link the InstantSearch component with the query. Having some form of search box component is compulsory.

Conclusion

I prefer not to use the out-of-the-box search box component as I can potentially save requests to Algolia's API, as searches aren't being made on the fly as a user enters a search term. This is the plugins default behaviour.

Passing a search term through a query-string may come across as a little backwards, especially when it's rather nice to see search results change before your eyes as you type letter-by-letter. However, this approach misses one key element: Tracking in Google Analytics. Even though I will be primary the person making the most use of my site search, it'll be interesting to see who else uses it and what search keywords are used.

Useful Links

ASP.NET Core: Using Assembly Build Date For Cache Busting

ASP.NET Core contains a variety of useful Tag Helpers to enable server-side code to participate in creating and rendering HTML elements in our Views. One Tag Helper, in particular, has the ability to cache bust links to static resources such as Image, CSS and JavaScript by appending an asp-append-version="true" attribute.

The asp-append-version attribute automatically adds a version number to the file name using a SHA256 hashing algorithm, so whenever the file is updated, the server generates a new unique version. For a deeper understanding on how ASP.NET Core performs this piece of functionality, give the following StackOverflow post a read: How does javascript version (asp-append-version) work in ASP.NET Core MVC?.

This approach works perfectly if you're linking to your static resources using the relevant HTML tag, for example img, script or link. In my scenario, I'm using a JavaScript library called LabJS - a dynamic script loader that gives the ability to control the loading and execution of different plugins. For example:

<script>
  $LAB
  .script("http://remote.tld/jquery.js").wait()
  .script("/local/plugin1.jquery.js")
  .script("/local/plugin2.jquery.js").wait()
  .script("/local/init.js").wait(function(){
      initMyPage();
  });
</script>

I need to be able to append a query string parameter to one of the JavaScript file references. One thing that came to mind was to use the applications last build-time as the cache busting value. Whenever the application is updated, this value will automatically be updated so no manual intervention is required.

I found code examples from meziantou.net that demonstrated various approaches to acquiring an applications build date. I modified the "Linker timestamp" example to return a Unix timestamp in a newly created class called AssemblyUtils.

public class AssemblyUtils
{
    #region Properties

    public int UnixTimestamp { get; set; }

    #endregion

    /// <summary>
    /// Get timestamp in Unix seconds for the last build.
    /// </summary>
    /// <returns></returns>
    public static int GetBuildTimestamp()
    {
        const int peHeaderOffset = 60;
        const int timestampOffset = 8;

        byte[] bytes = new byte[2048];

        using (FileStream file = new FileStream(Assembly.GetExecutingAssembly().Location, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
            file.Read(bytes, 0, bytes.Length);

        int headerPos = BitConverter.ToInt32(bytes, peHeaderOffset);
        int unixTime = BitConverter.ToInt32(bytes, headerPos + timestampOffset);

        return unixTime;
    }
}

The code will only return the Assembly information if your Visual Studio .csproj file (from version 15.4 onwards) includes the following setting within the <PropertyGroup> settings:

<Deterministic>False</Deterministic>

It would be a waste to constantly call the GetBuildTimestamp() method to acquire assembly information directly within the page View, when the most ideal approach would be to make this call once on application startup.

public void ConfigureServices(IServiceCollection services)
{
    #region Assembly Utils - Build Time

    Action<AssemblyUtils> assemblyBuildOptions = (opt =>
    {
        opt.UnixTimestamp = AssemblyUtils.GetBuildTimestamp();
    });

    services.Configure(assemblyBuildOptions);
    services.AddSingleton(resolver => resolver.GetRequiredService<IOptions<AssemblyUtils>>().Value);

    #endregion
}

We can access the build timestamp value using Dependency Injection within a base controller that gets inherited by all controllers.

public class BaseController : Controller
{
    private int _buildTimetamp { get; set; }

    public BaseController(AssemblyUtils assemblyUtls)
    {
        _buildTimetamp = assemblyUtls.UnixTimestamp;
    }

    public override void OnActionExecuting(ActionExecutingContext context)
    {
        base.OnActionExecuting(context);

        // Assign build timestamp to a View Bag.
        ViewBag.CacheBustingValue = _buildTimetamp;
    }
}

The timestamp is assigned to a ViewBag that can then be accessed at View level.

<script>
  $LAB
  .script("http://remote.tld/jquery.js").wait()
  .script("/local/plugin1.jquery.js")
  .script("/local/plugin2.jquery.js").wait()
  .script("/local/init.js?v=@ViewBag.CacheBustingValue").wait(function(){
      initMyPage();
  });
</script>

This will result in the following output:

<script>
  $LAB
  .script("http://remote.tld/jquery.js").wait()
  .script("/local/plugin1.jquery.js")
  .script("/local/plugin2.jquery.js").wait()
  .script("/local/init.js?v=1609610821").wait(function(){
      initMyPage();
  });
</script>