Azure Function: 404 Page Checker
Sometimes the simplest piece of development can be the most rewarding and I think my Azure Function that checks for broken links on a nightly basis is one of those things. The Azure Function reads from a list of links from a database table and carries out a check to determine if a 200 response is returned. If not, the link will be logged and sent to a user by email using the Sendgrid API.
Scenario
I was working on a project that takes a list of products from an API and stores them in a Hubspot HubDB table. This table contained all product information and the expected URL to a page. All the CMS pages had to be created manually and assigned the URL as stored in the table, which in turn would allow the page to be populated with product data.
As you can expect, the disadvantage of manually created pages is that a URL change in the HubDB table will result in a broken page. Not ideal! In this case, the likelihood of a URL being changed is rare. All I needed was a checker to ensure I was made aware on the odd occasion where a link to the product page could be broken.
I won't go into any further detail but rest assured, there was an entirely legitimate reason for this approach in the grand scheme of the project.
Azure Function
I have modified my original code purely for simplification.
using System;
using System.Collections.Generic;
using System.Net;
using System.Text;
using System.Threading.Tasks;
using Microsoft.Azure.WebJobs;
using Microsoft.Extensions.Logging;
using SendGrid;
using SendGrid.Helpers.Mail;
namespace ProductsSyncApp
{
public static class ProductLinkChecker
{
[FunctionName("ProductLinkChecker")]
public static void Run([TimerTrigger("%ProductLinkCheckerCronTime%"
#if DEBUG
, RunOnStartup=true
#endif
)]TimerInfo myTimer, ILogger log)
{
log.LogInformation($"Product Link Checker started at: {DateTime.Now:G}");
#region Iterate through all product links and output the ones that return 404.
List<string> brokenProductLinks = new List<string>();
foreach (string link in GetProductLinks())
{
if (!IsEndpointAvailable(link))
brokenProductLinks.Add(link);
}
#endregion
#region Send Email
if (brokenProductLinks.Count > 0)
SendEmail(Environment.GetEnvironmentVariable("Sendgrid.FromEmailAddress"), Environment.GetEnvironmentVariable("Sendgrid.ToAddress"), "www.contoso.com - Broken Link Report", EmailBody(brokenProductLinks));
#endregion
log.LogInformation($"Product Link Checker ended at: {DateTime.Now:G}");
}
/// <summary>
/// Get list of a product links.
/// This would come from a datasource somewhere containing a list of correctly expected URL's.
/// </summary>
/// <returns></returns>
private static List<string> GetProductLinks()
{
return new List<string>
{
"https://www.contoso.com/product/brokenlink1",
"https://www.contoso.com/product/brokenlink2",
"https://www.contoso.com/product/brokenlink3",
};
}
/// <summary>
/// Checks if a URL endpoint is available.
/// </summary>
/// <param name="url"></param>
/// <returns></returns>
private static bool IsEndpointAvailable(string url)
{
try
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
using HttpWebResponse response = (HttpWebResponse)request.GetResponse();
if (response.StatusCode == HttpStatusCode.OK)
return true;
return false;
}
catch
{
return false;
}
}
/// <summary>
/// Create the email body.
/// </summary>
/// <param name="brokenLinks"></param>
/// <returns></returns>
private static string EmailBody(List<string> brokenLinks)
{
StringBuilder body = new StringBuilder();
body.Append("<p>To whom it may concern,</p>");
body.Append("<p>The following product URL's are broken:");
body.Append("<ul>");
foreach (string link in brokenLinks)
body.Append($"<li>{link}</li>");
body.Append("</ul>");
body.Append("<p>Many thanks.</p>");
return body.ToString();
}
/// <summary>
/// Send email through SendGrid.
/// </summary>
/// <param name="fromAddress"></param>
/// <param name="toAddress"></param>
/// <param name="subject"></param>
/// <param name="body"></param>
/// <returns></returns>
private static Response SendEmail(string fromAddress, string toAddress, string subject, string body)
{
SendGridClient client = new SendGridClient(Environment.GetEnvironmentVariable("SendGrid.ApiKey"));
SendGridMessage sendGridMessage = new SendGridMessage
{
From = new EmailAddress(fromAddress, "Product Link Report"),
};
sendGridMessage.AddTo(toAddress);
sendGridMessage.SetSubject(subject);
sendGridMessage.AddContent("text/html", body);
return Task.Run(() => client.SendEmailAsync(sendGridMessage)).Result;
}
}
}
Here's a rundown on what is happening:
- A list of links is returned from the
GetProductLinks()
method. This will contain a list of correct links that should be accessible on the website. - Loop through all the links and carry out a check against the
IsEndpointAvailable()
method. This method carries out a simple check to see if the link returns a 200 response. If not, it'll be marked as broken. - Add any link marked as broken to the
brokenProductLinks
collection. - If there are broken links, send an email handled by SendGrid.
As you can see, the code itself is very simple and the only thing that needs to be customised for your use is the GetProductLinks
method, which will need to output a list of expected links that a site should contain for cross-referencing.
Email Send Out
When using Azure functions, you can't use the standard .NET approach to send emails and Microsoft recommends that an authenticated SMTP relay service that reduces the likelihood of email providers rejecting the message. More insight into this can be found in the following StackOverflow post - Not able to connect to smtp from Azure Cloud Service.
When it comes to SMTP relay services, SendGrid comes up favourably and being someone who uses it in their current workplace, it was my natural inclination to make use of it in my Azure Function. Plus, they've made things easy by providing a Nuget package to allow direct access to their Web API v3 endpoints.