Azure Function: 404 Page Checker

Sometimes the simplest piece of development can be the most rewarding and I think my Azure Function that checks for broken links on a nightly basis is one of those things. The Azure Function reads from a list of links from a database table and carries out a check to determine if a 200 response is returned. If not, the link will be logged and sent to a user by email using the Sendgrid API.

Scenario

I was working on a project that takes a list of products from an API and stores them in a Hubspot HubDB table. This table contained all product information and the expected URL to a page. All the CMS pages had to be created manually and assigned the URL as stored in the table, which in turn would allow the page to be populated with product data.

As you can expect, the disadvantage of manually created pages is that a URL change in the HubDB table will result in a broken page. Not ideal! In this case, the likelihood of a URL being changed is rare. All I needed was a checker to ensure I was made aware on the odd occasion where a link to the product page could be broken.

I won't go into any further detail but rest assured, there was an entirely legitimate reason for this approach in the grand scheme of the project.

Azure Function

I have modified my original code purely for simplification.

using System;
using System.Collections.Generic;
using System.Net;
using System.Text;
using System.Threading.Tasks;
using Microsoft.Azure.WebJobs;
using Microsoft.Extensions.Logging;
using SendGrid;
using SendGrid.Helpers.Mail;

namespace ProductsSyncApp
{
  public static class ProductLinkChecker
  {
    [FunctionName("ProductLinkChecker")]
    public static void Run([TimerTrigger("%ProductLinkCheckerCronTime%"
      #if DEBUG
      , RunOnStartup=true
      #endif
      )]TimerInfo myTimer, ILogger log)
    {
      log.LogInformation($"Product Link Checker started at: {DateTime.Now:G}");

      #region Iterate through all product links and output the ones that return 404.

      List<string> brokenProductLinks = new List<string>();

      foreach (string link in GetProductLinks())
      {
        if (!IsEndpointAvailable(link))
          brokenProductLinks.Add(link);
      }

      #endregion

      #region Send Email

      if (brokenProductLinks.Count > 0)
        SendEmail(Environment.GetEnvironmentVariable("Sendgrid.FromEmailAddress"), Environment.GetEnvironmentVariable("Sendgrid.ToAddress"), "www.contoso.com - Broken Link Report", EmailBody(brokenProductLinks));

      #endregion

      log.LogInformation($"Product Link Checker ended at: {DateTime.Now:G}");
    }

    /// <summary>
    /// Get list of a product links.
    /// This would come from a datasource somewhere containing a list of correctly expected URL's.
    /// </summary>
    /// <returns></returns>
    private static List<string> GetProductLinks()
    {
      return new List<string>
      {
        "https://www.contoso.com/product/brokenlink1",
        "https://www.contoso.com/product/brokenlink2",
        "https://www.contoso.com/product/brokenlink3",
      };
    }

    /// <summary>
    /// Checks if a URL endpoint is available.
    /// </summary>
    /// <param name="url"></param>
    /// <returns></returns>
    private static bool IsEndpointAvailable(string url)
    {
      try
      {
        HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);

        using HttpWebResponse response = (HttpWebResponse)request.GetResponse();

        if (response.StatusCode == HttpStatusCode.OK)
          return true;

        return false;
      }
      catch
      {
        return false;
      }
    }

    /// <summary>
    /// Create the email body.
    /// </summary>
    /// <param name="brokenLinks"></param>
    /// <returns></returns>
    private static string EmailBody(List<string> brokenLinks)
    {
      StringBuilder body = new StringBuilder();

      body.Append("<p>To whom it may concern,</p>");
      body.Append("<p>The following product URL's are broken:");

      body.Append("<ul>");

      foreach (string link in brokenLinks)
        body.Append($"<li>{link}</li>");

      body.Append("</ul>");

      body.Append("<p>Many thanks.</p>");

      return body.ToString();
    }

    /// <summary>
    /// Send email through SendGrid.
    /// </summary>
    /// <param name="fromAddress"></param>
    /// <param name="toAddress"></param>
    /// <param name="subject"></param>
    /// <param name="body"></param>
    /// <returns></returns>
    private static Response SendEmail(string fromAddress, string toAddress, string subject, string body)
    {
      SendGridClient client = new SendGridClient(Environment.GetEnvironmentVariable("SendGrid.ApiKey"));

      SendGridMessage sendGridMessage = new SendGridMessage
      {
        From = new EmailAddress(fromAddress, "Product Link Report"),
      };

      sendGridMessage.AddTo(toAddress);
      sendGridMessage.SetSubject(subject);
      sendGridMessage.AddContent("text/html", body);

      return Task.Run(() => client.SendEmailAsync(sendGridMessage)).Result;
    }
  }
}

Here's a rundown on what is happening:

  1. A list of links is returned from the GetProductLinks() method. This will contain a list of correct links that should be accessible on the website.
  2. Loop through all the links and carry out a check against the IsEndpointAvailable() method. This method carries out a simple check to see if the link returns a 200 response. If not, it'll be marked as broken.
  3. Add any link marked as broken to the brokenProductLinks collection.
  4. If there are broken links, send an email handled by SendGrid.

As you can see, the code itself is very simple and the only thing that needs to be customised for your use is the GetProductLinks method, which will need to output a list of expected links that a site should contain for cross-referencing.

Email Send Out

When using Azure functions, you can't use the standard .NET approach to send emails and Microsoft recommends that an authenticated SMTP relay service that reduces the likelihood of email providers rejecting the message. More insight into this can be found in the following StackOverflow post - Not able to connect to smtp from Azure Cloud Service.

When it comes to SMTP relay services, SendGrid comes up favourably and being someone who uses it in their current workplace, it was my natural inclination to make use of it in my Azure Function. Plus, they've made things easy by providing a Nuget package to allow direct access to their Web API v3 endpoints.

ASP.NET Core: System.FormatException: Could not parse the JSON file

Another day, another ASP.NET Core error... This time relating to JSON not being parsable. Like the error I posted yesterday, this was another strange one as it only occurred within an Azure environment.

Let me start by showing the file compilation error:

Application '/LM/W3SVC/144182150/ROOT' with physical root 'D:\home\site\wwwroot\' hit unexpected managed exception, exception code = '0xe0434352'. First 30KB characters of captured stdout and stderr logs:
Unhandled exception. System.FormatException: Could not parse the JSON file.
 ---> System.Text.Json.JsonReaderException: '0x00' is an invalid start of a value. LineNumber: 0 | BytePositionInLine: 0.
   at System.Text.Json.ThrowHelper.ThrowJsonReaderException(Utf8JsonReader& json, ExceptionResource resource, Byte nextByte, ReadOnlySpan`1 bytes)
   at System.Text.Json.Utf8JsonReader.ConsumeValue(Byte marker)
   at System.Text.Json.Utf8JsonReader.ReadFirstToken(Byte first)
   at System.Text.Json.Utf8JsonReader.ReadSingleSegment()
   at System.Text.Json.Utf8JsonReader.Read()
   at System.Text.Json.JsonDocument.Parse(ReadOnlySpan`1 utf8JsonSpan, Utf8JsonReader reader, MetadataDb& database, StackRowStack& stack)
   at System.Text.Json.JsonDocument.Parse(ReadOnlyMemory`1 utf8Json, JsonReaderOptions readerOptions, Byte[] extraRentedBytes)
   at System.Text.Json.JsonDocument.Parse(ReadOnlyMemory`1 json, JsonDocumentOptions options)
   at System.Text.Json.JsonDocument.Parse(String json, JsonDocumentOptions options)
   at Microsoft.Extensions.Configuration.Json.JsonConfigurationFileParser.ParseStream(Stream input)
   at Microsoft.Extensions.Configuration.Json.JsonConfigurationFileParser.Parse(Stream input)
   at Microsoft.Extensions.Configuration.Json.JsonConfigurationProvider.Load(Stream stream)
   --- End of inner exception stack trace ---
   at Microsoft.Extensions.Configuration.Json.JsonConfigurationProvider.Load(Stream stream)
   at Microsoft.Extensions.Configuration.FileConfigurationProvider.Load(Boolean reload)
--- End of stack trace from previous location where exception was thrown ---
   at Microsoft.Extensions.Configuration.FileConfigurationProvider.HandleException(ExceptionDispatchInfo info)
   at Microsoft.Extensions.Configuration.FileConfigurationProvider.Load(Boolean reload)
   at Microsoft.Extensions.Configuration.FileConfigurationProvider.Load()
   at Microsoft.Extensions.Configuration.ConfigurationRoot..ctor(IList`1 providers)
   at Microsoft.Extensions.Configuration.ConfigurationBuilder.Build()
   at Microsoft.Extensions.Logging.AzureAppServices.SiteConfigurationProvider.GetAzureLoggingConfiguration(IWebAppContext context)
   at Microsoft.Extensions.Logging.AzureAppServicesLoggerFactoryExtensions.AddAzureWebAppDiagnostics(ILoggingBuilder builder, IWebAppContext context)
   at Microsoft.Extensions.Logging.AzureAppServicesLoggerFactoryExtensions.AddAzureWebAppDiagnostics(ILoggingBuilder builder)
   at Microsoft.AspNetCore.Hosting.AppServicesWebHostBuilderExtensions.<>c.<UseAzureAppServices>b__0_0(ILoggingBuilder builder)
   at Microsoft.Extensions.DependencyInjection.LoggingServiceCollectionExtensions.AddLogging(IServiceCollection services, Action`1 configure)
   at Microsoft.AspNetCore.Hosting.WebHostBuilderExtensions.<>c__DisplayClass8_0.<ConfigureLogging>b__0(IServiceCollection collection)
   at Microsoft.AspNetCore.Hosting.HostingStartupWebHostBuilder.<>c__DisplayClass6_0.<ConfigureServices>b__0(WebHostBuilderContext context, IServiceCollection services)
   at Microsoft.AspNetCore.Hosting.HostingStartupWebHostBuilder.ConfigureServices(WebHostBuilderContext context, IServiceCollection services)
   at Microsoft.AspNetCore.Hosting.GenericWebHostBuilder.<.ctor>b__5_2(HostBuilderContext context, IServiceCollection services)
   at Microsoft.Extensions.Hosting.HostBuilder.CreateServiceProvider()
   at Microsoft.Extensions.Hosting.HostBuilder.Build()
   at Site.Web.Program.Main(String[] args) in C:\Development\surinder-main-website\Site.Web\Program.cs:line 11

Process Id: 2588.
File Version: 13.1.20169.6. Description: IIS ASP.NET Core Module V2 Request Handler. Commit: 62c098bc170f50feca15916e81cb7f321ffc52ff

The application was not consuming any form of JSON as part of its main functionality. The only JSON being used were three variations of appsettings.json - each one for development, staging and production. So this had to be the source of the issue. The error message also confirmed this as Program.cs was referenced and it’s at this point where the application startup code is run.

My first thought was I must have forgotten a comma or missing a closing quote for one of my values. After running the JSON through a validator, it passed with flying colours.

Solution

After some investigation, the issue was caused by incorrect encoding of the file. All the appsettings.json files were set to "UTF-8" and as a result, possibly causing some metadata to be added stopping the application from reading the files. Once this was changed to "UTF-8-BOM" through Notepad++ everything worked fine.

ASP.NET Core: Failed to start application '/LM/W3SVC/####/ROOT', ErrorCode '0x8007023e'.

You gotta love .NET core compilation errors! They provide the most ambiguous error messages known to man. I have noticed the error message and accompanying error code could be caused by a multitude of factors. This error is no different so I’ll make my contribution, hoping this may help someone else.

The error in question occurred really randomly whilst deploying a minor HTML update to a .NET Core site I was hosting within an Azure Web App. It couldn’t have been a simpler release - change to some markup in a View. When the site loaded, I was greeted with the following error:

Failed to start application '/LM/W3SVC/####/ROOT', ErrorCode '0x8007023e’.

I was able to get some further information about the error from the Event Log:

Application 'D:\home\site\wwwroot\' failed to start. Exception message:
Executable was not found at 'D:\home\site\wwwroot\%LAUNCHER_PATH%.exe'
Process Id: 10848.
File Version: 13.1.19331.0. Description: IIS ASP.NET Core Module V2.

The error could only be reproduced on Azure and not within my local development and staging environments. I created a new deployment slot to check if somehow my existing slot got corrupted. Unfortunately, this made no difference. The strange this is, the application was working completely fine up until this release. It's still unknown to me what could have happened for this error to occur all of a sudden.

Solution

It would seem that no one else on the planet experienced this issue when Googling the error message and error code. After a lot of fumbling around, the fix ended up being relatively straight-forward. The detail provided by the Event Log pointed me in the right direction and the clue was in the %LAUNCHER_PATH% placeholder. The %LAUNCHER_PATH% placeholder is set in the web.config and this is normally replaced when the application is run in Visual Studio or IIS.

In Azure, both %LAUNCHER_PATH% and %LAUNCHER_ARGS% variables need to be explicitly set. The following line in the web.config needs to be changed from:

<aspNetCore processPath="%LAUNCHER_PATH%" arguments="%LAUNCHER_ARGS%" stdoutLogEnabled="false" stdoutLogFile=".\logs\stdout" forwardWindowsAuthToken="false" startupTimeLimit="3600" requestTimeout="23:00:00" hostingModel="InProcess">

To:

<aspNetCore processPath=".\Site.Web.exe" arguments="" stdoutLogEnabled="false" stdoutLogFile=".\logs\stdout" forwardWindowsAuthToken="false" startupTimeLimit="3600" requestTimeout="23:00:00" hostingModel="InProcess">

The processPath is now pointing to the executable generated by the project. In this case, "Site.Web.exe". Also, since no arguments are being parsed in my build, the arguments attribute is left empty. When you push up your next release, the error should be rectified.

As a side note, there was one thing recommended to me by Azure support regarding my publish settings in Visual Studio. It was recommended that I should set the deployment mode from "Framework-Dependent" to "Self-Contained". This will ensure the application will always run in its current framework version on the off-chance framework changes happen at an Azure level.

Diagnosing SQL71562 Error When Deploying An Azure SQL Database

When using the “Deploy to Azure Database” option in Microsoft SQL Management Studio to move a database to Azure, you may sometimes come across the following error:

Error SQL71562: Error validating element [dbo].[f]: Synonym: [dbo].[f] has an unresolved reference to object [server].[database_name].[table_name]. External references are not supported when creating a package from this platform.

These type of errors are generated as you cannot setup a linked server in Azure and queries using four-part [server].[database].[schema].[table] references are not supported. I’ve come across a SQL71562 error in the past, but this one was different. Generally, the error details are a lot more helpful and relates to stored procedures or views where a table path contains the database name:

Error SQL71562: Procedure: [dbo].[store_procedure_name] has an unresolved reference to object [database_name].[dbo].[table_name]

Easy enough to resolve. The error I was getting this time threw me as it didn’t point me to any object in the database to where the conflict resides and would require me to look through all possible database objects. This would be easy enough to do manually on a small database, but not a large database consisting of over 50 stored procedures and 30 views. Thankfully, SQL to the rescue...

To search across all stored procedures and views, you can use the LIKE operator to search against the database’s offending system objects based on the details you can gather from the error message:

-- Stored Procedures
SELECT OBJECT_NAME(object_id),
       OBJECT_DEFINITION(object_id)
FROM sys.procedures
WHERE OBJECT_DEFINITION(object_id) LIKE '%[database_name]%'

-- Views
SELECT OBJECT_NAME(object_id),
       OBJECT_DEFINITION(object_id)
FROM sys.views
WHERE OBJECT_DEFINITION(object_id) LIKE '%[database_name]%'

Azure WebJob To Delete Old Files from A Blob Container

I like to keep my blob containers quite tidy and delete any files that would unnecessarily increase its size. For a project I was working on, I had a blob that was being used to temporarily store images a user uploaded for manipulation at a later time. I saw no reason to keep these files for no longer than 24 hours. An Azure WebJob seemed an ideal solution to do this.

I could've left the blob container to stagnate and fester over time and the reasoning behind creating a cleanup task wasn't from a cost point of view. A blob container is very reasonably priced for the amount of storage and requests I would be making. I was more concerned about performance for times where I would be trawling through many thousands of files to get back the image a user had uploaded for temporary use by my web application.

Creating an Azure WebJob is very easy and versatile. You have the flexibility to develop a WebJob by creating the following scripts or programs:

  • .cmd, .bat, .exe (using windows cmd)
  • .ps1 (using powershell)
  • .sh (using bash)
  • .php (using php)
  • .py (using python)
  • .js (using node)
  • .jar (using java)

In this post, I will be developing my WebJob using a Console Application that will generate an executable. In Visual Studio 2017, there are two ways you can go about creating a project for your WebJob:

  1. Console Application project
  2. Selecting Azure WebJob project - which you will find under the "Cloud" category.

If you create your WebJob using a Console Application, you will still have the option later on to "Publish as an Azure WebJob..." when right-clicking on the project. In the code below I happened to be using a Console Application only because I didn't even know a Azure WebJob project existed until after I completed development on my project. Doh!

Program.cs

I have created a new project called "Site.AzueWebJob.Cleanup". The project uses the following two Azure nuget packages:

namespace Site.AzureWebJob.Cleanup
{
    class Program
    {
        static void Main(string[] args)
        {
            try
            {
                CloudStorageAccount storageAccount = CloudStorageAccount.Parse("<Insert Storage Connection String Here>");

                CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
                CloudBlobContainer dataContainer = blobClient.GetContainerReference("<Blob container name>");

                Console.WriteLine("Hourly threshold to remove records: {0}", ConfigurationManager.AppSettings["Azure.CleanupHours"]);

                #region Retrieve all data items greater than 24 hours and delete them

                Console.WriteLine("Retrieving old data files...");

                // Get files where the "Last Modified Date" is olders than 24 hours.
                IEnumerable<CloudBlob> oldData = dataContainer.ListBlobs()
                                .OfType<CloudBlob>()
                                .Where(b => b.Properties.LastModified.Value.Date < DateTime.Now.AddHours(int.Parse(ConfigurationManager.AppSettings["Azure.CleanupHours"].ToString()) * -1));

                IList<CloudBlob> dataBlobs = oldData as IList<CloudBlob> ?? oldData.ToList();

                Console.WriteLine("Data records retrieved: {0}.", dataBlobs.Count);
                Console.WriteLine("Removing old data files...");

                // Loop through the files and delete if they exist.
                foreach (CloudBlob dataBlob in dataBlobs)
                {
                    bool isDeleted = dataBlob.DeleteIfExists();

                    if (isDeleted)
                        Console.WriteLine("Deleted: {0}.", dataBlob.Name);
                }

                #endregion

                Console.WriteLine("Removing old data complete.");
            }
            catch (Exception ex)
            {
                Console.WriteLine("Error cleaning container files: {0}", ex.Message);
            }

            Console.WriteLine("Clean Containers WebJob complete.");
        }
    }
}

There isn't really much to it. All I am doing is retrieving all files that are older than 24 hours (value set within App.config app setting called: "Azure.CleanupHours") and then carrying out the delete process by looping through any records returned.

The most safest way to delete a file is to use the CloudBlob.DeleteIfExists() call. As the method name suggests, it will only delete a file if it exists. Using the CloudBlob.Delete() will cause an exception if for some reason the file isn't there and will require additional error handling.

Final Steps

Now that we have our Azure WebJob ready to go, the only thing left is to publish to your Azure Web App by simply right-clicking on your project and selecting: "Publish as an Azure WebJob...". Here you will connect to your Azure instance and have the options to choose how your WebJob should run:

  • Continuously
  • On Demand
  • On Schedule

Redirect Non-WWW to WWW Domain In Azure Websites

If you require your website URL to always be prefixed with a "www" at the start of the domain, then you will need to modify the web.config (preferably in the Web.Release.Config) with the following addition:

<system.webServer>
    <rewrite xdt:Transform="Insert">
      <rules>
        <rule name="Redirect to WWW site">
          <match url=".*" />
          <conditions logicalGrouping="MatchAny">
            <add input="{HTTP_HOST}" pattern="^(www\.)(.*)$" negate="true" />
          </conditions>
          <action type="Redirect" url="http://www.{HTTP_HOST}/{R:0}" redirectType="Permanent" />
        </rule>
      </rules>
    </rewrite>
  </system.webServer>

In addition to the web.config file changes, ensure the Azure Website instance contains the correct domain bindings within the "Manage Domains" area. For example:

Azure Manage Custom Domains

Tools Every Azure Developer Should Be Using

Having developed quite a few websites in Azure, there are some key tools I found that made my life easier when accessing all areas of my Azure cloud instance. The great thing about the selection of tools I have listed below is that it gives me access to all the features I need wrapped in a nice interface.

So lets get to it!

Azure Storage Explorer

Azure Storage Explorer is a useful tool for inspecting and altering the data in your Azure storage projects, including the logs of your cloud-hosted applications. This includes:

  • Blobs
  • Queues
  • Tables

Unlike some of the previous storage explorer software I've used in the past, Azure Storage Explorer allows you to preview a blob directly through its interface, such as: Images, Video or Text files. So you don't have to waste time downloading a blob just to check if its been generated correctly. Amazing time saver!

Once you have your storage set up within your Azure account, you can use this application to manage everything: create, view, copy, rename and delete all three types of storage types (listed above).

Azure Storage Explorer

An application as full featured as this shouldn't be free. But luckily for us, it is.

Download: https://azurestorageexplorer.codeplex.com/

Azure User Management Console

Azure User Management Console manages the users and logins of an Azure SQL database. The tool is simply converting your action into T-SQL commands and execute them against an Azure database of your choice.

Azure User Management Console

What some beginner Azure developers do is they use the same master credentials that is assigned to the database on creation within their web application too. Of course, this master user has full "db_owner" privileges against the database. Not a good idea! This application allows you to create a new new user with restricted access access levels really easily.

Download: https://aumc.codeplex.com/

Redgate SQL Azure Backup

One thing I found lacking in Azure SQL databases is the ease of creating a regular backup. There doesn't seem to be an automated way to do this directly through the Azure account.

I've been toying around with Redgate's Azure backup service and that seems to do the job quite nicely. But it does come at a price. For a daily backup on one database will cost around £7 per month.

Full range of backup plans: http://cloudservices.red-gate.com/

CloudXplorer

Whenever I needed to take a quick look at any of my blob containers, Azure Storage Explorer would suffice for majority of cases. However, the only thing I've started noticing with Azure Storage Explorer is that it lacks the efficiency of being able to export a batch of files from a blob to local storage with ease.

CloudXplorer by ClumsyLeaf Software made browsing files within my blob container a breeze. All files were organised and displayed in a folder structure allowing me to download specific directories. The slick UI alone makes CloudXplorer a pleasure to use, especially if you have blob that is large in volume.

I have downloaded around 200MB worth of files from one of my blobs to a local drive without any issue.