Export Changes Between Two Git Commits In SourceTree

I should start off by saying how much I love TortoiseGit and it has always been the reliable source control medium, even though it's a bit of a nightmare to set up initially to work alongside Bitbucket. But due to a new development environment for an external project, I am kinda forced to use preinstalled Git programs:

  • SourceTree
  • Git Bash

I am more inclined to use a GUI when interacting with my repositories and use the command line when necessary.

One thing that has been missing from SourceTree ever since it was released, is the ability to export changes over multiple commits. I was hoping after many years this feature would be incorporated. Alas, no. After Googling around, I came across a StackOverflow post that showed the only way to export changes in Sourcetree based on multiple commits is by using a combination of the git archive and git diff commands:

git archive --output=archived_changes.zip HEAD $(git diff --diff-filter=ACMRTUXB --name-only hash1 hash2)

This can be run directly using the Terminal window for a repository in Sourcetree. The "hash1" and "hash2" values are the long 40 character length commit ID's.

The StackOverflow post has helped me in what I needed to achieve and as a learning process, I want to take things a step further in my post to understand what the archive command is actually doing for my own learning. So let's dissect the command into manageable chunks.

Part 1

git archive --output=archived_changes.zip HEAD

This creates the archive of the whole repository into a zip file. We can take things further in the next section to select the commits we need.

Part 2

git diff --diff-filter=ACMRTUXB

The git diff command shows changes in between commits. The filter option gives us more flexibility to select the files that are:

  • A Added
  • C Copied
  • D Deleted
  • M Modified
  • R Renamed
  • T have their type (mode) changed
  • U Unmerged
  • X Unknown
  • B have had their pairing Broken

Part 3

--name-only hash1 hash2

The second part of the git diff command uses the "name-only" option that just shows which files have changed over multiple commits based on the hash values entered.

Part 4

The git diff command needs to be wrapped around parentheses to act as a parameter for the git archive command. 

Duplicate Content: The Impact of Canonical URLs

Being a web developer I am trying to become savvier when it comes to factoring additional SEO practices, which is generally considered (in my view) compulsory. 

Ever since Google updated its Search Console (formally known as Webmaster Tools), it has opened my eyes to how my site is performing in greater detail, especially the pages Google deems as links not worthy for indexing. I started becoming more aware of this last August, when I wrote a post about attempting to reduce the number "Crawled - Currently not indexed" pages of my site. Through trial and error managed to find a way to reduce the excluded number of page links.

The area I have now become fixated on is the sheer number of pages being classed as "Duplicate without user-selected canonical". Google describes these pages as:

This page has duplicates, none of which is marked canonical. We think this page is not the canonical one. You should explicitly mark the canonical for this page. Inspecting this URL should show the Google-selected canonical URL.

In simplistic terms, Google has detected there are pages that can be accessed by different URL's with either same or similar content. In my case, this is the result of many years of unintentional neglect whilst migrating my site through different platforms and URL structures during the infancy of my online presence. 

Google Search Console has marked around 240 links as duplicates due to the following two reasons:

  1. Pages can be accessed with or without a ".aspx" extension.
  2. Paginated content.

I was surprised to see paginated content was classed as duplicate content, as I was always under the impression that this would never be the case. After all, the listed content is different and I have ensured that the page titles are different for when content is filtered by either category or tag. However, if a site consists of duplicate or similar content, it is considered a negative in the eyes of a search engine. 

Two weeks ago I added canonical tagging across my site, as I was intrigued to see if there would be any considerable change towards how Google crawls my site. Would it make my site easier to crawl and aid Google in understanding the page structure?

Surprising Outcome

I think I was quite naive about how my Search Console Coverage statistics would shift post cononicalisation. I was just expecting the number of pages classed as "Duplicate without user-selected canonical" to decrease, which was the case. I wasn't expecting anything more. On further investigation, it was interesting to see an overall positive change across all other coverage areas.

Here's the full breakdown:

  • Duplicate without user-selected canonical: Reduced by 10 pages
  • Crawled - Currently not indexed: Reduced by 65 pages
  • Crawl anomaly: Reduced by 20 pages
  • Valid : Increased by 60 pages

The change in figures may not look that impressive, but we have to remember this report is based only on two weeks after implementing canonical tags. All positives so far and I'm expecting to see further improvements over the coming weeks. 

Conclusion

Canonical markup can often be overlooked, both in its implementation and importance when it comes to SEO. After all, I still see sites that don't use them as the emphasis is placed on other areas that require more effort to ensure it meets Google's search criteria, such as building for mobile, structured data and performance. So it's understandable why canonical tags could be missed.

If you are in a similar position to me, where you are adding canonical markup to an existing site, it's really important to spend the time to set the original source page URL correctly the first time as the incorrect implementation can lead to issues.

Even though my Search Console stats have improved, the jury's still out to whether this translates to better site visibility across search engines. But anything that helps search engines and visitors understand your content source can only be beneficial.

Switch Branches In TortoiseGit

My day to day version control system is Bitbucket. I never got on with their own Git GUI offering - Sourcetree. I always found using TortoiseGit much more intuitive to use and a flexible way to interact with my git repository. If anyone can change my opinion on this, I am all ears!

I work with large projects that are around a couple hundred megabytes in size and if I were to clone the same project over different branches, it can use up quite a bit of hard disk space. I like to quickly switch to my master branch after carrying out a merge for testing before carrying out a release.

Luckily TortoiseGit makes switching branches a cinch in just a few clicks:

  • Right-click in your repository
  • Go to TortoiseGit context menu
  • Click Switch/Checkout
  • Select the branch you wish to switch to and select "Overwrite working tree changes (force)"

TortoiseGit Switch Branches

Selecting the "Overwrite working tree changes (force)" tick box is important to ensure all files in your working directory is overwritten with the files directly from your branch. We do not want remnants of files left from the branch we had previously switched from kicking around.

Reducing The Number of 'Crawled - Currently not indexed' Pages

Every few weeks, I check over the health of my site through Google Search Console (aka Webmaster Tools) and Analytics to see how Google is indexing my site and look into potential issues that could affect the click-through rate.

Over the years the content of my site has grown steadily and as it stands it consists of 250 published blog posts. When you take into consideration other potential pages Google indexes - consisting of filter URL's based on grouping posts by tag or category, the number of links that my site consists is increased considerably. It's to the discretion of Google's search algorithm to whether it includes these links for indexing.

Last month, I decided to scrutinise the Search Console Index Coverage report in great detail just to see if there are any improvements I can make to alleviate some minor issues. What I wasn't expecting to see is the large volume of links marked as "Crawled - Currently not indexed".

Crawled Currently Not Indexed - 225 Pages

Wow! 225 affected pages! What does "Crawled - Currently not indexed" mean? According to Google:

The page was crawled by Google, but not indexed. It may or may not be indexed in the future; no need to resubmit this URL for crawling.

Pretty self-explanatory but not much guidance on the process on how to lessen the number of links that aren't indexed. From my experience, the best place to start is to look at the list of links that are being excluded and to form a judgement based on the page content of these links. Unfortunately, there isn't an exact science. It's a process of trial and error.

Let's take a look at the links from my own 225 excluded pages:

Crawled Currently Not Indexed - Non Indexed Links

On initial look, I could see that the majority of the URL's consisted of links where users can filter posts by either category or tag. I could see nothing content-wise when inspecting these pages for a conclusive reason for index exclusion. However, what I did notice is that these links were automatically found by Google when the site gets spidered. The sitemap I submitted in the Search Console only list out blog posts and content pages.

This led me to believe a possible solution would be to create a separate sitemap that consisted purely of links for these categories and tags. I called it metasitemap.xml. Whenever I added a post, the sitemap's "lastmod" date would get updated, just like the pages listed in the default sitemap.

I created and submitted this new sitemap around mid-July and it wasn't until four days ago the improvement was reported from within the Search Console. The number of non-indexed pages was reduced to 58. That's a 74% reduction!

Crawled Currently Not Indexed - 58 Pages

Conclusion

As I stated above, there isn't an exact science for reducing the number of non-indexed pages as every site is different. Supplementing my site with an additional sitemap just happened to alleviate my issue. But that is not to say copying this approach won't help you. Just ensure you look into the list of excluded links for any patterns.

I still have some work to do and the next thing on my list is to implement canonical tags in all my pages since I have become aware I have duplicate content on different URL's - remnants to when I moved blogging platform.

If anyone has any other suggestions or solutions that worked for them, please leave a comment.

My First Butter CMS JavaScript Implementation

For one of my side projects, I was asked to use Butter CMS to allow for basic blog integration using JavaScript. I have never heard or used Butter CMS before and was intrigued to know more about the platform.

Butter CMS is another headless CMS variant that allows a developer to utilise API endpoints to push content to an application via an arrange of approaches. So nothing new here. Just like any headless CMS, the proof is in the pudding when it comes to the following factors:

  • Quality of features
  • Ease of integration
  • Price points
  • Quality of documentation

I haven't had a chance to properly look into what Butter CMS fully has to offer, but from what I have seen from working on the requirements for this side project I was pleasently surprised. Found it really easy to get setup with minimal amount of fuss! For this project I used Butter CMS's Blog Engine package, which does exactly what it says on the tin. All the fields you need for writing blog posts are already provided.

JavaScript Code

My JavaScipt implementation is pretty basic and provides the following functionality:

  • Outputs a list of posts consisting of title, date and summary text
  • Pagination
  • Output a single blog post

All key functionality is derived from the "ButterCMS" JavaScript file:

/*****************************************************/
/*                    Butter CMS                                 */
/*****************************************************/
var ButterCMS =
{
    ButterCmsObj: null,

    "Init": function () {
        // Initiate Butter CMS.
        this.ButterCmsObj = new ButterCmsBlogData();
        this.ButterCmsObj.Init();
    },
    "GetBlogPosts": function () {
        BEButterCMS.ButterCmsObj.GetBlogPosts(1);
    },
    "GetSinglePost": function (slug) {
        BEButterCMS.ButterCmsObj.GetSinglePost(slug);
    }
};

/*****************************************************/
/*                Butter CMS Data                         */
/*****************************************************/
function ButterCmsBlogData() {
    var apiKey = "<Enter API Key>",
        baseUrl = "/",
        butterInstance = null,
        $blogListingContainer = $("#posts"),
        $blogPostContainer = $("#post-individual"),
        pageSize = 10;

    // Initialise of the ButterCMSData object get the data.
    this.Init = function () {
        getCMSInstance();
    };

    // Returns a list of blog posts.
    this.GetBlogPosts = function (pageNo) {
        // The blog listing container needs to be cleared before any new markup is pushed.
        // For example when the next page of data is requested.
        $blogListingContainer.empty();

        // Request blog posts.
        butterInstance.post.list({ page: pageNo, page_size: pageSize }).then(function (resp) {
            var body = resp.data,
                blogPostData = {
                    posts: body.data,
                    next_page: body.meta.next_page,
                    previous_page: body.meta.previous_page
                };

            for (var i = 0; i < blogPostData.posts.length; i++) {
                $blogListingContainer.append(blogPostListItem(blogPostData.posts[i]));
            }

            //----------BEGIN: Pagination--------------//

            $blogListingContainer.append("<div>");

            if (blogPostData.previous_page) {
                $blogListingContainer.append("<a class=\"page-nav\" href=\"#\" data-pageno=" + blogPostData.previous_page + " href=\"\">Previous Page</a>");
            }

            if (blogPostData.next_page) {
                $blogListingContainer.append("<a class=\"page-nav\" href=\"#\" data-pageno=" + blogPostData.next_page + " href=\"\">Next Page</a>");
            }

            $blogListingContainer.append("</div>");

            paginationOnClick();

            //----------END: Pagination--------------//
        });
    };

    // Retrieves a single blog post based on the current URL of the page if a slug has not been provided.
    this.GetSinglePost = function (slug) {
        var currentPath = location.pathname,
            blogSlug = slug === null ? currentPath.match(/([^\/]*)\/*$/)[1] : slug;

        butterInstance.post.retrieve(blogSlug).then(function (resp) {
            var post = resp.data.data;

            $blogPostContainer.append(blogPost(post));
        });
    };

    // Renders the HTML markup and fields for a single post.
    function blogPost(post) {
        var html = "";

        html = "<article>";

        html += "<h1>" + post.title + "</h1>";
        html += "<div>" + blogPostDateFormat(post.created) + "</div>";
        html += "<div>" + post.body + "</div>";
        
        html += "</article>";

        return html;
    }

    // Renders the HTML markup and fields when listing out blog posts.
    function blogPostListItem(post) {
        var html = "";

        html = "<h2><a href=" + baseUrl + post.url + ">" + post.title + "</a></h2>";
        html += "<div>" + blogPostDateFormat(post.created) + "</div>";
        html += "<p>" + post.summary + "</p>";

        if (post.featured_image) {
            html += "<img src=" + post.featured_image + " />";
        }

        return html;
    }

    // Set click event for previous/next pagination buttons and reload the current data.
    function paginationOnClick() {
        $(".page-nav").on("click", function (e) {
            e.preventDefault();
            var pageNo = $(this).data("pageno"),
                butterCmsObj = new ButterCmsBlogData();

            butterCmsObj.Init();
            butterCmsObj.GetBlogPosts(pageNo);
        });
    }

    // Format the blog post date to dd/MM/yyyy HH:mm
    function blogPostDateFormat(date) {
        var dateObj = new Date(date);

        return [dateObj.getDate().padLeft(), (dateObj.getMonth() + 1).padLeft(), dateObj.getFullYear()].join('/') + ' ' + [dateObj.getHours().padLeft(), dateObj.getMinutes().padLeft()].join(':');
    }

    // Get instance of Butter CMS on initialise to make one call.
    function getCMSInstance() {
        butterInstance = new Butter(apiKey);
    }
}

// Set a prototype for padding numerical values.
Number.prototype.padLeft = function (base, chr) {
    var len = (String(base || 10).length - String(this).length) + 1;

    return len > 0 ? new Array(len).join(chr || '0') + this : this;
};

To get a list of blog posts:

// Initiate Butter CMS.
BEButterCMS.Init();

// Get all blog posts.
BEButterCMS.GetBlogPosts();

To get a single blog post, you will need to pass in the slug of the blog post via your own approach:

// Initiate Butter CMS.
BEButterCMS.Init();

// Get single blog post.
BEButterCMS.GetSinglePost(postSlug);

Check Your Website Headers People!

A website can tell the public a lot about you, from the things you want people to see and other things you probably would not. HTTP Headers can divulge things about your website that you wouldn't necessarily want to make public and its up to the individual to make a decision on what headers they're willing to expose. But what I would recommend is to at least analyse any site prior to moving to a production environment.

Why all of a sudden am I talking about questioning your website HTTP Headers?

It was only by chance when perusing StackOverflow I came across a question about securing HTTP headers, I was directed to a site called securityheaders.io. I immediately entered this very site for scanning, thinking it would fare quite well. But boy oh boy was I wrong!:

Security Headers (Before)

Based on this result, does this make my website vulnerable? To a certain extent yes. By default you're exposing some key information to potential hackers about how your website is built. For example, here is a simple list of HTTP Headers that could be returned from the server:

  • Web server
  • Framework version
  • Cache handling
  • Cross-site scripting access
  • Referrer policies

Now based on that list alone, what HTTP headers would you hide? From having my eyes opened by the report generated by securityheaders.io, as a minimum I would hide anything that shows what technology, framework and server platform I am using. If there happens to be an exploit on the very server or technology you are using, we don't want the whole world to know that especially if you happen to be hosting a high traffic website.

I decided to correct all the issues highlighted by securityheaders.io and spent additional time obfuscating some additional headers. Now I can proudly say I've passed. There is just one blemish against the report to do with the "Content-Security-Policy" header, which defines approved sources of content that the browser may load.

Security Headers (After)

I been tweaking around with the rules for this header and I'll be honest when I say it shafted the administration dashboard of my the content management system I use for my site - Kentico CMS. So before I reinstate the header, I need a little more time tweaking.

Another great site to use to analyse the security of your site (.NET sites only) is ASafaWeb, which scans for common configuration vulnerabilities.

Recommended Links

Google Seems To Have An Issue With My Server Response Time...

...and I think I know why...

Out of all the issues Google PageSpeed Insights seems to have when analysing my site, there are two specific things crop up that annoy me:

  1. ​Reduce server response time
  2. ​Leverage browser caching (due to Google Analytics JavaScript file)

The Google Analytics issue is something I will have to live with since (as far as I'm aware) there's nothing I can do. It would be nice if Google wouldn't penalise you for using a product they have developed. However, the "Reduce server response time" was something that perplexed me. My site is relatively simple and not doing anything over-the-top.

Due to the nature of my hosting setup (shared), I didn't have all the capabilities to make my website respond any better. The only way I could think of improving server response time was to move my hosting to another region and purchasing a VPS to get more control.

Now, I think I have resolved the server response time issue...It has something to do with a Web Statistics service called AWStats that was enabled by default as an "addon" service on my hosting. Once disabled through my Plesk Management Portal, Google PageSpeed didn't seem to have any issue with my server response.

I cannot 100% confirm if by disabling the Web Statistics service is a permanent solution and will work for everyone else. But there might be some truth behind this. Web Statistic services like AWStats store all analytical data in log files directly on the server, so this must have some affect on the time a request is made. I could be talking complete nonesense.

If you have experienced the same problem as me, check your own hosting setup and it's "addon" services. You never know, it may give you that extra Google PageSpeed point. :-)

Microsoft Virtual Academy...Something every Microsoft Developer Should Take A Look At!

There are many roads and avenues a tech-head can take to either get a grasp on new technology or prepare for certification. Unfortunately, some methods to get the knowledge on a subject can come at a great cost...especially when it comes to anything Microsoft.

Generally, Microsoft has always had some great forum and blogging communities to enable developers to get the expertise they require. I've always found them to be somewhat divided and looked rough around the edges. Now Microsoft has reworked its community and provided learners with a wide variety of courses freely available to anyone!

While MVA courses are not specifically meant to focus on exam preparation. They should be used as an addition to paid courses, books and online test exams to prepare for a certification. But it definitely helps. It takes more than just learning theory to pass an exam.

So if you require some extra exam training or just want to brush up your skills, give a few topics a go. I myself decided to test my skills by starting right from the beginning and covering courses that relate to my industry. In this case, to name a few:

  • Database Fundamentals
  • Building Web Apps with ASP.NET Jump Start
  • Developing ASP.NET MVC 4 Web Applications Jump Start
  • Programming In C# Jump Start
  • Twenty C# Questions Explained

I can guarantee you'll be stumped by some of the exam questions after covering each topic. Some questions can be quite challenging!

I've been a .NET developer for around 7 years and even I had to go through the learning content more than once. Just because you've been in the technical industry for a lengthy period of time, we are all susceptible to forget things or may not be aware of different coding techniques.

One of the great motivations of using MVA is the ranking system that places you against a leaderboard of other avid learners and seeing yourself progress as you complete each exam. All I can advise is that don't let the ranking system be your sole motivation to just "show-off" your knowledge. The important part is learning. What's the point in making a random attempt to answer each exam without a deep understanding on why you got the answer correct or incorrect.

You can see how far I have progressed by viewing my MVA profile here: http://www.microsoftvirtualacademy.com/Profile.aspx?alias=2181504

All in all: Fantastic resource and fair play to Microsoft for offering some free training!

C# In Depth by Jon Skeet Review

C# In Depth Third EditionWhen working as a programmer, it's really easy to continue coding in the same manner you have done since you picked up a language and made your first program.

The saying: "Why fix it if it ain't broken?" comes to mind...

I for one sometimes fail to move with the times (unknowingly to me) and find new and better ways of coding. It's only on the off chance I get introduced to different approaches through my work colleague or whilst Googling for an answer to one of my coding queries.

After reading some rave reviews on C# In Depth, written by the one and only Stackoverflow god: Jon Skeet. I decided to part with my hard earned money and make a purchase.

C# In Depth is different from other programming books I've read on C#. In fact it's really good and don't let the title of the book deter you. The contents is ideal for novice and semi-experienced programmers.

Firstly, you start off by being shown code samples on how C# has evolved through its iterations (v1 - v4). In most cases I gave myself a gratifying pat on the back when I noticed the approaches I've taken in my own projects utilised practises and features of the current language. ;-)

Secondly, unlike some programming books I've read in the past, it's not intimidating to read at all. Jon Skeet really has a great way to talk about some concepts I find difficult to comprehend in a clear a meaningful way, so I could utilise these concepts within my current applications.

The only minor niggle I have is that there were a few places where I would have liked specific chapters to go into more detail. On the other hand, it gave me the opportunity to research the nitty-gritty details for myself.

Since I purchased this book, I found myself referencing it many times and appreciating what C# has to offer along with it misconstrued and underused features.

All in all, the author truly has a gift in clearly demonstrating his understanding on the subject with finesse and if I am able to comprehend even one-tenth of his knowledge, I will be a happy man.

The Easy Way To Run a PHP Site In A Windows Environment

Even though my programming weapon of choice is .NET C#, there are times (unfortunate times!) where I need to dabble in a bit if PHP. Being a .NET developer means I do not have the setup to run PHP based sites such as Apache and MySQL.

In the past I have tried to create an Apache configured server but I could never get it running 100% - possibly because I didn't have the patience or could justify the additional time required for setup when I could be working on a PHP site once in a blue moon...

Last year, I came across a program called EasyPHP that allowed me to install a local instance of Apache and MySQL altogether in just one installation. It made it really to get up and running without all the setup and configuration hassle.

Once installed you can create numerous websites and MySQL instances in a version of your own choosing. Wicked! I never been so excited about PHP in my life!

I have only scratched the surface on the features EasyPHP provides and whenever I did need to use it, there has always been great improvements. Take a look at their site for more information: www.easyphp.org.

So if you're a Windows man who needs to carry out PHP odds and ends, can't recommend EasyPHP enough.