Blog

Tagged by 'lucene.net'

Published on
Jan 14, 2011-
2 min read
At Last! Created My Own eBay Style Search Using Solrnet
#lucenenet #facet #search #solr
Over the last few months I have been carrying out endless amounts of research and development to find a way to create my own eCommerce styled search similar to the likes of what eBay and Amazon use. Otherwise known as “Faceted Search”, whereby the search results are filtered through a series of facets belonging to your search criteria. Each facet typically corresponds to the possible values of a property common to a set of objects.

Sounds very difficult and complex doesn’t it! Even to this very day, I am sure eBay and Amazon must use some kind of “magic” to get their search to work in a seamlessly and efficient format.

There are numerous search solutions out there that could help you achieve in making this type of search. From my experience I couldn’t find any low cost out-of-the-box solutions that would help me in making my own search. Majority of the search vendors were not only very expensive but they also required a quote to tailor make a solution for you.

In the early stages I tried expanding my Lucene.NET knowledge, but I couldn’t find a flexible way to introduce facets into my search. I must admit I am not exactly an expert in Lucene and this could have also had a part to play in failing miserably.

When I thought all was lost and there was no chance in hell in being able to figure this thing out, I luckily came across a few blog and StackOverflow posts by a guy called Mauricio Scheffer. Mauricio seems to be the brains behind the .NET client version of a search platform called: SolrNet. SolrNet is a Solr client library built for the .NET Framework. This is one of the strengths of Solr. It can be consumed within other development platforms such as Python and Ruby.

SolrNet just happened to be an ideal solution to what I was looking for and with just over a weeks development I was able to build my own basic search, which looks something like this:

As you can see from my screenshots, you can carry out a search by report type and/or global text search. In addition, the showing and hiding of the facet objects are purely dependent on the searches returned.

SolrNet is a very flexible package and I know just enough to implement the basics. But I was really surprised on how well the searches performed even with the most basic implementation. So I am looking forward to adding additional features as over the next few months and perfecting both my Solr search index and code.

I won’t be posting the code that I used to create my search since its quite a big project and tailor made specific to my database architecture. But here are a few links that I found useful to get me started in the world of SolrNet:

Dec 13, 2010-

4 min read

Multi Query Search Using Lucene.NET

#lucenenet #multi-query-search #MultiFieldQueryParser

Over the last few days I have been doing some research on the best way to implement search functionality for a site I am currently building. The site will consists mainly of news articles. The client wanted a search that would allow a user to search across all fields that related to a news article.

Originally, I envisaged writing my own SQL to query a few tables within my database to return some search results. But as I delved further into designing the database architecture in the early planning stages, I found that my original (somewhat closed minded) approach wouldn't be flexible nor scalable enough to search and extract all the information I required.

From what I have researched, the general consensus is to either use SQL Full Text Search or Lucene.NET. Many have favoured the use of Lucene due to its richer querying language and generally more flexible since you have the ability to write a search index tailored to your project. From what I gather, Lucene can work with any type of text data. For example, you not only can index rows in your database but there are also solutions to support indexing physical files in your application. Neat!

I have written some basic code (below) with a couple methods to get started in creating a search index and carrying out a multi-query search across your whole index. You would further enhance this code to only carry out a full index once all required records have been added. Most implementations of Lucene would use incremental indexing, where documents already in the index are just updated individually, rather than deleting the whole index and building a new one every time. I plan to hook up and optimise my Lucene code into a service that would be scheduled to carry out an incremental index every midnight.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Lucene;
using Lucene.Net;
using Lucene.Net.Store;
using Lucene.Net.Analysis;
using Lucene.Net.Analysis.Standard;
using Lucene.Net.Index;
using Lucene.Net.Documents;
using Lucene.Net.QueryParsers;
using Lucene.Net.Search;
using System.Configuration; 

namespace MES.DataManager.Search
{
    public class Lucene
    {
        public static void IndexSite()
        {           
                //The file location of the index
                string indexLocation = @ConfigurationManager.AppSettings["SearchIndexPath"];

                Directory searchDirectory = null;

                if (System.IO.Directory.Exists(indexLocation))
                    searchDirectory = FSDirectory.GetDirectory(indexLocation, false);
                else
                    searchDirectory = FSDirectory.GetDirectory(indexLocation, true); 

                //Create an analyzer to process the text
                Analyzer searchAnalyser = new StandardAnalyzer(); 

                //Create the index writer with the directory and analyzer.
                IndexWriter indexWriter = new IndexWriter(searchDirectory, searchAnalyser, true);

                //Iterate through Article table and populate the index
                foreach (Article a in ArticleBLL.GetArticleDetails())
                {
                    Document doc = new Document();

                    doc.Add(new Field("id", a.ID.ToString(), Field.Store.YES, Field.Index.UN_TOKENIZED, Field.TermVector.YES));
                    doc.Add(new Field("title", a.Title, Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.YES));
                    doc.Add(new Field("articletype", a.Type.TypeName, Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.YES)); 

                    if (!String.IsNullOrEmpty(a.Summary))
                        doc.Add(new Field("summary", a.Summary, Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.YES));                

                    if (!String.IsNullOrEmpty(a.ByLineShort))
                        doc.Add(new Field("bylineshort", a.ByLineShort, Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.YES));                    

                    if (!String.IsNullOrEmpty(a.ByLineLong))
                        doc.Add(new Field("bylinelong", a.ByLineLong, Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.YES));                   

                    if (!String.IsNullOrEmpty(a.BasicWords))
                        doc.Add(new Field("basicwords", a.BasicWords, Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.YES));                   

                    if (!String.IsNullOrEmpty(a.MediumWords))
                        doc.Add(new Field("mediumwords", a.MediumWords, Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.YES));                   

                    if (!String.IsNullOrEmpty(a.LongWords))
                        doc.Add(new Field("longwords", a.LongWords, Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.YES));  

                    //Write the document to the index
                    indexWriter.AddDocument(doc);
                }
                          

                //Optimize and close the writer
                indexWriter.Optimize();
                indexWriter.Close();         
        }

        public static List<CoreArticleDetail> SearchArticles(string searchTerm)
        {
            Analyzer analyzer = new StandardAnalyzer(); 

            //Search by multiple fields
            MultiFieldQueryParser parser = new MultiFieldQueryParser(
                                                                new string[]
                                                                {
                                                                    "title",
                                                                    "summary",
                                                                    "bylineshort",
                                                                    "bylinelong",
                                                                    "basicwords",
                                                                    "mediumwords",
                                                                    "longwords"
                                                                },
                                                                analyzer); 

            Query query = parser.Parse(searchTerm); 

            //Create an index searcher that will perform the search
            IndexSearcher searcher = new IndexSearcher(@ConfigurationManager.AppSettings["SearchIndexPath"]); 

            //Execute the query
            Hits hits = searcher.Search(query);

            List<int> articleIDs = new List<int>(); 

            //Iterate through index and return all article id’s
            for (int i = 0; i < hits.Length(); i++)
            {
                Document doc = hits.Doc(i);

                articleIDs.Add(int.Parse(doc.Get("id")));
            } 

            return ArticleBLL.GetArticleSearchInformation(articleIDs);
        }

    }
}

As you can see, my example allows you to carry out a search across as many of your fields as you require which I am sure you will find useful. It took a lot of research to find out how to carry out a multi query search. Majority of the examples I found over the internet showed you how to search only one field.

The main advantage I can see straight away from using Lucene is that since the search data is held on disk, there is hardly any need to query the database. The only downside I can see is problems being caused by the possibility a corrupt index.

For more information on using Lucene, here are a couple of links that you may find useful to get started (I know I did):

http://www.codeproject.com/KB/library/IntroducingLucene.aspx http://ifdefined.com/blog/post/Full-Text-Search-in-ASPNET-using-LuceneNET.aspx

Back to view all posts

Featured Posts

Google Maps Distance Matrix API - Outputting More Than 25 Destinations

The Google Maps Distance Matrix API has a limit of 25 destinations per request. This post demonstrates how to process more than imposed limit by batching API calls.

Side Hustling With UserTesting.com

I've been using UserTesting.com since June for a side hustle to supplement my monthly investment contribution. I discuss whether it has been a worthy way to accumalate additional income.

Websites and The Environment

When building any application, the last thing on any developer's mind is how a build will impact the environment. We'll be discussing tools that can measure the size of the carbon footprint your website leaves behind.

Support

If you've found anything on this blog useful, you can buy me a coffee. It's certainly not necessary but much appreciated!

At Last! Created My Own eBay Style Search Using Solrnet

Multi Query Search Using Lucene.NET

Featured Posts