How to Indexing and searching in Lucene.net (.NET)

Indexing and searching in Lucene.net

Adding search capabilities to applications is something that users often ask. Sometimes it is not enough to have just filters on lists. Consider you have repository of 1000 document and you want to find out file with specific word, in such condition Lucene search engine is very useful.

What is Lucene.NET?

Lucene.NET is indexing and search server ported from famous Lucene that is developed for Java platform. From Lucene.NET project page we can read that Lucene.NET has the following goals:

1) Maintain the existing line-by-line port from Java to C#, fully automating and commoditizing the process such that the project can easily synchronize with the Java Lucene release schedule.
2) Maintaining the high-performance requirements excepted of a first class C# search engine library.
3) Maximize usability and power when used within the .NET runtime. To that end, it will present a highly idiomatic, carefully tailored API that takes advantage of many of the special features of the .NET runtime.

Main Two steps to do this
1) Indexing the document
2) Searching the documents with specific word

Step 1: Indexing the document

Here’s how to add document to index.

private static void IndexDocument()
{
    // where @"C:\LuceneIndex" is the path where Index file will create
    Lucene.Net.Store.Directory directory = FSDirectory.Open(new DirectoryInfo(@"C:\LuceneIndex"));
    Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_29);

    var writer = new IndexWriter(directory, analyzer, true, IndexWriter.MaxFieldLength.UNLIMITED);

    var doc = new Document();
    // Read file text and Pass it to the indexer
    string document_Content = "Document Conternt";
    doc.Add(new Field("text", document_Content, Field.Store.YES, Field.Index.ANALYZED));
    doc.Add(new Field("FileName", "Testdoc.doc", Field.Store.YES, Field.Index.ANALYZED));

    writer.AddDocument(doc);

    writer.Optimize();
    writer.Commit();
    writer.Close();
}        

After calling this method we have new document in Lucene.NET index. As we can see the “text” property is not indexed and we don’t expect somebody to search documents by text. You can find Lucene index files at location “C:\LuceneIndex”

Step 2: Searching documents
Now we can write method that searches documents by given phrase in document body.

private static void SearchText()
{
    Lucene.Net.Store.Directory directory = FSDirectory.Open(new DirectoryInfo(@"C:\LuceneIndex"));
    Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_29);

    var parser = new QueryParser(Version.LUCENE_29, "text", analyzer);
    Query query = parser.Parse("Document");

    var searcher = new IndexSearcher(directory);

    TopDocs topDocs = searcher.Search(query, 10);

    int results = topDocs.ScoreDocs.Length;
    Console.WriteLine("Found {0} results", results);

    for (int i = 0; i < results; i++)
    {
        ScoreDoc scoreDoc = topDocs.ScoreDocs[i];
        float score = scoreDoc.Score;
        int docId = scoreDoc.Doc;
        Document doc = searcher.Doc(docId);

        Console.WriteLine("Result num {0}, score {1}", i + 1, score);
        Console.WriteLine("ID: {0}", doc.Get("text"));
        Console.WriteLine("Text found in: {0}\r\n file", doc.Get("FileName"));
    }

    searcher.Close();
    directory.Close();
}

Conclusion

Lucene.NET is good solution for applications that need wide and powerful search capabilities. Lucene.NET is small library by size and it is very easy to use. Lucene.NET API enables you to fully manage the search index and perform queries on it. Although Lucene.NET is in Apache incubator right now it is promising project and I think it is worth to try out.

you can find some more information Lucene query language

Download code here

Indexing and searching in Lucene.net (.NET)

Indexing and searching in Lucene.net (.NET)

Advertisements
This entry was posted in lucene, MVC and tagged , . Bookmark the permalink.

One Response to How to Indexing and searching in Lucene.net (.NET)

  1. Samantha Laurence says:

    Hi, Thanks for sharing code with the blog, your blog is pretty useful to me. Can you do me a favour? I am looking for a solution where I could have authorization in search, say, If I do not have access to any file, I should not be able to search that. Please share if you have.

    Thanks,
    Samantha

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s