Filter in lucene based on field value

filter in lucene based on field value
This article will show you how to customize the lucene search based on indexed Field. When we index a document we are adding

number of fields on document e.g. I am indexing a document with filed “Text” and “ID” as below.


    var doc = new Document();
    // Read file text and Pass it to the indexer
    string document_Content = "microsoft office";
    doc.Add(new Field("text", document_Content, Field.Store.YES, Field.Index.ANALYZED));
    doc.Add(new Field("ID", "1", Field.Store.YES, Field.Index.ANALYZED));
    writer.AddDocument(doc);

    var doc1 = new Document();
    string document_Content1 = "microsoft outlook";
    doc1.Add(new Field("text", document_Content1, Field.Store.YES, Field.Index.ANALYZED));
    doc1.Add(new Field("ID", "2", Field.Store.YES, Field.Index.ANALYZED));

    writer.AddDocument(doc1);

    var doc2 = new Document();
    string document_Content2 = "microsoft powerpoint";
    doc2.Add(new Field("text", document_Content2, Field.Store.YES, Field.Index.ANALYZED));
    doc2.Add(new Field("ID", "3", Field.Store.YES, Field.Index.ANALYZED));

    writer.AddDocument(doc2);

We added 3 document with 3 different IDs and want to search text where ID from 1 to 2.

Now I want to do search on field “Text” with the range of “ID” 1 to 2. For this we have to define filter using “FieldCacheRangeFilter” as below


    var parser = new QueryParser(Version.LUCENE_29, "text", analyzer);
    Query query = parser.Parse("microsoft");

    var filter = FieldCacheRangeFilter.("ID",
    lowerVal: 1, includeLower: true,
    upperVal: 1, includeUpper: true);

    TopDocs topDocs = searcher.Search(query, filter, 10);

Complete Code


 private static void SearchText()
    {
        Lucene.Net.Store.Directory directory = FSDirectory.Open(new DirectoryInfo(@"C:\LuceneIndex"));
        Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_29);
        var searcher = new IndexSearcher(directory);

        var parser = new QueryParser(Version.LUCENE_29, "text", analyzer);
        Query query = parser.Parse("microsoft");

        var filter = FieldCacheRangeFilter.NewIntRange("ID",
        lowerVal: 1, includeLower: true,
        upperVal: 2, includeUpper: true);

        TopDocs topDocs = searcher.Search(query, filter, 10);

        int results = topDocs.ScoreDocs.Length;
        Console.WriteLine("Found {0} results", results);

        for (int i = 0; i < results; i++)
        {
            ScoreDoc scoreDoc = topDocs.ScoreDocs[i];
            float score = scoreDoc.Score;
            int docId = scoreDoc.Doc;
            Document doc = searcher.Doc(docId);

            Console.WriteLine("Result num {0}, score {1}", i + 1, score);
            Console.WriteLine("ID: {0}", doc.Get("text"));
            Console.WriteLine("Text found in: {0}\r\n file", doc.Get("FileName"));
        }

        searcher.Close();
        directory.Close();
    }

If you run the application you will see only two results where as we have “microsoft” word three times in indexing.
Lucene Version : v4.0.30319

Advertisements
This entry was posted in lucene, MVC and tagged . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s