PhraseQuery:短语查询,就是查询文档中是否包含指定的一个Term或多个Term,多个Term之间可以指定间隔即slop参数,官方API解释如图:
使用示例代码,如下:
package com.yida.framework.lucene5.query; import java.io.IOException; import org.apache.lucene.analysis.Analyzer; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.document.TextField; import org.apache.lucene.index.DirectoryReader; import org.apache.lucene.index.IndexReader; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.IndexWriterConfig; import org.apache.lucene.index.IndexWriterConfig.OpenMode; import org.apache.lucene.index.Term; import org.apache.lucene.search.IndexSearcher; import org.apache.lucene.search.PhraseQuery; import org.apache.lucene.search.ScoreDoc; import org.apache.lucene.search.TopDocs; import org.apache.lucene.store.Directory; import org.apache.lucene.store.RAMDirectory; public class PhraseQueryTest { public static void main(String[] args) throws IOException { Directory dir = new RAMDirectory(); Analyzer analyzer = new StandardAnalyzer(); IndexWriterConfig iwc = new IndexWriterConfig(analyzer); iwc.setOpenMode(OpenMode.CREATE); IndexWriter writer = new IndexWriter(dir, iwc); Document doc = new Document(); doc.add(new TextField("text", "quick brown fox", Field.Store.YES)); writer.addDocument(doc); doc = new Document(); doc.add(new TextField("text", "jumps over lazy broun dog", Field.Store.YES)); writer.addDocument(doc); doc = new Document(); doc.add(new TextField("text", "jumps over extremely very lazy broxn dog", Field.Store.YES)); writer.addDocument(doc); writer.close(); IndexReader reader = DirectoryReader.open(dir); IndexSearcher searcher = new IndexSearcher(reader); String term1 = "dog"; String term2 = "jumps"; PhraseQuery phraseQuery = new PhraseQuery(); phraseQuery.add(new Term("text",term1)); phraseQuery.add(new Term("text",term2)); phraseQuery.setSlop(15); TopDocs results = searcher.search(phraseQuery, null, 100); ScoreDoc[] scoreDocs = results.scoreDocs; for (int i = 0; i < scoreDocs.length; ++i) { //System.out.println(searcher.explain(query, scoreDocs[i].doc)); int docID = scoreDocs[i].doc; Document document = searcher.doc(docID); String path = document.get("text"); System.out.println("text:" + path); } } }
pharseQuery.add(term),每次都是add到末尾,当然你也可以用add(term,position)明确指定add到哪个位置,示例代码中add了两个Term,则我们的查询短语是dog jumps,他们的间隔为0,然后我们设置slop值为5,
第2个索引文档里单词jumps往右移动5次刚好可以得到我们的查询短语dog jumps,因此它符合要求被返回了,而第1个索引文档直接不包含单词dog不符合要求,第3个索引文档需要移动7次才能得到dog jumps,所以最后返回的只有第2个索引文档。
如果我把代码变一下,改成这样:
String term1 = "dog"; String term2 = "jumps"; PhraseQuery phraseQuery = new PhraseQuery(); phraseQuery.add(new Term("text",term1),0); phraseQuery.add(new Term("text",term2),2); phraseQuery.setSlop(6); TopDocs results = searcher.search(phraseQuery, null, 100);
这时候我们的查询短语就是dog xxx jumps,意思就是我们要查询包含dog和jumps字符的文档而且dog和jumps之间要有一个字符间隔(不包含停用词),这时候我们的slop就要加1了,即我们需要再多移动一次,所以这次slop值应该为6.
PharseQuery下还有一个子类NGramPhraseQuery,这个子类涉及到N-Gram模型,算法之类的我就略过了。
如果你还有什么问题请加我Q-Q:7-3-6-0-3-1-3-0-5,
或者加裙
一起交流学习!
相关推荐
本文档详细讲解了各种SpanQuery的用法,以及它跟PhraseQuery的区别
lucene学习lucene学习lucene学习lucene学习lucene学习lucene学习lucene学习lucene学习lucene学习lucene学习lucene学习lucene学习lucene学习lucene学习lucene学习lucene学习lucene学习lucene学习lucene学习lucene学习...
NULL 博文链接:https://iamyida.iteye.com/blog/2201291
NULL 博文链接:https://iamyida.iteye.com/blog/2193345
NULL 博文链接:https://baobeituping.iteye.com/blog/847077
(6) 短语查询(PhraseQuery) 12 (7) 模糊查询(FuzzyQuery) 12 3.3 QueryParser 13 (1) 创建QueryParser 13 (2) 各种匹配方式 13 3.4 分页搜索 14 (1) 普通分页 14 (2) searchAfter分页 15 第四章 分词基础 17 4.1...
NULL 博文链接:https://iamyida.iteye.com/blog/2203743
NULL 博文链接:https://iamyida.iteye.com/blog/2199848
NULL 博文链接:https://iamyida.iteye.com/blog/2204455
NULL 博文链接:https://iamyida.iteye.com/blog/2207080
NULL 博文链接:https://iamyida.iteye.com/blog/2202111
NULL 博文链接:https://iamyida.iteye.com/blog/2201372
NULL 博文链接:https://iamyida.iteye.com/blog/2205114
NULL 博文链接:https://iamyida.iteye.com/blog/2197839
NULL 博文链接:https://iamyida.iteye.com/blog/2202651
NULL 博文链接:https://iamyida.iteye.com/blog/2203575
NULL 博文链接:https://iamyida.iteye.com/blog/2206107
Lucene3.0之查询处理(1):原理和查询类型 各种Query对象详解
NULL 博文链接:https://iamyida.iteye.com/blog/2199368
Lucene的的学习资料及案例,包括一个lucene的学习资料总结。供大家学习使用,也有本人写的一个小案例。