全文检索Lucene (1)

阅读量：

Lucene是一个基于Java的开源全文检索框架，支持高效的大规模文本检索。本文通过一个简单的HelloWorld级别示例展示了如何使用Lucene进行站内搜索。具体步骤包括：创建索引库、添加文章数据、编写查询语句，并从索引中获取搜索结果。
依赖项包括lucene-analyzers-common-6.1.0.jar、lucene-core-6.1.0.jar等核心组件。实现过程中创建了一个包含id、标题和内容字段的文章类，并通过模拟数据展示了如何将文章添加到索引中，并通过标准分析器编写查询语句进行搜索。
最终的查询结果展示了两条匹配记录，并输出了相关文章信息。整个过程简单明了地体现了Lucene在站内搜索中的应用及其高效性。

Lucene属于Apache开源的一个基于全文检索的框架系统，并且广为人知其高效性与实用性。今天将深入探讨如何在实际项目中灵活运用这一核心组件中的关键功能模块

工作流程

依赖

如果想利用Lucene这个强大的搜索引擎工具，在项目初期必须获取并引用第三方提供的jar文件包。以下列出了一些我在项目中实际应用的jar包工具。

lucene-analyzers-common-6.1.0.jar：提供分析器功能
lucene-core-6.1.0.jar：包含全文检索核心功能
lucene-highlighter-6.1.0.jar：能够对匹配到的目标词进行高亮显示
lucene-memory-6.1.0.jar：涵盖索引存储相关功能
lucene-queries-6.1.0.jar：提供查询处理能力
lucene-queryparser-6.1.0.jar：包含查询处理模块

Lucene `HelloWorld`

目前启动一个名为HelloWorld的小模块。开发一个基于文章内容的查询功能。

`Article.java`

复制代码

    /** * @Date 2016年8月1日
     * * @author Administrator
     */
    package domain;
    
    /** * @author 郭瑞彪
     * */
    public class Article {
    
    private Integer id;
    private String title;
    private String content;
    
    public Integer getId() {
        return id;
    }
    
    public void setId(Integer id) {
        this.id = id;
    }
    
    public String getTitle() {
        return title;
    }
    
    @Override
    public String toString() {
        return "Article [id=" + id + ", title=" + title + ", content=" + content + "]";
    }
    
    public void setTitle(String title) {
        this.title = title;
    }
    
    public String getContent() {
        return content;
    }
    
    public void setContent(String content) {
        this.content = content;
    }
    
    }

`创建索引库`

复制代码

    @Test
    public void createIndex() throws Exception {
    
        // 模拟一条文章数据
        Article a = new Article();
        a.setId(1);
        a.setTitle("全文检索");
        a.setContent("我们主要是做站内搜索（或叫系统内搜索），即对系统内的资源进行搜索");
    
        // 建立索引
        Directory dir = FSDirectory.open(Paths.get("./indexDir/"));
        IndexWriterConfig indexWriterConfig = new IndexWriterConfig(new StandardAnalyzer());
        IndexWriter indexWriter = new IndexWriter(dir, indexWriterConfig);
    
        Document doc = new Document();
        doc.add(new StringField("id", a.getId().toString(), Field.Store.YES));
        doc.add(new TextField("title", a.getTitle(), Field.Store.YES));
        doc.add(new TextField("content", a.getContent(), Field.Store.YES));
    
        indexWriter.addDocument(doc);
        indexWriter.close();
    }

`从索引库中获取查询结果`

复制代码

    @Test
    public void search() throws Exception {
    
        String queryString = "资源";
        Analyzer analyzer = new StandardAnalyzer();
        analyzer.setVersion(Version.LUCENE_6_1_0);
        QueryParser queryParser = new QueryParser("content", analyzer);
        Query query = queryParser.parse(queryString);
        // IndexReader indexReader =
        // DirectoryReader.open(FSDirectory.open(Paths.get("./indexDir/")));
        DirectoryReader directoryReader = DirectoryReader.open(FSDirectory.open(Paths.get("./indexDir/")));
        IndexReader indexReader = directoryReader;
        IndexSearcher indexSearcher = new IndexSearcher(indexReader);
        TopDocs topDocs = indexSearcher.search(query, 10);
        ScoreDoc[] scoreDocs = topDocs.scoreDocs;
    
        List<Article> articles = new ArrayList<Article>();
        for (int i = 0; i < scoreDocs.length; i++) {
            ScoreDoc scoreDoc = scoreDocs[i];
            Document doc = indexSearcher.doc(scoreDoc.doc);
            Article a = new Article();
            a.setId(Integer.parseInt(doc.get("id")));
            a.setTitle(doc.get("title"));
            a.setContent(doc.get("content"));
            System.out.println(a.toString());
            articles.add(a);
        }
        // 显示结果
        System.out.println("总的记录数为： " + topDocs.totalHits);
        System.out.println(articles.toString());
        for (Article a : articles) {
            System.out.println("-----------搜索结果如下-----------------");
            System.out.println(">>>id: " + a.getId());
            System.out.println(">>>title:" + a.getTitle());
            System.out.println(">>>content:" + a.getContent());
        }
        indexReader.close();
        analyzer.close();
    
    }

`查询结果`

复制代码

    总的记录数为： 4
    
    -----------搜索结果如下-----------------
    >>>id: 1
    >>>title:全文检索
    >>>content:我们主要是做站内搜索（或叫系统内搜索），即对系统内的资源进行搜索
    -----------搜索结果如下-----------------
    >>>id: 2
    >>>title:全文检索2
    >>>content:我们主要是做站内搜索（或叫系统内搜索），即对系统内的资源进行搜索,hahahahahhaha

总结

Lucene全文检索的功能得以轻易地实现，并且其中还有其他更多潜在的应用值得我们深入探索和利用。

全部评论 (0)

还没有任何评论哟~

全文检索Lucene (1)

Lucene是apache开源的一个全文检索框架，很是出名。今天先来分享一个类似于HelloWorld级别的使用。工作流程依赖我们要想使用Lucene，那就得先引用人家的jar包了。

全文检索 Lucene

全文检索Lucene 网上学习黑马课程后，在原有笔记的基础上进行总结课程计划 lucene入门什么是lucene Lucene的作用使用场景优点和缺点 lucene应用索引流程搜索流程 f...

ElasticSearch补充(1) -- Lucene全文检索库

大家都知道ES是基于Lucene的，说这么多ES，肯定还是得了解一下Lucene的 1什么是全文检索先说全文检索 1.1结构化数据与非结构化数据我们生活中的数据总体分为两种：结构化数据和非结构化数...

全文检索Lucene (2)

接着全文检索Lucene1。下面我们来深入的研究一下，如何使用Lucene！从全文检索Lucene1中我们可以看出，Lucene就好比一个双向的工作流，一方面是对索引库的维护，另一方面是对查询过程的...

全文检索 Lucene(3)

看完前两篇博客之后，想必大家对于Lucene的使用都有了一个比较清晰的认识了。如果对Lucene的知识点还是有点模糊的话，个人建议还是先看看这两篇文章。全文检索Lucene（1）全文检索Lucen...

全文检索 Lucene(4)

经过了前面几篇文章的学习，我们基本上可以适用Lucene来开发我们的站内搜索应用了。但是观察一下目前的主流的搜索引擎，我们会发现查询结果会有高亮的显示效果。所以，今天我们就来学习一下，给Lucene添...

Lucene实现全文检索

一、什么是全文检索 1、数据的分类 1结构化数据格式固定、长度固定、数据类型固定。例如数据库中的数据 2非结构化数据 word文档、pdf文档、邮件、html、txt 格式不固定、长度不固定、数据...

全文检索技术Lucene

一.Lucene简介 1\.Lucene是什么 Lucene是一个开放源代码的全文检索引擎工具包，但它不是一个完整的全文检索引擎，而是一个全文检索引擎的架构，提供了完整的查询引擎和索引引擎，部分文本分...

Lucene—全文检索工具包

一、Lucene简介 Lucene是ApacheJakarta家族中的⼀个开源项目，是⼀个开放源代码的全文检索引擎工具包，但它不是⼀个完整的全文检索引擎，而是⼀个全文检索引擎的架构，提供了完整的查询...

全文检索Lucene（四）---Compass框架

Compass是一个强大的,事务的,高性能的对象/搜索引擎映射OSEM:object/searchenginemapping与一个Java持久层框架。 Compass之于Lucene，就像Hibern...

是否确定退出登录?

全文检索Lucene (1)

工作流程

依赖

Lucene HelloWorld

Article.java

创建索引库

从索引库中获取查询结果

查询结果

总结

全部评论 (0)

相关文章推荐

全文检索Lucene (1)

全文检索 Lucene

ElasticSearch补充(1) -- Lucene全文检索库

全文检索Lucene (2)

全文检索 Lucene(3)

全文检索 Lucene(4)

Lucene实现全文检索

全文检索技术Lucene

Lucene—全文检索工具包

全文检索Lucene（四）---Compass框架

Lucene `HelloWorld`

`Article.java`

`创建索引库`

`从索引库中获取查询结果`

`查询结果`