为Elasticsearch构建自定义标记器

恩特雷维夫

为此,我正在构建自定义标记器:doc_values字段与已分析字段的性能

这个API似乎都没有记录(?),因此我不打算使用其他插件/令牌生成器的代码示例,但是当我重新部署弹性代码并部署了令牌生成器时,在日志中不断出现此错误:

[2017-09-20 08:45:37,412][WARN ][indices.cluster          ] [Samuel Silke] [[storm-crawler-2017-09-11][3]] marking and sending shard failed due to [failed to create index]
[storm-crawler-2017-09-11] IndexCreationException[failed to create index]; nested: CreationException[Guice creation errors:

1) Could not find a suitable constructor in com.cameraforensics.elasticsearch.plugins.UrlTokenizerFactory. Classes must have either one (and only one) constructor annotated with @Inject or a zero-argument constructor that is not private.
  at com.cameraforensics.elasticsearch.plugins.UrlTokenizerFactory.class(Unknown Source)
  at org.elasticsearch.index.analysis.TokenizerFactoryFactory.create(Unknown Source)
  at org.elasticsearch.common.inject.assistedinject.FactoryProvider2.initialize(Unknown Source)
  at _unknown_

1 error];
    at org.elasticsearch.indices.IndicesService.createIndex(IndicesService.java:360)
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyNewIndices(IndicesClusterStateService.java:294)
    at org.elasticsearch.indices.cluster.IndicesClusterStateService.clusterChanged(IndicesClusterStateService.java:163)
    at org.elasticsearch.cluster.service.InternalClusterService.runTasksForExecutor(InternalClusterService.java:610)
    at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:772)
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:231)
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:194)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: org.elasticsearch.common.inject.CreationException: Guice creation errors:

1) Could not find a suitable constructor in com.cameraforensics.elasticsearch.plugins.UrlTokenizerFactory. Classes must have either one (and only one) constructor annotated with @Inject or a zero-argument constructor that is not private.
  at com.cameraforensics.elasticsearch.plugins.UrlTokenizerFactory.class(Unknown Source)
  at org.elasticsearch.index.analysis.TokenizerFactoryFactory.create(Unknown Source)
  at org.elasticsearch.common.inject.assistedinject.FactoryProvider2.initialize(Unknown Source)
  at _unknown_

1 error
    at org.elasticsearch.common.inject.internal.Errors.throwCreationExceptionIfErrorsExist(Errors.java:360)
    at org.elasticsearch.common.inject.InjectorBuilder.injectDynamically(InjectorBuilder.java:172)
    at org.elasticsearch.common.inject.InjectorBuilder.build(InjectorBuilder.java:110)
    at org.elasticsearch.common.inject.InjectorImpl.createChildInjector(InjectorImpl.java:157)
    at org.elasticsearch.common.inject.ModulesBuilder.createChildInjector(ModulesBuilder.java:55)
    at org.elasticsearch.indices.IndicesService.createIndex(IndicesService.java:358)
    ... 9 more

我的令牌生成器是为v2.3.4构建的,令牌生成器工厂如下所示:

public class UrlTokenizerFactory extends AbstractTokenizerFactory {

    @Inject
    public UrlTokenizerFactory(Index index, IndexSettingsService indexSettings, @Assisted String name, @Assisted Settings settings){
        super(index, indexSettings.getSettings(), name, settings);
    }

    @Override
    public Tokenizer create() {
        return new UrlTokenizer();
    }
}

我真的不知道我在做什么错。我部署不正确了吗?它似乎正在根据日志使用我的课程...

我仅将其部署到我的es节点之一(4节点集群)。/_cat/plugins?v端点给出了这样的:

name         component          version type url 
Samuel Silke urltokenizer       2.3.4.0 j        

由于此过程的文档很少或没有文档,因此,通过复制其他人在插件中创建的构造,我已经走了很远。

我看到的错误没有任何意义。我的TokenizerFactory看起来像其他所有人都在使用此版本的Elastic。我在做错什么,或者可能做错了我应该做的这项工作?

恩特雷维夫

原来我缺少一个Environment变量。应该是这样的: public UrlTokenizerFactory(Index index, IndexSettingsService indexSettings, Environment env, @Assisted String name, @Assisted Settings settings){ ...

我最终在这里找到了类似的一个:https : //github.com/codelibs/elasticsearch-analysis-kuromoji-neologd/blob/2.3.x/src/main/java/org/codelibs/elasticsearch/kuromoji/neologd/ index / analysis / KuromojiTokenizerFactory.java

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章