上传 IK 词典文件启用时报错cluster status not ready

容器与中间件中间件技术服务知识库
问题现象

用户上传 IK 词典文件后,在启用时报错:

elasticsearch cluster status not ready, no update or restart will be executed. If you want to update or restart this resource anyway, please FORCE to do it.

图片

排查步骤

根据此报错描述,检查云搜索集群是否处于重启,以及 RED 或者 Yellow 状态

  1. 检查集群健康状态
{
  "cluster_name" : "nkxzzdr1xxxxx",
  "status" : "yellow",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 6,
  "active_shards" : 16,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 3,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 84.21052631578947
}

可以看到当前实例状态为 Yellow,继续排查集群处于 Yellow 状态的原因,运行如下命令,命令回显显示集群中有三个副本分片处于未分配状态。

GET _cat/shards?h=index,shard,prirep,state,unassigned.reason&v
index                      shard prirep state      unassigned.reason
mytest                     2     p      STARTED    
mytest                     2     r      STARTED    
mytest                     2     r      STARTED    
mytest                     2     r      UNASSIGNED INDEX_CREATED
mytest                     1     r      STARTED    
mytest                     1     r      STARTED    
mytest                     1     p      STARTED    
mytest                     1     r      UNASSIGNED INDEX_CREATED
mytest                     0     r      STARTED    
mytest                     0     p      STARTED    
mytest                     0     r      STARTED    
mytest                     0     r      UNASSIGNED INDEX_CREATED

查看未分片的具体解释

GET _cluster/allocation/explain?pretty

// 部分回显如下:
        {
          "decider" : "same_shard",
          "decision" : "NO",
          "explanation" : "a copy of this shard is already allocated to this node [[mytest][1], node[aXDggTa2REqsvQxOEmhBUg], [R], s[STARTED], a[id=-NcCEWV0Swaj8fu_K0eXFA]]"
        }
解决方案

根据上述的排查信息,我们可以找到对应的处理方法,在这个案例中,通过修改副本分片的个数,或者是增加一个节点解决此问题。 这里我们选择修改索引的副本分片数来使集群恢复到 Green 状态

PUT mytest/_settings
{
 "number_of_replicas": 2
}

当集群恢复到 Green 后,再次对词典文件进行启用即可成功。 对词典文件进行启用,会导致集群进行重启,建议您在业务可维护窗口进行相关操作。

参考文档
15
0
0
0
相关产品
评论
未登录
看完啦,登录分享一下感受吧~
暂无评论