今天来聊聊Logstash写入Elasticsearch发生的metadata通配异常问题

说明

本文描述问题及解决方法同样适用于 腾讯云 Elasticsearch Service(ES)

另外使用到:腾讯云 Logstash(Logstash,LS)

  • 系统环境说明

Linux环境:CentOS Linux release 7.2 (Final)

Elasticsearch:7.10.1

Logstash:7.10.2

Java:1.8.0_181

背景

我们在做Elasticsearch数据迁移的时候,往往因为数据量太多,有大量索引需要迁移,所以在logstash里配置的索引名多为模糊匹配,但是在实际使用中,却会遇到一些问题。

问题及解决方案

问题一:不允许使用通配符

[2021-09-15T13:36:34,723][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({"type"=>"security_exception", "reason"=>"action [indices:admin/create] is unauthorized for user [elastic]", "caused_by"=>{"type"=>"illegal_state_exception", "reason"=>"There are no external requests known to support wildcards that don't support replacing their indices"}})
[2021-09-15T13:36:34,723][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({"type"=>"security_exception", "reason"=>"action [indices:admin/create] is unauthorized for user [elastic]", "caused_by"=>{"type"=>"illegal_state_exception", "reason"=>"There are no external requests known to support wildcards that don't support replacing their indices"}})
[2021-09-15T13:36:34,723][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({"type"=>"security_exception", "reason"=>"action [indices:admin/create] is unauthorized for user [elastic]", "caused_by"=>{"type"=>"illegal_state_exception", "reason"=>"There are no external requests known to support wildcards that don't support replacing their indices"}})
[2021-09-15T13:36:34,724][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({"type"=>"security_exception", "reason"=>"action [indices:admin/create] is unauthorized for user [elastic]", "caused_by"=>{"type"=>"illegal_state_exception", "reason"=>"There are no external requests known to support wildcards that don't support replacing their indices"}})
[2021-09-15T13:36:34,724][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({"type"=>"security_exception", "reason"=>"action [indices:admin/create] is unauthorized for user [elastic]", "caused_by"=>{"type"=>"illegal_state_exception", "reason"=>"There are no external requests known to support wildcards that don't support replacing their indices"}})
[2021-09-15T13:36:34,724][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({"type"=>"security_exception", "reason"=>"action [indices:admin/create] is unauthorized for user [elastic]", "caused_by"=>{"type"=>"illegal_state_exception", "reason"=>"There are no external requests known to support wildcards that don't support replacing their indices"}})
[2021-09-15T13:36:34,724][INFO ][logstash.outputs.elasticsearch] Retrying individual bulk actions that failed or were rejected by the previous bulk request. {:count=>125}

解决方案

这个报错是因为output段index => 不允许为*,这里需要将output段里的index修改为

index => "%{[@metadata][_index]}"

问题二:目标集群自动创建出名称为@metadata的索引

这个问题是因为input段没有显式指定docinfo为true,其含义是开启文档信息,其中包含了索引名称、类型、文档id这些信息。如果没有显式指定为true,那这个值默认false,会导致output段拿不到metadata的相关值,直接导致%{[@metadata][_index]}等变量的异常。

截图来自 —— plugins inputs elasticsearch docinfo

解决方案

正确demo示范:

input {
   elasticsearch {
       hosts => "1.1.1.1:9200"
       index => "*"
       docinfo => true
       size => 5000
       scroll => "5m"
     }
}
output {
   elasticsearch {
       hosts => ["http://2.2.2.2:9200"]
       user => "elastic"
       password => "your_password"
       index => "%{[@metadata][_index]}"
       document_type => "%{[@metadata][_type]}"
       document_id => "%{[@metadata][_id]}"
   }
}

问题三:未知索引名称导致logstash写入报不允许通配的问题

在另一种场景中,我们没有使用metadata的docinfo信息,但是还是会拿到不允许通配的报错:

[2021-01-04T16:09:46,517][INFO ][logstash.outputs.elasticsearch][main][15029ec90b014722fb1e21a3d9bea5122d0776b75db4d73c2b75c774c6d36eef] retrying failed action with response code: 403 ({"type"=>"security_exception", "reason"=>"action [indices:admin/create] is unauthorized for user [elastic]", "caused_by"=>{"type"=>"illegal_state_exception", "reason"=>"There are no external requests known to support wildcards that don't support replacing their indices"}})
[2021-01-04T16:09:46,518][INFO ][logstash.outputs.elasticsearch][main][15029ec90b014722fb1e21a3d9bea5122d0776b75db4d73c2b75c774c6d36eef] Retrying individual bulk actions that failed or were rejected by the previous bulk request. {:count=>3}
[2021-01-04T16:10:47,124][INFO ][logstash.outputs.elasticsearch][main][15029ec90b014722fb1e21a3d9bea5122d0776b75db4d73c2b75c774c6d36eef] retrying failed action with response code: 403 ({"type"=>"security_exception", "reason"=>"action [indices:admin/create] is unauthorized for user [elastic]", "caused_by"=>{"type"=>"illegal_state_exception", "reason"=>"There are no external requests known to support wildcards that don't support replacing their indices"}})
[2021-01-04T16:10:47,125][INFO ][logstash.outputs.elasticsearch][main][15029ec90b014722fb1e21a3d9bea5122d0776b75db4d73c2b75c774c6d36eef] Retrying individual bulk actions that failed or were rejected by the previous bulk request. {:count=>1}

解决方案

遇到的问题很奇怪,经过一番检查,我们在logstash配置的output段发现了端倪:

output {
    	stdout {codec => rubydebug}
	if [rtype] {
        	elasticsearch {
                	#index => "%{rtype}_%{+YYYY.MM.dd}"
                	index => "%{rtype}_%{date_index}"
                	hosts => ["http://2.2.2.2:9200"]
                	user => "elastic"
                	password => "your_password"
        	}
	}
}

这里可以明确地看到,if判断条件过于简单,只要字段存在就进行写入,并且还以条件为索引名。这个显然不合理,这种情况去查看索引列表,一定可以发现很多不符合预期的索引名称,包括通配符的存在,所以这里就需要进行优化。

优化方案:

  • 尽量避免以条件为索引名,容易发生未知的问题;
  • 如果索引名称一定要是条件,则需要严格判断字段内容,控制索引名在预期内,并且要判断出else的未知字符串,否则将发生难以接受的后果。
正文完