随笔,Elasticsearch 7.10.1集群压测报告(32核64G*3,AMD)

说明

本文描述问题及解决方法同样适用于 腾讯云 Elasticsearch Service(ES)

另外使用到:腾讯云 云服务器(Cloud Virtual Machine,CVM)

本文延续上一篇 Elasticsearch压测工具esrally部署指南

环境配置

Esrally客户端环境

  • 版本

Linux环境:Centos 7.9

Python:3.8.7

Pip:pip 20.2.3 from pip (python 3.8)

Java:openjdk version 1.8.0_302 (build 1.8.0_302-b08)

Git:2.7.5

Esrally:2.3.0

  • 配置

内存:32G

硬盘:SSD云硬盘 100GB

CPU个数:1

CPU核心数:16

Elasticsearch服务端环境

  • 版本

Linux环境:Centos 7.2

Java:openjdk version 11.0.9.1-ga (build 11.0.9.1-ga+1, mixed mode)

Elasticsearch版本:7.10.1(腾讯云 Elasticsearch Service 白金版)

  • 配置

节点数量:3

内存:64G

硬盘:SSD云硬盘 1TB

CPU个数:1

CPU核心数:32

CPU型号:AMD EPYC 7K62 48-Core Processor

背景

在大数据时代的今天,业务量越来越大,每天动辄都会产生上百GB、上TB的数据,所以拥有一个性能强劲的Elasticsearch集群就显得尤为重要。我们需要模拟大量网络日志、用户行为日志的读写动作,衡量各性能的指标,找出集群瓶颈所在,以确认我们需要怎样的硬件配置以及业务优化,才能满足现有的业务量,这就是我们在业务上线前所必要做的。

压测

esrally 相关术语及参数

Rally 是汽车拉力赛的意思,所以关于它里面术语也是跟汽车的拉力赛有关。

  • track: 即赛道的意思,这里指压测用到的样本数据和压测策略,使用 esrally list tracks 列出。rally 自带的 track 可在 https://github.com/elastic/rally-tracks 中查看,每个 track 的文件名中都存在 README.md 对压测的数据类型和参数做了详细的说明。如果没有指定 track, 则默认使用 geonames track 进行测试;
  • target-hosts:即远程elasticsearch的ip和端口,以ip:port的形式指定;
  • pipeline: 指一个压测流程,可以通过 esrally list pipeline 查看,其中有一个 benchmark-only 的流程,就是将 es 的管理交给用户来操作,rally 只用来做压测,如果你想针对已有的 es 进行压测,则使用该模式;
  • track-params:对默认的压测参数进行覆盖;
  • user-tag:本次压测的 tag 标记;
  • client-options:指定一些客户端连接选项,比如用户名和密码。

压测指令

esrally race \
  --track=geonames \
  --target-hosts=10.0.10.4:9200 \
  --pipeline=benchmark-only \
  --track-params="number_of_shards:3, number_of_replicas:1" \
  --user-tag="version:AMD_32C64G_1T*3" \
  --client-options="basic_auth_user:'elastic', basic_auth_password:'your_password'"

压测报告

压测指标

压测任务

压测结果

单位

Cumulative indexing time of primary
shards

12.0018

min

Min cumulative indexing time across
primary shards

0

min

Median cumulative indexing time across
primary shards

0.00191667

min

Max cumulative indexing time across
primary shards

4.07257

min

Cumulative indexing throttle time of
primary shards

0

min

Min cumulative indexing throttle time
across primary shards

0

min

Median cumulative indexing throttle time
across primary shards

0

min

Max cumulative indexing throttle time
across primary shards

0

min

Cumulative merge time of primary
shards

2.89252

min

Cumulative merge count of primary
shards

103

Min cumulative merge time across primary
shards

0

min

Median cumulative merge time across
primary shards

0.00151667

min

Max cumulative merge time across primary
shards

1.03677

min

Cumulative merge throttle time of
primary shards

1.01787

min

Min cumulative merge throttle time
across primary shards

0

min

Median cumulative merge throttle time
across primary shards

0

min

Max cumulative merge throttle time
across primary shards

0.389717

min

Cumulative refresh time of primary
shards

0.627867

min

Cumulative refresh count of primary
shards

882

Min cumulative refresh time across
primary shards

0

min

Median cumulative refresh time across
primary shards

0.0103833

min

Max cumulative refresh time across
primary shards

0.197033

min

Cumulative flush time of primary
shards

0.31325

min

Cumulative flush count of primary
shards

16

Min cumulative flush time across primary
shards

0

min

Median cumulative flush time across
primary shards

0.000266667

min

Max cumulative flush time across primary
shards

0.133983

min

Total Young Gen GC time

4.998

s

Total Young Gen GC count

438

Total Old Gen GC time

0

s

Total Old Gen GC count

0

Store size

6.12128

GB

Translog size

0.0143655

GB

Heap used for segments

1.28726

MB

Heap used for doc values

0.214363

MB

Heap used for terms

0.930458

MB

Heap used for norms

0.0603638

MB

Heap used for points

0

MB

Heap used for stored fields

0.082077

MB

Segment count

126

error rate

index-append

0

%

Min Throughput

index-stats

90.01

ops/s

Mean Throughput

index-stats

90.02

ops/s

Median Throughput

index-stats

90.02

ops/s

Max Throughput

index-stats

90.04

ops/s

50th percentile latency

index-stats

3.32502

ms

90th percentile latency

index-stats

3.79391

ms

99th percentile latency

index-stats

4.44376

ms

99.9th percentile latency

index-stats

11.1876

ms

100th percentile latency

index-stats

18.235

ms

50th percentile service time

index-stats

2.59334

ms

90th percentile service time

index-stats

2.89264

ms

99th percentile service time

index-stats

3.40719

ms

99.9th percentile service time

index-stats

4.69541

ms

100th percentile service time

index-stats

17.4649

ms

error rate

index-stats

0

%

Min Throughput

node-stats

89.96

ops/s

Mean Throughput

node-stats

89.98

ops/s

Median Throughput

node-stats

89.98

ops/s

Max Throughput

node-stats

89.99

ops/s

50th percentile latency

node-stats

3.69659

ms

90th percentile latency

node-stats

4.27771

ms

99th percentile latency

node-stats

5.6903

ms

99.9th percentile latency

node-stats

13.8993

ms

100th percentile latency

node-stats

20.7916

ms

50th percentile service time

node-stats

2.94895

ms

90th percentile service time

node-stats

3.40596

ms

99th percentile service time

node-stats

4.90547

ms

99.9th percentile service time

node-stats

10.2721

ms

100th percentile service time

node-stats

20.3025

ms

error rate

node-stats

0

%

Min Throughput

default

50.01

ops/s

Mean Throughput

default

50.01

ops/s

Median Throughput

default

50.01

ops/s

Max Throughput

default

50.03

ops/s

50th percentile latency

default

3.58861

ms

90th percentile latency

default

4.04014

ms

99th percentile latency

default

4.86193

ms

99.9th percentile latency

default

6.20941

ms

100th percentile latency

default

9.99651

ms

50th percentile service time

default

2.86091

ms

90th percentile service time

default

3.08905

ms

99th percentile service time

default

3.60439

ms

99.9th percentile service time

default

5.87795

ms

100th percentile service time

default

9.52015

ms

error rate

default

0

%

Min Throughput

term

99.96

ops/s

Mean Throughput

term

99.97

ops/s

Median Throughput

term

99.97

ops/s

Max Throughput

term

99.98

ops/s

50th percentile latency

term

2.7826

ms

90th percentile latency

term

3.2312

ms

99th percentile latency

term

3.61357

ms

99.9th percentile latency

term

6.03389

ms

100th percentile latency

term

7.04569

ms

50th percentile service time

term

2.06093

ms

90th percentile service time

term

2.2888

ms

99th percentile service time

term

2.75873

ms

99.9th percentile service time

term

5.32203

ms

100th percentile service time

term

6.37927

ms

error rate

term

0

%

Min Throughput

phrase

109.78

ops/s

Mean Throughput

phrase

109.86

ops/s

Median Throughput

phrase

109.88

ops/s

Max Throughput

phrase

109.91

ops/s

50th percentile latency

phrase

3.43511

ms

90th percentile latency

phrase

3.97437

ms

99th percentile latency

phrase

5.24641

ms

99.9th percentile latency

phrase

14.8293

ms

100th percentile latency

phrase

20.7531

ms

50th percentile service time

phrase

2.74598

ms

90th percentile service time

phrase

3.11355

ms

99th percentile service time

phrase

4.0228

ms

99.9th percentile service time

phrase

12.8595

ms

100th percentile service time

phrase

20.0747

ms

error rate

phrase

0

%

Min Throughput

country_agg_uncached

3

ops/s

Mean Throughput

country_agg_uncached

3

ops/s

Median Throughput

country_agg_uncached

3

ops/s

Max Throughput

country_agg_uncached

3

ops/s

50th percentile latency

country_agg_uncached

227.008

ms

90th percentile latency

country_agg_uncached

228.92

ms

99th percentile latency

country_agg_uncached

244.17

ms

100th percentile latency

country_agg_uncached

247.198

ms

50th percentile service time

country_agg_uncached

226.148

ms

90th percentile service time

country_agg_uncached

227.994

ms

99th percentile service time

country_agg_uncached

243.33

ms

100th percentile service time

country_agg_uncached

246.318

ms

error rate

country_agg_uncached

0

%

Min Throughput

country_agg_cached

98

ops/s

Mean Throughput

country_agg_cached

98.53

ops/s

Median Throughput

country_agg_cached

98.58

ops/s

Max Throughput

country_agg_cached

98.9

ops/s

50th percentile latency

country_agg_cached

2.46718

ms

90th percentile latency

country_agg_cached

3.0093

ms

99th percentile latency

country_agg_cached

3.94664

ms

99.9th percentile latency

country_agg_cached

6.4442

ms

100th percentile latency

country_agg_cached

7.78294

ms

50th percentile service time

country_agg_cached

1.66946

ms

90th percentile service time

country_agg_cached

1.93246

ms

99th percentile service time

country_agg_cached

2.26278

ms

99.9th percentile service time

country_agg_cached

5.31934

ms

100th percentile service time

country_agg_cached

7.03626

ms

error rate

country_agg_cached

0

%

Min Throughput

scroll

20.03

pages/s

Mean Throughput

scroll

20.04

pages/s

Median Throughput

scroll

20.04

pages/s

Max Throughput

scroll

20.04

pages/s

50th percentile latency

scroll

617.586

ms

90th percentile latency

scroll

632.626

ms

99th percentile latency

scroll

637.991

ms

100th percentile latency

scroll

639.222

ms

50th percentile service time

scroll

615.78

ms

90th percentile service time

scroll

631.178

ms

99th percentile service time

scroll

636.894

ms

100th percentile service time

scroll

638.208

ms

error rate

scroll

0

%

Min Throughput

expression

1.5

ops/s

Mean Throughput

expression

1.5

ops/s

Median Throughput

expression

1.5

ops/s

Max Throughput

expression

1.5

ops/s

50th percentile latency

expression

382.843

ms

90th percentile latency

expression

386.208

ms

99th percentile latency

expression

407.571

ms

100th percentile latency

expression

408.252

ms

50th percentile service time

expression

382.038

ms

90th percentile service time

expression

385.511

ms

99th percentile service time

expression

406.388

ms

100th percentile service time

expression

406.605

ms

error rate

expression

0

%

Min Throughput

painless_static

1.4

ops/s

Mean Throughput

painless_static

1.4

ops/s

Median Throughput

painless_static

1.4

ops/s

Max Throughput

painless_static

1.4

ops/s

50th percentile latency

painless_static

545.018

ms

90th percentile latency

painless_static

552.211

ms

99th percentile latency

painless_static

563.011

ms

100th percentile latency

painless_static

567.372

ms

50th percentile service time

painless_static

544.375

ms

90th percentile service time

painless_static

551.622

ms

99th percentile service time

painless_static

561.831

ms

100th percentile service time

painless_static

566.231

ms

error rate

painless_static

0

%

Min Throughput

painless_dynamic

1.4

ops/s

Mean Throughput

painless_dynamic

1.4

ops/s

Median Throughput

painless_dynamic

1.4

ops/s

Max Throughput

painless_dynamic

1.4

ops/s

50th percentile latency

painless_dynamic

519.255

ms

90th percentile latency

painless_dynamic

528.058

ms

99th percentile latency

painless_dynamic

534.018

ms

100th percentile latency

painless_dynamic

534.093

ms

50th percentile service time

painless_dynamic

518.442

ms

90th percentile service time

painless_dynamic

527.547

ms

99th percentile service time

painless_dynamic

532.945

ms

100th percentile service time

painless_dynamic

533.011

ms

error rate

painless_dynamic

0

%

Min Throughput

decay_geo_gauss_function_score

1

ops/s

Mean Throughput

decay_geo_gauss_function_score

1

ops/s

Median Throughput

decay_geo_gauss_function_score

1

ops/s

Max Throughput

decay_geo_gauss_function_score

1

ops/s

50th percentile latency

decay_geo_gauss_function_score

521.981

ms

90th percentile latency

decay_geo_gauss_function_score

523.898

ms

99th percentile latency

decay_geo_gauss_function_score

526.734

ms

100th percentile latency

decay_geo_gauss_function_score

527.336

ms

50th percentile service time

decay_geo_gauss_function_score

519.966

ms

90th percentile service time

decay_geo_gauss_function_score

521.63

ms

99th percentile service time

decay_geo_gauss_function_score

524.976

ms

100th percentile service time

decay_geo_gauss_function_score

525.479

ms

error rate

decay_geo_gauss_function_score

0

%

Min Throughput

decay_geo_gauss_script_score

1

ops/s

Mean Throughput

decay_geo_gauss_script_score

1

ops/s

Median Throughput

decay_geo_gauss_script_score

1

ops/s

Max Throughput

decay_geo_gauss_script_score

1

ops/s

50th percentile latency

decay_geo_gauss_script_score

538.191

ms

90th percentile latency

decay_geo_gauss_script_score

545.069

ms

99th percentile latency

decay_geo_gauss_script_score

586.967

ms

100th percentile latency

decay_geo_gauss_script_score

586.985

ms

50th percentile service time

decay_geo_gauss_script_score

536.905

ms

90th percentile service time

decay_geo_gauss_script_score

543.779

ms

99th percentile service time

decay_geo_gauss_script_score

585.324

ms

100th percentile service time

decay_geo_gauss_script_score

585.823

ms

error rate

decay_geo_gauss_script_score

0

%

Min Throughput

field_value_function_score

1.5

ops/s

Mean Throughput

field_value_function_score

1.5

ops/s

Median Throughput

field_value_function_score

1.5

ops/s

Max Throughput

field_value_function_score

1.51

ops/s

50th percentile latency

field_value_function_score

194.825

ms

90th percentile latency

field_value_function_score

196.333

ms

99th percentile latency

field_value_function_score

197.708

ms

100th percentile latency

field_value_function_score

198.15

ms

50th percentile service time

field_value_function_score

193.185

ms

90th percentile service time

field_value_function_score

194.363

ms

99th percentile service time

field_value_function_score

195.43

ms

100th percentile service time

field_value_function_score

196.297

ms

error rate

field_value_function_score

0

%

Min Throughput

field_value_script_score

1.5

ops/s

Mean Throughput

field_value_script_score

1.5

ops/s

Median Throughput

field_value_script_score

1.5

ops/s

Max Throughput

field_value_script_score

1.5

ops/s

50th percentile latency

field_value_script_score

254.118

ms

90th percentile latency

field_value_script_score

257.491

ms

99th percentile latency

field_value_script_score

290.216

ms

100th percentile latency

field_value_script_score

291.284

ms

50th percentile service time

field_value_script_score

252.402

ms

90th percentile service time

field_value_script_score

255.398

ms

99th percentile service time

field_value_script_score

288.065

ms

100th percentile service time

field_value_script_score

290.369

ms

error rate

field_value_script_score

0

%

Min Throughput

large_terms

1.1

ops/s

Mean Throughput

large_terms

1.1

ops/s

Median Throughput

large_terms

1.1

ops/s

Max Throughput

large_terms

1.1

ops/s

50th percentile latency

large_terms

837.035

ms

90th percentile latency

large_terms

844.836

ms

99th percentile latency

large_terms

953.793

ms

100th percentile latency

large_terms

983.886

ms

50th percentile service time

large_terms

829.683

ms

90th percentile service time

large_terms

837.255

ms

99th percentile service time

large_terms

906.481

ms

100th percentile service time

large_terms

976.468

ms

error rate

large_terms

0

%

Min Throughput

large_filtered_terms

1.1

ops/s

Mean Throughput

large_filtered_terms

1.1

ops/s

Median Throughput

large_filtered_terms

1.1

ops/s

Max Throughput

large_filtered_terms

1.1

ops/s

50th percentile latency

large_filtered_terms

741.092

ms

90th percentile latency

large_filtered_terms

745.989

ms

99th percentile latency

large_filtered_terms

748.138

ms

100th percentile latency

large_filtered_terms

751.272

ms

50th percentile service time

large_filtered_terms

733.844

ms

90th percentile service time

large_filtered_terms

738.678

ms

99th percentile service time

large_filtered_terms

741.33

ms

100th percentile service time

large_filtered_terms

743.794

ms

error rate

large_filtered_terms

0

%

Min Throughput

large_prohibited_terms

1.1

ops/s

Mean Throughput

large_prohibited_terms

1.1

ops/s

Median Throughput

large_prohibited_terms

1.1

ops/s

Max Throughput

large_prohibited_terms

1.1

ops/s

50th percentile latency

large_prohibited_terms

731.11

ms

90th percentile latency

large_prohibited_terms

779.105

ms

99th percentile latency

large_prohibited_terms

787.458

ms

100th percentile latency

large_prohibited_terms

789.288

ms

50th percentile service time

large_prohibited_terms

723.897

ms

90th percentile service time

large_prohibited_terms

771.871

ms

99th percentile service time

large_prohibited_terms

780.304

ms

100th percentile service time

large_prohibited_terms

781.836

ms

error rate

large_prohibited_terms

0

%

Min Throughput

desc_sort_population

1.5

ops/s

Mean Throughput

desc_sort_population

1.5

ops/s

Median Throughput

desc_sort_population

1.5

ops/s

Max Throughput

desc_sort_population

1.51

ops/s

50th percentile latency

desc_sort_population

80.1856

ms

90th percentile latency

desc_sort_population

83.461

ms

99th percentile latency

desc_sort_population

84.7192

ms

100th percentile latency

desc_sort_population

84.8816

ms

50th percentile service time

desc_sort_population

79.4289

ms

90th percentile service time

desc_sort_population

82.2305

ms

99th percentile service time

desc_sort_population

83.3647

ms

100th percentile service time

desc_sort_population

84.0873

ms

error rate

desc_sort_population

0

%

Min Throughput

asc_sort_population

1.5

ops/s

Mean Throughput

asc_sort_population

1.51

ops/s

Median Throughput

asc_sort_population

1.51

ops/s

Max Throughput

asc_sort_population

1.51

ops/s

50th percentile latency

asc_sort_population

83.61

ms

90th percentile latency

asc_sort_population

84.9397

ms

99th percentile latency

asc_sort_population

85.7584

ms

100th percentile latency

asc_sort_population

86.0759

ms

50th percentile service time

asc_sort_population

82.6323

ms

90th percentile service time

asc_sort_population

83.4548

ms

99th percentile service time

asc_sort_population

84.8149

ms

100th percentile service time

asc_sort_population

85.2134

ms

error rate

asc_sort_population

0

%

Min Throughput

asc_sort_with_after_population

1.5

ops/s

Mean Throughput

asc_sort_with_after_population

1.5

ops/s

Median Throughput

asc_sort_with_after_population

1.5

ops/s

Max Throughput

asc_sort_with_after_population

1.51

ops/s

50th percentile latency

asc_sort_with_after_population

129.041

ms

90th percentile latency

asc_sort_with_after_population

136.526

ms

99th percentile latency

asc_sort_with_after_population

139.951

ms

100th percentile latency

asc_sort_with_after_population

140.097

ms

50th percentile service time

asc_sort_with_after_population

127.7

ms

90th percentile service time

asc_sort_with_after_population

135.069

ms

99th percentile service time

asc_sort_with_after_population

138.689

ms

100th percentile service time

asc_sort_with_after_population

139.159

ms

error rate

asc_sort_with_after_population

0

%

Min Throughput

desc_sort_geonameid

6.02

ops/s

Mean Throughput

desc_sort_geonameid

6.02

ops/s

Median Throughput

desc_sort_geonameid

6.02

ops/s

Max Throughput

desc_sort_geonameid

6.02

ops/s

50th percentile latency

desc_sort_geonameid

6.0802

ms

90th percentile latency

desc_sort_geonameid

6.64085

ms

99th percentile latency

desc_sort_geonameid

7.08974

ms

100th percentile latency

desc_sort_geonameid

7.19372

ms

50th percentile service time

desc_sort_geonameid

5.1818

ms

90th percentile service time

desc_sort_geonameid

5.68349

ms

99th percentile service time

desc_sort_geonameid

5.99611

ms

100th percentile service time

desc_sort_geonameid

6.12978

ms

error rate

desc_sort_geonameid

0

%

Min Throughput

desc_sort_with_after_geonameid

6

ops/s

Mean Throughput

desc_sort_with_after_geonameid

6

ops/s

Median Throughput

desc_sort_with_after_geonameid

6

ops/s

Max Throughput

desc_sort_with_after_geonameid

6

ops/s

50th percentile latency

desc_sort_with_after_geonameid

132.899

ms

90th percentile latency

desc_sort_with_after_geonameid

139.958

ms

99th percentile latency

desc_sort_with_after_geonameid

142.393

ms

100th percentile latency

desc_sort_with_after_geonameid

147.022

ms

50th percentile service time

desc_sort_with_after_geonameid

131.937

ms

90th percentile service time

desc_sort_with_after_geonameid

139.392

ms

99th percentile service time

desc_sort_with_after_geonameid

141.311

ms

100th percentile service time

desc_sort_with_after_geonameid

145.89

ms

error rate

desc_sort_with_after_geonameid

0

%

Min Throughput

asc_sort_geonameid

6.01

ops/s

Mean Throughput

asc_sort_geonameid

6.01

ops/s

Median Throughput

asc_sort_geonameid

6.01

ops/s

Max Throughput

asc_sort_geonameid

6.01

ops/s

50th percentile latency

asc_sort_geonameid

5.9837

ms

90th percentile latency

asc_sort_geonameid

6.61111

ms

99th percentile latency

asc_sort_geonameid

6.84505

ms

100th percentile latency

asc_sort_geonameid

7.04724

ms

50th percentile service time

asc_sort_geonameid

5.16965

ms

90th percentile service time

asc_sort_geonameid

5.52956

ms

99th percentile service time

asc_sort_geonameid

5.73313

ms

100th percentile service time

asc_sort_geonameid

5.77093

ms

error rate

asc_sort_geonameid

0

%

Min Throughput

asc_sort_with_after_geonameid

6

ops/s

Mean Throughput

asc_sort_with_after_geonameid

6.01

ops/s

Median Throughput

asc_sort_with_after_geonameid

6.01

ops/s

Max Throughput

asc_sort_with_after_geonameid

6.01

ops/s

50th percentile latency

asc_sort_with_after_geonameid

124.876

ms

90th percentile latency

asc_sort_with_after_geonameid

126.708

ms

99th percentile latency

asc_sort_with_after_geonameid

128.544

ms

100th percentile latency

asc_sort_with_after_geonameid

129.14

ms

50th percentile service time

asc_sort_with_after_geonameid

124.197

ms

90th percentile service time

asc_sort_with_after_geonameid

125.81

ms

99th percentile service time

asc_sort_with_after_geonameid

127.88

ms

100th percentile service time

asc_sort_with_after_geonameid

128.285

ms

error rate

asc_sort_with_after_geonameid

0

%

正文完