今日总结 – Elasticsearch 7.10.1集群压测报告(8核32G*3,AMD)

说明

本文描述问题及解决方法同样适用于 腾讯云 Elasticsearch Service(ES)

另外使用到:腾讯云 云服务器(Cloud Virtual Machine,CVM)

本文延续上一篇 Elasticsearch压测工具esrally部署指南

环境配置

Esrally客户端环境

  • 版本

Linux环境:Centos 7.9

Python:3.8.7

Pip:pip 20.2.3 from pip (python 3.8)

Java:openjdk version 1.8.0_302 (build 1.8.0_302-b08)

Git:2.7.5

Esrally:2.3.0

  • 配置

内存:32G

硬盘:SSD云硬盘 100GB

CPU个数:1

CPU核心数:16

Elasticsearch服务端环境

  • 版本

Linux环境:Centos 7.2

Java:openjdk version 11.0.9.1-ga (build 11.0.9.1-ga+1, mixed mode)

Elasticsearch版本:7.10.1(腾讯云 Elasticsearch Service 白金版)

  • 配置

节点数量:3

内存:32G

硬盘:SSD云硬盘 1TB

CPU个数:1

CPU核心数:8

CPU型号:AMD EPYC 7K62 48-Core Processor

背景

在大数据时代的今天,业务量越来越大,每天动辄都会产生上百GB、上TB的数据,所以拥有一个性能强劲的Elasticsearch集群就显得尤为重要。我们需要模拟大量网络日志、用户行为日志的读写动作,衡量各性能的指标,找出集群瓶颈所在,以确认我们需要怎样的硬件配置以及业务优化,才能满足现有的业务量,这就是我们在业务上线前所必要做的。

压测

esrally 相关术语及参数

Rally 是汽车拉力赛的意思,所以关于它里面术语也是跟汽车的拉力赛有关。

  • track: 即赛道的意思,这里指压测用到的样本数据和压测策略,使用 esrally list tracks 列出。rally 自带的 track 可在 https://github.com/elastic/rally-tracks 中查看,每个 track 的文件名中都存在 README.md 对压测的数据类型和参数做了详细的说明。如果没有指定 track, 则默认使用 geonames track 进行测试;
  • target-hosts:即远程elasticsearch的ip和端口,以ip:port的形式指定;
  • pipeline: 指一个压测流程,可以通过 esrally list pipeline 查看,其中有一个 benchmark-only 的流程,就是将 es 的管理交给用户来操作,rally 只用来做压测,如果你想针对已有的 es 进行压测,则使用该模式;
  • track-params:对默认的压测参数进行覆盖;
  • user-tag:本次压测的 tag 标记;
  • client-options:指定一些客户端连接选项,比如用户名和密码。

压测指令

esrally race \
  --track=geonames \
  --target-hosts=10.0.10.4:9200 \
  --pipeline=benchmark-only \
  --track-params="number_of_shards:3, number_of_replicas:1" \
  --user-tag="version:Intel_8C32G_1T*3" \
  --client-options="basic_auth_user:'elastic', basic_auth_password:'your_password'"

压测报告

压测指标

压测任务

压测结果

单位

Cumulative indexing time of primary
shards

15.4204

min

Min cumulative indexing time across
primary shards

0

min

Median cumulative indexing time across
primary shards

0.00133333

min

Max cumulative indexing time across
primary shards

5.60557

min

Cumulative indexing throttle time of
primary shards

0

min

Min cumulative indexing throttle time
across primary shards

0

min

Median cumulative indexing throttle time
across primary shards

0

min

Max cumulative indexing throttle time
across primary shards

0

min

Cumulative merge time of primary
shards

4.10468

min

Cumulative merge count of primary
shards

127

Min cumulative merge time across primary
shards

0

min

Median cumulative merge time across
primary shards

0.0014

min

Max cumulative merge time across primary
shards

1.40863

min

Cumulative merge throttle time of
primary shards

0.87585

min

Min cumulative merge throttle time
across primary shards

0

min

Median cumulative merge throttle time
across primary shards

0

min

Max cumulative merge throttle time
across primary shards

0.311133

min

Cumulative refresh time of primary
shards

1.08202

min

Cumulative refresh count of primary
shards

1053

Min cumulative refresh time across
primary shards

0

min

Median cumulative refresh time across
primary shards

0.00851667

min

Max cumulative refresh time across
primary shards

0.34495

min

Cumulative flush time of primary
shards

0.288333

min

Cumulative flush count of primary
shards

21

Min cumulative flush time across primary
shards

0

min

Median cumulative flush time across
primary shards

0.000325

min

Max cumulative flush time across primary
shards

0.110533

min

Total Young Gen GC time

9.41

s

Total Young Gen GC count

1119

Total Old Gen GC time

0

s

Total Old Gen GC count

0

Store size

6.05857

GB

Translog size

0.0153261

GB

Heap used for segments

0.888233

MB

Heap used for doc values

0.187038

MB

Heap used for terms

0.59523

MB

Heap used for norms

0.0531616

MB

Heap used for points

0

MB

Heap used for stored fields

0.052803

MB

Segment count

109

Min Throughput

index-append

82021.7

docs/s

Mean Throughput

index-append

82462.8

docs/s

Median Throughput

index-append

82386.2

docs/s

Max Throughput

index-append

83125.2

docs/s

50th percentile latency

index-append

301.696

ms

90th percentile latency

index-append

444.837

ms

99th percentile latency

index-append

2023.37

ms

100th percentile latency

index-append

2783.09

ms

50th percentile service time

index-append

301.696

ms

90th percentile service time

index-append

444.837

ms

99th percentile service time

index-append

2023.37

ms

100th percentile service time

index-append

2783.09

ms

error rate

index-append

0

%

Min Throughput

index-stats

90

ops/s

Mean Throughput

index-stats

90

ops/s

Median Throughput

index-stats

90

ops/s

Max Throughput

index-stats

90

ops/s

50th percentile latency

index-stats

3.44108

ms

90th percentile latency

index-stats

3.98452

ms

99th percentile latency

index-stats

5.299

ms

99.9th percentile latency

index-stats

38.0072

ms

100th percentile latency

index-stats

44.7222

ms

50th percentile service time

index-stats

2.69155

ms

90th percentile service time

index-stats

3.11186

ms

99th percentile service time

index-stats

3.9878

ms

99.9th percentile service time

index-stats

10.1688

ms

100th percentile service time

index-stats

43.6778

ms

error rate

index-stats

0

%

Min Throughput

node-stats

89.97

ops/s

Mean Throughput

node-stats

89.99

ops/s

Median Throughput

node-stats

89.99

ops/s

Max Throughput

node-stats

90

ops/s

50th percentile latency

node-stats

3.86662

ms

90th percentile latency

node-stats

4.6441

ms

99th percentile latency

node-stats

6.03199

ms

99.9th percentile latency

node-stats

7.77375

ms

100th percentile latency

node-stats

11.1361

ms

50th percentile service time

node-stats

3.09424

ms

90th percentile service time

node-stats

3.87135

ms

99th percentile service time

node-stats

5.18555

ms

99.9th percentile service time

node-stats

6.79361

ms

100th percentile service time

node-stats

10.4609

ms

error rate

node-stats

0

%

Min Throughput

default

50.01

ops/s

Mean Throughput

default

50.02

ops/s

Median Throughput

default

50.01

ops/s

Max Throughput

default

50.03

ops/s

50th percentile latency

default

3.79126

ms

90th percentile latency

default

4.40135

ms

99th percentile latency

default

5.30919

ms

99.9th percentile latency

default

9.95044

ms

100th percentile latency

default

16.0406

ms

50th percentile service time

default

3.11513

ms

90th percentile service time

default

3.55932

ms

99th percentile service time

default

4.45552

ms

99.9th percentile service time

default

9.09098

ms

100th percentile service time

default

15.3868

ms

error rate

default

0

%

Min Throughput

term

99.87

ops/s

Mean Throughput

term

99.92

ops/s

Median Throughput

term

99.93

ops/s

Max Throughput

term

99.95

ops/s

50th percentile latency

term

2.95951

ms

90th percentile latency

term

3.47878

ms

99th percentile latency

term

5.96863

ms

99.9th percentile latency

term

42.9865

ms

100th percentile latency

term

45.2784

ms

50th percentile service time

term

2.21038

ms

90th percentile service time

term

2.64551

ms

99th percentile service time

term

3.26334

ms

99.9th percentile service time

term

22.3663

ms

100th percentile service time

term

31.8719

ms

error rate

term

0

%

Min Throughput

phrase

109.79

ops/s

Mean Throughput

phrase

109.87

ops/s

Median Throughput

phrase

109.88

ops/s

Max Throughput

phrase

109.91

ops/s

50th percentile latency

phrase

3.46583

ms

90th percentile latency

phrase

4.0292

ms

99th percentile latency

phrase

4.79164

ms

99.9th percentile latency

phrase

22.6104

ms

100th percentile latency

phrase

26.7837

ms

50th percentile service time

phrase

2.74638

ms

90th percentile service time

phrase

3.18039

ms

99th percentile service time

phrase

3.84184

ms

99.9th percentile service time

phrase

6.36719

ms

100th percentile service time

phrase

26.4556

ms

error rate

phrase

0

%

Min Throughput

country_agg_uncached

2.99

ops/s

Mean Throughput

country_agg_uncached

3

ops/s

Median Throughput

country_agg_uncached

3

ops/s

Max Throughput

country_agg_uncached

3

ops/s

50th percentile latency

country_agg_uncached

269.149

ms

90th percentile latency

country_agg_uncached

273.057

ms

99th percentile latency

country_agg_uncached

278.817

ms

100th percentile latency

country_agg_uncached

286.744

ms

50th percentile service time

country_agg_uncached

268.065

ms

90th percentile service time

country_agg_uncached

272.192

ms

99th percentile service time

country_agg_uncached

278.273

ms

100th percentile service time

country_agg_uncached

285.921

ms

error rate

country_agg_uncached

0

%

Min Throughput

country_agg_cached

97.54

ops/s

Mean Throughput

country_agg_cached

98.19

ops/s

Median Throughput

country_agg_cached

98.25

ops/s

Max Throughput

country_agg_cached

98.65

ops/s

50th percentile latency

country_agg_cached

2.44149

ms

90th percentile latency

country_agg_cached

3.57161

ms

99th percentile latency

country_agg_cached

4.06758

ms

99.9th percentile latency

country_agg_cached

4.75733

ms

100th percentile latency

country_agg_cached

8.72978

ms

50th percentile service time

country_agg_cached

1.62785

ms

90th percentile service time

country_agg_cached

1.94765

ms

99th percentile service time

country_agg_cached

2.40375

ms

99.9th percentile service time

country_agg_cached

3.82282

ms

100th percentile service time

country_agg_cached

8.15348

ms

error rate

country_agg_cached

0

%

Min Throughput

scroll

20.03

pages/s

Mean Throughput

scroll

20.04

pages/s

Median Throughput

scroll

20.04

pages/s

Max Throughput

scroll

20.05

pages/s

50th percentile latency

scroll

608.43

ms

90th percentile latency

scroll

618.341

ms

99th percentile latency

scroll

630.375

ms

100th percentile latency

scroll

643.218

ms

50th percentile service time

scroll

606.853

ms

90th percentile service time

scroll

616.445

ms

99th percentile service time

scroll

629.186

ms

100th percentile service time

scroll

641.773

ms

error rate

scroll

0

%

Min Throughput

expression

1.5

ops/s

Mean Throughput

expression

1.5

ops/s

Median Throughput

expression

1.5

ops/s

Max Throughput

expression

1.5

ops/s

50th percentile latency

expression

468.457

ms

90th percentile latency

expression

471.827

ms

99th percentile latency

expression

490.783

ms

100th percentile latency

expression

494.895

ms

50th percentile service time

expression

467.494

ms

90th percentile service time

expression

470.928

ms

99th percentile service time

expression

490.078

ms

100th percentile service time

expression

494.534

ms

error rate

expression

0

%

Min Throughput

painless_static

1.4

ops/s

Mean Throughput

painless_static

1.4

ops/s

Median Throughput

painless_static

1.4

ops/s

Max Throughput

painless_static

1.4

ops/s

50th percentile latency

painless_static

621.158

ms

90th percentile latency

painless_static

629.028

ms

99th percentile latency

painless_static

641.496

ms

100th percentile latency

painless_static

649.315

ms

50th percentile service time

painless_static

620.261

ms

90th percentile service time

painless_static

628.394

ms

99th percentile service time

painless_static

641.063

ms

100th percentile service time

painless_static

648.63

ms

error rate

painless_static

0

%

Min Throughput

painless_dynamic

1.4

ops/s

Mean Throughput

painless_dynamic

1.4

ops/s

Median Throughput

painless_dynamic

1.4

ops/s

Max Throughput

painless_dynamic

1.4

ops/s

50th percentile latency

painless_dynamic

606.079

ms

90th percentile latency

painless_dynamic

612.458

ms

99th percentile latency

painless_dynamic

621.385

ms

100th percentile latency

painless_dynamic

636.442

ms

50th percentile service time

painless_dynamic

604.965

ms

90th percentile service time

painless_dynamic

611.485

ms

99th percentile service time

painless_dynamic

620.609

ms

100th percentile service time

painless_dynamic

635.157

ms

error rate

painless_dynamic

0

%

Min Throughput

decay_geo_gauss_function_score

1

ops/s

Mean Throughput

decay_geo_gauss_function_score

1

ops/s

Median Throughput

decay_geo_gauss_function_score

1

ops/s

Max Throughput

decay_geo_gauss_function_score

1

ops/s

50th percentile latency

decay_geo_gauss_function_score

553.953

ms

90th percentile latency

decay_geo_gauss_function_score

563.907

ms

99th percentile latency

decay_geo_gauss_function_score

580.905

ms

100th percentile latency

decay_geo_gauss_function_score

584.343

ms

50th percentile service time

decay_geo_gauss_function_score

552.879

ms

90th percentile service time

decay_geo_gauss_function_score

562.371

ms

99th percentile service time

decay_geo_gauss_function_score

579.423

ms

100th percentile service time

decay_geo_gauss_function_score

583.145

ms

error rate

decay_geo_gauss_function_score

0

%

Min Throughput

decay_geo_gauss_script_score

1

ops/s

Mean Throughput

decay_geo_gauss_script_score

1

ops/s

Median Throughput

decay_geo_gauss_script_score

1

ops/s

Max Throughput

decay_geo_gauss_script_score

1

ops/s

50th percentile latency

decay_geo_gauss_script_score

570.908

ms

90th percentile latency

decay_geo_gauss_script_score

581.122

ms

99th percentile latency

decay_geo_gauss_script_score

596.53

ms

100th percentile latency

decay_geo_gauss_script_score

597.903

ms

50th percentile service time

decay_geo_gauss_script_score

570.098

ms

90th percentile service time

decay_geo_gauss_script_score

579.971

ms

99th percentile service time

decay_geo_gauss_script_score

594.926

ms

100th percentile service time

decay_geo_gauss_script_score

597.184

ms

error rate

decay_geo_gauss_script_score

0

%

Min Throughput

field_value_function_score

1.5

ops/s

Mean Throughput

field_value_function_score

1.5

ops/s

Median Throughput

field_value_function_score

1.5

ops/s

Max Throughput

field_value_function_score

1.5

ops/s

50th percentile latency

field_value_function_score

221.406

ms

90th percentile latency

field_value_function_score

230.417

ms

99th percentile latency

field_value_function_score

237.445

ms

100th percentile latency

field_value_function_score

247.695

ms

50th percentile service time

field_value_function_score

220.246

ms

90th percentile service time

field_value_function_score

229.26

ms

99th percentile service time

field_value_function_score

236.631

ms

100th percentile service time

field_value_function_score

246.138

ms

error rate

field_value_function_score

0

%

Min Throughput

field_value_script_score

1.5

ops/s

Mean Throughput

field_value_script_score

1.5

ops/s

Median Throughput

field_value_script_score

1.5

ops/s

Max Throughput

field_value_script_score

1.5

ops/s

50th percentile latency

field_value_script_score

292.884

ms

90th percentile latency

field_value_script_score

295.211

ms

99th percentile latency

field_value_script_score

318.695

ms

100th percentile latency

field_value_script_score

371.378

ms

50th percentile service time

field_value_script_score

291.864

ms

90th percentile service time

field_value_script_score

294.001

ms

99th percentile service time

field_value_script_score

317.536

ms

100th percentile service time

field_value_script_score

370.16

ms

error rate

field_value_script_score

0

%

Min Throughput

large_terms

1.1

ops/s

Mean Throughput

large_terms

1.1

ops/s

Median Throughput

large_terms

1.1

ops/s

Max Throughput

large_terms

1.1

ops/s

50th percentile latency

large_terms

874.286

ms

90th percentile latency

large_terms

887.648

ms

99th percentile latency

large_terms

896.463

ms

100th percentile latency

large_terms

897.591

ms

50th percentile service time

large_terms

867.613

ms

90th percentile service time

large_terms

880.374

ms

99th percentile service time

large_terms

889.655

ms

100th percentile service time

large_terms

889.989

ms

error rate

large_terms

0

%

Min Throughput

large_filtered_terms

1.1

ops/s

Mean Throughput

large_filtered_terms

1.1

ops/s

Median Throughput

large_filtered_terms

1.1

ops/s

Max Throughput

large_filtered_terms

1.1

ops/s

50th percentile latency

large_filtered_terms

872.465

ms

90th percentile latency

large_filtered_terms

884.033

ms

99th percentile latency

large_filtered_terms

901.338

ms

100th percentile latency

large_filtered_terms

911.676

ms

50th percentile service time

large_filtered_terms

865.622

ms

90th percentile service time

large_filtered_terms

876.737

ms

99th percentile service time

large_filtered_terms

893.794

ms

100th percentile service time

large_filtered_terms

903.582

ms

error rate

large_filtered_terms

0

%

Min Throughput

large_prohibited_terms

1.1

ops/s

Mean Throughput

large_prohibited_terms

1.1

ops/s

Median Throughput

large_prohibited_terms

1.1

ops/s

Max Throughput

large_prohibited_terms

1.1

ops/s

50th percentile latency

large_prohibited_terms

873.317

ms

90th percentile latency

large_prohibited_terms

890.626

ms

99th percentile latency

large_prohibited_terms

914.139

ms

100th percentile latency

large_prohibited_terms

921.4

ms

50th percentile service time

large_prohibited_terms

865.85

ms

90th percentile service time

large_prohibited_terms

883.182

ms

99th percentile service time

large_prohibited_terms

906.849

ms

100th percentile service time

large_prohibited_terms

914.852

ms

error rate

large_prohibited_terms

0

%

Min Throughput

desc_sort_population

1.5

ops/s

Mean Throughput

desc_sort_population

1.5

ops/s

Median Throughput

desc_sort_population

1.5

ops/s

Max Throughput

desc_sort_population

1.51

ops/s

50th percentile latency

desc_sort_population

103.368

ms

90th percentile latency

desc_sort_population

121.52

ms

99th percentile latency

desc_sort_population

147.378

ms

100th percentile latency

desc_sort_population

162.058

ms

50th percentile service time

desc_sort_population

102.062

ms

90th percentile service time

desc_sort_population

120.179

ms

99th percentile service time

desc_sort_population

146.429

ms

100th percentile service time

desc_sort_population

161.312

ms

error rate

desc_sort_population

0

%

Min Throughput

asc_sort_population

1.5

ops/s

Mean Throughput

asc_sort_population

1.5

ops/s

Median Throughput

asc_sort_population

1.5

ops/s

Max Throughput

asc_sort_population

1.51

ops/s

50th percentile latency

asc_sort_population

106.443

ms

90th percentile latency

asc_sort_population

118.099

ms

99th percentile latency

asc_sort_population

124.97

ms

100th percentile latency

asc_sort_population

126.89

ms

50th percentile service time

asc_sort_population

105.037

ms

90th percentile service time

asc_sort_population

116.566

ms

99th percentile service time

asc_sort_population

123.97

ms

100th percentile service time

asc_sort_population

125.546

ms

error rate

asc_sort_population

0

%

Min Throughput

asc_sort_with_after_population

1.5

ops/s

Mean Throughput

asc_sort_with_after_population

1.5

ops/s

Median Throughput

asc_sort_with_after_population

1.5

ops/s

Max Throughput

asc_sort_with_after_population

1.51

ops/s

50th percentile latency

asc_sort_with_after_population

162.77

ms

90th percentile latency

asc_sort_with_after_population

178.146

ms

99th percentile latency

asc_sort_with_after_population

182.551

ms

100th percentile latency

asc_sort_with_after_population

184.061

ms

50th percentile service time

asc_sort_with_after_population

161.567

ms

90th percentile service time

asc_sort_with_after_population

176.876

ms

99th percentile service time

asc_sort_with_after_population

181.754

ms

100th percentile service time

asc_sort_with_after_population

182.319

ms

error rate

asc_sort_with_after_population

0

%

Min Throughput

desc_sort_geonameid

6.01

ops/s

Mean Throughput

desc_sort_geonameid

6.01

ops/s

Median Throughput

desc_sort_geonameid

6.01

ops/s

Max Throughput

desc_sort_geonameid

6.01

ops/s

50th percentile latency

desc_sort_geonameid

7.21374

ms

90th percentile latency

desc_sort_geonameid

7.75274

ms

99th percentile latency

desc_sort_geonameid

8.15132

ms

100th percentile latency

desc_sort_geonameid

8.15188

ms

50th percentile service time

desc_sort_geonameid

6.31562

ms

90th percentile service time

desc_sort_geonameid

7.02085

ms

99th percentile service time

desc_sort_geonameid

7.37525

ms

100th percentile service time

desc_sort_geonameid

7.41357

ms

error rate

desc_sort_geonameid

0

%

Min Throughput

desc_sort_with_after_geonameid

6

ops/s

Mean Throughput

desc_sort_with_after_geonameid

6

ops/s

Median Throughput

desc_sort_with_after_geonameid

6

ops/s

Max Throughput

desc_sort_with_after_geonameid

6

ops/s

50th percentile latency

desc_sort_with_after_geonameid

131.94

ms

90th percentile latency

desc_sort_with_after_geonameid

140.335

ms

99th percentile latency

desc_sort_with_after_geonameid

161.993

ms

100th percentile latency

desc_sort_with_after_geonameid

162.614

ms

50th percentile service time

desc_sort_with_after_geonameid

131.317

ms

90th percentile service time

desc_sort_with_after_geonameid

139.324

ms

99th percentile service time

desc_sort_with_after_geonameid

161.083

ms

100th percentile service time

desc_sort_with_after_geonameid

161.61

ms

error rate

desc_sort_with_after_geonameid

0

%

Min Throughput

asc_sort_geonameid

6.02

ops/s

Mean Throughput

asc_sort_geonameid

6.02

ops/s

Median Throughput

asc_sort_geonameid

6.02

ops/s

Max Throughput

asc_sort_geonameid

6.03

ops/s

50th percentile latency

asc_sort_geonameid

6.80715

ms

90th percentile latency

asc_sort_geonameid

7.84657

ms

99th percentile latency

asc_sort_geonameid

8.26933

ms

100th percentile latency

asc_sort_geonameid

8.34375

ms

50th percentile service time

asc_sort_geonameid

6.18429

ms

90th percentile service time

asc_sort_geonameid

6.78306

ms

99th percentile service time

asc_sort_geonameid

7.15076

ms

100th percentile service time

asc_sort_geonameid

7.174

ms

error rate

asc_sort_geonameid

0

%

Min Throughput

asc_sort_with_after_geonameid

6

ops/s

Mean Throughput

asc_sort_with_after_geonameid

6

ops/s

Median Throughput

asc_sort_with_after_geonameid

6

ops/s

Max Throughput

asc_sort_with_after_geonameid

6

ops/s

50th percentile latency

asc_sort_with_after_geonameid

120.317

ms

90th percentile latency

asc_sort_with_after_geonameid

131.139

ms

99th percentile latency

asc_sort_with_after_geonameid

136.769

ms

100th percentile latency

asc_sort_with_after_geonameid

146.23

ms

50th percentile service time

asc_sort_with_after_geonameid

119.526

ms

90th percentile service time

asc_sort_with_after_geonameid

130.616

ms

99th percentile service time

asc_sort_with_after_geonameid

135.632

ms

100th percentile service time

asc_sort_with_after_geonameid

145.283

ms

error rate

asc_sort_with_after_geonameid

0

%

正文完