然后说一下这个,Elasticsearch 7.10.1集群压测报告(16核64G*3 本地NVMe SSD,Intel)

说明

本文描述问题及解决方法同样适用于 腾讯云 Elasticsearch Service(ES)

另外使用到:腾讯云 云服务器(Cloud Virtual Machine,CVM)

本文延续上一篇 Elasticsearch压测工具esrally部署指南

环境配置

Esrally客户端环境

  • 版本

Linux环境:Centos 7.9

Python:3.8.7

Pip:pip 20.2.3 from pip (python 3.8)

Java:openjdk version 1.8.0_302 (build 1.8.0_302-b08)

Git:2.7.5

Esrally:2.3.0

  • 配置

内存:32G

硬盘:SSD云硬盘 100GB

CPU个数:1

CPU核心数:16

Elasticsearch服务端环境

  • 版本

Linux环境:Centos 7.2

Java:openjdk version 11.0.9.1-ga (build 11.0.9.1-ga+1, mixed mode)

Elasticsearch版本:7.10.1(腾讯云 Elasticsearch Service 白金版)

  • 配置

节点数量:3

内存:64G

硬盘:本地NVMe SSD盘 3.5T

CPU个数:1

CPU核心数:16

CPU型号:Intel(R) Xeon(R) Platinum 8255C CPU @ 2.50GHz

背景

在大数据时代的今天,业务量越来越大,每天动辄都会产生上百GB、上TB的数据,所以拥有一个性能强劲的Elasticsearch集群就显得尤为重要。我们需要模拟大量网络日志、用户行为日志的读写动作,衡量各性能的指标,找出集群瓶颈所在,以确认我们需要怎样的硬件配置以及业务优化,才能满足现有的业务量,这就是我们在业务上线前所必要做的。

压测

esrally 相关术语及参数

Rally 是汽车拉力赛的意思,所以关于它里面术语也是跟汽车的拉力赛有关。

  • track: 即赛道的意思,这里指压测用到的样本数据和压测策略,使用 esrally list tracks 列出。rally 自带的 track 可在 https://github.com/elastic/rally-tracks 中查看,每个 track 的文件名中都存在 README.md 对压测的数据类型和参数做了详细的说明。如果没有指定 track, 则默认使用 geonames track 进行测试;
  • target-hosts:即远程elasticsearch的ip和端口,以ip:port的形式指定;
  • pipeline: 指一个压测流程,可以通过 esrally list pipeline 查看,其中有一个 benchmark-only 的流程,就是将 es 的管理交给用户来操作,rally 只用来做压测,如果你想针对已有的 es 进行压测,则使用该模式;
  • track-params:对默认的压测参数进行覆盖;
  • user-tag:本次压测的 tag 标记;
  • client-options:指定一些客户端连接选项,比如用户名和密码。

压测指令

esrally race \
  --track=geonames \
  --target-hosts=10.0.1.10:9200 \
  --pipeline=benchmark-only \
  --track-params="number_of_shards:3, number_of_replicas:1" \
  --user-tag="version:Intel_8C32G_1T*3" \
  --client-options="basic_auth_user:'elastic', basic_auth_password:'your_password'"

压测报告

压测指标

压测任务

压测结果

单位

Cumulative indexing time of primary
shards

13.65421667

min

Min cumulative indexing time across
primary shards

0

min

Median cumulative indexing time across
primary shards

0.003233333

min

Max cumulative indexing time across
primary shards

4.546866667

min

Cumulative indexing throttle time of
primary shards

0

min

Min cumulative indexing throttle time
across primary shards

0

min

Median cumulative indexing throttle time
across primary shards

0

min

Max cumulative indexing throttle time
across primary shards

0

min

Cumulative merge time of primary shards

3.7998

min

Cumulative merge count of primary shards

116

Min cumulative merge time across primary
shards

0

min

Median cumulative merge time across
primary shards

0.00215

min

Max cumulative merge time across primary
shards

1.368616667

min

Cumulative merge throttle time of
primary shards

0.854816667

min

Min cumulative merge throttle time
across primary shards

0

min

Median cumulative merge throttle time
across primary shards

0

min

Max cumulative merge throttle time
across primary shards

0.376866667

min

Cumulative refresh time of primary
shards

0.8684

min

Cumulative refresh count of primary
shards

824

Min cumulative refresh time across
primary shards

0

min

Median cumulative refresh time across
primary shards

0.013216667

min

Max cumulative refresh time across
primary shards

0.288866667

min

Cumulative flush time of primary shards

0.231516667

min

Cumulative flush count of primary shards

16

Min cumulative flush time across primary
shards

0

min

Median cumulative flush time across
primary shards

3.33E-05

min

Max cumulative flush time across primary
shards

0.090116667

min

Total Young Gen GC time

5.537

s

Total Young Gen GC count

428

Total Old Gen GC time

0

s

Total Old Gen GC count

0

Store size

6.086144759

GB

Translog size

0.014669072

GB

Heap used for segments

1.157173157

MB

Heap used for doc values

0.233444214

MB

Heap used for terms

0.801841736

MB

Heap used for norms

0.05255127

MB

Heap used for points

0

MB

Heap used for stored fields

0.069335938

MB

Segment count

106

error rate

index-append

0

%

Min Throughput

index-stats

90.01

ops/s

Mean Throughput

index-stats

90.02

ops/s

Median Throughput

index-stats

90.02

ops/s

Max Throughput

index-stats

90.03

ops/s

50th percentile latency

index-stats

3.455013706

ms

90th percentile latency

index-stats

3.865102708

ms

99th percentile latency

index-stats

4.179690956

ms

99.9th percentile latency

index-stats

6.49599925

ms

100th percentile latency

index-stats

6.718149525

ms

50th percentile service time

index-stats

2.740972996

ms

90th percentile service time

index-stats

2.861405979

ms

99th percentile service time

index-stats

3.228418953

ms

99.9th percentile service time

index-stats

4.324396003

ms

100th percentile service time

index-stats

5.646112026

ms

error rate

index-stats

0

%

Min Throughput

node-stats

89.84

ops/s

Mean Throughput

node-stats

89.94

ops/s

Median Throughput

node-stats

89.95

ops/s

Max Throughput

node-stats

89.97

ops/s

50th percentile latency

node-stats

3.845604981

ms

90th percentile latency

node-stats

4.411485343

ms

99th percentile latency

node-stats

7.24903738

ms

99.9th percentile latency

node-stats

30.31716143

ms

100th percentile latency

node-stats

33.79341552

ms

50th percentile service time

node-stats

3.116541484

ms

90th percentile service time

node-stats

3.478722525

ms

99th percentile service time

node-stats

5.306357127

ms

99.9th percentile service time

node-stats

29.87361055

ms

100th percentile service time

node-stats

33.57647097

ms

error rate

node-stats

0

%

Min Throughput

default

50.01

ops/s

Mean Throughput

default

50.02

ops/s

Median Throughput

default

50.02

ops/s

Max Throughput

default

50.04

ops/s

50th percentile latency

default

4.064526525

ms

90th percentile latency

default

4.501826642

ms

99th percentile latency

default

4.822351079

ms

99.9th percentile latency

default

7.446992685

ms

100th percentile latency

default

10.88421501

ms

50th percentile service time

default

3.334970505

ms

90th percentile service time

default

3.497533122

ms

99th percentile service time

default

3.920079021

ms

99.9th percentile service time

default

6.650144298

ms

100th percentile service time

default

9.162903996

ms

error rate

default

0

%

Min Throughput

term

99.94

ops/s

Mean Throughput

term

99.96

ops/s

Median Throughput

term

99.96

ops/s

Max Throughput

term

99.97

ops/s

50th percentile latency

term

3.888126492

ms

90th percentile latency

term

4.309532285

ms

99th percentile latency

term

4.719402031

ms

99.9th percentile latency

term

6.680716229

ms

100th percentile latency

term

7.289358997

ms

50th percentile service time

term

3.183407011

ms

90th percentile service time

term

3.321979992

ms

99th percentile service time

term

3.905628119

ms

99.9th percentile service time

term

5.603055375

ms

100th percentile service time

term

6.446593034

ms

error rate

term

0

%

Min Throughput

phrase

109.74

ops/s

Mean Throughput

phrase

109.84

ops/s

Median Throughput

phrase

109.85

ops/s

Max Throughput

phrase

109.89

ops/s

50th percentile latency

phrase

3.632957436

ms

90th percentile latency

phrase

4.082768376

ms

99th percentile latency

phrase

9.706799537

ms

99.9th percentile latency

phrase

29.81259928

ms

100th percentile latency

phrase

32.19792765

ms

50th percentile service time

phrase

2.910208976

ms

90th percentile service time

phrase

3.142374702

ms

99th percentile service time

phrase

3.984294158

ms

99.9th percentile service time

phrase

20.82108271

ms

100th percentile service time

phrase

31.94962203

ms

error rate

phrase

0

%

Min Throughput

country_agg_uncached

2.99

ops/s

Mean Throughput

country_agg_uncached

3

ops/s

Median Throughput

country_agg_uncached

3

ops/s

Max Throughput

country_agg_uncached

3

ops/s

50th percentile latency

country_agg_uncached

251.0548279

ms

90th percentile latency

country_agg_uncached

257.6586545

ms

99th percentile latency

country_agg_uncached

309.3294735

ms

100th percentile latency

country_agg_uncached

325.2049923

ms

50th percentile service time

country_agg_uncached

250.289458

ms

90th percentile service time

country_agg_uncached

256.7963825

ms

99th percentile service time

country_agg_uncached

308.5876537

ms

100th percentile service time

country_agg_uncached

324.82491

ms

error rate

country_agg_uncached

0

%

Min Throughput

country_agg_cached

97.7

ops/s

Mean Throughput

country_agg_cached

98.31

ops/s

Median Throughput

country_agg_cached

98.37

ops/s

Max Throughput

country_agg_cached

98.73

ops/s

50th percentile latency

country_agg_cached

2.292511461

ms

90th percentile latency

country_agg_cached

3.606574354

ms

99th percentile latency

country_agg_cached

3.905656755

ms

99.9th percentile latency

country_agg_cached

12.49224018

ms

100th percentile latency

country_agg_cached

20.38649598

ms

50th percentile service time

country_agg_cached

1.612633991

ms

90th percentile service time

country_agg_cached

1.707840967

ms

99th percentile service time

country_agg_cached

1.882032589

ms

99.9th percentile service time

country_agg_cached

2.454972915

ms

100th percentile service time

country_agg_cached

19.66166002

ms

error rate

country_agg_cached

0

%

Min Throughput

scroll

20.03

pages/s

Mean Throughput

scroll

20.04

pages/s

Median Throughput

scroll

20.04

pages/s

Max Throughput

scroll

20.04

pages/s

50th percentile latency

scroll

628.220485

ms

90th percentile latency

scroll

644.2532087

ms

99th percentile latency

scroll

650.1815793

ms

100th percentile latency

scroll

651.516327

ms

50th percentile service time

scroll

626.7748305

ms

90th percentile service time

scroll

642.7276699

ms

99th percentile service time

scroll

648.6820131

ms

100th percentile service time

scroll

649.740635

ms

error rate

scroll

0

%

Min Throughput

expression

1.5

ops/s

Mean Throughput

expression

1.5

ops/s

Median Throughput

expression

1.5

ops/s

Max Throughput

expression

1.5

ops/s

50th percentile latency

expression

428.848215

ms

90th percentile latency

expression

439.8192967

ms

99th percentile latency

expression

459.0066177

ms

100th percentile latency

expression

515.3190996

ms

50th percentile service time

expression

428.0562435

ms

90th percentile service time

expression

438.3463855

ms

99th percentile service time

expression

458.5917785

ms

100th percentile service time

expression

514.3833789

ms

error rate

expression

0

%

Min Throughput

painless_static

1.4

ops/s

Mean Throughput

painless_static

1.4

ops/s

Median Throughput

painless_static

1.4

ops/s

Max Throughput

painless_static

1.4

ops/s

50th percentile latency

painless_static

586.3726893

ms

90th percentile latency

painless_static

596.5983276

ms

99th percentile latency

painless_static

606.40333

ms

100th percentile latency

painless_static

643.5932609

ms

50th percentile service time

painless_static

585.549683

ms

90th percentile service time

painless_static

595.2985338

ms

99th percentile service time

painless_static

604.4269893

ms

100th percentile service time

painless_static

642.71596

ms

error rate

painless_static

0

%

Min Throughput

painless_dynamic

1.4

ops/s

Mean Throughput

painless_dynamic

1.4

ops/s

Median Throughput

painless_dynamic

1.4

ops/s

Max Throughput

painless_dynamic

1.4

ops/s

50th percentile latency

painless_dynamic

578.331885

ms

90th percentile latency

painless_dynamic

588.6542239

ms

99th percentile latency

painless_dynamic

639.5249295

ms

100th percentile latency

painless_dynamic

674.0814669

ms

50th percentile service time

painless_dynamic

577.148992

ms

90th percentile service time

painless_dynamic

586.6313238

ms

99th percentile service time

painless_dynamic

637.8022957

ms

100th percentile service time

painless_dynamic

673.520562

ms

error rate

painless_dynamic

0

%

Min Throughput

decay_geo_gauss_function_score

1

ops/s

Mean Throughput

decay_geo_gauss_function_score

1

ops/s

Median Throughput

decay_geo_gauss_function_score

1

ops/s

Max Throughput

decay_geo_gauss_function_score

1

ops/s

50th percentile latency

decay_geo_gauss_function_score

596.2377415

ms

90th percentile latency

decay_geo_gauss_function_score

601.2386714

ms

99th percentile latency

decay_geo_gauss_function_score

635.365771

ms

100th percentile latency

decay_geo_gauss_function_score

638.511686

ms

50th percentile service time

decay_geo_gauss_function_score

595.240855

ms

90th percentile service time

decay_geo_gauss_function_score

600.2598313

ms

99th percentile service time

decay_geo_gauss_function_score

634.613057

ms

100th percentile service time

decay_geo_gauss_function_score

637.025386

ms

error rate

decay_geo_gauss_function_score

0

%

Min Throughput

decay_geo_gauss_script_score

1

ops/s

Mean Throughput

decay_geo_gauss_script_score

1

ops/s

Median Throughput

decay_geo_gauss_script_score

1

ops/s

Max Throughput

decay_geo_gauss_script_score

1

ops/s

50th percentile latency

decay_geo_gauss_script_score

625.090891

ms

90th percentile latency

decay_geo_gauss_script_score

630.8023634

ms

99th percentile latency

decay_geo_gauss_script_score

683.7971485

ms

100th percentile latency

decay_geo_gauss_script_score

707.296924

ms

50th percentile service time

decay_geo_gauss_script_score

623.9091625

ms

90th percentile service time

decay_geo_gauss_script_score

629.5137551

ms

99th percentile service time

decay_geo_gauss_script_score

682.9331893

ms

100th percentile service time

decay_geo_gauss_script_score

705.740146

ms

error rate

decay_geo_gauss_script_score

0

%

Min Throughput

field_value_function_score

1.5

ops/s

Mean Throughput

field_value_function_score

1.5

ops/s

Median Throughput

field_value_function_score

1.5

ops/s

Max Throughput

field_value_function_score

1.5

ops/s

50th percentile latency

field_value_function_score

229.5384041

ms

90th percentile latency

field_value_function_score

233.4263508

ms

99th percentile latency

field_value_function_score

270.1970014

ms

100th percentile latency

field_value_function_score

275.7009011

ms

50th percentile service time

field_value_function_score

228.4333875

ms

90th percentile service time

field_value_function_score

232.5404038

ms

99th percentile service time

field_value_function_score

268.5175606

ms

100th percentile service time

field_value_function_score

274.79857

ms

error rate

field_value_function_score

0

%

Min Throughput

field_value_script_score

1.5

ops/s

Mean Throughput

field_value_script_score

1.5

ops/s

Median Throughput

field_value_script_score

1.5

ops/s

Max Throughput

field_value_script_score

1.5

ops/s

50th percentile latency

field_value_script_score

310.1942869

ms

90th percentile latency

field_value_script_score

316.5398964

ms

99th percentile latency

field_value_script_score

370.1661842

ms

100th percentile latency

field_value_script_score

370.3959546

ms

50th percentile service time

field_value_script_score

309.0658145

ms

90th percentile service time

field_value_script_score

315.3409874

ms

99th percentile service time

field_value_script_score

369.2647413

ms

100th percentile service time

field_value_script_score

369.589395

ms

error rate

field_value_script_score

0

%

Min Throughput

large_terms

1.09

ops/s

Mean Throughput

large_terms

1.09

ops/s

Median Throughput

large_terms

1.09

ops/s

Max Throughput

large_terms

1.09

ops/s

50th percentile latency

large_terms

3019.357218

ms

90th percentile latency

large_terms

3148.364278

ms

99th percentile latency

large_terms

3197.829196

ms

100th percentile latency

large_terms

3199.601323

ms

50th percentile service time

large_terms

910.0391099

ms

90th percentile service time

large_terms

959.9049714

ms

99th percentile service time

large_terms

968.0313652

ms

100th percentile service time

large_terms

970.609337

ms

error rate

large_terms

0

%

Min Throughput

large_filtered_terms

1.09

ops/s

Mean Throughput

large_filtered_terms

1.09

ops/s

Median Throughput

large_filtered_terms

1.09

ops/s

Max Throughput

large_filtered_terms

1.09

ops/s

50th percentile latency

large_filtered_terms

3620.908266

ms

90th percentile latency

large_filtered_terms

4020.584281

ms

99th percentile latency

large_filtered_terms

4116.249537

ms

100th percentile latency

large_filtered_terms

4132.152895

ms

50th percentile service time

large_filtered_terms

943.2442301

ms

90th percentile service time

large_filtered_terms

967.7331623

ms

99th percentile service time

large_filtered_terms

977.4057028

ms

100th percentile service time

large_filtered_terms

989.8770991

ms

error rate

large_filtered_terms

0

%

Min Throughput

large_prohibited_terms

1.08

ops/s

Mean Throughput

large_prohibited_terms

1.08

ops/s

Median Throughput

large_prohibited_terms

1.08

ops/s

Max Throughput

large_prohibited_terms

1.08

ops/s

50th percentile latency

large_prohibited_terms

5511.526279

ms

90th percentile latency

large_prohibited_terms

6377.10955

ms

99th percentile latency

large_prohibited_terms

6511.666184

ms

100th percentile latency

large_prohibited_terms

6529.329016

ms

50th percentile service time

large_prohibited_terms

915.282855

ms

90th percentile service time

large_prohibited_terms

927.2769783

ms

99th percentile service time

large_prohibited_terms

1003.07909

ms

100th percentile service time

large_prohibited_terms

1030.781596

ms

error rate

large_prohibited_terms

0

%

Min Throughput

desc_sort_population

1.5

ops/s

Mean Throughput

desc_sort_population

1.5

ops/s

Median Throughput

desc_sort_population

1.5

ops/s

Max Throughput

desc_sort_population

1.51

ops/s

50th percentile latency

desc_sort_population

94.85391242

ms

90th percentile latency

desc_sort_population

98.01778586

ms

99th percentile latency

desc_sort_population

172.1419359

ms

100th percentile latency

desc_sort_population

172.3068906

ms

50th percentile service time

desc_sort_population

92.73930395

ms

90th percentile service time

desc_sort_population

95.87958701

ms

99th percentile service time

desc_sort_population

169.6528183

ms

100th percentile service time

desc_sort_population

169.8743009

ms

error rate

desc_sort_population

0

%

Min Throughput

asc_sort_population

1.5

ops/s

Mean Throughput

asc_sort_population

1.51

ops/s

Median Throughput

asc_sort_population

1.51

ops/s

Max Throughput

asc_sort_population

1.51

ops/s

50th percentile latency

asc_sort_population

97.17846307

ms

90th percentile latency

asc_sort_population

99.45248545

ms

99th percentile latency

asc_sort_population

159.9782795

ms

100th percentile latency

asc_sort_population

163.9837084

ms

50th percentile service time

asc_sort_population

95.09641747

ms

90th percentile service time

asc_sort_population

96.82746612

ms

99th percentile service time

asc_sort_population

158.4363437

ms

100th percentile service time

asc_sort_population

161.421165

ms

error rate

asc_sort_population

0

%

Min Throughput

asc_sort_with_after_population

1.5

ops/s

Mean Throughput

asc_sort_with_after_population

1.5

ops/s

Median Throughput

asc_sort_with_after_population

1.5

ops/s

Max Throughput

asc_sort_with_after_population

1.5

ops/s

50th percentile latency

asc_sort_with_after_population

148.1033507

ms

90th percentile latency

asc_sort_with_after_population

151.5490926

ms

99th percentile latency

asc_sort_with_after_population

158.6420806

ms

100th percentile latency

asc_sort_with_after_population

229.5572197

ms

50th percentile service time

asc_sort_with_after_population

146.967508

ms

90th percentile service time

asc_sort_with_after_population

150.2630878

ms

99th percentile service time

asc_sort_with_after_population

157.6524033

ms

100th percentile service time

asc_sort_with_after_population

228.73829

ms

error rate

asc_sort_with_after_population

0

%

Min Throughput

desc_sort_geonameid

6.01

ops/s

Mean Throughput

desc_sort_geonameid

6.02

ops/s

Median Throughput

desc_sort_geonameid

6.02

ops/s

Max Throughput

desc_sort_geonameid

6.02

ops/s

50th percentile latency

desc_sort_geonameid

5.803632666

ms

90th percentile latency

desc_sort_geonameid

6.381984311

ms

99th percentile latency

desc_sort_geonameid

7.265141936

ms

100th percentile latency

desc_sort_geonameid

7.281934377

ms

50th percentile service time

desc_sort_geonameid

4.946461529

ms

90th percentile service time

desc_sort_geonameid

5.292251252

ms

99th percentile service time

desc_sort_geonameid

6.245202825

ms

100th percentile service time

desc_sort_geonameid

6.498822011

ms

error rate

desc_sort_geonameid

0

%

Min Throughput

desc_sort_with_after_geonameid

6

ops/s

Mean Throughput

desc_sort_with_after_geonameid

6.01

ops/s

Median Throughput

desc_sort_with_after_geonameid

6.01

ops/s

Max Throughput

desc_sort_with_after_geonameid

6.01

ops/s

50th percentile latency

desc_sort_with_after_geonameid

126.3386633

ms

90th percentile latency

desc_sort_with_after_geonameid

130.6906092

ms

99th percentile latency

desc_sort_with_after_geonameid

187.815266

ms

100th percentile latency

desc_sort_with_after_geonameid

212.0108256

ms

50th percentile service time

desc_sort_with_after_geonameid

125.629341

ms

90th percentile service time

desc_sort_with_after_geonameid

129.3062597

ms

99th percentile service time

desc_sort_with_after_geonameid

187.3282619

ms

100th percentile service time

desc_sort_with_after_geonameid

211.03667

ms

error rate

desc_sort_with_after_geonameid

0

%

Min Throughput

asc_sort_geonameid

6.02

ops/s

Mean Throughput

asc_sort_geonameid

6.02

ops/s

Median Throughput

asc_sort_geonameid

6.02

ops/s

Max Throughput

asc_sort_geonameid

6.03

ops/s

50th percentile latency

asc_sort_geonameid

5.402479379

ms

90th percentile latency

asc_sort_geonameid

6.277742097

ms

99th percentile latency

asc_sort_geonameid

6.940018795

ms

100th percentile latency

asc_sort_geonameid

7.008069078

ms

50th percentile service time

asc_sort_geonameid

4.530322447

ms

90th percentile service time

asc_sort_geonameid

5.615919665

ms

99th percentile service time

asc_sort_geonameid

5.752198274

ms

100th percentile service time

asc_sort_geonameid

5.764805945

ms

error rate

asc_sort_geonameid

0

%

Min Throughput

asc_sort_with_after_geonameid

6.01

ops/s

Mean Throughput

asc_sort_with_after_geonameid

6.01

ops/s

Median Throughput

asc_sort_with_after_geonameid

6.01

ops/s

Max Throughput

asc_sort_with_after_geonameid

6.01

ops/s

50th percentile latency

asc_sort_with_after_geonameid

117.4643885

ms

90th percentile latency

asc_sort_with_after_geonameid

120.7699069

ms

99th percentile latency

asc_sort_with_after_geonameid

122.8115193

ms

100th percentile latency

asc_sort_with_after_geonameid

125.1218041

ms

50th percentile service time

asc_sort_with_after_geonameid

116.4326565

ms

90th percentile service time

asc_sort_with_after_geonameid

119.6759058

ms

99th percentile service time

asc_sort_with_after_geonameid

122.3663394

ms

100th percentile service time

asc_sort_with_after_geonameid

123.856142

ms

error rate

asc_sort_with_after_geonameid

0

%

正文完