每日分享 – Elasticsearch 7.10.1集群压测报告(32核64G*3,Intel)

说明

本文描述问题及解决方法同样适用于 腾讯云 Elasticsearch Service(ES)

另外使用到:腾讯云 云服务器(Cloud Virtual Machine,CVM)

本文延续上一篇 Elasticsearch压测工具esrally部署指南

环境配置

Esrally客户端环境

  • 版本

Linux环境:Centos 7.9

Python:3.8.7

Pip:pip 20.2.3 from pip (python 3.8)

Java:openjdk version 1.8.0_302 (build 1.8.0_302-b08)

Git:2.7.5

Esrally:2.3.0

  • 配置

内存:32G

硬盘:SSD云硬盘 100GB

CPU个数:1

CPU核心数:16

Elasticsearch服务端环境

  • 版本

Linux环境:Centos 7.2

Java:openjdk version 11.0.9.1-ga (build 11.0.9.1-ga+1, mixed mode)

Elasticsearch版本:7.10.1(腾讯云 Elasticsearch Service 白金版)

  • 配置

节点数量:3

内存:64G

硬盘:SSD云硬盘 1TB

CPU个数:1

CPU核心数:32

CPU型号:Intel(R) Xeon(R) Platinum 8255C CPU @ 2.50GHz

背景

在大数据时代的今天,业务量越来越大,每天动辄都会产生上百GB、上TB的数据,所以拥有一个性能强劲的Elasticsearch集群就显得尤为重要。我们需要模拟大量网络日志、用户行为日志的读写动作,衡量各性能的指标,找出集群瓶颈所在,以确认我们需要怎样的硬件配置以及业务优化,才能满足现有的业务量,这就是我们在业务上线前所必要做的。

压测

esrally 相关术语及参数

Rally 是汽车拉力赛的意思,所以关于它里面术语也是跟汽车的拉力赛有关。

  • track: 即赛道的意思,这里指压测用到的样本数据和压测策略,使用 esrally list tracks 列出。rally 自带的 track 可在 https://github.com/elastic/rally-tracks 中查看,每个 track 的文件名中都存在 README.md 对压测的数据类型和参数做了详细的说明。如果没有指定 track, 则默认使用 geonames track 进行测试;
  • target-hosts:即远程elasticsearch的ip和端口,以ip:port的形式指定;
  • pipeline: 指一个压测流程,可以通过 esrally list pipeline 查看,其中有一个 benchmark-only 的流程,就是将 es 的管理交给用户来操作,rally 只用来做压测,如果你想针对已有的 es 进行压测,则使用该模式;
  • track-params:对默认的压测参数进行覆盖;
  • user-tag:本次压测的 tag 标记;
  • client-options:指定一些客户端连接选项,比如用户名和密码。

压测指令

esrally race \
  --track=geonames \
  --target-hosts=10.0.10.4:9200 \
  --pipeline=benchmark-only \
  --track-params="number_of_shards:3, number_of_replicas:1" \
  --user-tag="version:Intel_8C32G_1T*3" \
  --client-options="basic_auth_user:'elastic', basic_auth_password:'your_password'"

压测报告

压测指标

压测任务

压测结果

单位

Cumulative indexing time of primary
shards

17.01225

min

Min cumulative indexing time across
primary shards

0

min

Median cumulative indexing time across
primary shards

0.012

min

Max cumulative indexing time across
primary shards

5.682667

min

Cumulative indexing throttle time of
primary shards

0

min

Min cumulative indexing throttle time
across primary shards

0

min

Median cumulative indexing throttle time
across primary shards

0

min

Max cumulative indexing throttle time
across primary shards

0

min

Cumulative merge time of primary shards

4.949667

min

Cumulative merge count of primary shards

320

Min cumulative merge time across primary
shards

0

min

Median cumulative merge time across
primary shards

0.010817

min

Max cumulative merge time across primary
shards

1.830933

min

Cumulative merge throttle time of
primary shards

1.62605

min

Min cumulative merge throttle time
across primary shards

0

min

Median cumulative merge throttle time
across primary shards

0

min

Max cumulative merge throttle time
across primary shards

0.781817

min

Cumulative refresh time of primary
shards

1.250317

min

Cumulative refresh count of primary
shards

2778

Min cumulative refresh time across
primary shards

0

min

Median cumulative refresh time across
primary shards

0.067933

min

Max cumulative refresh time across
primary shards

0.313033

min

Cumulative flush time of primary shards

0.269583

min

Cumulative flush count of primary shards

15

Min cumulative flush time across primary
shards

0

min

Median cumulative flush time across
primary shards

0

min

Max cumulative flush time across primary
shards

0.10945

min

Total Young Gen GC time

7.068

s

Total Young Gen GC count

425

Total Old Gen GC time

0

s

Total Old Gen GC count

0

Store size

6.121098

GB

Translog size

0.054619

GB

Heap used for segments

0.967323

MB

Heap used for doc values

0.141586

MB

Heap used for terms

0.71254

MB

Heap used for norms

0.047729

MB

Heap used for points

0

MB

Heap used for stored fields

0.065468

MB

Segment count

100

Min Throughput

index-append

75632.8

docs/s

Mean Throughput

index-append

78759.4

docs/s

Median Throughput

index-append

79140.52

docs/s

Max Throughput

index-append

79771.77

docs/s

50th percentile latency

index-append

355.1897

ms

90th percentile latency

index-append

1057.213

ms

99th percentile latency

index-append

1783.757

ms

100th percentile latency

index-append

1900.323

ms

50th percentile service time

index-append

355.1897

ms

90th percentile service time

index-append

1057.213

ms

99th percentile service time

index-append

1783.757

ms

100th percentile service time

index-append

1900.323

ms

error rate

index-append

0

%

Min Throughput

index-stats

89.87

ops/s

Mean Throughput

index-stats

89.92

ops/s

Median Throughput

index-stats

89.93

ops/s

Max Throughput

index-stats

89.95

ops/s

50th percentile latency

index-stats

8.077405

ms

90th percentile latency

index-stats

9.348461

ms

99th percentile latency

index-stats

12.69016

ms

99.9th percentile latency

index-stats

25.73892

ms

100th percentile latency

index-stats

26.45408

ms

50th percentile service time

index-stats

7.31302

ms

90th percentile service time

index-stats

8.543402

ms

99th percentile service time

index-stats

9.795916

ms

99.9th percentile service time

index-stats

17.87015

ms

100th percentile service time

index-stats

26.06197

ms

error rate

index-stats

0

%

Min Throughput

node-stats

89.23

ops/s

Mean Throughput

node-stats

89.71

ops/s

Median Throughput

node-stats

89.79

ops/s

Max Throughput

node-stats

89.88

ops/s

50th percentile latency

node-stats

8.682566

ms

90th percentile latency

node-stats

9.717911

ms

99th percentile latency

node-stats

13.65085

ms

99.9th percentile latency

node-stats

17.28893

ms

100th percentile latency

node-stats

18.81426

ms

50th percentile service time

node-stats

7.959221

ms

90th percentile service time

node-stats

8.811547

ms

99th percentile service time

node-stats

11.70221

ms

99.9th percentile service time

node-stats

16.80108

ms

100th percentile service time

node-stats

18.08894

ms

error rate

node-stats

0

%

Min Throughput

default

50.01

ops/s

Mean Throughput

default

50.02

ops/s

Median Throughput

default

50.02

ops/s

Max Throughput

default

50.04

ops/s

50th percentile latency

default

8.182141

ms

90th percentile latency

default

9.24675

ms

99th percentile latency

default

10.33466

ms

99.9th percentile latency

default

22.50608

ms

100th percentile latency

default

26.34883

ms

50th percentile service time

default

7.163304

ms

90th percentile service time

default

7.841727

ms

99th percentile service time

default

8.586275

ms

99.9th percentile service time

default

21.70245

ms

100th percentile service time

default

24.52273

ms

error rate

default

0

%

Min Throughput

term

99.93

ops/s

Mean Throughput

term

99.96

ops/s

Median Throughput

term

99.96

ops/s

Max Throughput

term

99.97

ops/s

50th percentile latency

term

7.621245

ms

90th percentile latency

term

8.385463

ms

99th percentile latency

term

12.36773

ms

99.9th percentile latency

term

28.44925

ms

100th percentile latency

term

30.47515

ms

50th percentile service time

term

6.956592

ms

90th percentile service time

term

7.521901

ms

99th percentile service time

term

8.439376

ms

99.9th percentile service time

term

14.6161

ms

100th percentile service time

term

29.85921

ms

error rate

term

0

%

Min Throughput

phrase

109.82

ops/s

Mean Throughput

phrase

109.89

ops/s

Median Throughput

phrase

109.9

ops/s

Max Throughput

phrase

109.93

ops/s

50th percentile latency

phrase

7.218418

ms

90th percentile latency

phrase

8.160093

ms

99th percentile latency

phrase

29.5303

ms

99.9th percentile latency

phrase

51.73535

ms

100th percentile latency

phrase

54.22535

ms

50th percentile service time

phrase

6.466333

ms

90th percentile service time

phrase

7.252101

ms

99th percentile service time

phrase

8.131189

ms

99.9th percentile service time

phrase

28.56503

ms

100th percentile service time

phrase

53.7504

ms

error rate

phrase

0

%

Min Throughput

country_agg_uncached

3

ops/s

Mean Throughput

country_agg_uncached

3

ops/s

Median Throughput

country_agg_uncached

3

ops/s

Max Throughput

country_agg_uncached

3

ops/s

50th percentile latency

country_agg_uncached

288.9474

ms

90th percentile latency

country_agg_uncached

309.166

ms

99th percentile latency

country_agg_uncached

326.2984

ms

100th percentile latency

country_agg_uncached

327.1032

ms

50th percentile service time

country_agg_uncached

288.1656

ms

90th percentile service time

country_agg_uncached

308.6675

ms

99th percentile service time

country_agg_uncached

325.8579

ms

100th percentile service time

country_agg_uncached

326.667

ms

error rate

country_agg_uncached

0

%

Min Throughput

country_agg_cached

97.23

ops/s

Mean Throughput

country_agg_cached

97.96

ops/s

Median Throughput

country_agg_cached

98.03

ops/s

Max Throughput

country_agg_cached

98.48

ops/s

50th percentile latency

country_agg_cached

5.794905

ms

90th percentile latency

country_agg_cached

6.362851

ms

99th percentile latency

country_agg_cached

7.16715

ms

99.9th percentile latency

country_agg_cached

10.21684

ms

100th percentile latency

country_agg_cached

14.41195

ms

50th percentile service time

country_agg_cached

5.052252

ms

90th percentile service time

country_agg_cached

5.466061

ms

99th percentile service time

country_agg_cached

6.142363

ms

99.9th percentile service time

country_agg_cached

7.957455

ms

100th percentile service time

country_agg_cached

13.78594

ms

error rate

country_agg_cached

0

%

Min Throughput

scroll

20.02

pages/s

Mean Throughput

scroll

20.02

pages/s

Median Throughput

scroll

20.02

pages/s

Max Throughput

scroll

20.03

pages/s

50th percentile latency

scroll

843.4661

ms

90th percentile latency

scroll

865.0329

ms

99th percentile latency

scroll

880.6698

ms

100th percentile latency

scroll

882.1531

ms

50th percentile service time

scroll

841.9406

ms

90th percentile service time

scroll

863.089

ms

99th percentile service time

scroll

879.1557

ms

100th percentile service time

scroll

880.8891

ms

error rate

scroll

0

%

Min Throughput

expression

1.5

ops/s

Mean Throughput

expression

1.5

ops/s

Median Throughput

expression

1.5

ops/s

Max Throughput

expression

1.5

ops/s

50th percentile latency

expression

516.8727

ms

90th percentile latency

expression

538.0341

ms

99th percentile latency

expression

571.611

ms

100th percentile latency

expression

591.9596

ms

50th percentile service time

expression

516.039

ms

90th percentile service time

expression

537.3009

ms

99th percentile service time

expression

570.3717

ms

100th percentile service time

expression

590.8448

ms

error rate

expression

0

%

Min Throughput

painless_static

1.4

ops/s

Mean Throughput

painless_static

1.4

ops/s

Median Throughput

painless_static

1.4

ops/s

Max Throughput

painless_static

1.4

ops/s

50th percentile latency

painless_static

694.191

ms

90th percentile latency

painless_static

746.8065

ms

99th percentile latency

painless_static

789.5176

ms

100th percentile latency

painless_static

804.777

ms

50th percentile service time

painless_static

684.0766

ms

90th percentile service time

painless_static

732.2472

ms

99th percentile service time

painless_static

784.8393

ms

100th percentile service time

painless_static

789.1287

ms

error rate

painless_static

0

%

Min Throughput

painless_dynamic

1.4

ops/s

Mean Throughput

painless_dynamic

1.4

ops/s

Median Throughput

painless_dynamic

1.4

ops/s

Max Throughput

painless_dynamic

1.4

ops/s

50th percentile latency

painless_dynamic

661.6189

ms

90th percentile latency

painless_dynamic

710.2084

ms

99th percentile latency

painless_dynamic

754.6959

ms

100th percentile latency

painless_dynamic

779.6387

ms

50th percentile service time

painless_dynamic

659.1536

ms

90th percentile service time

painless_dynamic

704.4893

ms

99th percentile service time

painless_dynamic

753.5435

ms

100th percentile service time

painless_dynamic

778.9637

ms

error rate

painless_dynamic

0

%

Min Throughput

decay_geo_gauss_function_score

1

ops/s

Mean Throughput

decay_geo_gauss_function_score

1

ops/s

Median Throughput

decay_geo_gauss_function_score

1

ops/s

Max Throughput

decay_geo_gauss_function_score

1

ops/s

50th percentile latency

decay_geo_gauss_function_score

600.7554

ms

90th percentile latency

decay_geo_gauss_function_score

628.561

ms

99th percentile latency

decay_geo_gauss_function_score

639.8627

ms

100th percentile latency

decay_geo_gauss_function_score

646.2472

ms

50th percentile service time

decay_geo_gauss_function_score

599.7752

ms

90th percentile service time

decay_geo_gauss_function_score

627.4241

ms

99th percentile service time

decay_geo_gauss_function_score

638.9019

ms

100th percentile service time

decay_geo_gauss_function_score

644.9651

ms

error rate

decay_geo_gauss_function_score

0

%

Min Throughput

decay_geo_gauss_script_score

1

ops/s

Mean Throughput

decay_geo_gauss_script_score

1

ops/s

Median Throughput

decay_geo_gauss_script_score

1

ops/s

Max Throughput

decay_geo_gauss_script_score

1

ops/s

50th percentile latency

decay_geo_gauss_script_score

613.6628

ms

90th percentile latency

decay_geo_gauss_script_score

645.8648

ms

99th percentile latency

decay_geo_gauss_script_score

659.1196

ms

100th percentile latency

decay_geo_gauss_script_score

667.759

ms

50th percentile service time

decay_geo_gauss_script_score

612.4639

ms

90th percentile service time

decay_geo_gauss_script_score

644.815

ms

99th percentile service time

decay_geo_gauss_script_score

658.5626

ms

100th percentile service time

decay_geo_gauss_script_score

666.5389

ms

error rate

decay_geo_gauss_script_score

0

%

Min Throughput

field_value_function_score

1.5

ops/s

Mean Throughput

field_value_function_score

1.5

ops/s

Median Throughput

field_value_function_score

1.5

ops/s

Max Throughput

field_value_function_score

1.5

ops/s

50th percentile latency

field_value_function_score

247.8083

ms

90th percentile latency

field_value_function_score

260.4863

ms

99th percentile latency

field_value_function_score

284.7711

ms

100th percentile latency

field_value_function_score

328.2898

ms

50th percentile service time

field_value_function_score

246.5339

ms

90th percentile service time

field_value_function_score

259.3155

ms

99th percentile service time

field_value_function_score

283.1177

ms

100th percentile service time

field_value_function_score

327.2172

ms

error rate

field_value_function_score

0

%

Min Throughput

field_value_script_score

1.5

ops/s

Mean Throughput

field_value_script_score

1.5

ops/s

Median Throughput

field_value_script_score

1.5

ops/s

Max Throughput

field_value_script_score

1.5

ops/s

50th percentile latency

field_value_script_score

343.0819

ms

90th percentile latency

field_value_script_score

380.6714

ms

99th percentile latency

field_value_script_score

415.3917

ms

100th percentile latency

field_value_script_score

458.5758

ms

50th percentile service time

field_value_script_score

342.5406

ms

90th percentile service time

field_value_script_score

379.462

ms

99th percentile service time

field_value_script_score

414.4178

ms

100th percentile service time

field_value_script_score

457.3435

ms

error rate

field_value_script_score

0

%

Min Throughput

large_terms

0.99

ops/s

Mean Throughput

large_terms

0.99

ops/s

Median Throughput

large_terms

0.99

ops/s

Max Throughput

large_terms

0.99

ops/s

50th percentile latency

large_terms

25574.39

ms

90th percentile latency

large_terms

29477.65

ms

99th percentile latency

large_terms

30244.04

ms

100th percentile latency

large_terms

30327.25

ms

50th percentile service time

large_terms

988.9636

ms

90th percentile service time

large_terms

1037.687

ms

99th percentile service time

large_terms

1109.41

ms

100th percentile service time

large_terms

1111.429

ms

error rate

large_terms

0

%

Min Throughput

large_filtered_terms

0.99

ops/s

Mean Throughput

large_filtered_terms

0.99

ops/s

Median Throughput

large_filtered_terms

0.99

ops/s

Max Throughput

large_filtered_terms

0.99

ops/s

50th percentile latency

large_filtered_terms

26601.74

ms

90th percentile latency

large_filtered_terms

30839.63

ms

99th percentile latency

large_filtered_terms

31705.55

ms

100th percentile latency

large_filtered_terms

31807.8

ms

50th percentile service time

large_filtered_terms

999.2523

ms

90th percentile service time

large_filtered_terms

1063.678

ms

99th percentile service time

large_filtered_terms

1090.776

ms

100th percentile service time

large_filtered_terms

1129.235

ms

error rate

large_filtered_terms

0

%

Min Throughput

large_prohibited_terms

0.99

ops/s

Mean Throughput

large_prohibited_terms

0.99

ops/s

Median Throughput

large_prohibited_terms

0.99

ops/s

Max Throughput

large_prohibited_terms

1

ops/s

50th percentile latency

large_prohibited_terms

25155.12

ms

90th percentile latency

large_prohibited_terms

28645.59

ms

99th percentile latency

large_prohibited_terms

29415.42

ms

100th percentile latency

large_prohibited_terms

29648.9

ms

50th percentile service time

large_prohibited_terms

999.0508

ms

90th percentile service time

large_prohibited_terms

1037.584

ms

99th percentile service time

large_prohibited_terms

1113.158

ms

100th percentile service time

large_prohibited_terms

1137.651

ms

error rate

large_prohibited_terms

0

%

Min Throughput

desc_sort_population

1.5

ops/s

Mean Throughput

desc_sort_population

1.5

ops/s

Median Throughput

desc_sort_population

1.5

ops/s

Max Throughput

desc_sort_population

1.51

ops/s

50th percentile latency

desc_sort_population

120.1642

ms

90th percentile latency

desc_sort_population

129.939

ms

99th percentile latency

desc_sort_population

139.4398

ms

100th percentile latency

desc_sort_population

139.817

ms

50th percentile service time

desc_sort_population

118.8727

ms

90th percentile service time

desc_sort_population

128.5942

ms

99th percentile service time

desc_sort_population

138.486

ms

100th percentile service time

desc_sort_population

138.5035

ms

error rate

desc_sort_population

0

%

Min Throughput

asc_sort_population

1.5

ops/s

Mean Throughput

asc_sort_population

1.5

ops/s

Median Throughput

asc_sort_population

1.5

ops/s

Max Throughput

asc_sort_population

1.51

ops/s

50th percentile latency

asc_sort_population

128.1445

ms

90th percentile latency

asc_sort_population

137.5476

ms

99th percentile latency

asc_sort_population

147.0698

ms

100th percentile latency

asc_sort_population

148.544

ms

50th percentile service time

asc_sort_population

126.667

ms

90th percentile service time

asc_sort_population

136.1103

ms

99th percentile service time

asc_sort_population

146.2709

ms

100th percentile service time

asc_sort_population

147.0874

ms

error rate

asc_sort_population

0

%

Min Throughput

asc_sort_with_after_population

1.5

ops/s

Mean Throughput

asc_sort_with_after_population

1.5

ops/s

Median Throughput

asc_sort_with_after_population

1.5

ops/s

Max Throughput

asc_sort_with_after_population

1.51

ops/s

50th percentile latency

asc_sort_with_after_population

179.7538

ms

90th percentile latency

asc_sort_with_after_population

187.8424

ms

99th percentile latency

asc_sort_with_after_population

197.217

ms

100th percentile latency

asc_sort_with_after_population

199.9815

ms

50th percentile service time

asc_sort_with_after_population

178.4708

ms

90th percentile service time

asc_sort_with_after_population

187.0946

ms

99th percentile service time

asc_sort_with_after_population

195.5255

ms

100th percentile service time

asc_sort_with_after_population

199.0645

ms

error rate

asc_sort_with_after_population

0

%

Min Throughput

desc_sort_geonameid

6.02

ops/s

Mean Throughput

desc_sort_geonameid

6.02

ops/s

Median Throughput

desc_sort_geonameid

6.02

ops/s

Max Throughput

desc_sort_geonameid

6.03

ops/s

50th percentile latency

desc_sort_geonameid

12.53419

ms

90th percentile latency

desc_sort_geonameid

13.90111

ms

99th percentile latency

desc_sort_geonameid

14.92428

ms

100th percentile latency

desc_sort_geonameid

15.03593

ms

50th percentile service time

desc_sort_geonameid

11.6426

ms

90th percentile service time

desc_sort_geonameid

13.02232

ms

99th percentile service time

desc_sort_geonameid

14.38134

ms

100th percentile service time

desc_sort_geonameid

14.54603

ms

error rate

desc_sort_geonameid

0

%

Min Throughput

desc_sort_with_after_geonameid

6

ops/s

Mean Throughput

desc_sort_with_after_geonameid

6

ops/s

Median Throughput

desc_sort_with_after_geonameid

6

ops/s

Max Throughput

desc_sort_with_after_geonameid

6

ops/s

50th percentile latency

desc_sort_with_after_geonameid

160.9827

ms

90th percentile latency

desc_sort_with_after_geonameid

169.9296

ms

99th percentile latency

desc_sort_with_after_geonameid

176.6884

ms

100th percentile latency

desc_sort_with_after_geonameid

179.5116

ms

50th percentile service time

desc_sort_with_after_geonameid

159.8333

ms

90th percentile service time

desc_sort_with_after_geonameid

168.0821

ms

99th percentile service time

desc_sort_with_after_geonameid

175.6854

ms

100th percentile service time

desc_sort_with_after_geonameid

179.1033

ms

error rate

desc_sort_with_after_geonameid

0

%

Min Throughput

asc_sort_geonameid

6.02

ops/s

Mean Throughput

asc_sort_geonameid

6.02

ops/s

Median Throughput

asc_sort_geonameid

6.02

ops/s

Max Throughput

asc_sort_geonameid

6.03

ops/s

50th percentile latency

asc_sort_geonameid

11.36692

ms

90th percentile latency

asc_sort_geonameid

12.97426

ms

99th percentile latency

asc_sort_geonameid

14.58453

ms

100th percentile latency

asc_sort_geonameid

15.33824

ms

50th percentile service time

asc_sort_geonameid

10.46318

ms

90th percentile service time

asc_sort_geonameid

12.13509

ms

99th percentile service time

asc_sort_geonameid

14.05151

ms

100th percentile service time

asc_sort_geonameid

14.11172

ms

error rate

asc_sort_geonameid

0

%

Min Throughput

asc_sort_with_after_geonameid

6

ops/s

Mean Throughput

asc_sort_with_after_geonameid

6

ops/s

Median Throughput

asc_sort_with_after_geonameid

6

ops/s

Max Throughput

asc_sort_with_after_geonameid

6

ops/s

50th percentile latency

asc_sort_with_after_geonameid

153.3129

ms

90th percentile latency

asc_sort_with_after_geonameid

164.1402

ms

99th percentile latency

asc_sort_with_after_geonameid

169.6966

ms

100th percentile latency

asc_sort_with_after_geonameid

174.1272

ms

50th percentile service time

asc_sort_with_after_geonameid

152.4712

ms

90th percentile service time

asc_sort_with_after_geonameid

163.2446

ms

99th percentile service time

asc_sort_with_after_geonameid

169.0365

ms

100th percentile service time

asc_sort_with_after_geonameid

173.7628

ms

error rate

asc_sort_with_after_geonameid

0

%

正文完