web-dev-qa-db-ja.com

iostat、vmstat、mpstatを使用したボトルネックの特定

ApacheとMySQLのパフォーマンステストをいくつか行ってから、単一のコンポーネントをアップグレードし、ボトルネックの場所を確認するためにテストを再度実行しています。地元の中古サーバーウェアハウス(ユタ州リンドンのTams)があり、明日、これらのテストを実行して、より多くのコアとより高いクロック速度を比較するために約20個のプロセッサーを貸し出しています。次に、NVMe RAIDとSSD RAID、またはHDD RAIDとシングルHDDをテストします。

Iostat、vmstat、mpstatなどの他のさまざまなプログラムの統計を調べて組み合わせるスクリプトを作成しました。結果を見ていると、512スレッドでこれほど大きな速度低下が発生する理由を理解するのに苦労しています。これらのツールを使用してCPU、HDD、メモリの負荷を見ると、それと256スレッドの間に大きな違いはないようです。

これには、次のコマンドを使用してApacheBenchmarkを使用しています。ab -kc 500 -n 1000 -l -s 60 http://localhost/stress.phpすべての測定は、テストが本格的に行われるように、テストの5秒後に行われました。

Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz, Cores: 6, Sockets 2, DDR4, (free-mh): 62G 12G 18G 4.5M 32G 49G, NVMe RAID, TestType: Apache PHP, Threads: 16, Server Load: 0.94, 0.28, 0.15, Threads: 16,   Time per request: 125.523 [ms] (mean), Concurrency Level: 16, Requests per second 127.47 [#/sec] (mean)Time per request:       125.523 [ms] (mean),  Connect: 0    0   0.1      0       1, Processing: 59  124 177.8    121    2924, CPU Core Load: %usr,1.89,1.94,1.98,1.95,1.93,1.92,1.93,1.94,1.93,1.94,1.93,1.95,1.92,1.87,1.88,1.85,1.83,1.85,1.82,1.84,1.82,1.85,1.84,1.85,1.82,, CPU Idle%: %idle,97.66,97.59,97.60,97.60,97.64,97.63,97.64,97.60,97.64,97.59,97.64,97.60,97.65,97.66,97.70,97.69,97.74,97.69,97.75,97.70,97.75,97.70,97.73,97.62,97.75,,Waiting on Disk: %iowait,0.04,0.05,0.04,0.04,0.04,0.04,0.04,0.04,0.03,0.04,0.03,0.04,0.03,0.05,0.04,0.04,0.04,0.04,0.04,0.04,0.03,0.04,0.04,0.11,0.04,, vmstatProcs: 17, vmstatSwapIn: 0, vmstatSwapOut: 0, vmstatIoIn: 7, vmstatIoOut: 42, iostatR-S: 1.72, iostatW-S: 13.37, iostatR-MBS: 0.25, iostatW-MBS: 0.52, iostatHDD-Util: 0.00
Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz, Cores: 6, Sockets 2, DDR4, (free-mh): 62G 12G 17G 4.5M 32G 49G, NVMe RAID, TestType: Apache PHP, Threads: 32, Server Load: 3.98, 0.97, 0.37, Threads: 32,   Time per request: 194.680 [ms] (mean), Concurrency Level: 32, Requests per second 164.37 [#/sec] (mean)Time per request:       194.680 [ms] (mean),  Connect: 0    0   0.5      0       4, Processing: 65  193 380.9    143    4300, CPU Core Load: %usr,1.89,1.94,1.99,1.95,1.93,1.92,1.94,1.95,1.94,1.94,1.93,1.95,1.93,1.88,1.88,1.85,1.84,1.85,1.82,1.85,1.82,1.85,1.84,1.86,1.82,, CPU Idle%: %idle,97.66,97.58,97.60,97.60,97.64,97.63,97.63,97.60,97.64,97.59,97.64,97.60,97.65,97.66,97.70,97.69,97.74,97.68,97.75,97.70,97.75,97.70,97.72,97.62,97.74,,Waiting on Disk: %iowait,0.04,0.05,0.04,0.04,0.04,0.04,0.04,0.04,0.03,0.04,0.03,0.04,0.03,0.05,0.04,0.04,0.04,0.04,0.04,0.04,0.03,0.04,0.04,0.11,0.04,, vmstatProcs: 32, vmstatSwapIn: 0, vmstatSwapOut: 0, vmstatIoIn: 7, vmstatIoOut: 42, iostatR-S: 1.72, iostatW-S: 13.37, iostatR-MBS: 0.25, iostatW-MBS: 0.52, iostatHDD-Util: 0.00
Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz, Cores: 6, Sockets 2, DDR4, (free-mh): 62G 12G 17G 4.5M 32G 49G, NVMe RAID, TestType: Apache PHP, Threads: 64, Server Load: 7.13, 1.72, 0.62, Threads: 64,   Time per request: 384.736 [ms] (mean), Concurrency Level: 64, Requests per second 166.35 [#/sec] (mean)Time per request:       384.736 [ms] (mean),  Connect: 0    1   3.0      0      14, Processing: 130  378 547.1    311    5501, CPU Core Load: %usr,1.90,1.95,1.99,1.95,1.93,1.93,1.94,1.95,1.94,1.95,1.94,1.95,1.93,1.88,1.89,1.86,1.84,1.86,1.83,1.85,1.83,1.86,1.84,1.86,1.83,, CPU Idle%: %idle,97.66,97.58,97.59,97.59,97.63,97.62,97.63,97.60,97.63,97.58,97.64,97.59,97.64,97.65,97.69,97.69,97.73,97.68,97.75,97.69,97.75,97.69,97.72,97.61,97.74,,Waiting on Disk: %iowait,0.04,0.05,0.04,0.04,0.04,0.04,0.04,0.04,0.03,0.04,0.03,0.04,0.03,0.05,0.04,0.04,0.04,0.04,0.04,0.04,0.03,0.04,0.04,0.11,0.04,, vmstatProcs: 59, vmstatSwapIn: 0, vmstatSwapOut: 0, vmstatIoIn: 7, vmstatIoOut: 42, iostatR-S: 1.72, iostatW-S: 13.37, iostatR-MBS: 0.25, iostatW-MBS: 0.52, iostatHDD-Util: 0.00
Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz, Cores: 6, Sockets 2, DDR4, (free-mh): 62G 12G 17G 4.5M 32G 49G, NVMe RAID, TestType: Apache PHP, Threads: 128, Server Load: 11.64, 2.82, 0.99, Threads: 128,   Time per request: 786.459 [ms] (mean), Concurrency Level: 128, Requests per second 162.75 [#/sec] (mean)Time per request:       786.459 [ms] (mean),  Connect: 0    1   4.0      0      14, Processing: 149  742 1175.8    468    6124, CPU Core Load: %usr,1.90,1.95,1.99,1.96,1.94,1.93,1.94,1.95,1.94,1.95,1.94,1.96,1.93,1.88,1.89,1.86,1.84,1.86,1.83,1.85,1.83,1.86,1.85,1.86,1.83,, CPU Idle%: %idle,97.65,97.58,97.59,97.59,97.63,97.62,97.63,97.59,97.63,97.58,97.63,97.59,97.64,97.65,97.69,97.68,97.73,97.68,97.74,97.69,97.74,97.69,97.72,97.61,97.74,,Waiting on Disk: %iowait,0.04,0.05,0.04,0.04,0.04,0.04,0.04,0.04,0.03,0.04,0.03,0.04,0.03,0.05,0.04,0.04,0.04,0.04,0.04,0.04,0.03,0.04,0.04,0.11,0.04,, vmstatProcs: 101, vmstatSwapIn: 0, vmstatSwapOut: 0, vmstatIoIn: 7, vmstatIoOut: 42, iostatR-S: 1.72, iostatW-S: 13.37, iostatR-MBS: 0.25, iostatW-MBS: 0.52, iostatHDD-Util: 0.00
Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz, Cores: 6, Sockets 2, DDR4, (free-mh): 62G 13G 17G 4.5M 32G 48G, NVMe RAID, TestType: Apache PHP, Threads: 256, Server Load: 30.81, 7.33, 2.49, Threads: 256,   Time per request: 2309.182 [ms] (mean), Concurrency Level: 256, Requests per second 110.86 [#/sec] (mean)Time per request:       2309.182 [ms] (mean),  Connect: 0    3   5.2      0      14, Processing: 148 1546 1985.2    830    9003, CPU Core Load: %usr,1.91,1.95,2.00,1.96,1.94,1.93,1.95,1.96,1.95,1.96,1.94,1.96,1.94,1.89,1.90,1.86,1.85,1.86,1.83,1.86,1.83,1.86,1.85,1.87,1.84,, CPU Idle%: %idle,97.65,97.57,97.59,97.59,97.63,97.62,97.62,97.59,97.62,97.58,97.63,97.58,97.64,97.65,97.69,97.68,97.73,97.67,97.74,97.69,97.74,97.69,97.71,97.61,97.73,,Waiting on Disk: %iowait,0.04,0.05,0.04,0.04,0.04,0.04,0.04,0.04,0.03,0.04,0.03,0.04,0.03,0.05,0.04,0.04,0.04,0.04,0.04,0.04,0.03,0.04,0.04,0.11,0.04,, vmstatProcs: 77, vmstatSwapIn: 0, vmstatSwapOut: 0, vmstatIoIn: 7, vmstatIoOut: 42, iostatR-S: 1.72, iostatW-S: 13.37, iostatR-MBS: 0.25, iostatW-MBS: 0.52, iostatHDD-Util: 0.00
Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz, Cores: 6, Sockets 2, DDR4, (free-mh): 62G 13G 16G 4.5M 32G 48G, NVMe RAID, TestType: Apache PHP, Threads: 512, Server Load: 46.33, 11.29, 3.82, Threads: 512,   Time per request: 28000.550 [ms] (mean), Concurrency Level: 512, Requests per second 18.29 [#/sec] (mean)Time per request:       28000.550 [ms] (mean),  Connect: 0   11  15.6      0      38, Processing: 179 14189 21674.4   1677   54626, CPU Core Load: %usr,1.91,1.96,2.00,1.96,1.94,1.94,1.95,1.96,1.95,1.96,1.95,1.96,1.94,1.89,1.90,1.87,1.85,1.87,1.84,1.86,1.84,1.87,1.85,1.87,1.84,, CPU Idle%: %idle,97.65,97.57,97.58,97.58,97.62,97.61,97.62,97.59,97.62,97.57,97.63,97.58,97.63,97.64,97.68,97.68,97.72,97.67,97.74,97.68,97.74,97.68,97.71,97.60,97.73,,Waiting on Disk: %iowait,0.04,0.05,0.04,0.04,0.04,0.04,0.04,0.04,0.03,0.04,0.03,0.04,0.03,0.05,0.04,0.04,0.04,0.04,0.04,0.04,0.03,0.04,0.04,0.11,0.04,, vmstatProcs: 0, vmstatSwapIn: 0, vmstatSwapOut: 0, vmstatIoIn: 7, vmstatIoOut: 42, iostatR-S: 1.72, iostatW-S: 13.37, iostatR-MBS: 0.25, iostatW-MBS: 0.52, iostatHDD-Util: 0.00
Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz, Cores: 6, Sockets 2, DDR4, (free-mh): 62G 14G 16G 4.5M 32G 48G, NVMe RAID, TestType: Apache PHP, Threads: 750, Server Load: 37.39, 13.46, 4.95, Threads: 750,   Time per request: 40777.912 [ms] (mean), Concurrency Level: 750, Requests per second 18.39 [#/sec] (mean)Time per request:       40777.912 [ms] (mean),  Connect: 0   13  17.8      0      43, Processing: 150 17962 23352.7   1807   54307, CPU Core Load: %usr,1.91,1.96,2.00,1.97,1.95,1.94,1.95,1.96,1.95,1.96,1.95,1.97,1.94,1.89,1.90,1.87,1.85,1.87,1.84,1.86,1.84,1.87,1.86,1.87,1.84,, CPU Idle%: %idle,97.64,97.57,97.58,97.58,97.62,97.61,97.62,97.58,97.62,97.57,97.62,97.58,97.63,97.64,97.68,97.67,97.72,97.67,97.73,97.68,97.73,97.68,97.71,97.60,97.73,,Waiting on Disk: %iowait,0.04,0.05,0.04,0.04,0.04,0.04,0.04,0.04,0.03,0.04,0.03,0.04,0.03,0.05,0.04,0.04,0.04,0.04,0.04,0.04,0.03,0.04,0.04,0.11,0.04,, vmstatProcs: 0, vmstatSwapIn: 0, vmstatSwapOut: 0, vmstatIoIn: 7, vmstatIoOut: 42, iostatR-S: 1.72, iostatW-S: 13.36, iostatR-MBS: 0.25, iostatW-MBS: 0.52, iostatHDD-Util: 0.00

PHPスクリプトはjson_decode()を実行し、数千のレコードを複数回ソートし、いくつかの配列アイテムでpreg_replace()を使用してから、以下を使用して乱数を生成します。

for($z = 0; $z < 10; $z++) {
    $r = array(0,0,0,0,0,0,0,0,0,0,0);
    for ($i=0;$i<100000;$i++) {
      $n = mt_Rand(0,10000);
      if ($n<=10) {
        $r[$n]++;
      }
    }
    print_r($r); 
    print_r("<BR>");
}

これを決定するために出力を確認する必要がある他のコマンドがあるかどうか疑問に思っています。現在のコマンドセット:

vmstat -w -S m 
iostat -dxmh /dev/md0
mpstat -P ALL
free -h 
lscpu
top -b -n

#Between each test clear cache
sync; echo 3 > /proc/sys/vm/drop_caches 
swapoff -a && swapon -a
1
Alan

このマイクロベンチマークでも、応答時間を測定できることは良いことです。ホストレベルの使用率の指標は、全体を物語っていません。

256スレッドのかなり前にパフォーマンスが低下しました。 1秒あたりのリクエスト数は実際には32スレッド後に減少したため、3番目の実行は悪化し始めました。そして、CPU使用率やiowaitの対応する増加はありません。したがって、他のリソースで競合が発生している可能性があります。

オープンマインドを保ち、使用率と飽和についてすべてを確認します。非常に不完全なリスト:

  • それはランダムなことをし、それを使い果たしてブロックします/dev/random
  • 群れやUNIX IPCなどの共有リソースを同時に使用しますか?
  • 変数を出力する場合、この出力はどこに行き、I/Oはどれくらい遅いですか?
  • 多くのワーカースレッドはどのように管理されていますか?つまり、Apache MPMは何で、どのように構成されていますか?
  • このベンチマークはデータベースを使用していますか? DBMSのチューニングはそれ自体がトピックです。

これはパフォーマンスツールが優れているLinuxなので、CPUですべてをサンプリングして、コールグラフを確認することを検討してください。参照 GreggのLinuxパフォーマンスノート CPUフレームグラフの作成方法。可能な場合はeBPFを使用し、perf record以前のカーネル。グラフを読んで、コード、実行時間とデータベース、およびオペレーティングシステムの間で時間が費やされている場所を確認します。たとえば、サンプルスタックの多くがファイルI/Oに関連している場合、ファイルシステム関連の関数が表示されることがあります。

すべてのものを見ることのトピックについても、 netdata を取得することを検討してください。この特定のツールは、最小限の構成で多数のデータポイントを収集します。私の意見では、組み込みのアラートセットだけでも、インストールする価値はあります。 netdataはApache httpd mod_statusでも参照できます なので、ホストメトリックとともに接続とワーカーをグラフ化できます。

2
John Mahowald