web-dev-qa-db-ja.com

NGINX内部負荷分散+ PHP-FPMアップストリームにより、ランダムな二重要求/送信が発生します

一見ランダムな時間に、アプリケーションによって処理される重複したリクエストを受け取るという非常に深刻な問題が発生しています。通常、ユーザーはフォームを送信し、コンテンツを2回保存する場合があります。

この問題がJS主導の二重送信である可能性を排除しました。ネットワークアナライザーは、1つの要求しか行われていないことを示しています。ただし、PHPアプリケーションは確かに全体で2回実行されていることも示しています。徹底的に調査した結果、この二重の保存動作を引き起こすロジックの問題はアプリケーションにありません。

編集:「keepalive8;」行をNGINX confから削除し、二重送信を取得しなくなりました。代わりに、問題のあるリクエスト中に504を取得しています

以下をご覧になり、目立つものがあればお知らせいただければ幸いです。ありがとうございます。

NGINXとPHP-FPMの設定は以下のとおりです。

/ etc/nginx/nginx.conf

user nginx;
worker_processes  1;
worker_rlimit_nofile 10240;

error_log  /var/log/nginx/error.log;
pid        /var/run/nginx.pid;

events {
    worker_connections  1024;
    multi_accept on;
    use epoll;
}

http {

    server_tokens off;
    add_header 'Access-Control-Allow-Origin' http://$Host;
        add_header 'Access-Control-Allow-Methods' 'GET, POST';
    add_header 'X-Powered-By' 'smartCMS';

    upstream php_fpm {
        least_conn;
        server 127.0.0.1:9000 max_fails=3 fail_timeout=15s;
        keepalive 8;
    }

    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    log_not_found off;
    access_log    /var/log/nginx/access.log combined buffer=16k;

    open_file_cache max=200000 inactive=20s;
    open_file_cache_valid 30s;
    open_file_cache_min_uses 2;
    open_file_cache_errors on;

    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;

    keepalive_requests 200;
    keepalive_timeout  65;

    gzip  on;
    gzip_static  on;
    gzip_http_version 1.0;
    gzip_comp_level 6;
    gzip_proxied any;
    gzip_types application/javascript application/x-javascript application/xhtml+xml application/xml application/xml+rss image/svg+xml text/css text/javascript text/plain text/xml;
    gzip_vary on;
    gzip_disable "MSIE [1-6].(?!.*SV1)";

    client_max_body_size 12m;
    client_body_buffer_size 128k;
    client_body_timeout 60;
    client_header_timeout 10;
    large_client_header_buffers 4 16k;
    send_timeout 60;

    server_names_hash_bucket_size 64;

    include /etc/nginx/conf.d/*.conf;
 include /etc/nginx/sites-enabled/*;
 }

/ etc/php-fpm.conf

;;;;;;;;;;;;;;;;;;;;;
; FPM Configuration ;
;;;;;;;;;;;;;;;;;;;;;

; All relative paths in this configuration file are relative to PHP's install
; prefix.

; Include one or more files. If glob(3) exists, it is used to include a bunch of
; files from a glob(3) pattern. This directive can be used everywhere in the
; file.
    include=/etc/php-fpm.d/pools/*.conf

 ;;;;;;;;;;;;;;;;;;
 ;  PHP INI  ;
 ;;;;;;;;;;;;;;;;;;
 php_admin_value[upload_max_filesize] = 10M;
 php_admin_value[post_max_size] = 12M;
 php_admin_value[max_execution_time] = 60;
 php_admin_value[expose_php] = Off;

 ;;;;;;;;;;;;;;;;;;
 ; Global Options ;
 ;;;;;;;;;;;;;;;;;;

 [global]
 ; Pid file
 ; Default Value: none
 pid = /var/run/php-fpm/php-fpm.pid

 ; Error log file
 ; Default Value: /var/log/php-fpm.log
 error_log = /var/log/php-fpm/error.log

 ; Log level
 ; Possible Values: alert, error, warning, notice, debug
 ; Default Value: notice
 log_level = warning

 ; If this number of child processes exit with SIGSEGV or SIGBUS within the time
 ; interval set by emergency_restart_interval then FPM will restart. A value
 ; of '0' means 'Off'.
 ; Default Value: 0
 emergency_restart_threshold = 1

 ; Interval of time used by emergency_restart_interval to determine when
 ; a graceful restart will be initiated.  This can be useful to work around
 ; accidental corruptions in an accelerator's shared memory.
 ; Available Units: s(econds), m(inutes), h(ours), or d(ays)
 ; Default Unit: seconds
 ; Default Value: 0
 emergency_restart_interval = 1m

 ; Time limit for child processes to wait for a reaction on signals from master.
 ; Available units: s(econds), m(inutes), h(ours), or d(ays)
 ; Default Unit: seconds
 ; Default Value: 0
 process_control_timeout = 60s

 ; Send FPM to background. Set to 'no' to keep FPM in foreground for debugging.
 ; Default Value: yes
 daemonize = yes

 ;;;;;;;;;;;;;;;;;;;;
 ; Pool Definitions ;
 ;;;;;;;;;;;;;;;;;;;;

 ; See /etc/php-fpm.d/pools/*.conf

/ etc/php-fpm.d/pools/www0.conf

; Start a new pool named 'www0'.
    [www0]

; pool_id0php_fpm_service_namephp-fpmtemplatepool.conf.erbnamewwwenabletrue
; The address on which to accept FastCGI requests.
; Valid syntaxes are:
    ;   'ip.add.re.ss:port'    - to listen on a TCP socket to a specific address on
;                            a specific port;
;   'port'                 - to listen on a TCP socket to all addresses on a
;                            specific port;
;   '/path/to/unix/socket' - to listen on a unix socket.
; Note: This value is mandatory.
    listen = 127.0.0.1:9000

; Set listen(2) backlog. A value of '-1' means unlimited.
; Default Value: -1
listen.backlog = 4096

; List of ipv4 addresses of FastCGI clients which are allowed to connect.
; Equivalent to the FCGI_WEB_SERVER_ADDRS environment variable in the original
; PHP FCGI (5.2.2+). Makes sense only with a tcp listening socket. Each address
; must be separated by a comma. If this value is left blank, connections will be
; accepted from any ip address.
; Default Value: any
listen.allowed_clients = 127.0.0.1

; Set permissions for unix socket, if one is used. In Linux, read/write
; permissions must be set in order to allow connections from a web server. Many
; BSD-derived systems allow connections regardless of permissions.
; Default Values: user and group are set as the running user
;                 mode is set to 0666
;listen.owner = nobody
;listen.group = nobody
;listen.mode = 0666

listen.owner = nginx
listen.group = nginx
listen.mode = 0660

; Unix user/group of processes
; Note: The user is mandatory. If the group is not set, the default user's group
;       will be used.
; RPM: Apache Choosed to be able to access some dir as httpd
user = nginx
; RPM: Keep a group allowed to write in log dir.
    group = nginx

; Choose how the process manager will control the number of child processes.
; Possible Values:
    ;   static  - a fixed number (pm.max_children) of child processes;
;   dynamic - the number of child processes are set dynamically based on the
;             following directives:
    ;             pm.max_children      - the maximum number of children that can
;                                    be alive at the same time.
;             pm.start_servers     - the number of children created on startup.
;             pm.min_spare_servers - the minimum number of children in 'idle'
;                                    state (waiting to process). If the number
;                                    of 'idle' processes is less than this
;                                    number then some children will be created.
;             pm.max_spare_servers - the maximum number of children in 'idle'
;                                    state (waiting to process). If the number
;                                    of 'idle' processes is greater than this
;                                    number then some children will be killed.
; Note: This value is mandatory.
    pm = static

; The number of child processes to be created when pm is set to 'static' and the
; maximum number of child processes to be created when pm is set to 'dynamic'.
; This value sets the limit on the number of simultaneous requests that will be
; served. Equivalent to the ApacheMaxClients directive with mpm_prefork.
    ; Equivalent to the PHP_FCGI_CHILDREN environment variable in the original PHP
; CGI.
; Note: Used when pm is set to either 'static' or 'dynamic'
; Note: This value is mandatory.
    pm.max_children = 48


; The number of requests each child process should execute before respawning.
; This can be useful to work around memory leaks in 3rd party libraries. For
; endless request processing specify '0'. Equivalent to PHP_FCGI_MAX_REQUESTS.
; Default Value: 0
pm.max_requests = 10000

; The URI to view the FPM status page. If this value is not set, no URI will be
; recognized as a status page. By default, the status page shows the following
; information:
    ;   accepted conn    - the number of request accepted by the pool;
;   pool             - the name of the pool;
;   process manager  - static or dynamic;
;   idle processes   - the number of idle processes;
;   active processes - the number of active processes;
;   total processes  - the number of idle + active processes.
; The values of 'idle processes', 'active processes' and 'total processes' are
; updated each second. The value of 'accepted conn' is updated in real time.
; Example output:
    ;   accepted conn:   12073
;   pool:             www
;   process manager:  static
;   idle processes:   35
;   active processes: 65
;   total processes:  100
; By default the status page output is formatted as text/plain. Passing either
; 'html' or 'json' as a query string will return the corresponding output
; syntax. Example:
;   http://www.foo.bar/status
    ;   http://www.foo.bar/status?json
    ;   http://www.foo.bar/status?html
    ; Note: The value must start with a leading slash (/). The value can be
;       anything, but it may not be a good idea to use the .php extension or it
;       may conflict with a real PHP file.
; Default Value: not set
;pm.status_path = /status

; The ping URI to call the monitoring page of FPM. If this value is not set, no
; URI will be recognized as a ping page. This could be used to test from outside
; that FPM is alive and responding, or to
; - create a graph of FPM availability (rrd or such);
; - remove a server from a group if it is not responding (load balancing);
; - trigger alerts for the operating team (24/7).
; Note: The value must start with a leading slash (/). The value can be
;       anything, but it may not be a good idea to use the .php extension or it
;       may conflict with a real PHP file.
; Default Value: not set
;ping.path = /ping

; This directive may be used to customize the response of a ping request. The
; response is formatted as text/plain with a 200 response code.
; Default Value: pong
;ping.response = pong

; The timeout for serving a single request after which the worker process will
; be killed. This option should be used when the 'max_execution_time' ini option
; does not stop script execution for some reason. A value of '0' means 'off'.
; Available units: s(econds)(default), m(inutes), h(ours), or d(ays)
; Default Value: 0
request_terminate_timeout = 60s

; The timeout for serving a single request after which a PHP backtrace will be
; dumped to the 'slowlog' file. A value of '0s' means 'off'.
; Available units: s(econds)(default), m(inutes), h(ours), or d(ays)
; Default Value: 0
request_slowlog_timeout = 20s

; The log file for slow requests
; Default Value: not set
; Note: slowlog is mandatory if request_slowlog_timeout is set
slowlog = /var/log/php-fpm/www-slow.log

; Set open file descriptor rlimit.
; Default Value: system defined value
;rlimit_files = 1024

; Set max core size rlimit.
; Possible Values: 'unlimited' or an integer greater or equal to 0
; Default Value: system defined value
;rlimit_core = 0

; Chroot to this directory at the start. This value must be defined as an
; absolute path. When this value is not set, chroot is not used.
; Note: chrooting is a great security feature and should be used whenever
;       possible. However, all PHP paths will be relative to the chroot
;       (error_log, sessions.save_path, ...).
; Default Value: not set
;chroot =

; Chdir to this directory at the start. This value must be an absolute path.
; Default Value: current directory or / when chroot
;chdir = /var/www

; Redirect worker stdout and stderr into main error log. If not set, stdout and
; stderr will be redirected to /dev/null according to FastCGI specs.
; Default Value: no
;catch_workers_output = yes

; Limits the extensions of the main script FPM will allow to parse. This can
; prevent configuration mistakes on the web server side. You should only limit
; FPM to .php extensions to prevent malicious users to use other extensions to
; exectute php code.
; Note: set an empty value to allow all extensions.
; Default Value: .php
;security.limit_extensions = .php .php3 .php4 .php5

; Pass environment variables like LD_LIBRARY_PATH. All $VARIABLEs are taken from
; the current environment.
; Default Value: clean env
;env[HOSTNAME] = $HOSTNAME
;env[PATH] = /usr/local/bin:/usr/bin:/bin
;env[TMP] = /tmp
;env[TMPDIR] = /tmp
;env[TEMP] = /tmp

; Additional php.ini defines, specific to this pool of workers. These settings
; overwrite the values previously defined in the php.ini. The directives are the
; same as the PHP SAPI:
    ;   php_value/php_flag             - you can set classic ini defines which can
;                                    be overwritten from PHP call 'ini_set'.
;   php_admin_value/php_admin_flag - these directives won't be overwritten by
;                                     PHP call 'ini_set'
; For php_*flag, valid values are on, off, 1, 0, true, false, yes or no.

; Defining 'extension' will load the corresponding shared extension from
; extension_dir. Defining 'disable_functions' or 'disable_classes' will not
; overwrite previously defined php.ini values, but will append the new value
; instead.

; Default Value: nothing is defined by default except the values in php.ini and
;                specified at startup with the -d argument
;php_admin_value[sendmail_path] = /usr/sbin/sendmail -t -i -f [email protected]
php_flag[display_errors] = off
php_admin_value[error_log] = /var/log/php-fpm/www-error.log
php_admin_flag[log_errors] = on
php_admin_value[memory_limit] = 256M

; Set session path to a directory owned by process user
;php_value[session.save_handler] = files
;php_value[session.save_path] = /var/lib/php/session
4
sudoyum

上流のPHP-FPMのnginxでleast_conn負荷分散戦略を使用しています。これは、1つのIPアドレスにいる1人のユーザーに対して、異なるPHP-FPMプロセスによってサービスが提供される可能性があることを意味します。

これらの2つのPHP-FPMプロセスが、ユーザー間で必要なすべての状態を共有しない場合、これが原因で奇妙なことが発生する可能性があります。たとえば、ユーザーセッションの状態がPHP-FPMノードに対してローカルである場合、ユーザーは、ログイン中に他のサーバーにアクセスするとログアウトされます。

これを回避するには、least_connip_hashに置き換える必要があります。これにより、1つのIPアドレスからのすべての接続が同じPHP-FPMノードに送信されるようになります。理論的には、これにより負荷分散が少し不均一になりますが、実際には違いはありません。

これは、発生している問題の原因ではない可能性があります。

2
Tero Kilkanen

キープアライブを削除すると、根本的な問題が発生していると思います。これは、構成したタイムアウトと、負荷がかかった状態でのバックエンドの応答性の組み合わせであると思われます。

具体的には、これがあなたの問題だと思います。

upstream php_fpm {
    least_conn;
    server 127.0.0.1:9000 max_fails=3 fail_timeout=15s;
    keepalive 8;
}

私は以下を試します:

upstream php_fpm {
    least_conn;
    server 127.0.0.1:9000 max_fails=3 fail_timeout=60s;
    keepalive 8;
}

PHP-FPMは60秒後に処理を終了するように設定されていますが、nginxはリクエストが15秒後に失敗したと見なしていると思います。

https://nginx.org/en/docs/http/ngx_http_upstream_module.html#server から:

fail_timeout = timeは、サーバーとの通信に失敗した指定された回数が、サーバーが使用不可であると見なすために発生する時間を設定します。サーバーが使用不可と見なされる期間。デフォルトでは、パラメーターは10秒に設定されています。

ピーク負荷がどのように見えるかを確認し、それを吸収するためにバックエンドのスケーリングを検討したい場合もあります。

1