URLのパスからファイル名拡張子を除いたものに一致する

Question

このシナリオに最適な正規表現は何でしょうか？

このURLを考えます：

http://php.net/manual/en/function.preg-match.php

http://php.netと.phpの間（ただし含まない）をすべて選択するにはどうすればよいですか：

/manual/en/function.preg-match

これは Nginx 設定ファイル用です。

FailedDev · Accepted Answer

このような：

if (preg_match('/(?<=net).*(?=\.php)/', $subject, $regs)) { $result = $regs[0]; }

説明：

" (?<= # Assert that the regex below can be matched, with the match ending at this position (positive lookbehind) net # Match the characters “net” literally ) . # Match any single character that is not a line break character * # Between zero and unlimited times, as many times as possible, giving back as needed (greedy) (?= # Assert that the regex below can be matched, starting at this position (positive lookahead) \. # Match the character “.” literally php # Match the characters “php” literally ) "

user212218 · Answer

正規表現は、この仕事に最も効果的なツールではないかもしれません。

parse_url() を pathinfo() と組み合わせて使用してみてください：

$url = 'http://php.net/manual/en/function.preg-match.php'; $path = parse_url($url, PHP_URL_PATH); $pathinfo = pathinfo($path); echo $pathinfo['dirname'], '/', $pathinfo['filename'];

上記のコード出力：

/manual/en/function.preg-match

morja · Answer

これを試して：

preg_match("/net(.*)\.php$/","http://php.net/manual/en/function.preg-match.php", $matches); echo $matches[1]; // prints /manual/en/function.preg-match

Crayon Violent · Answer

URLを分析するために正規表現を使用する必要はありません。 PHPには、このための組み込み関数 pathinfo（）および parse_url（）があります。

Ja͢ck · Answer

楽しみのために、ここではまだ検討されていない2つの方法を示します。

substr($url, strpos($s, '/', 8), -4)

または：

substr($s, strpos($s, '/', 8), -strlen($s) + strrpos($s, '.'))

HTTPスキームhttp://およびhttps://は最大で8文字であるという考えに基づいているため、通常は9番目の位置から最初のスラッシュを見つけるだけで十分です。拡張子が常に.phpの場合、最初のコードが機能します。それ以外の場合は、他のコードが必要です。

純粋な正規表現ソリューションの場合、次のように文字列を分解できます。

~^(?:[^:/?#]+:)?(?://[^/?#]*)?([^?#]*)~ ^

パス部分は、式の下の行の^で示される最初のメモリグループ（つまり、インデックス1）内にあります。拡張機能を削除するには、pathinfo()を使用します。

$parts = pathinfo($matches[1]); echo $parts['dirname'] . '/' . $parts['filename'];

式をこれに微調整することもできます。

([^?#]*?)(?:\.[^?#]*)?(?:\?|$)

ただし、この式にはバックトラッキングが含まれているため、あまり最適ではありません。最終的には、私はより少ないカスタムのために行きます：

$parts = pathinfo(parse_url($url, PHP_URL_PATH)); echo $parts['dirname'] . '/' . $parts['filename'];

user1626664 · Answer

シンプル：

$url = "http://php.net/manual/en/function.preg-match.php"; preg_match("/http:\/\/php\.net(.+)\.php/", $url, $matches); echo $matches[1];

$matches[0]は完全なURL、$matches[1]は必要な部分です。

自分自身を参照してください： http://codepad.viper-7.com/hHmwI2

nickl- · Answer

|（？<=\w）/.+（？= \。\ w + $）|

先頭にある最初のリテラル「/」からすべてを選択します
word（\ w）キャラクターの後ろを見る
先読みが続くまで
- リテラル '。'によって追加された
- 1つ以上のWord（\ w）文字
- 終わりの前に

 re> |（？<=\w）/.+（？= \。\ w + $）| コンパイル時間0.0011ミリ秒 メモリ割り当て（コードスペース）：32 学習時間0.0002ミリ秒 キャプチャサブパターンカウント= 0 オプションなし 最初の文字= '/' 必要な文字 最大後読み= 1 被験者の長さの下限= 2 開始バイトのセットなし データ> http://php.net/manual/en/function.preg-match。 php 実行時間0.0007ミリ秒 0：/manual/en/function.preg-match

| // [^ /] （。）\。\ w + $ |

2つのリテラル「//」の後にリテラル「/」以外の何かが続く
まですべてを選択します
リテラル '。'を見つける末尾の$の前にWord\w文字のみが続く

 re> | // [^ /] *（。*）\。\ w + $ | コンパイル時間0.0010ミリ秒 メモリ割り当て（コードスペース）：28 学習時間0.0002ミリ秒 キャプチャサブパターンカウント= 1 オプションなし 最初の文字= '/' 必要な文字='。 ' サブジェクト長の下限= 4 開始バイトのセットなし データ> http://php.net/manual/en/function.preg-match.php 実行時間0.0005ミリ秒 0：//php.net/manual/en/function.preg-match.php 1：/manual/en/function.preg-match

|/[^ /] +（。*）\。|

リテラル「/」の後に少なくとも1つ以上の非リテラル「/」が続く
最後のリテラル '。'の前のすべてを積極的に選択します。

 re> |/[^ /] +（。*）\。| コンパイル時間0.0008ミリ秒 メモリ割り当て（コードスペース）：23 Study time 0.0002ミリ秒 キャプチャサブパターンカウント= 1 オプションなし 最初の文字= '/' 必要な文字='。 ' 被験者の長さの下限= 3 開始バイトのセットなし データ> http://php.net/manual/en/function.preg-match.php 実行時間0.0005ミリ秒 0：/php.net/manual/en/function.preg-match. 1：/manual/en/function.preg-match

|/[^ /] +\K。*（？= \。）|

リテラル「/」の後に少なくとも1つ以上の非リテラル「/」が続く
選択開始をリセット\ K
前にすべてを積極的に選択する
最後のリテラル「。」を先読みします

 re> |/[^ /] +\K。*（？= \。）| コンパイル時間0.0009ミリ秒 メモリ割り当て（コードスペース）：22 学習時間0.0002ミリ秒 キャプチャサブパターンカウント= 0 オプションなし 最初の文字= '/' 不要文字 被験者の長さbound = 2 開始バイトのセットなし data> http://php.net/manual/en/function.preg-match.php 実行時間0.0005ミリ秒 0：/manual/en/function.preg-match

|\w +\K /.*（？= \。）|

リテラル '/'の前に1つ以上のWord（\ w）文字を見つける
リセット選択開始\ K
リテラル '/'に続いて
以前の何でも
最後のリテラル「。」を先読みします

 re> |\w +\K /.*（？= \。）| コンパイル時間0.0009ミリ秒 メモリ割り当て（コードスペース）：22 Study時間0.0003ミリ秒 キャプチャサブパターンカウント= 0 オプションなし 最初の文字なし 必要な文字= '/' 被験者の長さの下限= 2 開始バイトセット：0 1 2 3 4 5 6 7 8 9 ABCDEFGHIJKLMNOP QRSTUVWXYZ _ abcdefghijklmnopqrstu vwxyz data> http://php.net/manual/en/function。 preg-match.php 実行時間0.0011ミリ秒 0：/manual/en/function.preg-match

Homer6 · Answer

この一般的なURLの一致により、URLの一部を選択できます。

if (preg_match('/\b(?P<protocol>https?|ftp)://(?P<domain>[-A-Z0-9.]+)(?P<file>/[-A-Z0-9+&@#/%=~_|!:,.;]*)?(?P<parameters>\?[-A-Z0-9+&@#/%=~_|!:,.;]*)?/i', $subject, $regs)) { $result = $regs['file']; //or you can append the $regs['parameters'] too } else { $result = ""; }

Firas Dib · Answer

あなたが私に尋ねると、これまでに提供されているものよりも優れた正規表現ソリューションがあります： http://regex101.com/r/nQ8rH5

 /http：\/\/[^\/] +\K。*（？= \。[^。] + $）/ i