ファイルfindおよびgrep内の文字列を検索する効率的な方法

Question

ファイラー（古いHP-UXワークステーション）で特定の文字列を含むすべてのファイルを検索しています。

ファイルがファイルシステムのどこにあるのかわかりません（ディレクトリがたくさんあり、スクリプト、プレーンテキスト、バイナリファイルの数が非常に多いです）。

このシステムにはgrep-Rオプションが存在しないことを正確に説明します。そのため、文字列が含まれているファイルを取得するために、findとgrepを使用しています。

find . -type f -exec grep -i "mystring" {} \;

このコマンドには満足できません。速度が遅すぎて、grepが文字列と一致したファイルの名前とパスが出力されません。さらに、エラーがある場合は、コンソール出力にエコーされます。

だから私はもっとうまくやれると思った：

find . -type f -exec grep -l -i "mystring" {} 2>/dev/null \;

しかし、それは非常に遅いです。

このコマンドのより効率的な代替手段はありますか？

ありがとうございます。

terdon · Accepted Answer

私が思いつくことができる最速の方法は、xargsを使用して負荷を共有することです。

find . -type f -print0 | xargs -0 grep -Fil "mypattern"

3631ファイルを含むディレクトリでいくつかのベンチマークを実行します。

$ time find . -type f -exec grep -l -i "mystring" {} 2>/dev/null \; real 0m15.012s user 0m4.876s sys 0m1.876s $ time find . -type f -exec grep -Fli "mystring" {} 2>/dev/null \; real 0m13.982s user 0m4.328s sys 0m1.592s $ time find . -type f -print0 | xargs -0 grep -Fil "mystring" >/dev/null real 0m3.565s user 0m3.508s sys 0m0.052s

他のオプションは、findを使用してファイルリストを制限することによって合理化することです。

 -executable Matches files which are executable and direc‐ tories which are searchable (in a file name resolution sense). -writable Matches files which are writable. -mtime n File's data was last modified n*24 hours ago. See the comments for -atime to understand how rounding affects the interpretation of file modification times. -group gname File belongs to group gname (numeric group ID allowed). -perm /mode Any of the permission bits mode are set for the file. Symbolic modes are accepted in this form. You must specify `u', `g' or `o' if you use a symbolic mode. -size n[cwbkMG] <-- you can set a minimum or maximum size File uses n units of space.

または、grepを微調整することによって：

すでにgrepの-lオプションを使用しているため、ファイル名が出力され、さらに重要なことに、最初の一致で停止します。

 -l, --files-with-matches Suppress normal output; instead print the name of each input file from which output would normally have been printed. The scanning will stop on the first match. (-l is specified by POSIX.)

速度を上げるために私が考えることができる他の唯一のことは、-Fオプションを使用して、パターンが正規表現として解釈されないようにすることです（@suspectusによって提案されています）。

suspectus · Answer

使用する grep -F、これはgrepに、パターンを正規表現ではなく文字列として解釈するように指示します（これは必要ないと思います）。解析されるファイルのサイズによっては、grepよりもかなり高速になる可能性があります。

UbuntuおよびRHELLinuxでは、-Hオプションは一致したファイルのファイルパスを表示します。

find . -type f -exec grep -FHi "mystring" {} +

UbuntuおよびRHELLinuxでは、-Hオプションは一致したファイルのファイルパスを表示します。

find . -type f -exec grep -FHi "mystring" {} +