ダウンロードしたいURLのリストが記載されたテキストファイルがあります。次のURLをダウンロードする前にwgetを待機させたい。私は使っています
cat urls.txt | xargs -n1 -i wget --wait=30 {}
ただし、URL間の待機はありません。これにwgetの--waitオプションを使用できますか?他の選択肢は?
これは、各URLに対してwgetが新しく呼び出されるためです。使用 -i
URLのリストをwget
にフィードするオプション:
$ wget -i urls.txt --wait=30
マニュアルから:
-i file
--input-file=file
Read URLs from a local or external file. If - is specified as file,
URLs are read from the standard input. (Use ./- to read from a file
literally named -.) If this function is used, no URLs need be present
on the command line. If there are URLs both on the command line and in
an input file, those on the command lines will be the first ones to be
retrieved. If --force-html is not specified, then file should consist
of a series of URLs, one per line.
However, if you specify --force-html, the document will be regarded as
html. In that case you may have problems with relative links, which
you can solve either by adding "<base href="url">" to the documents or
by specifying --base=url on the command line.
If the file is an external one, the document will be automatically
treated as html if the Content-Type matches text/html. Furthermore,
the file's location will be implicitly used as base href if none was
specified.