Bashで文字列を配列に分割する

Question

Bashスクリプトでは、1行を複数の部分に分割してそれらを配列に格納したいと思います。

この線：

Paris, France, Europe

このような配列にしたいのですが。

array[0] = Paris array[1] = France array[2] = Europe

私は簡単なコードを使いたいのですが、コマンドのスピードは関係ありません。どうすればいいの？

Dennis Williamson · Accepted Answer

IFS=', ' read -r -a array <<< "$string"

$IFSの文字は、この場合、2つの文字のシーケンスではなく、 または カンマまたはスペースのいずれかで区切られるように、個別に区切り文字として扱われることに注意してください。ただし、興味深いことに、スペースが特別に扱われるため、入力にコンマスペースが表示されても空のフィールドは作成されません。

個々の要素にアクセスするには

echo "${array[0]}"

要素を反復するには：

for element in "${array[@]}" do echo "$element" done

インデックスと値の両方を取得するには

for index in "${!array[@]}" do echo "$index ${array[index]}" done

最後の例はBash配列がまばらなので便利です。つまり、要素を削除したり要素を追加したりすると、インデックスが連続しなくなります。

unset "array[1]" array[42]=Earth

配列内の要素数を取得するには

echo "${#array[@]}"

先に述べたように、配列はまばらになることがあるので、最後の要素を取得するために長さを使うべきではありません。これがBash 4.2以降でできることです。

echo "${array[-1]}"

bashのどのバージョンでも（2.05b以降のどこかから）：

echo "${array[@]: -1:1}"

負のオフセットが大きいほど、配列の末尾から遠くなります。古い形式のマイナス記号の前のスペースに注意してください。必須です。

Jim Ho · Answer

これは、IFSを設定しない方法です。

string="1:2:3:4:5" set -f # avoid globbing (expansion of *). array=(${string//:/ }) for i in "${!array[@]}" do echo "$i=>${array[i]}" done

アイデアは文字列置換を使うことです：

${string//substring/replacement}

$ substringのすべての一致を空白に置き換えてから、置換された文字列を使用して配列を初期化します。

(element1 element2 ... elementN)

注：この答えは split + glob演算子を使用しています。したがって、*などの一部の文字が拡張されないようにするには、このスクリプトのグロビングを一時停止することをお勧めします。

Jmoney38 · Answer

t="one,two,three" a=($(echo "$t" | tr ',' '
')) echo "${a[2]}"

3枚プリント

Luca Borrione · Answer

特にセパレータがキャリッジリターンである場合は、受け入れられた回答で説明されている方法が機能しないことが時々起こりました。
そのような場合、私はこのようにして解決しました：

string='first line second line third line' oldIFS="$IFS" IFS=' ' IFS=${IFS:0:1} # this is useful to format your code with tabs lines=( $string ) IFS="$oldIFS" for line in "${lines[@]}" do echo "--> $line" done

user2350426 · Answer

受け入れられた答えは1行の値に対して機能します。
変数に複数の行がある場合

string='first line second line third line'

すべての行を取得するには、非常に異なるコマンドが必要です。

while read -r line; do lines+=("$line"); done <<<"$string"

あるいはもっと単純なbash readarray ：

readarray -t lines <<<"$string"

Printf機能を利用すると、すべての行を印刷するのはとても簡単です。

printf ">[%s]
" "${lines[@]}" >[first line] >[ second line] >[ third line]

ssanch · Answer

これはJmoney38によるアプローチと似ていますが、sedを使用します。

string="1,2,3,4" array=(`echo $string | sed 's/,/
/g'`) echo ${array[0]}

プリント1

dawg · Answer

文字列を配列に分割するための鍵は、", "の複数文字区切り文字です。 IFSは文字列ではなく、それらの文字の集合であるため、複数文字の区切り文字にIFSを使用する解決策は本質的に間違っています。

IFS=", "を割り当てると、文字列はETHER "," OR " "、またはそれらの任意の組み合わせで改行されます。これは、", "の2文字の区切り文字の正確な表現ではありません。

awkまたはsedを使用して文字列を分割することができます。処理は次のとおりです。

#!/bin/bash str="Paris, France, Europe" array=() while read -r -d $'\0' each; do # use a NUL terminated field separator array+=("$each") done < <(printf "%s" "$str" | awk '{ gsub(/,[ ]+|$/,"\0"); print }') declare -p array # declare -a array=([0]="Paris" [1]="France" [2]="Europe") output

Bashで直接正規表現を使用するほうが効率的です。

#!/bin/bash str="Paris, France, Europe" array=() while [[ $str =~ ([^,]+)(,[ ]+|$) ]]; do array+=("${BASH_REMATCH[1]}") # capture the field i=${#BASH_REMATCH} # length of field + delimiter str=${str:i} # advance the string by that length done # the loop deletes $str, so make a copy if needed declare -p array # declare -a array=([0]="Paris" [1]="France" [2]="Europe") output...

2番目の形式では、サブシェルはなく、本質的に高速になります。

bgoldstによる編集： 私のreadarrayソリューションをdawgのregexソリューションと比較したベンチマークをいくつか紹介します。また、readソリューションを組み込んでいます（注：私のソリューションとの調和を図るためにregexソリューションを少し修正しました）記事の下にある私のコメントも参照してください。

## competitors function c_readarray { readarray -td '' a < <(awk '{ gsub(/, /,"\0"); print; };' <<<"$1, "); unset 'a[-1]'; }; function c_read { a=(); local REPLY=''; while read -r -d ''; do a+=("$REPLY"); done < <(awk '{ gsub(/, /,"\0"); print; };' <<<"$1, "); }; function c_regex { a=(); local s="$1, "; while [[ $s =~ ([^,]+),\ ]]; do a+=("${BASH_REMATCH[1]}"); s=${s:${#BASH_REMATCH}}; done; }; ## helper functions function rep { local -i i=-1; for ((i = 0; i<$1; ++i)); do printf %s "$2"; done; }; ## end rep() function testAll { local funcs=(); local args=(); local func=''; local -i rc=-1; while [[ "$1" != ':' ]]; do func="$1"; if [[ ! "$func" =~ ^[_a-zA-Z][_a-zA-Z0-9]*$ ]]; then echo "bad function name: $func" >&2; return 2; fi; funcs+=("$func"); shift; done; shift; args=("$@"); for func in "${funcs[@]}"; do echo -n "$func "; { time $func "${args[@]}" >/dev/null 2>&1; } 2>&1| tr '
' '/'; rc=${PIPESTATUS[0]}; if [[ $rc -ne 0 ]]; then echo "[$rc]"; else echo; fi; done| column -ts/; }; ## end testAll() function makeStringToSplit { local -i n=$1; ## number of fields if [[ $n -lt 0 ]]; then echo "bad field count: $n" >&2; return 2; fi; if [[ $n -eq 0 ]]; then echo; Elif [[ $n -eq 1 ]]; then echo 'first field'; Elif [[ "$n" -eq 2 ]]; then echo 'first field, last field'; else echo "first field, $(rep $[$1-2] 'mid field, ')last field"; fi; }; ## end makeStringToSplit() function testAll_splitIntoArray { local -i n=$1; ## number of fields in input string local s=''; echo "===== $n field$(if [[ $n -ne 1 ]]; then echo 's'; fi;) ====="; s="$(makeStringToSplit "$n")"; testAll c_readarray c_read c_regex : "$s"; }; ## end testAll_splitIntoArray() ## results testAll_splitIntoArray 1; ## ===== 1 field ===== ## c_readarray real 0m0.067s user 0m0.000s sys 0m0.000s ## c_read real 0m0.064s user 0m0.000s sys 0m0.000s ## c_regex real 0m0.000s user 0m0.000s sys 0m0.000s ## testAll_splitIntoArray 10; ## ===== 10 fields ===== ## c_readarray real 0m0.067s user 0m0.000s sys 0m0.000s ## c_read real 0m0.064s user 0m0.000s sys 0m0.000s ## c_regex real 0m0.001s user 0m0.000s sys 0m0.000s ## testAll_splitIntoArray 100; ## ===== 100 fields ===== ## c_readarray real 0m0.069s user 0m0.000s sys 0m0.062s ## c_read real 0m0.065s user 0m0.000s sys 0m0.046s ## c_regex real 0m0.005s user 0m0.000s sys 0m0.000s ## testAll_splitIntoArray 1000; ## ===== 1000 fields ===== ## c_readarray real 0m0.084s user 0m0.031s sys 0m0.077s ## c_read real 0m0.092s user 0m0.031s sys 0m0.046s ## c_regex real 0m0.125s user 0m0.125s sys 0m0.000s ## testAll_splitIntoArray 10000; ## ===== 10000 fields ===== ## c_readarray real 0m0.209s user 0m0.093s sys 0m0.108s ## c_read real 0m0.333s user 0m0.234s sys 0m0.109s ## c_regex real 0m9.095s user 0m9.078s sys 0m0.000s ## testAll_splitIntoArray 100000; ## ===== 100000 fields ===== ## c_readarray real 0m1.460s user 0m0.326s sys 0m1.124s ## c_read real 0m2.780s user 0m1.686s sys 0m1.092s ## c_regex real 17m38.208s user 15m16.359s sys 2m19.375s ##

MrPotatoHead · Answer

純粋なbashの複数文字区切り文字の解決方法。

他の人がこのスレッドで指摘したように、OPの質問は配列に解析されるべきコンマ区切り文字列の例を与えました、しかし彼/彼女がコンマ区切り文字、単一文字区切り文字、または複数文字だけに興味があるかどうか示しませんでした区切り文字

グーグルはこの回答を検索結果のトップまたはその近くにランク付けする傾向があるので、複数の文字の区切り文字についての質問に対する強い回答を読者に提供したいと思いました。

複数文字の区切り文字の問題に対する解決策を探しているのであれば、 Mallikarjun M の投稿、特にこのエレガントなピュアを提供する gniourf_gniourf からの回答を検討することをお勧めします。パラメータ展開を使用したBASHソリューション：

#!/bin/bash str="LearnABCtoABCSplitABCaABCString" delimiter=ABC s=$str$delimiter array=(); while [[ $s ]]; do array+=( "${s%%"$delimiter"*}" ); s=${s#*"$delimiter"}; done; declare -p array

引用コメント/参照投稿へのリンク

引用質問へのリンク： bashで複数文字の区切り文字で文字列を分割する方法

Geoff Lee · Answer

これを試して

IFS=', '; array=(Paris, France, Europe) for item in ${array[@]}; do echo $item; done

それは簡単です。必要に応じて、宣言を追加することもできます（そしてコンマも削除します）。

IFS=' ';declare -a array=(Paris France Europe)

上記を元に戻すためにIFSが追加されましたが、新鮮なbashインスタンスではIFSがなくても機能します。

To Kra · Answer

これは私にとってOSX上で動作します。

string="1 2 3 4 5" declare -a array=($string)

文字列の区切り文字が異なる場合は、最初の文字列をスペースに置き換えます。

string="1,2,3,4,5" delimiter="," declare -a array=($(echo $string | tr "$delimiter" " "))

シンプル:-)

user1009908 · Answer

更新：evalの問題のため、これをしないでください。

やや少ない式で：

IFS=', ' eval 'array=($string)'

例えば.

string="foo, bar,baz" IFS=', ' eval 'array=($string)' echo ${array[1]} # -> bar

balaganAtomi · Answer

私はこの記事に出会いました。

上記のどれも私を助けませんでした。 awkを使って解決しました。それが誰かに役立つならば：

STRING="value1,value2,value3" array=`echo $STRING | awk -F ',' '{ s = $1; for (i = 2; i <= NF; i++) s = s "
"$i; print s; }'` for Word in ${array} do echo "This is the Word $Word" done

sel-en-ium · Answer

IFSを変更せずにこれを実行する別の方法：

read -r -a myarray <<< "${string//, /$IFS}"

IFSを目的の区切り文字と一致するように変更するのではなく、 目的の区切り文字", "のすべての出現箇所を$IFSを介して"${string//, /$IFS}"の内容に置き換えることができます。

たぶんこれは非常に大きな文字列では遅くなるでしょうか？

これはDennis Williamsonの答えに基づいています。

Eduardo Lucio · Answer

これが私のハックです！

文字列を文字列で分割するのは、bashを使うのはかなりつまらないことです。何が起こるかというと、限られたアプローチしかない場合（「;」、「/」、「。」などで分割）、または出力にさまざまな副作用があります。

以下のアプローチは多くの操作を必要としました、しかし、私はそれが私たちのほとんどのニーズのために働くだろうと思います！

#!/bin/bash # -------------------------------------- # SPLIT FUNCTION # ---------------- F_SPLIT_R=() f_split() { : 'It does a "split" into a given string and returns an array. Args: TARGET_P (str): Target string to "split". DELIMITER_P (Optional[str]): Delimiter used to "split". If not informed the split will be done by spaces. Returns: F_SPLIT_R (array): Array with the provided string separated by the informed delimiter. ' F_SPLIT_R=() TARGET_P=$1 DELIMITER_P=$2 if [ -z "$DELIMITER_P" ] ; then DELIMITER_P=" " fi REMOVE_N=1 if [ "$DELIMITER_P" == "
" ] ; then REMOVE_N=0 fi # NOTE: This was the only parameter that has been a problem so far! # By Questor # [Ref.: https://unix.stackexchange.com/a/390732/61742] if [ "$DELIMITER_P" == "./" ] ; then DELIMITER_P="[.]/" fi if [ ${REMOVE_N} -eq 1 ] ; then # NOTE: Due to bash limitations we have some problems getting the # output of a split by awk inside an array and so we need to use # "line break" (
) to succeed. Seen this, we remove the line breaks # momentarily afterwards we reintegrate them. The problem is that if # there is a line break in the "string" informed, this line break will # be lost, that is, it is erroneously removed in the output! # By Questor TARGET_P=$(awk 'BEGIN {RS="dn"} {gsub("
", "3F2C417D448C46918289218B7337FCAF"); printf $0}' <<< "${TARGET_P}") fi # NOTE: The replace of "
" by "3F2C417D448C46918289218B7337FCAF" results # in more occurrences of "3F2C417D448C46918289218B7337FCAF" than the # amount of "
" that there was originally in the string (one more # occurrence at the end of the string)! We can not explain the reason for # this side effect. The line below corrects this problem! By Questor TARGET_P=${TARGET_P%????????????????????????????????} SPLIT_NOW=$(awk -F"$DELIMITER_P" '{for(i=1; i<=NF; i++){printf "%s
", $i}}' <<< "${TARGET_P}") while IFS= read -r LINE_NOW ; do if [ ${REMOVE_N} -eq 1 ] ; then # NOTE: We use "'" to prevent blank lines with no other characters # in the sequence being erroneously removed! We do not know the # reason for this side effect! By Questor LN_NOW_WITH_N=$(awk 'BEGIN {RS="dn"} {gsub("3F2C417D448C46918289218B7337FCAF", "
"); printf $0}' <<< "'${LINE_NOW}'") # NOTE: We use the commands below to revert the intervention made # immediately above! By Questor LN_NOW_WITH_N=${LN_NOW_WITH_N%?} LN_NOW_WITH_N=${LN_NOW_WITH_N#?} F_SPLIT_R+=("$LN_NOW_WITH_N") else F_SPLIT_R+=("$LINE_NOW") fi done <<< "$SPLIT_NOW" } # -------------------------------------- # HOW TO USE # ---------------- STRING_TO_SPLIT=" * How do I list all databases and tables using psql? \" Sudo -u postgres /usr/pgsql-9.4/bin/psql -c \"\l\" Sudo -u postgres /usr/pgsql-9.4/bin/psql <DB_NAME> -c \"\dt\" \" \" \list or \l: list all databases \dt: list all tables in the current database \" [Ref.: https://dba.stackexchange.com/questions/1285/how-do-i-list-all-databases-and-tables-using-psql] " f_split "$STRING_TO_SPLIT" "bin/psql -c" # -------------------------------------- # OUTPUT AND TEST # ---------------- ARR_LENGTH=${#F_SPLIT_R[*]} for (( i=0; i<=$(( $ARR_LENGTH -1 )); i++ )) ; do echo " > -----------------------------------------" echo "${F_SPLIT_R[$i]}" echo " < -----------------------------------------" done if [ "$STRING_TO_SPLIT" == "${F_SPLIT_R[0]}bin/psql -c${F_SPLIT_R[1]}" ] ; then echo " > -----------------------------------------" echo "The strings are the same!" echo " < -----------------------------------------" fi

Eduardo Cuomo · Answer

これを使って：

countries='Paris, France, Europe' OIFS="$IFS" IFS=', ' array=($countries) IFS="$OIFS" #${array[1]} == Paris #${array[2]} == France #${array[3]} == Europe