バックスラッシュ文字で終わるすべての行をどのように組み合わせることができますか？

Question

Sedやawkなどの一般的なコマンドラインツールを使用して、円記号などの特定の文字で終わるすべての行を結合することはできますか？

たとえば、次のファイルがあるとします。

foo bar \ bash \ baz dude \ happy

この出力を取得したいと思います。

foo bar bash baz dude happy

neurino · Accepted Answer

短くて簡単なsedソリューション：

sed ' : again /\$/ { N s/\
// t again } ' textfile

またはGNU sedを使用している場合はワンライナー：

sed ':x; /\$/ { N; s/\
//; tx }' textfile

camh · Answer

それはおそらくPerlで最も簡単です（Perlはsedやawkのようなものです。

Perl -p -e 's/\
//'

Gilles &#39;SO- stop being evil&#39; · Answer

これがawkソリューションです。行が\で終わっている場合は、バックスラッシュを取り除いて、改行を終了せずに行を出力します。そうでない場合は、行末に改行を入れて出力します。

awk '{if (sub(/\$/,"")) printf "%s", $0; else print $0}'

sedではawkの方が明らかに読みやすいですが、それほど悪くありません。

Peter.O · Answer

これ自体は答えではありません。 sedの副次的な問題です。

具体的には、それを理解するためにGilles sedコマンドを少しずつ分解する必要がありました...私はそれについていくつかのメモを書き始め、それからここで誰かに役立つかもしれないと思いました...

だからここに...ジルのsed script documented形式：

#!/bin/bash ####################################### sed_dat="$HOME/ztest.dat" while IFS= read -r line ;do echo "$line" ;done <<'END_DAT' >"$sed_dat" foo bar \ bash \ baz dude \ happy yabba dabba doo END_DAT ####################################### sedexec="$HOME/ztest.sed" while IFS= read -r line ;do echo "$line" ;done <<'END-SED' >"$sedexec"; \ sed -nf "$sedexec" "$sed_dat" s/\$// # If a line has trailing '\', remove the '\' # t'Hold-append' # branch: Branch conditionally to the label 'Hold-append' # The condition is that a replacement was made. # The current pattern-space had a trailing '\' which # was replaced, so branch to 'Hold-apend' and append # the now-truncated line to the hold-space # # This branching occurs for each (successive) such line. # # PS. The 't' command may be so named because it means 'on true' # (I'm not sure about this, but the shoe fits) # # Note: Appending to the hold-space introduces a leading '
' # delimiter for each appended line # # eg. compare the hex dump of the follow 4 example commands: # 'x' swaps the hold and patten spaces # # echo -n "a" |sed -ne 'p' |xxd -p ## 61 # echo -n "a" |sed -ne 'H;x;p' |xxd -p ## 0a61 # echo -n "a" |sed -ne 'H;H;x;p' |xxd -p ## 0a610a61 # echo -n "a" |sed -ne 'H;H;H;x;p' |xxd -p ## 0a610a610a61 # No replacement was made above, so the current pattern-space # (input line) has a "normal" ending. x # Swap the pattern-space (the just-read "normal" line) # with the hold-space. The hold-space holds the accumulation # of appended "stripped-of-backslah" lines G # The pattern-space now holds zero to many "stripped-of-backslah" lines # each of which has a preceding '
' # The 'G' command Gets the Hold-space and appends it to # the pattern-space. This append action introduces another # '
' delimiter to the pattern space. s/
//g # Remove all '
' newlines from the pattern-space p # Print the pattern-space s/.*// # Now we need to remove all data from the pattern-space # This is done as a means to remove data from the hold-space # (there is no way to directly remove data from the hold-space) x # Swap the no-data pattern space with the hold-space # This leaves the hold-space re-initialized to empty... # The current pattern-space will be overwritten by the next line-read b # Everything is ready for the next line-read. It is time to make # an unconditional branch the to end of process for this line # ie. skip any remaining logic, read the next line and start the process again. :'Hold-append' # The ':' (colon) indicates a label.. # A label is the target of the 2 branch commands, 'b' and 't' # A label can be a single letter (it is often 'a') # Note; 'b' can be used without a label as seen in the previous command H # Append the pattern to the hold buffer # The pattern is prefixed with a '
' before it is appended END-SED #######

Kusalananda · Answer

シェルのreadが-rなしで使用されるとバックスラッシュを解釈するという事実を使用します。

$ while IFS= read line; do printf '%s
' "$line"; done <file foo bar bash baz dude happy

これは、データ内のotherバックスラッシュも解釈することに注意してください。

verdo · Answer

さらに別の一般的なコマンドラインツールはedです。これはデフォルトでファイルを変更するため、ファイルの権限は変更されません（edの詳細については、 edを使用したファイルの編集スクリプトからのテキストエディター）

str=' foo bar \ bash 1 \ bash 2 \ bash 3 \ bash 4 \ baz dude \ happy xxx vvv 1 \ vvv 2 \ CCC ' # We are using (1,$)g/re/command-list and (.,.+1)j to join lines ending with a '\' # ?? repeats the last regex search. # replace ',p' with 'wq' to edit files in-place # (using Bash and FreeBSD ed on Mac OS X) cat <<-'EOF' | ed -s <(printf '%s' "$str") H ,g/\$/s///\ .,.+1j\ ??s///\ .,.+1j ,p EOF

Isaac · Answer

ファイル全体をメモリにロードするシンプルな（r）ソリューション：

sed -z 's/\
//g' file # GNU sed 4.2.2+.

または、（出力）行の理解（GNU構文）を機能させる、まだ短いもの：

sed ':x;/\$/{N;bx};s/\
//g' file

1行（POSIX構文）：

sed -e :x -e '/\$/{N;bx' -e '}' -e 's/\
//g' file

または、awkを使用します（ファイルが大きすぎてメモリに収まらない場合）。

awk '{a=sub(/\$/,"");printf("%s%s",$0,a?"":RS)}' file

Andy · Answer

@Gilesソリューションに基づくMacバージョンは次のようになります

sed ':x /\$/{N; s|\'$'\n||; tx }' textfile

主な違いは、改行がどのように表されるかであり、さらにそれを1つの行に結合すると、改行されます。