R data.table計算で前の行の値を使用します

Question

1つの列の現在の値と別の列の前の値から計算されたdata.tableに新しい列を作成したい。前の行にアクセスすることは可能ですか？

例えば。：

> DT <- data.table(A=1:5, B=1:5*10, C=1:5*100) > DT A B C 1: 1 10 100 2: 2 20 200 3: 3 30 300 4: 4 40 400 5: 5 50 500 > DT[, D := C + BPreviousRow] # What is the correct code here?

正解は

> DT A B C D 1: 1 10 100 NA 2: 2 20 200 210 3: 3 30 300 320 4: 4 40 400 430 5: 5 50 500 540

Arun · Accepted Answer

shift()を v1.9.6 で実装すると、これは非常に簡単です。

_DT[ , D := C + shift(B, 1L, type="lag")] # or equivalently, in this case, DT[ , D := C + shift(B)] _

[〜＃〜] news [〜＃〜] から：

新しい関数shift()はvector、list、data.framesまたはの高速_lead/lag_を実装しますdata.tables。 type引数を取ります。引数は"lag"（デフォルト）または"lead"のいずれかです。 _:=_またはset()とともに非常に便利に使用できます。例：DT[, (cols) := shift(.SD, 1L), by=id]。詳細については、_?shift_をご覧ください。

以前の回答の履歴を参照してください。

Steven Beaupr&#233; · Answer

dplyrを使用すると、次のことができます。

mutate(DT, D = lag(B) + C)

与えるもの：

# A B C D #1: 1 10 100 NA #2: 2 20 200 210 #3: 3 30 300 320 #4: 4 40 400 430 #5: 5 50 500 540

dnlbrky · Answer

いくつかの人々が特定の質問に答えています。このような状況で役立つと思われる汎用関数については、以下のコードを参照してください。前の行を取得するだけでなく、「過去」または「将来」の行を好きなだけ行けます。

rowShift <- function(x, shiftLen = 1L) { r <- (1L + shiftLen):(length(x) + shiftLen) r[r<1] <- NA return(x[r]) } # Create column D by adding column C and the value from the previous row of column B: DT[, D := C + rowShift(B,-1)] # Get the Old Faithul eruption length from two events ago, and three events in the future: as.data.table(faithful)[1:5,list(eruptLengthCurrent=eruptions, eruptLengthTwoPrior=rowShift(eruptions,-2), eruptLengthThreeFuture=rowShift(eruptions,3))] ## eruptLengthCurrent eruptLengthTwoPrior eruptLengthThreeFuture ##1: 3.600 NA 2.283 ##2: 1.800 NA 4.533 ##3: 3.333 3.600 NA ##4: 2.283 1.800 NA ##5: 4.533 3.333 NA

Gary Weissman · Answer

上記の@Steve Lianoglouのコメントに基づいて、次の理由だけではありません：

DT[, D:= C + c(NA, B[.I - 1]) ] # A B C D # 1: 1 10 100 NA # 2: 2 20 200 210 # 3: 3 30 300 320 # 4: 4 40 400 430 # 5: 5 50 500 540

seq_lenまたはheadまたはその他の関数の使用を避けてください。

Ryogi · Answer

Arunのソリューションに従って、.Nを参照せずに同様の結果を得ることができます

> DT[, D := C + c(NA, head(B, -1))][] A B C D 1: 1 10 100 NA 2: 2 20 200 210 3: 3 30 300 320 4: 4 40 400 430 5: 5 50 500 540

Abdullah Al Mahmud · Answer

これが私の直観的な解決策です。

_#create data frame df <- data.frame(A=1:5, B=seq(10,50,10), C=seq(100,500, 100))` #subtract the shift from num rows shift <- 1 #in this case the shift is 1 invshift <- nrow(df) - shift #Now create the new column df$D <- c(NA, head(df$B, invshift)+tail(df$C, invshift))` _

ここで、行数から1を引いたinvshiftは4です。nrow(df)は、データフレームまたはベクターの行数を提供します。同様に、以前の値を取得する場合は、nrow 2、3、...などから減算し、それに応じてNAを先頭に配置します。

geneorama · Answer

パディング引数を追加し、いくつかの名前を変更して、shiftと呼びました。 https://github.com/geneorama/geneorama/blob/master/R/shift.R