ddplyエラーの意味： 'names'属性[9]はベクトル[1]と同じ長さでなければなりません

Question

私はハッカーのための機械学習を行っていますが、私はこの行で立ち往生しています：

from.weight <- ddply(priority.train, .(From.EMail), summarise, Freq = length(Subject))

次のエラーが生成されます。

Error in attributes(out) <- attributes(col) : 'names' attribute [9] must be the same length as the vector [1]

これはtraceback（）です：

> traceback() 11: FUN(1:5[[1L]], ...) 10: lapply(seq_len(n), extract_col_rows, df = x, i = i) 9: extract_rows(x$data, x$index[[i]]) 8: `[[.indexed_df`(pieces, i) 7: pieces[[i]] 6: function (i) { piece <- pieces[[i]] if (.inform) { res <- try(.fun(piece, ...)) if (inherits(res, "try-error")) { piece <- paste(capture.output(print(piece)), collapse = "
") stop("with piece ", i, ": 
", piece, call. = FALSE) } } else { res <- .fun(piece, ...) } progress$step() res }(1L) 5: .Call("loop_apply", as.integer(n), f, env) 4: loop_apply(n, do.ply) 3: llply(.data = .data, .fun = .fun, ..., .progress = .progress, .inform = .inform, .parallel = .parallel, .paropts = .paropts) 2: ldply(.data = pieces, .fun = .fun, ..., .progress = .progress, .inform = .inform, .parallel = .parallel, .paropts = .paropts) 1: ddply(priority.train, .(From.EMail), summarise, Freq = length(Subject))

Priority.trainオブジェクトはデータフレームです。詳細は次のとおりです。

> mode(priority.train) [1] "list" > names(priority.train) [1] "Date" "From.EMail" "Subject" "Message" "Path" > sapply(priority.train, mode) Date From.EMail Subject Message Path "list" "character" "character" "character" "character" > sapply(priority.train, class) $Date [1] "POSIXlt" "POSIXt" $From.EMail [1] "character" $Subject [1] "character" $Message [1] "character" $Path [1] "character" > length(priority.train) [1] 5 > nrow(priority.train) [1] 1250 > ncol(priority.train) [1] 5 > str(priority.train) 'data.frame': 1250 obs. of 5 variables: $ Date : POSIXlt, format: "2002-01-31 22:44:14" "2002-02-01 00:53:41" "2002-02-01 02:01:44" "2002-02-01 10:29:23" ... $ From.EMail: chr "removed@removed.ca" "removed@removed.net" "removed@removed.ca" "removed@removed.net" ... $ Subject : chr "please help a newbie compile mplayer :-)" "re: please help a newbie compile mplayer :-)" "re: please help a newbie compile mplayer :-)" "re: please help a newbie compile mplayer :-)" ... $ Message : chr " 
 Hello,
 
 I just installed redhat 7.2 and I think I have everything 
working properly. Anyway I want to in"| __truncated__ "Make sure you rebuild as root and you're in the directory that you
downloaded the file. Also it might complain of a few depen"| __truncated__ "Lance wrote:

>Make sure you rebuild as root and you're in the directory that you
>downloaded the file. Also it might compl"| __truncated__ "Once upon a time, rob wrote :

> I dl'd gcc3 and libgcc3, but I still get the same error message when I 
> try rpm --rebuil"| __truncated__ ... $ Path : chr "../03-Classification/data/easy_ham/01061.6610124afa2a5844d41951439d1c1068" "../03-Classification/data/easy_ham/01062.ef7955b391f9b161f3f2106c8cda5edb" "../03-Classification/data/easy_ham/01063.ad3449bd2890a29828ac3978ca8c02ab" "../03-Classification/data/easy_ham/01064.9f4fc60b4e27bba3561e322c82d5f7ff" ... Warning messages: 1: In encodeString(object, quote = "\"", na.encode = FALSE) : it is not known that wchar_t is Unicode on this platform 2: In encodeString(object, quote = "\"", na.encode = FALSE) : it is not known that wchar_t is Unicode on this platform

私はサンプルを投稿しますが、コンテンツは少し長く、コンテンツはここでは関係ないと思います。

ここでも同じエラーが発生します。

> ddply(priority.train, .(Subject)) Error in attributes(out) <- attributes(col) : 'names' attribute [9] must be the same length as the vector [1]

ここで何が起こっているのか誰にも手がかりがありますか？名前属性には明らかに9つの要素があるため、このエラーはpriority.trainとは異なるオブジェクトによって生成されるようです。

助けていただければ幸いです。ありがとう！

問題解決

この問題は、@ user1317221_Gのdput関数の使用に関するヒントのおかげで見つかりました。問題は、この時点で9つのフィールド（sec、min、hour、mday、mon、year、wday、yday、isdst）を含むリストであるDateフィールドにあります。この問題を解決するために、日付を文字ベクトルに変換し、ddplyを使用して、日付を日付に変換し直しました。

> tmp <- priority.train$Date > priority.train$Date <- as.character(priority.train$Date) > from.weight <- ddply(priority.train, .(From.EMail), summarise, Freq = length(Subject)) > priority.train$Date <- tmp > rm(tmp)

c.gutierrez · Accepted Answer

上記のHadleyが示唆しているように、POSIXltからPOSIXctにフォーマットを変換することで、この問題を修正しました-1行のコード：

 mydata$datetime<-strptime(mydata$datetime, "%Y-%m-%d %H:%M:%S") # original conversion from datetime string : > class(mydata$datetime) [1] "POSIXlt" "POSIXt" mydata$datetime<-as.POSIXct(mydata$datetime) # convert to POSIXct to use in data frames / ddply

user1317221_G · Answer

おそらくすでにこれを見てくださいであり、助けにはなりません。他の人があなたのエラーを再現することはできないので、おそらくまだ答えはないでしょう。

dput以下のhead(dput())がこれを助けるかもしれません。しかし、これはbaseを使用した代替案です。

x <- data.frame(A=c("a","b","c","a"),B=c("e","d","d","d")) ddply(x,.(A),summarise, Freq = length(B)) A Freq 1 a 2 2 b 1 3 c 1 tapply(x$B,x$A,length) a b c 2 1 1

これはtapplyでうまくいきますか？

x2 <- data.frame(A=c("removed@removed.ca", "removed@removed.net"), B=c("please help a newbie compile mplayer :-)", "re: please help a newbie compile mplayer :-)")) tapply(x2$B,x2$A,length) removed@removed.ca removed@removed.net 1 1 ddply(x2,.(A),summarise, Freq = length(B)) A Freq 1 removed@removed.ca 1 2 removed@removed.net 1

より簡単に試すこともできます：

table(x2$A) removed@removed.ca removed@removed.net 1 1

Yishin Lin · Answer

私は非常に似たような問題を抱えていましたが、それが同一のものかどうかはわかりません。以下のエラーを受け取りました。

Error in attributes(out) <- attributes(col) : 'names' attribute [20388] must be the same length as the vector [128]

リストモードには変数がないため、Motaのソリューションは私の状況では機能しません。問題を分類した方法は、plyr 1.8を削除し、plyr 1.7を手動でインストールすることです。エラーはなくなりました。また、plyr 1.8を再インストールして、問題を再現しようとしました。

HTH。

Ravi · Answer

私もddplyで同様の問題に直面し、以下のコード/エラーを与えました：

 test <- ddply(test, "catColumn", function(df) df[1:min(nrow(df), 3),]) Error: 'names' attribute [11] must be the same length as the vector [2]

データフレームの「テスト」には、かなりの数のカテゴリ変数がありました。

カテゴリ変数を次のように文字変数に変換すると、ddplyコマンドが機能します。

 test <- data.frame(lapply(test, as.character), stringsAsFactors=FALSE)

Chuck · Answer

干渉しているのはその日付列であることを理解したら、コマンドを変換するのではなく、実行するときにその列をそのまま残すこともできます...

そう

from.weight <- ddply(priority.train, .(From.EMail), summarise, Freq = length(Subject))

になることができる

from.weight <- ddply(priority.train[,c(1:7,9:10)], .(From.EMail), summarise, Freq = length(Subject))

たとえば、POSIXltの日付がデータフレームの列8にある場合。報告されたエラーの奇妙な点は、グループ化しようとしているものと、出力情報として求めているもののどちらにも関係ないことです...

Vanessa · Answer

ddplyの使用時に同じ問題が発生し、doByで修正しました

library(doBy) bylength = function(x){length(x)} newdt = bylength(X ~From.EMail + To.EMail, data = dt, FUN = bylength)

Nts · Answer

私も同じ問題に直面しています.ddplyに必要なデータを保持し、as.characterを使用してフィルター変数とすべての必要なテキスト変数を文字に変換するだけで解決します

動いた