doing a plyr operation on every row of a data frame in R -
i plyr syntax. time have use 1 of *apply() commands end kicking dog , going on 3 day bender. sake of dog , liver, what's concise syntax doing ddply operation on every row of data frame?
here's example works simple case:
x <- rnorm(10) y <- rnorm(10) df <- data.frame(x,y) ddply(df,names(df) ,function(df) max(df$x,df$y))
that works fine , gives me want. if things more complex causes plyr funky (and not bootsy collins) because plyr chewing on making "levels" out of floating point values
x <- rnorm(1000) y <- rnorm(1000) z <- rnorm(1000) myletters <- sample(letters, 1000, replace=t) df <- data.frame(x,y, z, myletters) ddply(df,names(df) ,function(df) max(df$x,df$y))
on box chews few minutes , returns:
error: memory exhausted (limit reached?) in addition: warning messages: 1: in paste(rep(l, each = ll), rep(lvs, length(l)), sep = sep) : reached total allocation of 1535mb: see help(memory.size) 2: in paste(rep(l, each = ll), rep(lvs, length(l)), sep = sep) : reached total allocation of 1535mb: see help(memory.size)
i think totally abusing plyr , not saying bug in plyr, rather abusive behavior me (liver , dog notwithstanding).
so in short, there syntax shortcut using ddply operate on every row substitute apply(x, 1, ...)
?
the workaround i've been using create "key" gives unique value every row , can join it.
x <- rnorm(1000) y <- rnorm(1000) z <- rnorm(1000) myletters <- sample(letters, 1000, replace=t) df <- data.frame(x,y, z, myletters) #make key df$mykey <- 1:nrow(df) myout <- merge(df, ddply(df,"mykey" ,function(df) max(df$x,df$y))) #knock out key myout$mykey <- null
but keep thinking "there has better way"
thanks!
just treat array , work on each row:
adply(df, 1, transform, max = max(x, y))
Comments
Post a Comment