- i had hard time phrasing title of question, if can suggest more accurate title, grateful.
i have matrix of ~100 samples , readings @ ~200,000 specific locations. locations string identifiers, sample readings numeric. example data:
library(data.table) str_col = c("str1", "str2", "str3") s1 = c(54.4, 19.0, 89.0) s2 = c(46.6, 39.5, 85.2) s3 = c(12.3, 0.2, 55.8) dt = data.table(str_col, s1, s2, s3) dt str_col s1 s2 s3 1: str1 54.4 46.6 12.3 2: str2 19.0 39.5 0.2 3: str3 89.0 85.2 55.8
i calculate standard deviation of readings @ each location (rows) sample columns (s1 - s3) while leaving out location identifier str_col
.
my attempt @ was:
dt[, -1, with=false][, stdv := apply(.sd, 1, sd)] dt str_col s1 s2 s3 1: str1 54.4 46.6 12.3 2: str2 19.0 39.5 0.2 3: str3 89.0 85.2 55.8
however returned original data.table, can see.
i can desired operation in steps follows:
dt_str_col = dt[,.(str_col)] dt2 = dt[, -1, with=false][, stdv := apply(.sd, 1, sd)] dt3 = data.table(dt_str_col, dt2) dt3 str_col s1 s2 s3 stdv 1: str1 54.4 46.6 12.3 22.39695 2: str2 19.0 39.5 0.2 19.65613 3: str3 89.0 85.2 55.8 18.17067
but wondering if there way reference in data.table, similar first attempt dt[, -1, with=false][, stdv := apply(.sd, 1, sd)]
?
i believe solve problem, wouldn't ?
dt[ , sdd := sd(.sd), = str_col] dt #> str_col s1 s2 s3 sdd #> 1: str1 54.4 46.6 12.3 22.4 #> 2: str2 19.0 39.5 0.2 19.7 #> 3: str3 89.0 85.2 55.8 18.2
in case there more 1 row per str_cols (i.e. want calculate standard deviation rows), can this:
# create column row positions dt[, rowpos := .i] dt[ , sdd := sd(.sd[, -1, with=false]), = rowpos]