r - subset data.table columns for calculation of new value but keep all old columns + new value -


  1. i had hard time phrasing title of question, if can suggest more accurate title, grateful.

i have matrix of ~100 samples , readings @ ~200,000 specific locations. locations string identifiers, sample readings numeric. example data:

library(data.table) str_col = c("str1", "str2", "str3") s1 = c(54.4, 19.0, 89.0) s2 = c(46.6, 39.5, 85.2) s3 = c(12.3, 0.2, 55.8)  dt = data.table(str_col, s1, s2, s3) dt    str_col   s1   s2   s3 1:    str1 54.4 46.6 12.3 2:    str2 19.0 39.5  0.2 3:    str3 89.0 85.2 55.8 

i calculate standard deviation of readings @ each location (rows) sample columns (s1 - s3) while leaving out location identifier str_col.

my attempt @ was:

dt[, -1, with=false][, stdv := apply(.sd, 1, sd)] dt    str_col   s1   s2   s3 1:    str1 54.4 46.6 12.3 2:    str2 19.0 39.5  0.2 3:    str3 89.0 85.2 55.8 

however returned original data.table, can see.

i can desired operation in steps follows:

dt_str_col = dt[,.(str_col)] dt2 = dt[, -1, with=false][, stdv := apply(.sd, 1, sd)] dt3 = data.table(dt_str_col, dt2) dt3    str_col   s1   s2   s3     stdv 1:    str1 54.4 46.6 12.3 22.39695 2:    str2 19.0 39.5  0.2 19.65613 3:    str3 89.0 85.2 55.8 18.17067 

but wondering if there way reference in data.table, similar first attempt dt[, -1, with=false][, stdv := apply(.sd, 1, sd)] ?

i believe solve problem, wouldn't ?

dt[ ,  sdd := sd(.sd), = str_col]  dt #>    str_col   s1   s2   s3  sdd #> 1:    str1 54.4 46.6 12.3 22.4 #> 2:    str2 19.0 39.5  0.2 19.7 #> 3:    str3 89.0 85.2 55.8 18.2 

in case there more 1 row per str_cols (i.e. want calculate standard deviation rows), can this:

# create column row positions dt[, rowpos := .i]  dt[ ,  sdd := sd(.sd[, -1, with=false]), = rowpos]