r - data.table confusion using rollends argument of rolling join -


i'm having trouble understanding data.table's rollends argument when doing rolling join.

the docs reference:

a logical vector length 2 (a single logical recycled) indicating whether values falling before first value or after last value group should rolled well.

  • if rollends[2]=true, roll last value forward. true default locf , false nocb rolls.

  • if rollends[1]=true, roll first value backward. true default nocb , false locf rolls.


now confusing example. here, build table of commercials , 2 different tables of sales.

# commercials commercials<-data.table(commercialid=c("c1","c2","c3","c4"), commercialdate=as.date(c("2014-1-1","2014-4-1","2014-7-1","2014-9-15"))) commercials[, rolldate:=commercialdate] #add column, rolldate equal commercialdate setkey(commercials, "rolldate")  commercials    commercialid commercialdate   rolldate 1:           c1     2014-01-01 2014-01-01 2:           c2     2014-04-01 2014-04-01 3:           c3     2014-07-01 2014-07-01 4:           c4     2014-09-15 2014-09-15   # sales1 (a single sale before commercials) sales1 <- data.table(saleid=c("s0"), saledate=as.date(c("2010-12-31"))) sales1[, rolldate:=saledate] setkey(sales1, "rolldate")  sales1 saleid   saledate   rolldate 1:     s0 2010-12-31 2010-12-31   # sales2 (a sale before commercials , sale after commercial1) sales2 <- data.table(saleid=c("s0", "s1"), saledate=as.date(c("2010-12-31", "2014-2-1"))) sales2[, rolldate:=saledate] setkey(sales2, "rolldate")  sales2 saleid   saledate   rolldate 1:     s0 2010-12-31 2010-12-31 2:     s1 2014-02-01 2014-02-01 

now rolling joins

sales1[commercials, roll=true, rollends=c(true, false)]    saleid saledate   rolldate commercialid commercialdate 1:     na     <na> 2014-01-01           c1     2014-01-01 2:     na     <na> 2014-04-01           c2     2014-04-01 3:     na     <na> 2014-07-01           c3     2014-07-01 4:     na     <na> 2014-09-15           c4     2014-09-15  sales2[commercials, roll=true, rollends=c(true, false)]    saleid   saledate   rolldate commercialid commercialdate 1:     s0 2010-12-31 2014-01-01           c1     2014-01-01 2:     na       <na> 2014-04-01           c2     2014-04-01 3:     na       <na> 2014-07-01           c3     2014-07-01 4:     na       <na> 2014-09-15           c4     2014-09-15 

questions

  1. why sale s0 mapped c1 in second join not first?
  2. a better/different explanation of rollends doing.

oh, , i'm using development version, 1.9.7

in first case,

sales1[commercials, roll=true, rollends=c(true, false)] 

2014-01-01 row in commercials falls after 2010-12-31. prevailing value has carried forward. falls on end, i.e., after sales1, , you've provided rollends[2] = false. doesn't rolled forward.

in second case,

sales2[commercials, roll=true, rollends=c(true, false)] 

2014-01-01 row in commercials falls in between 2010-12-31 , 2014-02-01. there's no effect of rollends row since doesn't fall on either end. last value gets rolled forward.

all other values fall outside of sales2. rollends argument comes play. , rollends[2] = false] means prevailing values won't rolled forwards.