R: dplyr group by date range -
i'm trying group data frame 3 date ranges based on "2016-04-10" , "2016-04-24".
df <- structure(list(date = structure(c(16803, 16810, 16817, 16824, 16831, 16838, 16845, 16852, 16859, 16866, 16873, 16880, 16887, 16894, 16901, 16908, 16915, 16922, 16929, 16936, 16943), class = "date"), new = c(1507l, 2851l, 3550l, 5329l, 7557l, 5546l, 6264l, 7160l, 9468l, 5789l, 5928l, 4642l, 8145l, 4867l, 4846l, 5231l, 7137l, 3938l, 3741l, 2937l, 194l), resolved = c(21, 27, 15, 16, 56, 2773, 8490, 8748, 9325, 7734, 10264, 6739, 6110, 9613, 10314, 10349, 7200, 9637, 10831, 11170, 5666), ost = c(1486, 2824, 3535, 5313, 7501, 2773, -2226, -1588, 143, -1945, -4336, -2097, 2035, -4746, -5468, -5118, -63, -5699, -7090, -8233, -5472)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(na, -21l), .names = c("date", "new", "resolved", "ost"))
tried following:
df1 <- df %>% group_by(dr=cut(date,breaks=as.date(c("2016-04-10","2016-04-24")))) %>% summarise(ost = sum(ost))
which gives wrong result below.
dr ost 2016-04-10 -10586 na -17885
help appreciated!
you can create grouping variable first,
df %>% mutate(group = cumsum(grepl('2016-04-10|2016-04-24', date))) %>% group_by(group) %>% summarise(ost = sum(ost)) #source: local data frame [3 x 2] # group ost # (int) (dbl) #1 0 8672 #2 1 -10586 #3 2 -26557
Comments
Post a Comment