| Title: | Time-Based Rolling Functions |
|---|---|
| Description: | Provides rolling statistical functions based on date and time windows instead of n-lagged observations. |
| Authors: | Michael Schramm [aut, cre, cph] (ORCID: <https://orcid.org/0000-0003-1876-6592>), Frank Harrell [ctb], Bob Rudis [ctb] |
| Maintainer: | Michael Schramm <[email protected]> |
| License: | GPL-3 | file LICENSE |
| Version: | 0.1.7 |
| Built: | 2026-05-29 10:38:36 UTC |
| Source: | https://github.com/mps9506/tbrf |
Data from the Texas Commission on Environmental Quality Surface Water Quality Monitoring Information System. The 'AverageDO“ field is the mean of dissolved oxygen concentrations (mg/L) measured at a field site at that day. The MinDO is the minimum dissolved oxygen concentration measured at that site on that day.
data(Dissolved_Oxygen)data(Dissolved_Oxygen)
A data frame with 236 rows and 6 variables:
unique water quality monitoring station identifier
sampling date in yyyy-mm-dd format
unique parameter code
parameter description with units
mean of dissolved oxygen measurement, in mg/L
minimum of dissolved oxygen measurement, in mg/L
https://www80.tceq.texas.gov/SwqmisPublic/public/default.htm
Data from the Texas Commission on Environmental Quality Surface Water Quality Monitoring Information System. The 'Value“ field is the lab measured value of Enterococci bacteria (MPN/100 mL) from grab samples collected at 'Station ID' on the Tres Palacios River on 'Date'.
data(Entero)data(Entero)
A data frame with 212 rows and 5 variables:
unique water quality monitoring station identifier
sampling date in yyyy-mm-dd format
unique parameter code
parameter description with units
Enterococci concentration, in MPN/L
https://www80.tceq.texas.gov/SwqmisPublic/public/default.htm
Provides stairstep values for ribbon plots. This was originally in Bob Rudis's ggalt package which is no longer on CRAN.
stat_stepribbon( mapping = NULL, data = NULL, geom = "ribbon", position = "identity", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, direction = "hv", ... )stat_stepribbon( mapping = NULL, data = NULL, geom = "ribbon", position = "identity", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, direction = "hv", ... )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
which geom to use; defaults to " |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
direction |
|
... |
Other arguments passed on to
|
Bob Rudis
https://groups.google.com/forum/?fromgroups=#!topic/ggplot2/9cFWHaH1CPs
x <- 1:10 df <- data.frame(x=x, y=x+10, ymin=x+7, ymax=x+12) gg <- ggplot(df, aes(x, y)) gg <- gg + geom_ribbon(aes(ymin=ymin, ymax=ymax), stat="stepribbon", fill="#b2b2b2") gg <- gg + geom_step(color="#2b2b2b") gg gg <- ggplot(df, aes(x, y)) gg <- gg + geom_ribbon(aes(ymin=ymin, ymax=ymax), stat="stepribbon", fill="#b2b2b2", direction="hv") gg <- gg + geom_step(color="#2b2b2b") ggx <- 1:10 df <- data.frame(x=x, y=x+10, ymin=x+7, ymax=x+12) gg <- ggplot(df, aes(x, y)) gg <- gg + geom_ribbon(aes(ymin=ymin, ymax=ymax), stat="stepribbon", fill="#b2b2b2") gg <- gg + geom_step(color="#2b2b2b") gg gg <- ggplot(df, aes(x, y)) gg <- gg + geom_ribbon(aes(ymin=ymin, ymax=ymax), stat="stepribbon", fill="#b2b2b2", direction="hv") gg <- gg + geom_step(color="#2b2b2b") gg
Produces a a rolling time-window based vector of binomial probability and confidence intervals.
tbr_binom(.tbl, x, tcolumn, unit = "years", n, alpha = 0.05, na.pad = TRUE)tbr_binom(.tbl, x, tcolumn, unit = "years", n, alpha = 0.05, na.pad = TRUE)
.tbl |
dataframe with two variables. |
x |
indicates the variable column containing "success" and "failure" observations coded as 1 or 0. |
tcolumn |
indicates the variable column containing Date or Date-Time values. |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window in the selected units. |
alpha |
numeric, probability of a type 1 error, so confidence coefficient = 1-alpha |
na.pad |
logical. If 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'. Defatuls to 'TRUE' |
tibble with binomial point estimate and confidence intervals.
## Generate Sample Data df <- tibble::tibble( date = sample(seq(as.Date('2000-01-01'), as.Date('2015/12/30'), by = "day"), 100), value = rbinom(100, 1, 0.25) ) ## Run Function tbr_binom(df, x = value, tcolumn = date, unit = "years", n = 5, alpha = 0.1, na.pad = FALSE)## Generate Sample Data df <- tibble::tibble( date = sample(seq(as.Date('2000-01-01'), as.Date('2015/12/30'), by = "day"), 100), value = rbinom(100, 1, 0.25) ) ## Run Function tbr_binom(df, x = value, tcolumn = date, unit = "years", n = 5, alpha = 0.1, na.pad = FALSE)
Produces a a rolling time-window based vector of geometric means and confidence intervals.
tbr_gmean(.tbl, x, tcolumn, unit = "years", n, na.pad = TRUE, ...)tbr_gmean(.tbl, x, tcolumn, unit = "years", n, na.pad = TRUE, ...)
.tbl |
a data frame with at least two variables; time column formatted as date, date/time and value column. |
x |
column containing the values to calculate the geometric mean. |
tcolumn |
formatted time column. |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window. |
na.pad |
logical. If 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'. Defatuls to 'TRUE' |
... |
additional arguments passed to |
tibble with columns for the rolling geometric mean and upper and lower confidence levels.
## Return a tibble with new rolling geometric mean column tbr_gmean(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, na.pad = FALSE) ## Not run: ## Return a tibble with rolling geometric mean and 95% CI tbr_gmean(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, conf = .95) ## End(Not run)## Return a tibble with new rolling geometric mean column tbr_gmean(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, na.pad = FALSE) ## Not run: ## Return a tibble with rolling geometric mean and 95% CI tbr_gmean(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, conf = .95) ## End(Not run)
Produces a a rolling time-window based vector of means and confidence intervals.
tbr_mean(.tbl, x, tcolumn, unit = "years", n, na.pad = TRUE, ...)tbr_mean(.tbl, x, tcolumn, unit = "years", n, na.pad = TRUE, ...)
.tbl |
a data frame with at least two variables; time column formatted as date, date/time and value column. |
x |
column containing the numeric values to calculate the mean. |
tcolumn |
formatted time column. |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window. |
na.pad |
logical. If 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'. Defaults to 'TRUE' |
... |
additional arguments passed to |
tibble with columns for the rolling mean and upper and lower confidence intervals.
## Return a tibble with new rolling mean column tbr_mean(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, na.pad = FALSE) ## Not run: ## Return a tibble with rolling mean and 95% CI tbr_mean(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, conf = .95) ## End(Not run)## Return a tibble with new rolling mean column tbr_mean(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, na.pad = FALSE) ## Not run: ## Return a tibble with rolling mean and 95% CI tbr_mean(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, conf = .95) ## End(Not run)
Produces a a rolling time-window based vector of medians and confidence intervals.
tbr_median(.tbl, x, tcolumn, unit = "years", n, na.pad = TRUE, ...)tbr_median(.tbl, x, tcolumn, unit = "years", n, na.pad = TRUE, ...)
.tbl |
a data frame with at least two variables; time column formatted as date, date/time and value column. |
x |
column containing the numeric values to calculate the mean. |
tcolumn |
formatted time column. |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window. |
na.pad |
logical. If 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'. Defaults to 'TRUE' |
... |
additional arguments passed to |
tibble with columns for the rolling median and upper and lower confidence intervals.
## Return a tibble with new rolling median column tbr_median(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, na.pad = FALSE) ## Not run: ## Return a tibble with rolling median and 95% CI tbr_median(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, conf = .95) ## End(Not run)## Return a tibble with new rolling median column tbr_median(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, na.pad = FALSE) ## Not run: ## Return a tibble with rolling median and 95% CI tbr_median(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, conf = .95) ## End(Not run)
Use Generic Functions with Time Windows
tbr_misc(.tbl, x, tcolumn, unit = "years", n, na.pad = TRUE, func, ...)tbr_misc(.tbl, x, tcolumn, unit = "years", n, na.pad = TRUE, func, ...)
.tbl |
a data frame with at least two variables; time column formatted as date, date/time and value column. |
x |
column containing the values the function is applied to. |
tcolumn |
formatted time column. |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window. |
na.pad |
logical. If 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'. Defaults to 'TRUE' |
func |
specified function |
... |
optional additional arguments passed to function |
tibble
tbr_misc(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, na.pad = FALSE, func = mean)tbr_misc(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, na.pad = FALSE, func = mean)
Time-Based Rolling Standard Deviation
tbr_sd(.tbl, x, tcolumn, unit = "years", n, na.rm = FALSE, na.pad = TRUE)tbr_sd(.tbl, x, tcolumn, unit = "years", n, na.rm = FALSE, na.pad = TRUE)
.tbl |
a data frame with at least two variables; time column formatted as date, date/time and value column. |
x |
column containing the values to calculate the standard deviation. |
tcolumn |
formatted time column. |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window. |
na.rm |
logical. Should missing values be removed? |
na.pad |
logical. If 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'. Defaults to 'TRUE' |
tibble with column for the rolling sd.
tbr_sd(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, na.pad = FALSE)tbr_sd(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, na.pad = FALSE)
Time-Based Rolling Sum
tbr_sum(.tbl, x, tcolumn, unit = "years", n, na.rm = FALSE, na.pad = TRUE)tbr_sum(.tbl, x, tcolumn, unit = "years", n, na.rm = FALSE, na.pad = TRUE)
.tbl |
a data frame with at least two variables; time column formatted as date, date/time and value column. |
x |
column containing the values to calculate the sum. |
tcolumn |
formatted time column. |
unit |
character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds" |
n |
numeric, describing the length of the time window. |
na.rm |
logical. Should missing values be removed? |
na.pad |
logical. If 'na.pad = TRUE' incomplete windows (duration of the window < 'n') return 'NA'. Defatuls to 'TRUE' |
dataframe with column for the rolling sum.
tbr_sum(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, na.pad = FALSE)tbr_sum(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, na.pad = FALSE)
Provides rolling statistical functions based on date and time windows instead of n-lagged observations.
Michael Schramm
Useful links:
tbrf makes use of the ggproto class system to extend the functionality of ggplot2. In general the actual classes should be of little interest to users as the standard ggplot2 api of using geom_* and stat_* functions for building up the plot is encouraged.
https://groups.google.com/forum/?fromgroups=#!topic/ggplot2/9cFWHaH1CPs