% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/monitor.R
\name{vetiver_compute_metrics}
\alias{vetiver_compute_metrics}
\alias{vetiver_pin_metrics}
\alias{vetiver_plot_metrics}
\title{Aggregate, update, and plot model metrics over time for monitoring}
\usage{
vetiver_compute_metrics(
  data,
  date_var,
  period,
  truth,
  estimate,
  ...,
  metric_set = yardstick::metrics,
  every = 1L,
  origin = NULL,
  before = 0L,
  after = 0L,
  complete = FALSE
)

vetiver_pin_metrics(
  board,
  df_metrics,
  metrics_pin_name,
  .index = .index,
  overwrite = FALSE
)

vetiver_plot_metrics(
  df_metrics,
  .index = .index,
  .estimate = .estimate,
  .metric = .metric,
  .n = .n
)
}
\arguments{
\item{data}{A \code{data.frame} containing the columns specified by \code{truth},
\code{estimate}, and \code{...}.}

\item{date_var}{The column in \code{data} containing dates or date-times for
monitoring, to be aggregated with \code{.period}}

\item{period}{\verb{[character(1)]}

A string defining the period to group by. Valid inputs can be roughly
broken into:
\itemize{
\item \code{"year"}, \code{"quarter"}, \code{"month"}, \code{"week"}, \code{"day"}
\item \code{"hour"}, \code{"minute"}, \code{"second"}, \code{"millisecond"}
\item \code{"yweek"}, \code{"mweek"}
\item \code{"yday"}, \code{"mday"}
}}

\item{truth}{The column identifier for the true results (that
is \code{numeric} or \code{factor}). This should be an unquoted column name
although this argument is passed by expression and support
\link[rlang:topic-inject]{quasiquotation} (you can unquote column
names).}

\item{estimate}{The column identifier for the predicted results
(that is also \code{numeric} or \code{factor}). As with \code{truth} this can be
specified different ways but the primary method is to use an
unquoted variable name.}

\item{...}{A set of unquoted column names or one or more
\code{dplyr} selector functions to choose which variables contain the
class probabilities. If \code{truth} is binary, only 1 column should be selected.
Otherwise, there should be as many columns as factor levels of \code{truth}.}

\item{metric_set}{A \code{\link[yardstick:metric_set]{yardstick::metric_set()}} function for computing metrics.
Defaults to \code{\link[yardstick:metrics]{yardstick::metrics()}}.}

\item{every}{\verb{[positive integer(1)]}

The number of periods to group together.

For example, if the period was set to \code{"year"} with an every value of \code{2},
then the years 1970 and 1971 would be placed in the same group.}

\item{origin}{\verb{[Date(1) / POSIXct(1) / POSIXlt(1) / NULL]}

The reference date time value. The default when left as \code{NULL} is the
epoch time of \verb{1970-01-01 00:00:00}, \emph{in the time zone of the index}.

This is generally used to define the anchor time to count from, which is
relevant when the every value is \verb{> 1}.}

\item{before, after}{\verb{[integer(1) / Inf]}

The number of values before or after the current element to
include in the sliding window. Set to \code{Inf} to select all elements
before or after the current element. Negative values are allowed, which
allows you to "look forward" from the current element if used as the
\code{.before} value, or "look backwards" if used as \code{.after}.}

\item{complete}{\verb{[logical(1)]}

Should the function be evaluated on complete windows only? If \code{FALSE},
the default, then partial computations will be allowed.}

\item{board}{A pin board, created by \code{\link[pins:board_folder]{board_folder()}}, \code{\link[pins:board_rsconnect]{board_rsconnect()}},
\code{\link[pins:board_url]{board_url()}} or another \code{board_} function.}

\item{df_metrics}{A tidy dataframe of metrics over time, such as created by
\code{vetiver_compute_metrics()}.}

\item{metrics_pin_name}{Pin name for where the \emph{metrics} are stored (as
opposed to where the model object is stored with \code{\link[=vetiver_pin_write]{vetiver_pin_write()}}).}

\item{.index}{The variable in \code{df_metrics} containing the aggregated dates
or date-times (from \code{time_var} in \code{data}). Defaults to \code{.index}.}

\item{overwrite}{If \code{FALSE} (the default), error when the new metrics contain
overlapping dates with the existing pin.If \code{TRUE}, overwrite any metrics for
dates that exist both in the existing pin and new metrics with the \emph{new}
values.}

\item{.estimate}{The variable in \code{df_metrics} containing the metric estimate.
Defaults to \code{.estimate}.}

\item{.metric}{The variable in \code{df_metrics} containing the metric type.
Defaults to \code{.metric}.}

\item{.n}{The variable in \code{df_metrics} containing the number of observations
used for estimating the metric.}
}
\value{
Both \code{vetiver_compute_metrics()} and \code{vetiver_pin_metrics()} return
a dataframe of metrics. The \code{vetiver_plot_metrics()} function returns a
\code{ggplot2} object.
}
\description{
These three functions can be used for model monitoring (such as in a
monitoring dashboard):
\itemize{
\item \code{vetiver_compute_metrics()} computes metrics (such as accuracy for a
classification model or RMSE for a regression model) at a chosen time
aggregation \code{.period}
\item \code{vetiver_pin_metrics()} updates an existing pin storing model metrics
over time
\item \code{vetiver_plot_metrics()} creates a plot of metrics over time
}
}
\details{
Sometimes when you monitor a model at a given time aggregation, you
may end up with dates in your new metrics (like \code{new_metrics} in the example)
that are the same as dates in your existing aggregated metrics (like
\code{original_metrics} in the example). This can happen if you need to re-run a
monitoring report because something failed. With \code{overwrite = FALSE} (the
default), \code{vetiver_pin_metrics()} will error when there are overlapping
dates. With \code{overwrite = TRUE}, \code{vetiver_pin_metrics()} will replace such
metrics with the new values. You probably want \code{FALSE} for interactive use
and \code{TRUE} for dashboards or reports that run on a schedule.

For arguments used more than once in your monitoring dashboard,
such as \code{date_var}, consider using
\href{https://bookdown.org/yihui/rmarkdown/parameterized-reports.html}{R Markdown parameters}
to reduce repetition and/or errors.
}
\examples{
\dontshow{if (rlang::is_installed(c("dplyr", "parsnip", "modeldata", "ggplot2"))) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf}
library(dplyr)
library(parsnip)
data(Chicago, package = "modeldata")
Chicago <- Chicago \%>\% select(ridership, date, all_of(stations))
training_data <- Chicago \%>\% filter(date < "2009-01-01")
testing_data <- Chicago \%>\% filter(date >= "2009-01-01", date < "2011-01-01")
monitoring <- Chicago \%>\% filter(date >= "2011-01-01", date < "2012-12-31")
lm_fit <- linear_reg() \%>\% fit(ridership ~ ., data = training_data)

library(pins)
b <- board_temp()

## before starting monitoring, initiate the metrics and pin
## (for example, with the testing data):
original_metrics <-
    augment(lm_fit, new_data = testing_data) \%>\%
    vetiver_compute_metrics(date, "week", ridership, .pred, every = 4L)
pin_write(b, original_metrics, "lm_fit_metrics")

## to continue monitoring with new data, compute metrics and update pin:
new_metrics <-
    augment(lm_fit, new_data = monitoring) \%>\%
    vetiver_compute_metrics(date, "week", ridership, .pred, every = 4L)
vetiver_pin_metrics(b, new_metrics, "lm_fit_metrics")

library(ggplot2)
vetiver_plot_metrics(new_metrics) +
    scale_size(range = c(2, 4))
\dontshow{\}) # examplesIf}
}
