Title: | Conditional Copula Model for Crop Yield Forecasting |
Version: | 1.0.0 |
Description: | Provides functions to model and forecast crop yields using a spatial temporal conditional copula approach. The package incorporates extreme weather covariates and Bayesian Structural Time Series models to analyze crop yield dependencies across multiple regions. Includes tools for fitting, simulating, and visualizing results. This method build upon established R packages, including 'Hofert' 'et' 'al'. (2025) <doi:10.32614/CRAN.package.copula>, 'Scott' (2024) <doi:10.32614/CRAN.package.bsts>, and 'Stephenson' 'et' 'al'. (2024) <doi:10.32614/CRAN.package.evd>. |
License: | MIT + file LICENSE |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
Imports: | bsts, copula, evd, ggplot2, grDevices, rootSolve, stats, utils |
Depends: | R (≥ 4.0.0) |
LazyData: | true |
LazyDataCompression: | xz |
Suggests: | knitr, rmarkdown, testthat (≥ 3.0.0), |
VignetteBuilder: | knitr |
Config/testthat/edition: | 3 |
NeedsCompilation: | no |
Packaged: | 2025-03-26 05:39:16 UTC; liyongkun |
Author: | Marie Michaelides [aut], Mélina Mailhot [aut], Yongkun Li [aut, cre] |
Maintainer: | Yongkun Li <yongkun.li@concordia.ca> |
Repository: | CRAN |
Date/Publication: | 2025-03-27 17:30:04 UTC |
Compute Gumbel Copula Parameter from Kendall's Tau
Description
Computes the Gumbel-Hougaard copula dependence parameter based on Kendall's tau.
Usage
GH.theta(tau)
Arguments
tau |
Numeric, Kendall's tau correlation coefficient. |
Value
Numeric, estimated Gumbel copula parameter.
Compute Clayton Copula Parameter from Kendall's Tau
Description
Computes the Clayton copula dependence parameter based on Kendall's tau.
Usage
clayton.theta(tau)
Arguments
tau |
Numeric, Kendall's tau correlation coefficient. |
Value
Numeric, estimated Clayton copula parameter.
Supported copula types
Description
A list containing supported copula types.
Usage
copula_list
Format
A list of copula types.
- copulas
"Gaussian" "Clayton" "Frank" "Gumbel" "Joe"
Data of the article "Probabilistic Crop Yields Forecasts With Spatio-Temporal Conditional Copula Using Extreme Weather Covariates"
Description
Contains crop yields and climate indices data of 24 CD regions in Ontario from 1950 to 2022
Usage
cropyields_covariates
Format
A data frame with 1752 rows and 38 variables:
- time
chr: year from 1950-2022
- CAR_CODE
num: 1-4
- CAR
chr: Southern, Western, Central, Eastern Ontario
- CD_CODE
num
- CD
chr: 24 subregions
- ID
chr
- lat
num: latitude
- lon
num: longitude
- yield
num: wheat crop yield per census division, in bushel/acre
- cdd
num: Annual maximum number of consecutive days with daily precipitation below 1mm (unit = days)
- cddcold_18
num: Annual cooling degree days above 18C (unit = degree_days)
- dlyfrzthw_tx0_tn
num: Annual number of days with a diurnal freeze-thaw cycle : tmax > 0 degc and tmin <= -1 degc
- first_fall_frost
num: First day of year with temperature below 0 degc for at least 1 days
- frost_days
num: Annual number of days with minimum daily temperature below 0C
- ice_days
num: Annual number of days with maximum daily temperature below 0 degC
- nr_cdd
num: The annual number of dry periods of 6 days and more, during which the maximal precipitation on a window of 6 days is under 1.0 mm
- prcptot
num: Annual total precipitation (unit = mm)
- r1mm
num: Annual number of days with daily precipitation over 1.0 mm/day
- r10mm
num: Annual number of days with daily precipitation over 10.0 mm/day
- r20mm
num: Annual number of days with daily precipitation over 20.0 mm/day
- rx1day
num: Annual maximum 1-day total precipitation (unit = mm)
- rx5day
num: Annual maximum 5-day total precipitation (unit = mm)
- tg_mean
num: Annual mean of daily mean temperatures (unit = C degrees)
- tn_mean
num: Annual mean of daily minimum temperatures (unit = C degrees)
- tn_min
num: Annual minimum of daily minimum temperatures (unit = C degrees)
- tnlt_-15
num: Annual number of days where daily minimum temperature is below -15 degC
- tnlt_-25
num: Annual number of days where daily minimum temperature is below -25 degC
- tr_18
num: Annual number of tropical nights : defined as days with minimum daily temperature above 18 degc
- tr_20
num: Annual number of tropical nights : defined as days with minimum daily temperature above 20 degc
- tr_22
num: Annual number of tropical nights : defined as days with minimum daily temperature above 22 degc
- tx_max
num: Annual minimum of daily maximum temperature (unit = C degrees)
- tx_mean
num: Annual mean of daily maximum temperature (unit = C degrees)
- txgt_25
num: Annual number of days where daily maximum temperature exceeds 25 degC
- txgt_27
num: Annual number of days where daily maximum temperature exceeds 27 degC
- txgt_29
num: Annual number of days where daily maximum temperature exceeds 29 degC
- txgt_30
num: Annual number of days where daily maximum temperature exceeds 30 degC
- txgt_32
num: Annual number of days where daily maximum temperature exceeds 32 degC
Source
ClimateData.ca
Compute Dynamic Gaussian Copula Correlation Parameter (rho)
Description
Computes the time-varying correlation parameter (rho) for a Gaussian copula.
Usage
dynamic.rho(params, lagged_rho, X_t)
Arguments
params |
Numeric vector of parameters: omega, alpha, and gamma coefficients. |
lagged_rho |
Numeric, the previous rho value. |
X_t |
Numeric vector or matrix of covariates at time t. |
Value
Numeric, estimated dynamic Gaussian copula correlation.
Compute Dynamic Clayton Copula Parameter
Description
Computes the Clayton copula parameter dynamically based on lagged values and covariates.
Usage
dynamic.theta.clayton(params, lagged_theta, X_t)
Arguments
params |
Numeric vector of parameters: omega, alpha, and gamma coefficients. |
lagged_theta |
Numeric, the previous theta value. |
X_t |
Numeric vector or matrix of covariates at time t. |
Value
Numeric, estimated dynamic Clayton copula parameter.
Compute Dynamic Frank Copula Parameter
Description
Computes the Frank copula parameter dynamically based on lagged values and covariates.
Usage
dynamic.theta.frank(params, lagged_theta, X_t)
Arguments
params |
Numeric vector of parameters: omega, alpha, and gamma coefficients. |
lagged_theta |
Numeric, the previous theta value. |
X_t |
Numeric vector or matrix of covariates at time t. |
Value
Numeric, estimated dynamic Frank copula parameter.
Compute Dynamic Gumbel Copula Parameter
Description
Computes the Gumbel copula parameter dynamically based on lagged values and covariates.
Usage
dynamic.theta.gumbel(params, lagged_theta, X_t)
Arguments
params |
Numeric vector of parameters: omega, alpha, and gamma coefficients. |
lagged_theta |
Numeric, the previous theta value. |
X_t |
Numeric vector or matrix of covariates at time t. |
Value
Numeric, estimated dynamic Gumbel copula parameter.
Compute Dynamic Joe Copula Parameter
Description
Computes the Joe copula parameter dynamically based on lagged values and covariates.
Usage
dynamic.theta.joe(params, lagged_theta, X_t)
Arguments
params |
Numeric vector of parameters: omega, alpha, and gamma coefficients. |
lagged_theta |
Numeric, the previous theta value. |
X_t |
Numeric vector or matrix of covariates at time t. |
Value
Numeric, estimated dynamic Joe copula parameter.
Fit a Bayesian Structural Time Series (BSTS) Model
Description
Fits a BSTS model for a time series y, given a vector or matrix of covariates z.
Usage
fit_bsts(y, z, lags = 0, MCMC.iter = 5000)
Arguments
y |
A numeric vector (time series response variable). |
z |
A numeric vector or matrix (covariates). |
lags |
Integer, number of lags for the autoregressive component. |
MCMC.iter |
Integer, number of MCMC iterations. |
Value
A fitted BSTS model.
Compute Frank Copula Parameter from Kendall's Tau
Description
Computes the Frank copula dependence parameter based on Kendall's tau.
Usage
frank.theta(tau)
Arguments
tau |
Numeric, Kendall's tau correlation coefficient. |
Value
Numeric, estimated Frank copula parameter.
Initial Parameters for 2D Pseudo-Loglikelihood Estimation
Description
Initial Parameters for 2D Pseudo-Loglikelihood Estimation
Usage
init_params_full
Format
A numeric vector of length (2+M+4*D*M)
where:
- omega
Baseline autoregressive coefficient.
- alpha
Parameter controlling variance.
- gamma1, gamma2, gamma3
Coefficients related to external factors.
- phi_gev
AR(1) coefficient for GEV.
- sigma_mu
Std dev of innovations for AR(1) process for GEV.
- sigma_gev
GEV scale parameter for GEV.
- xi_gev
GEV shape parameter for GEV.
Initial Parameters for 2D Pseudo-Loglikelihood-Generalized Estimation
Description
Initial Parameters for 2D Pseudo-Loglikelihood-Generalized Estimation
Usage
init_params_full_G
Format
A numeric vector of length (2+M+4*D*M)
, structured as follows:
- omega
Baseline autoregressive coefficient.
- alpha
Parameter controlling variance.
- gamma1, gamma2, gamma3
Coefficients related to external factors.
- Climate variable parameters
For each climate variable in each region, the following parameters are included:
-
mean(z)
,sd(z)
,sd(z)
,xi_gev
for each region and variable.
Initial Parameters for 2D Pseudo-Loglikelihood Estimation without GEV models for covariates
Description
Initial Parameters for 2D Pseudo-Loglikelihood Estimation without GEV models for covariates
Usage
init_params_noGEV
Format
A numeric vector of length (2+M)
where:
- omega
Baseline autoregressive coefficient.
- alpha
Parameter controlling variance.
- gamma1, gamma2, gamma3
Coefficients related to external factors.
Compute Joe Copula Parameter from Kendall's Tau
Description
Computes the Joe copula dependence parameter based on Kendall's tau.
Usage
joe.theta(tau)
Arguments
tau |
Numeric, Kendall's tau correlation coefficient. |
Value
Numeric, estimated Joe copula parameter.
Compute Log-Likelihood for a Generalized Dynamic Copula-GEV Model
Description
Computes the log-likelihood for a time-varying copula model combined with Generalized Extreme Value (GEV) margins.
Usage
log_likelihood_Generalized(params, U, Z, X, copula)
Arguments
params |
Numeric vector of model parameters, including copula parameters (omega, alpha, gamma) and GEV distribution parameters. |
U |
Numeric matrix (n_train x D), pseudo-observations for the copula. |
Z |
Numeric array (n_train x D x M), observed data for each margin and sub-feature. |
X |
Numeric matrix (n_train x M), risk factors for the dynamic copula parameter. |
copula |
Character, specifying the copula type: "Clayton", "Frank", "Gumbel", "Joe", or "Gaussian". |
Value
Numeric, negative log-likelihood value.
Examples
test_ll <- log_likelihood_Generalized(init_params_full_G,uu,
zz_train,xx_train,"Gaussian")
Generalized Log-Likelihood Function for 2D Copula-GEV Model
Description
Computes the negative log-likelihood of a 2-dimensional copula-GEV model, incorporating dynamic Generalized Extreme Value (GEV) parameters and a time-varying copula structure.
Usage
log_likelihood_generalized_2d(params, u1, u2, X_t, z1, z2, copula)
Arguments
params |
Numeric vector, model parameters including copula and GEV parameters. |
u1 |
Numeric vector (length |
u2 |
Numeric vector (length |
X_t |
Numeric matrix ( |
z1 |
Numeric matrix ( |
z2 |
Numeric matrix ( |
copula |
Character, specifying the copula type: "Clayton", "Frank", "Gumbel", "Joe", or "Gaussian". |
Value
The negative log-likelihood value for optimization.
Examples
test_ll_2d <-log_likelihood_generalized_2d(init_params_full,
uu[,1],
uu[,2],
xx_train,
zz_train[,1,],
zz_train[,2,],
"Gaussian")
Compute Log-Likelihood for a Generalized Dynamic Copula Model without GEV covariates
Description
Computes the log-likelihood for a time-varying copula model.
Usage
log_likelihood_noGEV(params, U, Z, X, copula)
Arguments
params |
Numeric vector of model parameters, including copula parameters (omega, alpha, gamma). |
U |
Numeric matrix (n_train x D), pseudo-observations for the copula. |
Z |
Numeric array (n_train x D x M), observed data for each margin and sub-feature. |
X |
Numeric matrix (n_train x M), risk factors for the dynamic copula parameter. |
copula |
Character, specifying the copula type: "Clayton", "Frank", "Gumbel", "Joe", or "Gaussian". |
Value
Numeric, negative log-likelihood value.
Examples
test_ll_noGEV <- log_likelihood_noGEV(init_params_noGEV,uu,
zz_train,x_train,"Gaussian")
list containing Dufferin and Wellington
Description
list containing Dufferin and Wellington
Usage
medoid_names
Format
An object of class list
of length 2.
19
Description
19
Usage
n_test
Format
An object of class integer
of length 1.
54
Description
54
Usage
n_train
Format
An object of class integer
of length 1.
Plot Observed Data and BSTS Forecast
Description
Creates a plot of observed data, forecasted values, and confidence intervals.
Usage
plot_forecast(
forecast,
data_train,
data_test,
time,
quant_high,
quant_low,
observed_col,
forecast_col,
title
)
Arguments
forecast |
A matrix of BSTS forecast samples. |
data_train |
Numeric vector, training data. |
data_test |
Numeric vector, test data. |
time |
Numeric vector, representing time indices. |
quant_high |
Numeric, upper quantile for confidence interval. |
quant_low |
Numeric, lower quantile for confidence interval. |
observed_col |
Character, color for observed data. |
forecast_col |
Character, color for forecasted data. |
title |
Character, title of the plot. |
Value
A ggplot2 object.
Compare Forecasts from Two Models
Description
Generates a time series plot comparing the forecasts from two models along with observed data.
Usage
plot_forecast_compare(
forecast1,
forecast2,
data_train,
data_test,
time,
quant_high,
quant_low,
col1,
title
)
Arguments
forecast1 |
Numeric matrix, forecasted values from the first model (columns: time points). |
forecast2 |
Numeric matrix, forecasted values from the second model (columns: time points). |
data_train |
Numeric vector, training data used for modeling. |
data_test |
Numeric vector, actual test data for evaluation. |
time |
Numeric vector, representing the time points corresponding to the data. |
quant_high |
Numeric, upper quantile (e.g., 0.9) for confidence interval. |
quant_low |
Numeric, lower quantile (e.g., 0.1) for confidence interval. |
col1 |
Character, color for observed data lines. |
title |
Character, title for the plot. |
Value
A ggplot2
object showing the forecast comparison.
Simulate Multivariate Crop Yield Data Using a Generalized Copula-BSTS Model Without GEV Covariates
Description
This function simulates multivariate crop yield data using a time-varying copula combined with Bayesian Structural Time Series (BSTS) models without GEV covariates for comparision.
Usage
simul.fun.noGEV(
nsim = 100,
n_train,
n_test,
copula,
init_params,
fn,
U_train,
Z_train,
Z_test,
X_train,
X_test,
Y_test,
BSTS_list
)
Arguments
nsim |
Integer, number of simulation replications. |
n_train |
Integer, number of training observations. |
n_test |
Integer, number of test observations. |
copula |
Character, specifying the copula type: "Clayton", "Frank", "Gumbel", "Joe", or "Gaussian". |
init_params |
Numeric vector, initial parameter values for optimization. |
fn |
Function, log-likelihood function for parameter estimation. |
U_train |
Numeric matrix (n_train x D), pseudo-observations for the copula. |
Z_train |
Numeric array (n_train x D x M), observed data for each margin and sub-feature. |
Z_test |
Numeric array (n_test x D x M), observed data for each margin and sub-feature. |
X_train |
Numeric matrix (n_train x M), risk factors for the dynamic copula parameter. |
X_test |
Numeric matrix (n_test x M), risk factors for the dynamic copula parameter. |
Y_test |
Numeric matrix (n_test x D), true future values for MSE calculation. |
BSTS_list |
List of length D, each element is a BSTS model for a different margin. |
Value
A list containing:
optim_results |
Results from the optimization process. |
theta_sim |
Simulated copula parameters across replications. |
Y_sim |
Simulated final BSTS-based forecasts. |
MSE |
Mean squared error for each simulation run. |
A Special Case of simulation_generalized in 2 Dimensions
Description
A Special Case of simulation_generalized in 2 Dimensions
Usage
simul_fun_generalized_2d(
nsim,
n_train,
n_test,
copula,
init_params,
fn,
u1,
u2,
z1_train,
z2_train,
X_t,
y1_test,
y2_test,
BSTS_1,
BSTS_2
)
Arguments
nsim |
Integer, number of simulation replications. |
n_train |
Integer, number of training observations. |
n_test |
Integer, number of test observations. |
copula |
Character, specifying the copula type: "Clayton", "Frank", "Gumbel", "Joe", or "Gaussian". |
init_params |
Numeric vector, initial parameter values for optimization. |
fn |
Function, log-likelihood function for parameter estimation. |
u1 |
Numeric vector (n_train), first pseudo-observation for the copula. |
u2 |
Numeric vector (n_train), second pseudo-observation for the copula. |
z1_train |
Numeric matrix (n_train x M), observed data for the first margin. |
z2_train |
Numeric matrix (n_train x M), observed data for the second margin. |
X_t |
Numeric matrix (n_train x M), risk factors for the dynamic copula parameter. |
y1_test |
Numeric vector (n_test), true future values for the first response variable. |
y2_test |
Numeric vector (n_test), true future values for the second response variable. |
BSTS_1 |
Fitted BSTS model for the first response variable. |
BSTS_2 |
Fitted BSTS model for the second response variable. |
Value
A list containing:
theta_simulated |
Simulated copula parameters across replications. |
y1_simulated |
Simulated values for the first response variable. |
y2_simulated |
Simulated values for the second response variable. |
MSE |
Mean squared error for each simulation run. |
optim_results |
Results from the optimization process. |
Simulate Multivariate Crop Yield Data Using a Generalized Copula-GEV-BSTS Model
Description
This function simulates multivariate crop yield data using a time-varying copula combined with Generalized Extreme Value (GEV) margins and Bayesian Structural Time Series (BSTS) models.
Usage
simulation_generalized(
nsim = 100,
n_train,
n_test,
copula,
init_params,
fn,
U_train,
Z_train,
X,
Y_test,
BSTS_list
)
Arguments
nsim |
Integer, number of simulation replications. |
n_train |
Integer, number of training observations. |
n_test |
Integer, number of test observations. |
copula |
Character, specifying the copula type: "Clayton", "Frank", "Gumbel", "Joe", or "Gaussian". |
init_params |
Numeric vector, initial parameter values for optimization. |
fn |
Function, log-likelihood function for parameter estimation. |
U_train |
Numeric matrix (n_train x D), pseudo-observations for the copula. |
Z_train |
Numeric array (n_train x D x M), observed data for each margin and sub-feature. |
X |
Numeric matrix (n_train x M), risk factors for the dynamic copula parameter. |
Y_test |
Numeric matrix (n_test x D), true future values for MSE calculation. |
BSTS_list |
List of length D, each element is a BSTS model for a different margin. |
Value
A list containing:
optim_results |
Results from the optimization process. |
theta_sim |
Simulated copula parameters across replications. |
Y_sim |
Simulated final BSTS-based forecasts. |
MSE |
Mean squared error for each simulation run. |
1950-2022
Description
1950-2022
Usage
time_all
Format
An object of class character
of length 73.
2004-2022
Description
2004-2022
Usage
time_test
Format
An object of class character
of length 19.
1950-2003
Description
1950-2003
Usage
time_train
Format
An object of class character
of length 54.
Pseudo-Observations of BSTS Residuals for Crop Yield Forecasting
Description
Pseudo-Observations of BSTS Residuals for Crop Yield Forecasting
Usage
uu
Format
A matrix with dimensions (n_train, D)
:
- n_train
Number of time points used in the training set.
- D
Number of regions analyzed (
Dufferin
,Wellington
).
Source
Derived from residuals of BSTS models fitted to crop yield data.
Maximized Covariates Matrix for Crop Yield Forecasting
Description
Maximized Covariates Matrix for Crop Yield Forecasting
Usage
xx_all
Format
A three-dimensional array with dimensions (n_train+n_test, M)
:
- n_train+n_test
Number of time points used in the training set.
- M
Number of selected climate covariates used for modeling (
cdd
,frost_days
,rx1day
,tg_mean
,txgt_25
).
Source
Derived from historical climate data from ClimateData.ca.
Maximized Covariates Matrix for Crop Yield Forecasting
Description
Maximized Covariates Matrix for Crop Yield Forecasting
Usage
xx_test
Format
A three-dimensional array with dimensions (n_test, M)
:
- n_test
Number of time points used in the testing set.
- M
Number of selected climate covariates used for modeling (
cdd
,frost_days
,rx1day
,tg_mean
,txgt_25
).
Source
Derived from historical climate data from ClimateData.ca.
Maximized Covariates Matrix for Crop Yield Forecasting
Description
Maximized Covariates Matrix for Crop Yield Forecasting
Usage
xx_train
Format
A three-dimensional array with dimensions (n_train, M)
:
- n_test
Number of time points used in the training set.
- M
Number of selected climate covariates used for modeling (
cdd
,frost_days
,rx1day
,tg_mean
,txgt_25
).
Source
Derived from historical climate data from ClimateData.ca.
Crop Yield Data
Description
Crop Yield Data
Usage
yy_all
Format
A matrix with dimensions (n_train+n_test, D)
:
- n_train+n_test
Number of time points used in the test set.
- D
Number of regions analyzed (
Dufferin
,Wellington
).
Source
Historical crop yield records from ClimateData.ca.
Crop Yield Data for Testing in BSTS Models
Description
Crop Yield Data for Testing in BSTS Models
Usage
yy_test
Format
A matrix with dimensions (n_train, D)
:
- n_train
Number of time points used in the test set.
- D
Number of regions analyzed (
Dufferin
,Wellington
).
Source
Historical crop yield records from ClimateData.ca.
Crop Yield Data for Training in BSTS Models
Description
Crop Yield Data for Training in BSTS Models
Usage
yy_train
Format
A matrix with dimensions (n_test, D)
:
- n_test
Number of time points used in the train set.
- D
Number of regions analyzed (
Dufferin
,Wellington
).
Source
Historical crop yield records from ClimateData.ca.
Standardized Covariates Array for Crop Yield Forecasting
Description
Standardized Covariates Array for Crop Yield Forecasting
Usage
zz_all
Format
A three-dimensional array with dimensions (n_train+n_test, D, M)
:
- n_train+n_test
Number of time points used in the training set.
- D
Number of regions analyzed (
Dufferin
,Wellington
).- M
Number of selected climate covariates used for modeling (
cdd
,frost_days
,rx1day
,tg_mean
,txgt_25
).
Source
Derived from historical climate data.
Standardized Covariates Array for Crop Yield Forecasting
Description
Standardized Covariates Array for Crop Yield Forecasting
Usage
zz_test
Format
A three-dimensional array with dimensions (n_test, D, M)
:
- n_test
Number of time points used in the testing set.
- D
Number of regions analyzed (
Dufferin
,Wellington
).- M
Number of selected climate covariates used for modeling (
cdd
,frost_days
,rx1day
,tg_mean
,txgt_25
).
Source
Derived from historical climate data.
Standardized Covariates Array for Crop Yield Forecasting
Description
Standardized Covariates Array for Crop Yield Forecasting
Usage
zz_train
Format
A three-dimensional array with dimensions (n_train, D, M)
:
- n_test
Number of time points used in the training set.
- D
Number of regions analyzed (
Dufferin
,Wellington
).- M
Number of selected climate covariates used for modeling (
cdd
,frost_days
,rx1day
,tg_mean
,txgt_25
).
Source
Derived from historical climate data from ClimateData.ca.