Title: | An Improved Version of WA-PLS |
Version: | 0.1.3 |
Description: | The goal of this package is to provide an improved version of WA-PLS (Weighted Averaging Partial Least Squares) by including the tolerances of taxa and the frequency of the sampled climate variable. This package also provides a way of leave-out cross-validation that removes both the test site and sites that are both geographically close and climatically close for each cycle, to avoid the risk of pseudo-replication. |
License: | GPL-3 |
Encoding: | UTF-8 |
URL: | https://github.com/special-uor/fxTWAPLS/, https://special-uor.github.io/fxTWAPLS/, https://research.reading.ac.uk/palaeoclimate/ |
BugReports: | https://github.com/special-uor/fxTWAPLS/issues/ |
Imports: | doFuture, foreach, future, geosphere, ggplot2, JOPS, MASS, parallel, progressr |
Suggests: | magrittr, progress, scales, tictoc |
Depends: | R (≥ 3.6) |
RoxygenNote: | 7.3.1 |
Language: | en-GB |
NeedsCompilation: | no |
Packaged: | 2024-06-25 11:27:00 UTC; r.villegas-diaz |
Author: | Mengmeng Liu |
Maintainer: | Roberto Villegas-Diaz <r.villegas-diaz@outlook.com> |
Repository: | CRAN |
Date/Publication: | 2024-06-25 11:50:13 UTC |
fxTWAPLS: An Improved Version of WA-PLS
Description
The goal of this package is to provide an improved version of WA-PLS (Weighted Averaging Partial Least Squares) by including the tolerances of taxa and the frequency of the sampled climate variable. This package also provides a way of leave-out cross-validation that removes both the test site and sites that are both geographically close and climatically close for each cycle, to avoid the risk of pseudo-replication.
Author(s)
Maintainer: Roberto Villegas-Diaz r.villegas-diaz@outlook.com (ORCID)
Authors:
Mengmeng Liu m.liu18@imperial.ac.uk (ORCID)
Iain Colin Prentice c.prentice@imperial.ac.uk (ORCID)
Cajo J. F. ter Braak cajo.terbraak@wur.nl (ORCID)
Sandy P. Harrison s.p.harrison@reading.ac.uk (ORCID)
Other contributors:
SPECIAL Research Group @ University of Reading [copyright holder]
See Also
Useful links:
Report bugs at https://github.com/special-uor/fxTWAPLS/issues/
TWA-PLS predict function
Description
TWA-PLS predict function
Usage
TWAPLS.predict.w(TWAPLSoutput, fossil_taxa)
Arguments
TWAPLSoutput |
The output of the |
fossil_taxa |
Fossil taxa abundance data to reconstruct past climates, each row represents a site to be reconstructed, each column represents a taxon. |
Value
A list of the reconstruction results. Each element in the list is described below:
fit
the fitted values using each number of components.
nPLS
the total number of components extracted.
See Also
Examples
## Not run:
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")
# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]
# Load reconstruction data
Holocene <- read.csv("/path/to/Holocene.csv")
taxaColMin <- which(colnames(Holocene) == "taxa0")
taxaColMax <- which(colnames(Holocene) == "taxaN")
core <- Holocene[, taxaColMin:taxaColMax]
## Train
fit_t_Tmin <- fxTWAPLS::TWAPLS.w(taxa, modern_pollen$Tmin, nPLS = 5)
fit_tf_Tmin <- fxTWAPLS::TWAPLS.w(
taxa,
modern_pollen$Tmin,
nPLS = 5,
usefx = TRUE,
fx_method = "bin",
bin = 0.02
)
fit_t_Tmin2 <- fxTWAPLS::TWAPLS.w2(taxa, modern_pollen$Tmin, nPLS = 5)
fit_tf_Tmin2 <- fxTWAPLS::TWAPLS.w2(
taxa,
modern_pollen$Tmin,
nPLS = 5,
usefx = TRUE,
fx_method = "bin",
bin = 0.02
)
## Predict
fossil_t_Tmin <- fxTWAPLS::TWAPLS.predict.w(fit_t_Tmin, core)
fossil_tf_Tmin <- fxTWAPLS::TWAPLS.predict.w(fit_tf_Tmin, core)
fossil_t_Tmin2 <- fxTWAPLS::TWAPLS.predict.w(fit_t_Tmin2, core)
fossil_tf_Tmin2 <- fxTWAPLS::TWAPLS.predict.w(fit_tf_Tmin2, core)
## End(Not run)
TWA-PLS training function
Description
TWA-PLS training function, which can perform fx
correction.
1/fx^2
correction will be applied at step 7.
Usage
TWAPLS.w(
modern_taxa,
modern_climate,
nPLS = 5,
usefx = FALSE,
fx_method = "bin",
bin = NA
)
Arguments
modern_taxa |
The modern taxa abundance data, each row represents a sampling site, each column represents a taxon. |
modern_climate |
The modern climate value at each sampling site. |
nPLS |
The number of components to be extracted. |
usefx |
Boolean flag on whether or not use |
fx_method |
Binned or p-spline smoothed |
bin |
Binwidth to get fx, needed for both binned and p-splined method.
if |
Value
A list of the training results, which will be used by the predict function. Each element in the list is described below:
fit
the fitted values using each number of components.
x
the observed modern climate values.
taxon_name
the name of each taxon.
optimum
the updated taxon optimum
comp
each component extracted (will be used in step 7 regression).
u
taxon optimum for each component (step 2).
t
taxon tolerance for each component (step 2).
z
a parameter used in standardization for each component (step 5).
s
a parameter used in standardization for each component (step 5).
orth
a list that stores orthogonalization parameters (step 4).
alpha
a list that stores regression coefficients (step 7).
meanx
mean value of the observed modern climate values.
nPLS
the total number of components extracted.
See Also
fx
, TWAPLS.predict.w
, and
WAPLS.w
Examples
## Not run:
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")
# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]
# Training
fit_t_Tmin <- fxTWAPLS::TWAPLS.w(taxa, modern_pollen$Tmin, nPLS = 5)
fit_tf_Tmin <- fxTWAPLS::TWAPLS.w(
taxa,
modern_pollen$Tmin,
nPLS = 5,
usefx = TRUE,
fx_method = "bin",
bin = 0.02
)
## End(Not run)
TWA-PLS training function v2
Description
TWA-PLS training function, which can perform fx
correction.
1/fx
correction will be applied at step 2 and step 7.
Usage
TWAPLS.w2(
modern_taxa,
modern_climate,
nPLS = 5,
usefx = FALSE,
fx_method = "bin",
bin = NA
)
Arguments
modern_taxa |
The modern taxa abundance data, each row represents a sampling site, each column represents a taxon. |
modern_climate |
The modern climate value at each sampling site. |
nPLS |
The number of components to be extracted. |
usefx |
Boolean flag on whether or not use |
fx_method |
Binned or p-spline smoothed |
bin |
Binwidth to get fx, needed for both binned and p-splined method.
if |
Value
A list of the training results, which will be used by the predict function. Each element in the list is described below:
fit
the fitted values using each number of components.
x
the observed modern climate values.
taxon_name
the name of each taxon.
optimum
the updated taxon optimum
comp
each component extracted (will be used in step 7 regression).
u
taxon optimum for each component (step 2).
t
taxon tolerance for each component (step 2).
z
a parameter used in standardization for each component (step 5).
s
a parameter used in standardization for each component (step 5).
orth
a list that stores orthogonalization parameters (step 4).
alpha
a list that stores regression coefficients (step 7).
meanx
mean value of the observed modern climate values.
nPLS
the total number of components extracted.
See Also
fx
, TWAPLS.predict.w
, and
WAPLS.w
Examples
## Not run:
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")
# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]
# Training
fit_t_Tmin2 <- fxTWAPLS::TWAPLS.w2(taxa, modern_pollen$Tmin, nPLS = 5)
fit_tf_Tmin2 <- fxTWAPLS::TWAPLS.w2(
taxa,
modern_pollen$Tmin,
nPLS = 5,
usefx = TRUE,
fx_method = "bin",
bin = 0.02
)
## End(Not run)
WA-PLS predict function
Description
WA-PLS predict function
Usage
WAPLS.predict.w(WAPLSoutput, fossil_taxa)
Arguments
WAPLSoutput |
The output of the |
fossil_taxa |
Fossil taxa abundance data to reconstruct past climates, each row represents a site to be reconstructed, each column represents a taxon. |
Value
A list of the reconstruction results. Each element in the list is described below:
fit
The fitted values using each number of components.
nPLS
The total number of components extracted.
See Also
Examples
## Not run:
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")
# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]
# Load reconstruction data
Holocene <- read.csv("/path/to/Holocene.csv")
taxaColMin <- which(colnames(Holocene) == "taxa0")
taxaColMax <- which(colnames(Holocene) == "taxaN")
core <- Holocene[, taxaColMin:taxaColMax]
## Train
fit_Tmin <- fxTWAPLS::WAPLS.w(taxa, modern_pollen$Tmin, nPLS = 5)
fit_f_Tmin <- fxTWAPLS::WAPLS.w(
taxa,
modern_pollen$Tmin,
nPLS = 5,
usefx = TRUE,
fx_method = "bin",
bin = 0.02
)
fit_Tmin2 <- fxTWAPLS::WAPLS.w2(taxa, modern_pollen$Tmin, nPLS = 5)
fit_f_Tmin2 <- fxTWAPLS::WAPLS.w2(
taxa,
modern_pollen$Tmin,
nPLS = 5,
usefx = TRUE,
fx_method = "bin",
bin = 0.02
)
## Predict
fossil_Tmin <- fxTWAPLS::WAPLS.predict.w(fit_Tmin, core)
fossil_f_Tmin <- fxTWAPLS::WAPLS.predict.w(fit_f_Tmin, core)
fossil_Tmin2 <- fxTWAPLS::WAPLS.predict.w(fit_Tmin2, core)
fossil_f_Tmin2 <- fxTWAPLS::WAPLS.predict.w(fit_f_Tmin2, core)
## End(Not run)
WA-PLS training function
Description
WA-PLS training function, which can perform fx
correction.
1/fx^2
correction will be applied at step 7.
Usage
WAPLS.w(
modern_taxa,
modern_climate,
nPLS = 5,
usefx = FALSE,
fx_method = "bin",
bin = NA
)
Arguments
modern_taxa |
The modern taxa abundance data, each row represents a sampling site, each column represents a taxon. |
modern_climate |
The modern climate value at each sampling site. |
nPLS |
The number of components to be extracted. |
usefx |
Boolean flag on whether or not use |
fx_method |
Binned or p-spline smoothed |
bin |
Binwidth to get fx, needed for both binned and p-splined method.
if |
Value
A list of the training results, which will be used by the predict function. Each element in the list is described below:
fit
the fitted values using each number of components.
x
the observed modern climate values.
taxon_name
the name of each taxon.
optimum
the updated taxon optimum (u* in the WA-PLS paper).
comp
each component extracted (will be used in step 7 regression).
u
taxon optimum for each component (step 2).
z
a parameter used in standardization for each component (step 5).
s
a parameter used in standardization for each component (step 5).
orth
a list that stores orthogonalization parameters (step 4).
alpha
a list that stores regression coefficients (step 7).
meanx
mean value of the observed modern climate values.
nPLS
the total number of components extracted.
See Also
fx
, TWAPLS.w
, and
WAPLS.predict.w
Examples
## Not run:
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")
# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]
# Training
fit_Tmin <- fxTWAPLS::WAPLS.w(taxa, modern_pollen$Tmin, nPLS = 5)
fit_f_Tmin <- fxTWAPLS::WAPLS.w(
taxa,
modern_pollen$Tmin,
nPLS = 5,
usefx = TRUE,
fx_method = "bin",
bin = 0.02
)
## End(Not run)
WA-PLS training function v2
Description
WA-PLS training function, which can perform fx
correction.
1/fx
correction will be applied at step 2 and step 7.
Usage
WAPLS.w2(
modern_taxa,
modern_climate,
nPLS = 5,
usefx = FALSE,
fx_method = "bin",
bin = NA
)
Arguments
modern_taxa |
The modern taxa abundance data, each row represents a sampling site, each column represents a taxon. |
modern_climate |
The modern climate value at each sampling site. |
nPLS |
The number of components to be extracted. |
usefx |
Boolean flag on whether or not use |
fx_method |
Binned or p-spline smoothed |
bin |
Binwidth to get fx, needed for both binned and p-splined method.
if |
Value
A list of the training results, which will be used by the predict function. Each element in the list is described below:
fit
the fitted values using each number of components.
x
the observed modern climate values.
taxon_name
the name of each taxon.
optimum
the updated taxon optimum (u* in the WA-PLS paper).
comp
each component extracted (will be used in step 7 regression).
u
taxon optimum for each component (step 2).
z
a parameter used in standardization for each component (step 5).
s
a parameter used in standardization for each component (step 5).
orth
a list that stores orthogonalization parameters (step 4).
alpha
a list that stores regression coefficients (step 7).
meanx
mean value of the observed modern climate values.
nPLS
the total number of components extracted.
See Also
fx
, TWAPLS.w
, and
WAPLS.predict.w
Examples
## Not run:
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")
# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]
# Training
fit_Tmin2 <- fxTWAPLS::WAPLS.w2(taxa, modern_pollen$Tmin, nPLS = 5)
fit_f_Tmin2 <- fxTWAPLS::WAPLS.w2(
taxa,
modern_pollen$Tmin,
nPLS = 5,
usefx = TRUE,
fx_method = "bin",
bin = 0.02
)
## End(Not run)
Pseudo-removed leave-out cross-validation
Description
Pseudo-removed leave-out cross-validation
Usage
cv.pr.w(
modern_taxa,
modern_climate,
nPLS = 5,
trainfun,
predictfun,
pseudo,
usefx = FALSE,
fx_method = "bin",
bin = NA,
cpus = 4,
test_mode = TRUE,
test_it = 5
)
Arguments
modern_taxa |
The modern taxa abundance data, each row represents a sampling site, each column represents a taxon. |
modern_climate |
The modern climate value at each sampling site. |
nPLS |
The number of components to be extracted. |
trainfun |
Training function you want to use, either
|
predictfun |
Predict function you want to use: if |
pseudo |
The geographically and climatically close sites to each test
site, obtained from |
usefx |
Boolean flag on whether or not use |
fx_method |
Binned or p-spline smoothed |
bin |
Binwidth to get fx, needed for both binned and p-splined method.
if |
cpus |
Number of CPUs for simultaneous iterations to execute, check
|
test_mode |
Boolean flag to execute the function with a limited number
of iterations, |
test_it |
Number of iterations to use in the test mode. |
Value
Leave-one-out cross validation results.
See Also
fx
, TWAPLS.w
,
TWAPLS.predict.w
, WAPLS.w
, and
WAPLS.predict.w
Examples
## Not run:
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")
# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]
point <- modern_pollen[, c("Long", "Lat")]
test_mode <- TRUE # It should be set to FALSE before running
dist <- fxTWAPLS::get_distance(
point,
cpus = 2, # Remove the following line
test_mode = test_mode
)
pseudo_Tmin <- fxTWAPLS::get_pseudo(
dist,
modern_pollen$Tmin,
cpus = 2, # Remove the following line
test_mode = test_mode
)
cv_pr_tf_Tmin2 <- fxTWAPLS::cv.pr.w(
taxa,
modern_pollen$Tmin,
nPLS = 5,
fxTWAPLS::TWAPLS.w2,
fxTWAPLS::TWAPLS.predict.w,
pseudo_Tmin,
usefx = TRUE,
fx_method = "bin",
bin = 0.02,
cpus = 2, # Remove the following line
test_mode = test_mode
)
# Run with progress bar
`%>%` <- magrittr::`%>%`
cv_pr_tf_Tmin2 <- fxTWAPLS::cv.pr.w(
taxa,
modern_pollen$Tmin,
nPLS = 5,
fxTWAPLS::TWAPLS.w2,
fxTWAPLS::TWAPLS.predict.w,
pseudo_Tmin,
usefx = TRUE,
fx_method = "bin",
bin = 0.02,
cpus = 2, # Remove the following line
test_mode = test_mode
) %>%
fxTWAPLS::pb()
## End(Not run)
Leave-one-out cross-validation
Description
Leave-one-out cross-validation as
rioja
(https://cran.r-project.org/package=rioja).
Usage
cv.w(
modern_taxa,
modern_climate,
nPLS = 5,
trainfun,
predictfun,
usefx = FALSE,
fx_method = "bin",
bin = NA,
cpus = 4,
test_mode = FALSE,
test_it = 5
)
Arguments
modern_taxa |
The modern taxa abundance data, each row represents a sampling site, each column represents a taxon. |
modern_climate |
The modern climate value at each sampling site. |
nPLS |
The number of components to be extracted. |
trainfun |
Training function you want to use, either
|
predictfun |
Predict function you want to use: if |
usefx |
Boolean flag on whether or not use |
fx_method |
Binned or p-spline smoothed |
bin |
Binwidth to get fx, needed for both binned and p-splined method.
if |
cpus |
Number of CPUs for simultaneous iterations to execute, check
|
test_mode |
boolean flag to execute the function with a limited number
of iterations, |
test_it |
number of iterations to use in the test mode. |
Value
leave-one-out cross validation results
See Also
fx
, TWAPLS.w
,
TWAPLS.predict.w
, WAPLS.w
, and
WAPLS.predict.w
Examples
## Not run:
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")
# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]
## LOOCV
test_mode <- TRUE # It should be set to FALSE before running
cv_tf_Tmin2 <- fxTWAPLS::cv.w(
taxa,
modern_pollen$Tmin,
nPLS = 5,
fxTWAPLS::TWAPLS.w2,
fxTWAPLS::TWAPLS.predict.w,
usefx = TRUE,
fx_method = "bin",
bin = 0.02,
cpus = 2, # Remove the following line
test_mode = test_mode
)
# Run with progress bar
`%>%` <- magrittr::`%>%`
cv_tf_Tmin2 <- fxTWAPLS::cv.w(
taxa,
modern_pollen$Tmin,
nPLS = 5,
fxTWAPLS::TWAPLS.w2,
fxTWAPLS::TWAPLS.predict.w,
usefx = TRUE,
fx_method = "bin",
bin = 0.02,
cpus = 2, # Remove the following line
test_mode = test_mode
) %>% fxTWAPLS::pb()
## End(Not run)
Get frequency of the climate value
Description
Function to get the frequency of the climate value, which will be used to
provide fx
correction for WA-PLS and TWA-PLS.
Usage
fx(x, bin, show_plot = FALSE)
Arguments
x |
Numeric vector with the modern climate values. |
bin |
Binwidth to get the frequency of the modern climate values. |
show_plot |
Boolean flag to show a plot of |
Value
Numeric vector with the frequency of the modern climate values.
See Also
cv.w
, cv.pr.w
, and
sse.sample
Examples
## Not run:
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")
# Get the frequency of each climate variable fx
fx_Tmin <- fxTWAPLS::fx(modern_pollen$Tmin, bin = 0.02, show_plot = TRUE)
fx_gdd <- fxTWAPLS::fx(modern_pollen$gdd, bin = 20, show_plot = TRUE)
fx_alpha <- fxTWAPLS::fx(modern_pollen$alpha, bin = 0.002, show_plot = TRUE)
## End(Not run)
Get frequency of the climate value with p-spline smoothing
Description
Function to get the frequency of the climate value, which will be used to
provide fx
correction for WA-PLS and TWA-PLS.
Usage
fx_pspline(x, bin, show_plot = FALSE)
Arguments
x |
Numeric vector with the modern climate values. |
bin |
Binwidth to get the frequency of the modern climate values, the curve will be p-spline smoothed later |
show_plot |
Boolean flag to show a plot of |
Value
Numeric vector with the frequency of the modern climate values.
See Also
cv.w
, cv.pr.w
, and
sse.sample
Examples
## Not run:
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")
# Get the frequency of each climate variable fx
fx_pspline_Tmin <- fxTWAPLS::fx_pspline(
modern_pollen$Tmin,
bin = 0.02,
show_plot = TRUE
)
fx_pspline_gdd <- fxTWAPLS::fx_pspline(
modern_pollen$gdd,
bin = 20,
show_plot = TRUE
)
fx_pspline_alpha <- fxTWAPLS::fx_pspline(
modern_pollen$alpha,
bin = 0.002,
show_plot = TRUE
)
## End(Not run)
Get the distance between points
Description
Get the distance between points, the output will be used in
get_pseudo
.
Usage
get_distance(point, cpus = 4, test_mode = FALSE, test_it = 5)
Arguments
point |
Each row represents a sampling site, the first column is longitude and the second column is latitude, both in decimal format. |
cpus |
Number of CPUs for simultaneous iterations to execute, check
|
test_mode |
Boolean flag to execute the function with a limited number
of iterations, |
test_it |
Number of iterations to use in the test mode. |
Value
Distance matrix, the value at the i-th
row, means the distance
between the i-th
sampling site and the whole sampling sites.
See Also
Examples
## Not run:
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")
point <- modern_pollen[, c("Long", "Lat")]
test_mode <- TRUE # It should be set to FALSE before running
dist <- fxTWAPLS::get_distance(
point,
cpus = 2, # Remove the following line
test_mode = test_mode
)
# Run with progress bar
`%>%` <- magrittr::`%>%`
dist <- fxTWAPLS::get_distance(
point,
cpus = 2, # Remove the following line
test_mode = test_mode
) %>%
fxTWAPLS::pb()
## End(Not run)
Get geographically and climatically close sites
Description
Get the sites which are both geographically and climatically close to the
test site, which could result in pseudo-replication and inflate the
cross-validation statistics. The output will be used in
cv.pr.w
.
Usage
get_pseudo(dist, x, cpus = 4, test_mode = FALSE, test_it = 5)
Arguments
dist |
Distance matrix which contains the distance from other sites. |
x |
The modern climate values. |
cpus |
Number of CPUs for simultaneous iterations to execute, check
|
test_mode |
Boolean flag to execute the function with a limited number
of iterations, |
test_it |
Number of iterations to use in the test mode. |
Value
The geographically and climatically close sites to each test site.
See Also
Examples
## Not run:
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")
point <- modern_pollen[, c("Long", "Lat")]
test_mode <- TRUE # It should be set to FALSE before running
dist <- fxTWAPLS::get_distance(
point,
cpus = 2, # Remove the following line
test_mode = test_mode
)
pseudo_Tmin <- fxTWAPLS::get_pseudo(
dist,
modern_pollen$Tmin,
cpus = 2, # Remove the following line
test_mode = test_mode
)
# Run with progress bar
`%>%` <- magrittr::`%>%`
pseudo_Tmin <- fxTWAPLS::get_pseudo(
dist,
modern_pollen$Tmin,
cpus = 2, # Remove the following line
test_mode = test_mode
) %>%
fxTWAPLS::pb()
## End(Not run)
Show progress bar
Description
Show progress bar
Usage
pb(expr, ...)
Arguments
expr |
R expression. |
... |
Arguments passed on to
|
Value
Return data from the function called.
Plot the residuals
Description
Plot the residuals, the black line is 0 line, the red line is the locally estimated scatterplot smoothing, which shows the degree of local compression.
Usage
plot_residuals(train_output, col)
Arguments
train_output |
Training output, can be the output of WA-PLS, WA-PLS with
|
col |
Choose which column of the fitted value to plot, in other words, how many number of components you want to use. |
Value
Plotting status.
See Also
Examples
## Not run:
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")
# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]
fit_tf_Tmin2 <- fxTWAPLS::TWAPLS.w2(
taxa,
modern_pollen$Tmin,
nPLS = 5,
usefx = TRUE,
fx_method = "bin",
bin = 0.02
)
nsig <- 3 # This should be got from the random t-test of the cross validation
fxTWAPLS::plot_residuals(fit_tf_Tmin2, nsig)
## End(Not run)
Plot the training results
Description
Plot the training results, the black line is the 1:1 line, the red line is
the linear regression line to fitted and x
, which shows the degree
of overall compression.
Usage
plot_train(train_output, col)
Arguments
train_output |
Training output, can be the output of WA-PLS, WA-PLS with
|
col |
Choose which column of the fitted value to plot, in other words, how many number of components you want to use. |
Value
Plotting status.
See Also
Examples
## Not run:
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")
# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]
fit_tf_Tmin2 <- fxTWAPLS::TWAPLS.w2(
taxa,
modern_pollen$Tmin,
nPLS = 5,
usefx = TRUE,
fx_method = "bin",
bin = 0.02
)
nsig <- 3 # This should be got from the random t-test of the cross validation
fxTWAPLS::plot_train(fit_tf_Tmin2, nsig)
## End(Not run)
Random t-test
Description
Do a random t-test to the cross-validation results.
Usage
rand.t.test.w(cvoutput, n.perm = 999)
Arguments
cvoutput |
|
n.perm |
The number of permutation times to get the p value, which assesses whether using the current number of components is significantly different from using one less. |
Value
A matrix of the statistics of the cross-validation results. Each component is described below:
R2
the coefficient of determination (the larger, the better the fit).
Avg.Bias
average bias.
Max.Bias
maximum bias.
Min.Bias
minimum bias.
RMSEP
root-mean-square error of prediction (the smaller, the better the fit).
delta.RMSEP
the percent change of RMSEP using the current number of components than using one component less.
p
assesses whether using the current number of components is significantly different from using one component less, which is used to choose the last significant number of components to avoid over-fitting.
-
The degree of overall compression is assessed by doing linear regression to the cross-validation result and the observed climate values.
-
Compre.b0
: the intercept. -
Compre.b1
: the slope (the closer to 1, the less the overall compression). -
Compre.b0.se
: the standard error of the intercept. -
Compre.b1.se
: the standard error of the slope.
-
See Also
Examples
## Not run:
## Random t-test
rand_pr_tf_Tmin2 <- fxTWAPLS::rand.t.test.w(cv_pr_tf_Tmin2, n.perm = 999)
# note: choose the last significant number of components based on the p-value,
# see details at Liu Mengmeng, Prentice Iain Colin, ter Braak Cajo J. F.,
# Harrison Sandy P.. 2020 An improved statistical approach for reconstructing
# past climates from biotic assemblages. Proc. R. Soc. A. 476: 20200346.
# <https://doi.org/10.1098/rspa.2020.0346>
## End(Not run)
Calculate Sample Specific Errors
Description
Calculate Sample Specific Errors
Usage
sse.sample(
modern_taxa,
modern_climate,
fossil_taxa,
trainfun,
predictfun,
nboot,
nPLS,
nsig,
usefx = FALSE,
fx_method = "bin",
bin = NA,
cpus = 4,
seed = NULL,
test_mode = FALSE,
test_it = 5
)
Arguments
modern_taxa |
The modern taxa abundance data, each row represents a sampling site, each column represents a taxon. |
modern_climate |
The modern climate value at each sampling site |
fossil_taxa |
Fossil taxa abundance data to reconstruct past climates, each row represents a site to be reconstructed, each column represents a taxon. |
trainfun |
Training function you want to use, either
|
predictfun |
Predict function you want to use: if |
nboot |
The number of bootstrap cycles you want to use. |
nPLS |
The number of components to be extracted. |
nsig |
The significant number of components to use to reconstruct past climates, this can be obtained from the cross-validation results. |
usefx |
Boolean flag on whether or not use |
fx_method |
Binned or p-spline smoothed |
bin |
Binwidth to get fx, needed for both binned and p-splined method.
if |
cpus |
Number of CPUs for simultaneous iterations to execute, check
|
seed |
Seed for reproducibility. |
test_mode |
Boolean flag to execute the function with a limited number
of iterations, |
test_it |
Number of iterations to use in the test mode. |
Value
The bootstrapped standard error for each site.
See Also
fx
, TWAPLS.w
,
TWAPLS.predict.w
, WAPLS.w
, and
WAPLS.predict.w
Examples
## Not run:
# Load modern pollen data
modern_pollen <- read.csv("/path/to/modern_pollen.csv")
# Extract taxa
taxaColMin <- which(colnames(modern_pollen) == "taxa0")
taxaColMax <- which(colnames(modern_pollen) == "taxaN")
taxa <- modern_pollen[, taxaColMin:taxaColMax]
# Load reconstruction data
Holocene <- read.csv("/path/to/Holocene.csv")
taxaColMin <- which(colnames(Holocene) == "taxa0")
taxaColMax <- which(colnames(Holocene) == "taxaN")
core <- Holocene[, taxaColMin:taxaColMax]
## SSE
nboot <- 5 # Recommended 1000
nsig <- 3 # This should be got from the random t-test of the cross validation
sse_tf_Tmin2 <- fxTWAPLS::sse.sample(
modern_taxa = taxa,
modern_climate = modern_pollen$Tmin,
fossil_taxa = core,
trainfun = fxTWAPLS::TWAPLS.w2,
predictfun = fxTWAPLS::TWAPLS.predict.w,
nboot = nboot,
nPLS = 5,
nsig = nsig,
usefx = TRUE,
fx_method = "bin",
bin = 0.02,
cpus = 2,
seed = 1
)
# Run with progress bar
`%>%` <- magrittr::`%>%`
sse_tf_Tmin2 <- fxTWAPLS::sse.sample(
modern_taxa = taxa,
modern_climate = modern_pollen$Tmin,
fossil_taxa = core,
trainfun = fxTWAPLS::TWAPLS.w2,
predictfun = fxTWAPLS::TWAPLS.predict.w,
nboot = nboot,
nPLS = 5,
nsig = nsig,
usefx = TRUE,
fx_method = "bin",
bin = 0.02,
cpus = 2,
seed = 1
) %>% fxTWAPLS::pb()
## End(Not run)