% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/LASSO2plus_XGBtraining.R
\name{LASSO2plus_XGBtraining}
\alias{LASSO2plus_XGBtraining}
\title{XGBoost Modeling after Variable Selection with LASSO2plus}
\usage{
LASSO2plus_XGBtraining(
  data = NULL,
  standardization = FALSE,
  columnWise = TRUE,
  biomks = NULL,
  outcomeType = c("binary", "continuous", "time-to-event"),
  Y = NULL,
  time = NULL,
  event = NULL,
  nrounds = 5,
  nthread = 2,
  gamma = 1,
  max_depth = 3,
  eta = 0.3,
  outfile = "nameWithPath",
  height = 6
)
}
\arguments{
\item{data}{A data matrix or a data frame where samples are in rows, and features/traits are in columns.}

\item{standardization}{A logical variable to indicate if standardization is needed before variable selection. The default is FALSE.}

\item{columnWise}{A logical variable to indicate if column wise or row wise normalization is needed. The default is TRUE, which us tod do column-wise normalization. This is only meaningful when "standardization" is TRUE.}

\item{biomks}{A vector of potential biomarkers for variable selection. They should be a subset of "data" column names.}

\item{outcomeType}{Outcome variable type. There are three choices: "binary" (default), "continuous", and "time-to-event".}

\item{Y}{Outcome variable name when the outcome type is either "binary" or "continuous".}

\item{time}{Time variable name when outcome type is "time-to-event".}

\item{event}{Event variable name when outcome type is "time-to-event".}

\item{nrounds}{Max number of boosting iterations.}

\item{nthread}{Number of parallel threads used to run XGBoost.}

\item{gamma}{Minimum loss reduction required to make a further partition on a leaf node of the tree.}

\item{max_depth}{Maximum depth of a tree. Increasing this value will make the model more complex and more likely to overfit.}

\item{eta}{The learning rate for the XGBoost model.}

\item{outfile}{A string for the output file including path if necessary but without file type extension.}

\item{height}{An integer to indicate the forest plot height in inches.}
}
\value{
A list is returned: 
\item{XGBoost_model}{An XGBoost model}
\item{XGBoost_model_score}{Model scores for the given training data set. 
 For a continuous outcome variable, this is a vector of the estimated continuous values; 
 for a binary outcome variable, this is a vector representing the probability or log-odds of the positive class;
 for time-to-event outcome, this is a vector of risk scores}
}
\description{
This function performs variable selection using LASSO2plus and then builds an XGBoost model.
}
\details{
#' The first part of LASSO2_XGBtraining involves variable selection with LASSO2plus. The LASSO2plus algorithm begins with variable selection using LASSO2,
which is typically a cross-validation-based LASSO regression. However, when only one or no variable is selected, the cross-validation results are ignored, 
and a minimum of two remaining variables is ensured through full-data lambda simulations. Additionally, it conducts variable selection through 
single-variable regression for each candidate variable. The variables selected from both LASSO2 and single-variable approaches are then combined to 
perform traditional variable selection using stepwise regression. 
The second part of LASSO2plus_XGBtraining involves ignoring the shrunk LASSO coefficients and building a XGBoost model.
It is suitable for three types of outcomes: continuous, binary, and time-to-event.
}
\examples{
# Load in data sets:
data("datlist", package = "csmpv")
tdat = datlist$training

# The function saves files locally. You can define your own temporary directory. 
# If not, tempdir() can be used to get the system's temporary directory.
temp_dir = tempdir()
# As an example, let's define Xvars, which will be used later:
Xvars = c("highIPI", "B.Symptoms", "MYC.IHC", "BCL2.IHC", "CD10.IHC", "BCL6.IHC")
# The function can work with three different outcome types. 
# Here, we use time-to-event as an example:
# tl2xfit = LASSO2plus_XGBtraining(data = tdat, biomks = Xvars,
#                                 outcomeType = "time-to-event",
#                                time = "FFP..Years.", event = "Code.FFP",
#                                outfile = paste0(temp_dir, "/survival_LASSO2plus_XGBoost"))
#
# You might save the files to the directory you want.

# To delete the temp_dir, use the following:
# unlink(temp_dir)
}
\references{
Friedman, J., Hastie, T. and Tibshirani, R. (2008) Regularization Paths for Generalized Linear Models via Coordinate Descent (2010), Journal of Statistical Software, Vol. 33(1), 1-22, doi:10.18637/jss.v033.i01.
  Simon, N., Friedman, J., Hastie, T. and Tibshirani, R. (2011) Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent, Journal of Statistical Software, Vol. 39(5), 1-13, doi:10.18637/jss.v039.i05.
  Tianqi Chen and Carlos Guestrin, "XGBoost: A Scalable Tree Boosting System", 22nd SIGKDD Conference on Knowledge Discovery and Data Mining, 2016, https://arxiv.org/abs/1603.02754
}
\author{
Aixiang Jiang
}
