% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/MiNTreesAjust.R
\name{MiNTreesAjust}
\alias{MiNTreesAjust}
\title{Ajust number of trees parameter for fitting  generalized boosted regression models}
\usage{
MiNTreesAjust(Matrix, specimens, test.frac = 5, times = 5,
  ntrees = c(100, 500, 1000, 5000, 10000), shrinkage = 0.1,
  intdepth = 2, n.terminal = 10, bag.frac = 0.5)
}
\arguments{
\item{Matrix}{numeric matrix of expression data where each row corresponds to a probe (gene, transcript),
and each column correspondes to a specimen (patient).}

\item{specimens}{factor vector with two levels specifying specimens in the columns of the \code{Matrix}}

\item{test.frac}{integer specifying fraction of data to use for model testing}

\item{times}{integer specifying number of trials}

\item{ntrees}{vector of integer specifying the total number of decision trees (boosting iterations).The tested parameter.}

\item{shrinkage}{numeric specifying the learning rate. Scales the step size in the gradient descent procedure.}

\item{intdepth}{integer specifying the maximum depth of each tree.}

\item{n.terminal}{integer specifying the actual minimum number of observations in the terminal nodes of the trees.}

\item{bag.frac}{the fraction of the training set observations randomly selected to propose the next tree in the expansion.}
}
\value{
list of 2
\cr
\code{train.accuracy} - a data frame of train data classification accuracy
for each \code{ntrees} value in each trial and their median.
\cr
\code{test.accuracy} - a data frame of test data classification accuracy
for each \code{ntrees} value in each trial and their median.
\cr
Also a plot for \code{ntrees} versus Accuracy is produced.
}
\description{
Test total number of trees parameter for microarray data binary classification using gradient boosting
over desicion trees.
}
\details{
\code{test.frac} defines fraction of specimens that will be used for model testing. For example, if
\code{test.frac}=5 then 4/5th of specimens will be used for model fitting (train data) and 1/5th of specimens
will be used for model testing (test data). Specimens for test and train data will be selected by random.
So with \code{times}>1, train and test data will differ each time.
\cr
While boosting basis functions are iteratively adding in a greedy fashion
so that each additional basis function further reduces the selected loss function.
Gaussian distribution (squared error) is used.
\code{ntrees}, \code{shrinkage}, \code{intdeep} are parameters for model tuning.
\code{bag.frac} introduces randomnesses into the model fit.
If \code{bag.frac} < 1 then running the same model twice will result in similar but different fits.
Number of specimens in train sample must be enough to provide the minimum number of observations in terminal nodes.I.e.
\cr(1-1/\code{test.frac})*\code{bag.frac} > \code{n.terminal}.
\cr
See \code{\link{gbm}} for details.
\cr
Use \code{\link{MiIntDepthAjust}} and \code{\link{MiShrinkAjust}} for ajusting other parameters.
\cr
Function is rather time-costing. If specimens are not equally distributed between two classified groups,
NA may be produced.
}
\examples{
#get gene expression and specimen data
data("IMexpression");data("IMspecimen")
#sample expression matrix and specimen data for binary classification,
#only "NORM" and "EBV" specimens are left
SampleMatrix<-MiDataSample(IMexpression, IMspecimen$diagnosis,"norm", "ebv")
SampleSpecimen<-MiSpecimenSample(IMspecimen$diagnosis, "norm", "ebv")
#Fitting, low tuning for faster running. Test ntrees
set.seed(1)
ClassRes<-MiNTreesAjust(SampleMatrix, SampleSpecimen, test.frac = 5, times = 3,
                       ntrees = c(10, 20), shrinkage = 1, intdepth = 2)
ClassRes[[1]] # train accuracy
ClassRes[[2]] # test accuracy


}
\seealso{
\code{\link{gbm}}, \code{\link{MiIntDepthAjust}}, \code{\link{MiShrinkAjust}}
}
\author{
Elena N. Filatova
}
