\name{data_generation}
\alias{data_generation}
\title{
Data generation function for examples.
}
\description{
This function generates a few types of data with different correlation structures. The generated data can be used in the examples provided in other functions such as the calculation of the functional canonical correlation analysis and the functional least angle regression.
}
\usage{
data_generation(seed,nsamples=80,hyper=NULL,var_type=c('f','m'),
                cor_type=1:6,uncorr=TRUE,nVar=8)
}
\arguments{
  \item{seed}{
Set the seed for random numbers.
}
  \item{nsamples}{
Sample size of the data to generate.
}
  \item{hyper}{
Hyper parameters used in the Gaussian process (GP). GP is used for building the covariance structure of the functional variables.
}
  \item{var_type}{
Two choices of the variable types. See details for more information.
}
\item{cor_type}{
Correlation structures. See details for more information.
}
  \item{uncorr}{
Whether the variables are built based on linearly uncorrelated variables. See details for more information.
}
  \item{nVar}{
Number of base variables to generate. Note that this is not the exact number of variables generated at the end.
}
}

\details{
\code{var_type} could be either \code{'f'} or \code{'m'}. If \code{var_type='f'}, only functional variables will be generated. If \code{var_type='m'}, both functional variables and scalar variables will be generated. 

When \code{uncorr} is \code{TRUE}, a few linearly uncorrelated variables will be generated. This is to better control the correlation structure of the variables using \code{cor_type}. If you want to generated a large number of variables, \code{uncorr} should be \code{FALSE}.

\code{cor_type} are numbers from 1 to 6 or from 1 to 4 depending on the choices of \code{var_type}. This is ONLY useful when we use the defaul number of variables, i.e., \code{nVar=8} and the initial variables are linearly uncorrelated, i.e., \code{uncorr=TRUE}. Bigger value of \code{cor_type} means more complicated correlation structures. 

If no correlation restriction is required for the variables, we can use \code{cor_type=1}. 

\code{nVar} is the number of the base variables generated. It is recommaned that users can modify the function to get their own data set. The other way is to use this function repeatedly to get enough both functional and scalar variables. The response variable can be re-generated by the user. Increasing the value of this argument may give \code{NaN} for the response variables.
}

\value{
\item{x}{List of covariates.}
\item{y}{Response variable.}
\item{BetaT}{True shape of the functional coefficients and true values of the scalar variables.}
\item{bConst}{Normalizing constants of the functional coefficients. True functional coefficients are the shape times the corresponding normalizing constant.}
\item{noise}{Random noise.}
\item{mu}{True intercept.}
}

\examples{
library(flars)
dataL=data_generation(seed = 1,uncorr = TRUE,nVar = 8,nsamples = 120,
      var_type = 'f',cor_type = 1)
}

