% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/famdcontour.R
\name{famdcontour}
\alias{famdcontour}
\title{Contour plots and FAMD function for classification modeling}
\usage{
famdcontour(dataf=dataf,listconti,listclass,vardep,proba="",
title="",title2="",depcol="",listacol="",alpha1=0.7,alpha2=0.7,alpha3=0.7,
classvar=1,intergrid=0,selec=0,modelo="glm",nodos=3,maxit=200,decay=0.01,
sampsize=400,mtry=2,nodesize=10,ntree=400,ntreegbm=500,shrink=0.01,
bag.fraction=1,n.minobsinnode=10,C=100,gamma=10,Dime1="Dim.1",Dime2="Dim.2")
}
\arguments{
\item{dataf}{data frame.}

\item{listconti}{Interval variables to use, in format c("var1","var2",...).}

\item{listclass}{Class variables to use, in format c("var1","var2",...).}

\item{vardep}{Dependent binary classification variable.}

\item{proba}{vector of probability predictions obtained externally (optional)}

\item{title}{plot main title}

\item{title2}{plot subtitle}

\item{depcol}{vector of two colors for points}

\item{listacol}{vector of colors for labels}

\item{alpha1}{alpha transparency for majoritary class}

\item{alpha2}{alpha transparency for minoritary class}

\item{alpha3}{alpha transparency for fit probability plots}

\item{classvar}{1 if dependent variable categories are plotted as supplementary}

\item{intergrid}{scale of grid for contour:0 if automatic}

\item{selec}{1 if stepwise logistic variable selection is required, 0 if not.}

\item{modelo}{name of model: "glm","gbm","rf,","nnet","svm".}

\item{nodos}{nnet: nodes}

\item{maxit}{nnet: iterations}

\item{decay}{nnet: decay}

\item{sampsize}{rf: sampsize}

\item{mtry}{rf: mtry}

\item{nodesize}{rf: nodesize}

\item{ntree}{rf: ntree}

\item{ntreegbm}{gbm: ntree}

\item{shrink}{gbm: shrink}

\item{bag.fraction}{gbm: bag.fraction}

\item{n.minobsinnode}{gbm:n.minobsinnode}

\item{C}{svm Radial: C}

\item{gamma}{svm Radial: gamma}

\item{Dime1, Dime2}{FAMD Dimensions to consider. Dim.1 and Dim.2 by default.}
}
\value{
A list with the following objects:\describe{
\item{graph1}{plot of points on FAMD first two dimensions}
\item{graph2}{plot of points and contour curves}
\item{graph3}{plot of points and variables}
\item{graph4}{plot of points variable and contour curves}
\item{graph5}{plot of points colored by fitted probability}
\item{graph6}{plot of points colored by abs difference}
\item{df1}{ data frame used for graph1}
\item{df2}{ data frame used for contour curves}
\item{df3}{ data frame used for variable names}
\item{listconti}{interval variables used-selected }
\item{listclass}{class variables used-selected }
}
}
\description{
This function presents visual graphics by means of FAMD.
FAMD function is Factorial Analysis for Mixed Data (interval and categorical)
Dependent classification variable is set as supplementary variable.
Machine learning algorithm predictions are presented in a filled contour setting
}
\details{
FAMD algorithm from FactoMineR package is used to compute point coordinates on dimensions
(Dim.1 and Dim.2 by default).
Minority class on dependent variable category is represented as red, majority category as green.
Color scheme can be altered using depcol and listacol, as well as alpha transparency values.
\subsection{Predictive modeling}{

For predictive modeling, selec=1 selects variables with a simple stepwise logistic regression.
By default select=0.
Logistic regression is used by default. Basic parameter setting is supported for algorithms nnet, rf,gbm and svm-RBF.
A vector of fitted probabilities obtained externally from other
algorithms can be imported in parameter proba=nameofvector. Contour curves are then
computed based on this vector.
}

\subsection{Contour curves}{

Contour curves are build by the following process: i) the chosen algorithm model is trained and all
observations are predicted-fitted. ii) A grid of points on the two chosen FAMD dimensions is built
iii) package MBA is used to interpol probability estimates over the grid, based on
previously fitted observations.
}

\subsection{Variable representation}{

In order to represent interval variables, categories of class variables, and points in the same plot,
a proportional projection of interval variables coordinates over the two dimensions range is applied.
Since space of input variables is frequently larger than two dimensions, sometimes overlapping of
points is produced; a frequency variable is used, and alpha values may be adjusted to avoid wrong interpretations
of the presence of dependent variable category/color.
}

\subsection{Troubleshooting}{
\itemize{
\item Check missings. Missing values are not allowed.
\item By default selec=0. Setting selec=1 may sometimes imply that no variables are selected; an error message is shown n this case.
\item Models with only two input variables could lead to plot generation problems.
\item Be sure that variables named in listconti are all numeric.
\item If some numeric variable is constant at one single value, process is stopped since numeric  Min-max standarization is performed,
and NaN values are generated.
}
}
}
\examples{
data(breastwisconsin1)
dataf<-breastwisconsin1
listconti=c( "clump_thickness","uniformity_of_cell_shape","mitosis")
listclass=c("")
vardep="classes"
result<-famdcontour(dataf=dataf,listconti,listclass,vardep)
}
\references{
Pages J. (2004). Analyse factorielle de donnees mixtes. Revue Statistique Appliquee.  LII (4). pp. 93-111.
}
\keyword{FAMD}
\keyword{classification}
\keyword{contour_curves}
