% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/IC_clean_data.R
\name{IC_clean_data}
\alias{IC_clean_data}
\title{Clean the dataframe before to be used with IC_threshold_matrix}
\usage{
IC_clean_data(
  data = stop("A dataframe object is required"),
  use = c("pairwise.complete.obs", "everything", "all.obs", "complete.obs",
    "na.or.complete"),
  method = c("pearson", "kendall", "spearman"),
  variable.retain = NULL,
  test.partial.correlation = TRUE,
  progress = TRUE,
  debug = FALSE
)
}
\arguments{
\item{data}{The data.frame to be cleaned}

\item{use}{an optional character string giving a method for computing covariances in the presence of missing values. This must be (an abbreviation of) one of the strings "everything", "all.obs", "complete.obs", "na.or.complete", or "pairwise.complete.obs".}

\item{method}{a character string indicating which correlation coefficient (or covariance) is to be computed. One of "pearson" (default), "kendall", or "spearman": can be abbreviated.}

\item{variable.retain}{a vector with the name of columns to keep}

\item{test.partial.correlation}{should the partial correlations be tested ?}

\item{progress}{Show a progress bar}

\item{debug}{if TRUE, information about progression of cleaning are shown}
}
\value{
A dataframe
}
\description{
This function must be used if missing values are present in the dataset.\cr
It ensures that all correlations and partial correlations can be calculated. 
The columns of the dataframe are removed one per one until all can be calculated without error. 
It is possible to say that one or more columns must be retained because they are of particular importance in the analysis. 
The use and method parameters are used by cor() function. The function uses by default a parallel computing in Unix or MacOSX systems. 
If progress is TRUE and the package pbmcapply is present, a progress bar is displayed. If debug is TRUE, some informations are shown during the process.
\code{https://fr.wikipedia.org/wiki/Iconographie_des_corrélations}
}
\details{
IC_clean_data checks and corrects the dataframe to be used with IC_threshold_matrix
}
\examples{
\dontrun{
library("HelpersMG")
# based on https://fr.wikipedia.org/wiki/Iconographie_des_corrélations
es <- structure(list(Student = c("e1", "e2", "e3", "e4", "e5", "e6", "e7", "e8"), 
                     Mass = c(52, 59, 55, 58, 66, 62, 63, 69), 
                     Age = c(12, 12.5, 13, 14.5, 15.5, 16, 17, 18), 
                     Assiduity = c(12, 9, 15, 5, 11, 15, 12, 9), 
                     Note = c(5, 5, 9, 5, 13.5, 18, 18, 18)), 
                     row.names = c(NA, -8L), class = "data.frame")
es

df_clean <- IC_clean_data(es, debug = TRUE)
cor_matrix <- IC_threshold_matrix(data=df_clean, threshold = NULL, progress=FALSE)
cor_threshold <- IC_threshold_matrix(data=df_clean, threshold = 0.3)
plot(cor_threshold, show.legend.strength=FALSE, show.legend.direction = FALSE)
cor_threshold_Note <- IC_correlation_simplify(matrix=cor_threshold, variable="Note")
plot(cor_threshold_Note, show.legend.strength=FALSE, show.legend.direction = FALSE)

cor_threshold <- IC_threshold_matrix(data=df_clean, threshold = 0.6)
plot(cor_threshold, 
layout=matrix(data=c(53, 53, 55, 55, 
                     55, 53, 55, 53), ncol=2, byrow=FALSE), 
show.legend.direction = FALSE,
show.legend.strength = FALSE, xlim=c(-2, 2), ylim=c(-2, 2))
}
}
\references{
Lesty, M., 1999. Une nouvelle approche dans le choix des régresseurs de la régression multiple en présence d’interactions et de colinéarités. Revue de Modulad 22, 41-77.
}
\seealso{
Other Iconography of correlations: 
\code{\link{IC_correlation_simplify}()},
\code{\link{IC_threshold_matrix}()},
\code{\link{plot.IconoCorel}()}
}
\author{
Marc Girondot
}
\concept{Iconography of correlations}
