% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/ENG.R
\name{ENG}
\alias{ENG}
\alias{ENG.default}
\alias{ENG.formula}
\title{Editing with Neighbor Graphs}
\usage{
\method{ENG}{formula}(formula, data, ...)

\method{ENG}{default}(x, graph = "RNG", classColumn = ncol(x), ...)
}
\arguments{
\item{formula}{A formula describing the classification variable and the attributes to be used.}

\item{data, x}{Data frame containing the tranining dataset to be filtered.}

\item{...}{Optional parameters to be passed to other methods.}

\item{graph}{Character indicating the type of graph to be constructed. It can be chosen between 'GG'
(Gabriel Graph) and 'RNG' (Relative Neighborhood Graph). See 'References' for more details on both graphs.}

\item{classColumn}{positive integer indicating the column which contains the
(factor of) classes. By default, the last column is considered.}
}
\value{
An object of class \code{filter}, which is a list with seven components:
\itemize{
   \item \code{cleanData} is a data frame containing the filtered dataset.
   \item \code{remIdx} is a vector of integers indicating the indexes for
   removed instances (i.e. their row number with respect to the original data frame).
   \item \code{repIdx} is a vector of integers indicating the indexes for
   repaired/relabelled instances (i.e. their row number with respect to the original data frame).
   \item \code{repLab} is a factor containing the new labels for repaired instances.
   \item \code{parameters} is a list containing the argument values.
   \item \code{call} contains the original call to the filter.
   \item \code{extraInf} is a character that includes additional interesting
   information not covered by previous items.
}
}
\description{
Similarity-based filter for removing label noise from a dataset as a
preprocessing step of classification. For more information, see 'Details' and
'References' sections.
}
\details{
\code{ENG} builds a neighborhood graph which can be either \emph{Gabriel Graph (GG)} or
\emph{Relative Neighborhood Graph (RNG)} [S\'{a}nchez et al., 1997]. Then, an
instance is considered as 'potentially noisy' if most of its neighbors have a different class. To decide whether such
an instance 'X' is removed, let S be the subset given by 'X' together with its neighbors from the same class.
Compute the majority class 'C' among the neighbors of examples in S, and remove 'X' if its class is not 'C'.
}
\examples{
# The example is not run because the graph construction is quite time-consuming.
\dontrun{
   data(iris)
   trainData <- iris[c(1:20,51:70,101:120),]
   out <- ENG(Species~Petal.Length + Petal.Width, data = trainData, graph = "RNG")
   print(out)
   identical(out$cleanData,trainData[setdiff(1:nrow(trainData),out$remIdx),])
}
}
\references{
S\'{a}nchez J. S., Pla F., Ferri F. J. (1997): Prototype selection for the nearest
neighbour rule through proximity graphs. \emph{Pattern Recognition Letters}, 18(6), 507-513.
}

