% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/orderNorm.R
\name{orderNorm}
\alias{orderNorm}
\alias{predict.orderNorm}
\alias{predict.orderNorm}
\alias{print.orderNorm}
\title{Calculate and perform Ordered Quantile normalizing transformation}
\usage{
orderNorm(x, warn = TRUE)

\method{predict}{orderNorm}(object, newdata = NULL, inverse = FALSE,
  warn = TRUE, ...)

\method{print}{orderNorm}(x, ...)
}
\arguments{
\item{x}{A vector to normalize}

\item{warn}{transforms outside observed range or ties will yield warning}

\item{object}{an object of class 'orderNorm'}

\item{newdata}{a vector of data to be (reverse) transformed}

\item{inverse}{if TRUE, performs reverse transformation}

\item{...}{additional arguments}
}
\value{
A list of class \code{orderNorm} with elements
  
\item{x.t}{transformed original data} 
\item{x}{original data} 
\item{n}{number of nonmissing observations}
\item{ties_status}{indicator if ties are present}
\item{fit}{fit to be used for extrapolation, if needed}
\item{norm_stat}{Pearson's P / degrees of freedom}

The \code{predict} function returns the numeric value of the transformation 
performed on new data, and allows for the inverse transformation as well.
}
\description{
The Ordered Quantile (ORQ) normalization transformation,
  \code{orderNorm()}, is a rank-based procedure by which the values of a
  vector are mapped to their percentile, which is then mapped to the same
  percentile of the normal distribution. Without the presence of ties, this
  essentially guarantees that the transformation leads to a uniform
  distribution.

  The transformation is: \deqn{g(x) = \Phi ^ {-1} (rank(x) / (length(x) +
  1))}

  Where \eqn{\Phi} refers to the standard normal cdf, rank(x) refers to each
  observation's rank, and length(x) refers to the number of observations.
  
  By itself, this method is certainly not new; the earliest mention of it
  that I could find is in a 1947 paper by Bartlett (see references). This
  formula was outlined explicitly in Van der Waerden, and expounded upon in
  Beasley (2009). However there is a key difference to this version of it,
  as explained below.
  
  Using linear interpolation between these percentiles, the ORQ normalization
  becomes a 1-1 transformation that can be applied to new data. However,
  outside of the observed domain of x, it is unclear how to extrapolate the
  transformation. In the ORQ normalization procedure, a binomial glm with a
  logit link is used on the ranks in order to extrapolate beyond the bounds
  of the original domain of x. The inverse normal CDF is then applied to
  these extrapolated predictions in order to extrapolate the transformation.
  This mitigates the influence of heavy-tailed distributions while preserving
  the 1-1 nature of the transformation. The extrapolation will provide a
  warning unless warn = FALSE.) However, we found that the extrapolation was
  able to perform very well even on data as heavy-tailed as a Cauchy
  distribution (paper to be published).
  
  This transformation can be performed on new data and inverted via the
  \code{predict} function.
}
\examples{

x <- rgamma(100, 1, 1)

orderNorm_obj <- orderNorm(x)
orderNorm_obj
p <- predict(orderNorm_obj)
x2 <- predict(orderNorm_obj, newdata = p, inverse = TRUE)

all.equal(x2, x)
}
\references{
Bartlett, M. S. "The Use of Transformations." Biometrics, 
  vol. 3, no. 1, 1947, pp. 39-52. JSTOR 
  www.jstor.org/stable/3001536.

 Van der Waerden BL. Order tests for the two-sample problem and 
   their power. 1952;55:453-458. Ser A. 
   
 Beasley TM, Erickson S, Allison DB. Rank-based inverse normal transformations
   are increasingly used, but are they merited? Behav. Genet. 2009;39(5): 
   580-595. pmid:19526352
}
\seealso{
\code{\link{boxcox}},
 \code{\link{lambert}}, 
 \code{\link{bestNormalize}},
 \code{\link{yeojohnson}}
}
