\name{tpsPower}
\Rdversion{1.1}
\alias{tpsPower}
\title{
	Simulation-based estimation of power for the two-phase study design
}
\description{
  Monte Carlo based estimation of statistical power for estimators of the components of a logistic regression model, based on balanced two-phase and case-control study designs (Breslow and Chatterjee, 1999; Prentice and Pykle, 1979).
}
\usage{
tpsPower(B=1000, betaTruth, X, N, strata, nII, alpha=0.05,
         threshold=c(-Inf, Inf), digits=NULL, betaNames=NULL,
         monitor=NULL)
}
\arguments{
  \item{B}{
   The number of datasets generated by the simulation.
}
	\item{betaTruth}{
    Regression coefficients from the logistic regression model.
}
  \item{X}{
    Design matrix for the logistic regression model. The first column should correspond to intercept. For each exposure, the baseline group should be coded as 0, the first level as 1, and so on. The number of levels cannot exceed 100.
}
  \item{N}{
    A numeric vector providing the sample size for each row of the design matrix, \code{X}.
}
  \item{strata}{
    A numeric vector indicating which columns of the design matrix, \code{X}, are used to form the phase I stratification variable. \code{strata=1} specifies the intercept and is, therefore, equivalent to a case-control study. \code{strata=0} indicates all possible stratified two-phase sampling schemes.
}
  \item{nII}{
    A numeric value indicating the phase II sample size. If a vector is provided, separate simulations are run for each element.
}
  \item{alpha}{
    Type I error rate assumed for the evaluation of coverage probabilities and power.
}
  \item{threshold}{
		An interval that specifies truncation of the Monte Carlo sampling distribution of each estimator.
}
  \item{digits}{
    Integer indicating the precision to be used for the output.
}
  \item{betaNames}{
    An optional character vector of names for the regression coefficients, 
    \code{betaTruth}.
}
\item{monitor}{
	Numeric value indicating how often \code{tpsPower()} reports real-time progress on the simulation, as the \code{B} datasets are generated and evaluated. The default of \code{NULL} indicates no output.
}
}
\details{
A simulation study is performed to estimate power for various estimators
of \code{beta}:
\itemize{
  \item{}{(a) complete data maximum likelihood (CD)}
  \item{}{(b) case-control maximum likelihood (CC)}
  \item{}{(c) two-phase weighted likelihood (WL)}
  \item{}{(d) two-phase pseudo- or profile likelihood (PL)}
  \item{}{(e) two-phase maximum likelihood (ML)}
}
	The overall simulation approach is the same as that described in \code{\link{tpsSim}}.
	
  In each case, power is estimated as the proportion of simulated datasets for which a hypothesis test of no effect is rejected.
	
	Only balanced designs are considered by \code{tpsPower()}. For unbalanced designs, power estimates can be obtained from \code{\link{tpsSim}}.
}
\value{
  \code{tpsPower()} returns an object of class "tpsPower", a list containing all the input arguments, as well as the following components:
    \item{power}{
    	Power against the null hypothesis that the regression coefficient is zero for a Wald-based test with an \code{alpha} type I error rate.
    }
    \item{na}{
    	A matrix consisting of the number of datasets excluded from the power calculations (i.e. set to \code{NA}), for each simulation performed. The three reasons are: (1) lack of convergence indicated by \code{NA} point estimates returned by \code{\link{glm}} or \code{\link{tps}}, (2) lack of convergence indicated by \code{NA} standard error point estimates returned by \code{\link{glm}} or \code{\link{tps}}, (3) exclusion on the basis of the \code{threshold} argument.
    }
}
\note{
	A generic print method provides formatted output of the results.

	A generic plot function \code{\link{plotTpsPower}} provides plots of powers against different sample sizes for each estimate of a regression coefficient.
}
\references{
  Prentice, R. and Pyke, R. (1979) "Logistic disease incidence models and case-control studies." Biometrika 66:403-411.

	Breslow, N. and Chatterjee, N. (1999) "Design and analysis of two phase studies with binary outcome applied to Wilms tumour prognosis." Applied Statistics 48:457-468.
}
\author{
  Takumi Saegusa, Sebastien Haneuse
}
\seealso{
  \code{\link{plotTpsPower}}.
}
\examples{
##
data(Ohio)

## 
XM   <- cbind(Int=1, Ohio[,1:3])
fitM <- glm(cbind(Death, N-Death) ~ factor(Age) + Sex + Race, data=Ohio,
            family=binomial)
betaNamesM <- c("Int", "Age1", "Age2", "Sex", "Race")

## Power for the TPS design where phase I stratification is based on Race.

\dontrun{tpsResult1 <- tpsPower(B=1000, beta=fitM$coef, X=XM, N=Ohio$N, strata=4,
                       nII=seq(from=100, to=1000, by=100),
                       betaNames=betaNamesM, monitor=100)
tpsResult1
}
## Power for the TPS design where phase I stratification is based on Age
##   * consider the setting where the age coefficients are halved from
##     their observed true values
##   * the intercept is modified, accordingly, using the beta0() function
newBetaM      <- fitM$coef
newBetaM[2:3] <- newBetaM[2:3] / 2
newBetaM[1]   <- beta0(betaX=newBetaM[-1], X=XM, N=Ohio$N,
                       rhoY=sum(Ohio$Death)/sum(Ohio$N))
\dontrun{tpsResult2 <- tpsPower(B=1000, beta=fitM$coef, X=XM, N=Ohio$N, strata=2,
                       nII=seq(from=100, to=500, by=50),
                       betaNames=betaNamesM, monitor=100)
tpsResult2
}}

