Type: | Package |
Title: | Reduced-Rank Multinomial Logistic Regression for Markov Chains |
Version: | 0.4.0 |
Description: | Fit the reduced-rank multinomial logistic regression model for Markov chains developed by Wang, Abner, Fardo, Schmitt, Jicha, Eldik and Kryscio (2021)<doi:10.1002/sim.8923> in R. It combines the ideas of multinomial logistic regression in Markov chains and reduced-rank. It is very useful in a study where multi-states model is assumed and each transition among the states is controlled by a series of covariates. The key advantage is to reduce the number of parameters to be estimated. The final coefficients for all the covariates and the p-values for the interested covariates will be reported. The p-values for the whole coefficient matrix can be calculated by two bootstrap methods. |
License: | GPL-2 |
Encoding: | UTF-8 |
LazyData: | true |
Imports: | nnet |
Depends: | R (≥ 3.5.0) |
RoxygenNote: | 7.1.1 |
Suggests: | rmarkdown, knitr |
NeedsCompilation: | no |
Packaged: | 2021-06-04 19:05:32 UTC; pwa24 |
Author: | Pei Wang [aut, cre], Richard Kryscio [aut] |
Maintainer: | Pei Wang <wangp33@miamioh.edu> |
Repository: | CRAN |
Date/Publication: | 2021-06-07 07:20:07 UTC |
Aupdate
Description
This function is used to update A matrix
Usage
Aupdate(Dfix, Gamma, Adata, R, p, q, I, iniA, eps, refA)
Arguments
Dfix |
the coefficient matrix for study covariates |
Gamma |
the G matrix value |
Adata |
the dataset |
R |
the rank of reduced rank model |
p |
the number of covariates in the dimension reduction |
q |
the numbne of study covariates |
I |
a U by U incidence matrix with elements; I(i,j)=1 if state j can be accessed from state i in one step and 0 otherwise |
iniA |
initial value for the iteration |
eps |
the tolerance for convergence, default is 10^-5 |
refA |
a vector of reference categories |
Value
a list of outputs:
NewA: the updated A matrix
loglikeA: the loglikelihood when updating A
Gupdate
Description
This function is used to update G matrix
Usage
Gupdate(A, Gdata, p, q, I, refG)
Arguments
A |
numeric matrix |
Gdata |
the dataset used to update G |
p |
the number of covariates in the dimension reduction |
q |
the numbne of study covariates |
I |
a U by U incidence matrix with elements; I(i,j)=1 if state j can be accessed from state i in one step and 0 otherwise |
refG |
a vector of reference categories |
Value
a list of outputs:
NewG: the updated G matrix
loglikeK: the loglikelihood when updating G
sderr: standard errors for the coefficient matrix
Cognitive Dataset
Description
A dataset containing the states and covariates of 649 participants enrolled in the BRAiNS cohort at the University of Kentucky's Alzheimer's Disease Research Center.
Usage
cogdat
Format
A data frame with 6240 rows and 14 columns:
- ID
used to denote the participants; from 1 to 649
- visitno
used to denote the visit number for each participant
- prstate
denote the previous state
- custate
denote the current state
- bagec
baseline age (centered at age 72)
- famhx
family history of dementia
- HBP
self reported high blood pressure
- apoe4
at least one Apolipoprotein-E (APOE) gene
\epsilon
4 allele- smk1
cigarette smoking level (none versus < 10)
- smk2
cigarette smoking level (11-19)
- smk2
cigarette smoking level (>= 20 pack years))
- lowed
low education
- headinj
self reported head injury
derivativeB
Description
This function is used to calculate the loglikelihood with a given matrix B=AG
Usage
derivativeB(B, I, zy, refd)
Arguments
B |
a numeric coefficient matrix |
I |
U by U incidence matrix with elements; I(i,j)=1 if state j can be accessed from state i in one step and 0 otherwise |
zy |
the variable values for a given observation |
refd |
a vector of reference categories |
Value
loglikelihood
derivatives
Description
This function is used calculate the derivative values (first and second derivatives for Newton-Raphson method) and loglikelihood when updating A
Usage
derivatives(A, Gamma, Dmat, I, zy, refA)
Arguments
A |
matrix with value from previous iteration |
Gamma |
G matrix values |
Dmat |
the coefficient matrix for the fixed variables, |
I |
a U by U incidence matrix with elements; I(i,j)=1 if state j can be accessed from state i in one step and 0 otherwise |
zy |
the variable values for a given observation |
refA |
a vector of reference categories |
Value
a list of outputs:
fird: the first derivative value
secd: the second derivative value
loglike: the loglikelihood
expand
Description
This function is used to expand the Y(category) to a indicator vector
Usage
expand(pri, curr, I, refE)
Arguments
pri |
the prior state |
curr |
the current state |
I |
a U by U incidence matrix with elements; I(i,j)=1 if state j can be accessed from state i in one step and 0 otherwise |
refE |
a vector with the reference categories |
Value
ry: a indicator vector
norm
Description
This function is used to normalize a vector to have unit length
Usage
norm(x)
Arguments
x |
a numeric vector |
Value
a normalized vector with length 1
rrmultinom
Description
This function is used to fit the reduced rank multinomial logistic regression for markov chain
Usage
rrmultinom(I, z1 = NULL, z2 = NULL, T, R, eps = 1e-05, ref = NULL)
Arguments
I |
a U by U incidence matrix with elements; U is number of states; I(i,j)=1 if state j can be accessed from state i in one step and 0 otherwise |
z1 |
a n by p matrix with covariates involved in the dimension reduction(DR), n is the number of subjects, p is the number of covariates involved in DR |
z2 |
a n by q matrix with study covariates (not in dimension reduction), q is the number of study covariates |
T |
a M by 3 state matrix,
|
R |
the rank |
eps |
the tolerance for convergence; the default is 10^-5 |
ref |
a vector of reference categories; the default is NULL and if NULL is used, the function will use the first category as the reference category for each row |
Value
a list of outputs:
Alpha: the final A matrix
Gamma: the final G matrix
Beta: the coefficient matrix for variables involved in reduced rank
Dcoe: the coefficient matrix for the fixed variables
Dsderr: the standard error matrix for the fixed variables
Dpval: the p-value matrix for the fixed variables
coemat: the overall coefficient matrix
niter: the iteration number to get converged
df: the degrees of freedom
loglik: the final loglikelihood
converge: three possible values with 0 means fail to converge, 1 means converges, and 2 means the maximum iteration is achieved
Examples
# generate the Markov chain
U=7
I1=I2=I3=rep(1,7)
I4=c(0,0,0,1,1,1,1)
I5=I6=I7=rep(0,7)
I=rbind(I1,I2,I3,I4,I5,I6,I7)
# prepare the data
data=cogdat
n=length(unique(data[,1]))
M=nrow(data)+n
Mc=0
z=matrix(0,n,9)
colnames(z)=colnames(data)[5:13]
T=matrix(0,M,3)
for(i in 1:n){
subdat=data[which(data[,1]==i),,drop=FALSE]
z[i,]=subdat[1,5:13]
mc=nrow(subdat)
T[(Mc+1):(Mc+mc+1),1]=i
T[(Mc+1):(Mc+mc+1),2]=0:mc
T[(Mc+1):(Mc+mc+1),3]=c(subdat[1,3],subdat[,4])
Mc=Mc+mc+1
}
#z1=z[,c(1:3),drop=FALSE]
z2=z[,4,drop=FALSE]
# fit the model with rank 1
rrmultinom(I,z1=NULL,z2,T,1,eps=9,ref=c(1,1,1,4))
sdfun
Description
This function is used get the standard error matrix from bootstrap method It returns the matrices of standard error and p-value for the coefficient matrix
Usage
sdfun(I, z1 = NULL, z2 = NULL, T, R, eps = 1e-05, B, tpoint = NULL, ref)
Arguments
I |
a U by U incidence matrix with elements; U is the number of states; I(i,j)=1 if state j can be accessed from state i in one step and 0 otherwise |
z1 |
a n by p matrix with covariates involved in the dimension reduction(DR), n is the number of subjects, p is the number of covariates involved in DR |
z2 |
a n by q matrix with study covariates (not in dimension reduction), q is the number of study covariates |
T |
a M by 3 state matrix,
|
R |
the rank |
eps |
the tolerance for convergence; the default is 10^-5 |
B |
the bootstrap number |
tpoint |
a matrix has two columns with the participants' visit information about timeline |
ref |
a vector of reference categories |
Value
a list of outputs:
coe: the coefficient matrix of the original data
sd: the standard error matrix
pvalue: the p-value matrix
Examples
# generate the Markov chain
U=7
I1=I2=I3=rep(1,7)
I4=c(0,0,0,1,1,1,1)
I5=I6=I7=rep(0,7)
I=rbind(I1,I2,I3,I4,I5,I6,I7)
# prepare the data
data=cogdat
n=length(unique(data[,1]))
M=nrow(data)+n
Mc=0
z=matrix(0,n,9)
colnames(z)=colnames(data)[5:13]
T=matrix(0,M,3)
for(i in 1:n){
subdat=data[which(data[,1]==i),,drop=FALSE]
z[i,]=subdat[1,5:13]
mc=nrow(subdat)
T[(Mc+1):(Mc+mc+1),1]=i
T[(Mc+1):(Mc+mc+1),2]=0:mc
T[(Mc+1):(Mc+mc+1),3]=c(subdat[1,3],subdat[,4])
Mc=Mc+mc+1
}
#z1=z[,c(1:3),drop=FALSE]
z2=z[,4,drop=FALSE]
# find the standard deviation matrix for the model with rank 1
sdfun(I,z1=NULL,z2,T,1,eps = 9,2,ref=c(1,1,1,4))