Type: | Package |
Title: | Item Response Theory and Computer-Based Testing in R |
Version: | 2.1.2 |
Date: | 2019-3-21 |
Author: | Xiao Luo [aut, cre] |
Maintainer: | Xiao Luo <xluo1986@gmail.com> |
Description: | A suite of psychometric analysis tools for research and operation, including: (1) computation of probability, information, and likelihood for the 3PL, GPCM, and GRM; (2) parameter estimation using joint or marginal likelihood estimation method; (3) simulation of computerized adaptive testing using built-in or customized algorithms; (4) assembly and simulation of multistage testing. The full documentation and tutorials are at https://github.com/xluo11/xxIRT. |
License: | GPL (≥ 3) |
Depends: | R (≥ 3.5.0) |
URL: | https://github.com/xluo11/xxIRT |
BugReports: | https://github.com/xluo11/xxIRT/issues |
Imports: | ggplot2, glpkAPI, lpSolveAPI, reshape2, stats |
RoxygenNote: | 6.1.1 |
Encoding: | UTF-8 |
NeedsCompilation: | no |
Packaged: | 2019-03-21 23:54:08 UTC; xiaoluo |
Repository: | CRAN |
Date/Publication: | 2019-03-22 10:00:03 UTC |
Automated Test Assembly (ATA)
Description
ata
initiates an ATA model
ata_obj_relative
adds a relative objective to the model
ata_obj_absolute
adds an absolute objective to the model
ata_constraint
adds a constraint to the model
ata_item_use
limits the minimum and maximum usage for items
ata_item_enemy
adds an enemy-item constraint to the model
ata_item_fixedvalue
forces an item to be selected or not selected
ata_solve
solves the MIP model
Usage
ata(pool, num_form = 1, len = NULL, max_use = NULL, ...)
## S3 method for class 'ata'
print(x, ...)
## S3 method for class 'ata'
plot(x, ...)
ata_obj_relative(x, coef, mode = c("max", "min"), tol = NULL,
negative = FALSE, forms = NULL, collapse = FALSE,
internal_index = FALSE, ...)
ata_obj_absolute(x, coef, target, equal_tol = FALSE, tol_up = NULL,
tol_down = NULL, forms = NULL, collapse = FALSE,
internal_index = FALSE, ...)
ata_constraint(x, coef, min = NA, max = NA, level = NULL,
forms = NULL, collapse = FALSE, internal_index = FALSE)
ata_item_use(x, min = NA, max = NA, items = NULL)
ata_item_enemy(x, items)
ata_item_fixedvalue(x, items, min = NA, max = NA, forms)
ata_solve(x, solver = c("lpsolve", "glpk"), as.list = TRUE,
details = TRUE, time_limit = 10, message = FALSE, ...)
Arguments
pool |
item pool, a data.frame |
num_form |
number of forms to be assembled |
len |
test length of each form |
max_use |
maximum use of each item |
... |
options, e.g. group, common_items, overlap_items |
x |
an ATA object |
coef |
coefficients of the objective function |
mode |
optimization mode: 'max' for maximization and 'min' for minimization |
tol |
the tolerance paraemter |
negative |
|
forms |
forms where objectives are added. |
collapse |
|
internal_index |
|
target |
the target values of the objective function |
equal_tol |
|
tol_up |
the range of upward tolerance |
tol_down |
the range of downward tolerance |
min |
the lower bound of the constraint |
max |
the upper bound of the constraint |
level |
the level of a categorical variable to be constrained |
items |
a vector of item indices, |
solver |
use 'lpsolve' for lp_solve 5.5 or 'glpk' for GLPK |
as.list |
|
details |
|
time_limit |
the time limit in seconds passed along to solvers |
message |
|
Details
The ATA model stores the definition of a MIP model. ata_solve
converts the model definition to a real MIP object and attempts to solve it.
ata_obj_relative
:
when mode='max', maximize (y-tol), subject to y <= sum(x) <= y+tol;
when mode='min', minimize (y+tol), subject to y-tol <= sum(x) <= y.
When negative
is TRUE
, y < 0, tol > 0.
coef
can be a numeric vector that has the same length with the pool or forms,
or a variable name in the pool, or a numeric vector of theta points.
When tol
is NULL
, it is optimized; when FALSE
, ignored;
when a number, fixed; when a range, constrained with lower and upper bounds.
ata_obj_absolute
minimizes y0+y1 subject to t-y0 <= sum(x) <= t+y1.
When level
is NA
, it is assumed that the constraint is on
a quantitative item property; otherwise, a categorical item property.
coef
can be a variable name, a constant, or a numeric vector that has
the same size as the pool.
ata_solve
takes control options in ...
.
For lpsolve, see lpSolveAPI::lp.control.options
.
For glpk, see glpkAPI::glpkConstants
Once the model is solved, additional data are added to the model.
status
shows the status of the solution, optimum
the optimal value of the objective fucntion found in the solution,
obj_vars
the values of two critical variables in the objective
function, result
the assembly results in a binary matrix, and
items
the assembled items
Examples
## Not run:
## generate a pool of 100 items
n_items <- 100
pool <- with(model_3pl_gendata(1, nitems), data.frame(id=1:n_items, a=a, b=b, c=c))
pool$content <- sample(1:3, n_items, replace=TRUE)
pool$time <- round(rlnorm(n_items, log(60), .2))
pool$group <- sort(sample(1:round(n_items/3), n_items, replace=TRUE))
## ex. 1: four 10-item forms, maximize b parameter
x <- ata(pool, 4, len=10, max_use=1)
x <- ata_obj_relative(x, "b", "max")
x <- ata_solve(x, timeout=5)
data.frame(form=1:4, b=sapply(x$items, function(x) mean(x$b)))
## ex. 2: four 10-item forms, minimize b parameter
x <- ata(pool, 4, len=10, max_use=1)
x <- ata_obj_relative(x, "b", "min", negative=TRUE)
x <- ata_solve(x, as.list=FALSE, timeout=5)
with(x$items, aggregate(b, by=list(form=form), mean))
## ex. 3: two 10-item forms, mean(b)=0, sd(b)=1
## content = (3, 3, 4), avg. time = 58--62 seconds
constr <- data.frame(name='content',level=1:3, min=c(3,3,4), max=c(3,3,4), stringsAsFactors=F)
constr <- rbind(constr, c('time', NA, 58*10, 62*10))
x <- ata(pool, 2, len=10, max_use=1)
x <- ata_obj_absolute(x, pool$b, 0*10)
x <- ata_obj_absolute(x, (pool$b-0)^2, 1*10)
for(i in 1:nrow(constr))
x <- with(constr, ata_constraint(x, name[i], min[i], max[i], level=level[i]))
x <- ata_solve(x, timeout=5)
sapply(x$items, function(x) c(mean=mean(x$b), sd=sd(x$b)))
## ex. 4: two 10-item forms, max TIF over (-1, 1), consider item sets
x <- ata(pool, 2, len=10, max_use=1, group="group")
x <- ata_obj_relative(x, seq(-1, 1, .5), 'max')
x <- ata_solve(x, timeout=5)
plot(x)
## End(Not run)
Helper functions of ATA
Description
miscellaneous helper functions of ATA
ata_append_constraints
appends constraint definitions to the model
ata_form_index
converts input forms into actual form indices in the model
ata_obj_coef
processes input coefficients of the objective functions
ata_solve_lpsolve
solves the the MIP model using lp_solve
ata_solve_glpk
solves the the MIP model using GLPK
Usage
ata_append_constraints(x, mat, dir, rhs)
ata_form_index(x, forms, collapse, internal_index)
ata_obj_coef(x, coef, compensate)
ata_solve_lpsolve(x, time_limit, message, ...)
ata_solve_glpk(x, time_limit, message, ...)
Arguments
mat |
coefficient matrix |
dir |
direction |
rhs |
right-hand-side value |
forms |
indices of forms |
collapse |
|
internal_index |
|
coef |
coefficients |
compensate |
|
time_limit |
the time limit in seconds passed along to solvers |
message |
|
... |
additional control parameters for solvers |
Simulation of Computerized Adaptive Testing (CAT)
Description
cat_sim
runs a simulation of CAT. Use theta
in options to set the starting
value of theta estimate.
cat_estimate_mle
is the maximum likelihood estimation rule. Use
map_len
to apply MAP to the first K items and use map_prior
to set the
prior for MAP.
cat_estimate_eap
is the expected a posteriori estimation rule,
using eap_mean
and eap_sd
option parameters as the prior
cat_estimate_hybrid
is a hybrid estimation rule, which uses MLE for
mixed responses and EAP for all 1's or 0's responses
cat_stop_default
is a three-way stopping rule. When stop_se
is set in the options, it uses the standard error stopping rule. When
stop_mi
is set in the options, it uses the minimum information stopping rule. When
stop_cut
is set in the options, it uses the confidence interval (set by ci_width
)
stopping rule.
cat_select_maxinfo
is the maximum information selection rule. Use group
(a numeric vector) to group items belonging to the same set. Use info_random
to implement
the random-esque item exposure control method.
cat_select_ccat
is the constrained CAT selection rule. Use
ccat_var
to set the content variable in the pool. Use ccat_perc
to set
the desired content distribution, with the name of each element being the content code
and tue value of each element being the percentage. Use ccat_random
to add randomness
to initial item selections.
cat_select_shadow
is the shadow-test selection rule. Use shadow_id
to group item sets. Use constraints
to set constraints. Constraints should be in a data.frame
with four columns: var (variable name), level (variable level, NA
for quantitative variable),
min (lower bound), and max (upper bound).
cat_stop_projection
is the projection-based stopping rule. Use
projection_method
to choose the projection method ('info' or 'diff'). Use
stop_cut
to set the cut score. Use constraints
to set the constraints.
Constraints should be a data.frame with columns: var (variable name),
level (variable level, NA
for quantitative varialbe), min (lower bound), max (upper bound)
Usage
cat_sim(true, pool, ...)
cat_estimate_mle(len, theta, stats, admin, pool, opts)
cat_estimate_eap(len, theta, stats, admin, pool, opts)
cat_estimate_hybrid(len, theta, stats, admin, pool, opts)
cat_stop_default(len, theta, stats, admin, pool, opts)
cat_select_maxinfo(len, theta, stats, admin, pool, opts)
cat_select_ccat(len, theta, stats, admin, pool, opts)
cat_select_shadow(len, theta, stats, admin, pool, opts)
## S3 method for class 'cat'
print(x, ...)
## S3 method for class 'cat'
plot(x, ...)
cat_stop_projection(len, theta, stats, admin, pool, opts)
Arguments
true |
the true theta |
pool |
the item pool (data.frame) |
... |
option/control parameters |
len |
the current test length |
theta |
the current theta estimate |
stats |
a matrix of responses, theta estimate, information and std error |
admin |
a data frame of administered items |
opts |
a list of option/control parameters |
x |
a |
Details
...
takes a variety of option/control parameters for the simulations from users.
min
and max are mandatory for setting limits on the test length. User-defined
selection, estimation, and stopping rules are also passed to the simulator via options.
To write a new rule, the function siganiture must be: function(len, theta, stats, admin, pool, opts)
.
See built-in rules for examples.
Value
cat_sim
returns a cat
object
an estimation rule should return a theta estimate
a stopping rule should return a boolean: TRUE
to stop the CAT, FALSE
to continue
a selection rule should return a list of (a) the selected item and (b) the updated pool
Examples
## Not run:
## generate a 100-item pool
num_items <- 100
pool <- with(model_3pl_gendata(1, num_items), data.frame(a=a, b=b, c=c))
pool$set_id <- sample(1:30, num_items, replace=TRUE)
pool$content <- sample(1:3, num_items, replace=TRUE)
pool$time <- round(rlnorm(num_items, mean=4.1, sd=.2))
## MLE, EAP, and hybrid estimation rule
cat_sim(1.0, pool, min=10, max=20, estimate_rule=cat_estimate_mle)
cat_sim(1.0, pool, min=10, max=20, estimate_rule=cat_estimate_eap)
cat_sim(1.0, pool, min=10, max=20, estimate_rule=cat_estimate_hybrid)
## SE, MI, and CI stopping rule
cat_sim(1.0, pool, min=10, max=20, stop_se=.3)
cat_sim(1.0, pool, min=10, max=20, stop_mi=.6)
cat_sim(1.0, pool, min=10, max=20, stop_cut=0)
cat_sim(1.0, pool, min=10, max=20, stop_cut=0, ci_width=2.58)
## maximum information selection with item sets
cat_sim(1.0, pool, min=10, max=20, group="set_id")$admin
## maximum information with item exposure control
cat_sim(1.0, pool, min=10, max=20, info_random=5)$admin
## Constrained-CAT selection rule with and without initial randomness
cat_sim(1.0, pool, min=10, max=20, select_rule=cat_select_ccat,
ccat_var="content", ccat_perc=c("1"=.2, "2"=.3, "3"=.5))
cat_sim(1.0, pool, min=10, max=20, select_rule=cat_select_ccat, ccat_random=5,
ccat_var="content", ccat_perc=c("1"=.2, "2"=.3, "3"=.5))
## Shadow-test selection rule
cons <- data.frame(var='content', level=1:3, min=c(3,3,4), max=c(3,3,4))
cons <- rbind(cons, data.frame(var='time', level=NA, min=55*10, max=65*10))
cat_sim(1.0, pool, min=10, max=10, select_rule=cat_select_shadow, constraints=cons)
## Projection-based stopping rule
cons <- data.frame(var='content', level=1:3, min=5, max=15)
cons <- rbind(cons, data.frame(var='time', level=NA, min=60*20, max=60*40))
cat_sim(1.0, pool, min=20, max=40, select_rule=cat_select_shadow, stop_rule=cat_stop_projection,
projection_method="diff", stop_cut=0, constraints=cons)
## End(Not run)
Cronbach's alpha
Description
cronbach_alpha
computes Cronbach's alpha internal consistency reliability
Usage
cronbach_alpha(responses)
Arguments
responses |
the oberved responses, 2d matrix |
Examples
cronbach_alpha(model_3pl_gendata(1000, 20)$u)
Estimate 3-parameter-logistic model
Description
Estimate the 3PL model using the maximum likelihood estimation
model_3pl_eap_scoring
scores response vectors using the EAP method
model_3pl_map_scoring
scores response vectors using the MAP method
model_3pl_dv_jmle
calculates the first and second derivatives for
the joint maximum likelihood estimation
model_3pl_estimate_jmle
estimates the parameters using the
joint maximum likelihood estimation (JMLE) method
model_3pl_dv_mmle
calculates the first and second derivatives for
the marginal maximum likelihood estimation
model_3pl_estimate_mmle
estimates the parameters using the
marginal maximum likelihood estimation (MMLE) method
Usage
model_3pl_eap_scoring(u, a, b, c, D = 1.702, prior = c(0, 1),
bound = c(-3, 3))
model_3pl_map_scoring(u, a, b, c, D = 1.702, prior = c(0, 1),
bound = c(-3, 3), nr_iter = 30, nr_conv = 0.001)
model_3pl_dv_Pt(t, a, b, c, D)
model_3pl_dv_Pa(t, a, b, c, D)
model_3pl_dv_Pb(t, a, b, c, D)
model_3pl_dv_Pc(t, a, b, c, D)
model_3pl_dv_jmle(dv, u)
model_3pl_estimate_jmle(u, t = NA, a = NA, b = NA, c = NA,
D = 1.702, iter = 100, conv = 1, nr_iter = 10, nr_conv = 0.001,
scale = c(0, 1), bounds_t = c(-3, 3), bounds_a = c(0.01, 2),
bounds_b = c(-3, 3), bounds_c = c(0, 0.25), priors = list(t = c(0,
1), a = c(-0.1, 0.2), b = c(0, 1), c = c(4, 20)), decay = 1,
debug = FALSE, true_params = NULL)
model_3pl_dv_mmle(pdv_fn, u, quad, a, b, c, D)
model_3pl_estimate_mmle(u, t = NA, a = NA, b = NA, c = NA,
D = 1.702, iter = 100, conv = 1, nr_iter = 10, nr_conv = 0.001,
bounds_t = c(-3, 3), bounds_a = c(0.01, 2), bounds_b = c(-3, 3),
bounds_c = c(0, 0.25), priors = list(t = c(0, 1), a = c(-0.1, 0.2), b
= c(0, 1), c = c(4, 20)), decay = 1, quad_degree = "11",
scoring = c("eap", "map"), debug = FALSE, true_params = NULL)
model_3pl_fitplot(u, t, a, b, c, D = 1.702, index = NULL,
intervals = seq(-3, 3, 0.5), show_points = TRUE)
Arguments
u |
observed response matrix, 2d matrix |
a |
discrimination parameters, 1d vector (fixed value) or NA (freely estimate) |
b |
difficulty parameters, 1d vector (fixed value) or NA (freely estimate) |
c |
pseudo-guessing parameters, 1d vector (fixed value) or NA (freely estimate) |
D |
the scaling constant, 1.702 by default |
prior |
the prior distribution |
nr_iter |
the maximum iterations of newton-raphson |
nr_conv |
the convegence criterion for newton-raphson |
t |
ability parameters, 1d vector (fixed value) or NA (freely estimate) |
iter |
the maximum iterations |
conv |
the convergence criterion of the -2 log-likelihood |
scale |
the meand and SD of the theta scale, N(0, 1) for JMLE by default |
bounds_t |
bounds of ability parameters |
bounds_a |
bounds of discrimination parameters |
bounds_b |
bounds of difficulty parameters |
bounds_c |
bounds of guessing parameters |
priors |
a list of prior distributions |
decay |
decay rate |
debug |
TRUE to print debuggin information |
true_params |
a list of true parameters for evaluating the estimation accuracy |
pdv_fn |
the function to compute derivatives of P w.r.t the estimating parameters |
quad_degree |
the number of quadrature points |
scoring |
the scoring method: 'eap' or 'map' |
index |
the indices of items being plotted |
intervals |
intervals on the x-axis |
show_points |
TRUE to show points |
Examples
with(model_3pl_gendata(10, 40), cbind(true=t, est=model_3pl_eap_scoring(u, a, b, c)$t))
with(model_3pl_gendata(10, 40), cbind(true=t, est=model_3pl_map_scoring(u, a, b, c)$t))
## Not run:
# generate data
x <- model_3pl_gendata(2000, 40)
# free estimation
y <- model_3pl_estimate_jmle(x$u, true_params=x)
# fix c-parameters
y <- model_3pl_estimate_jmle(x$u, c=0, true_params=x)
# no priors
y <- model_3pl_estimate_jmle(x$u, priors=NULL, iter=30, debug=T)
## End(Not run)
## Not run:
# generate data
x <- model_3pl_gendata(2000, 40)
# free estimation
y <- model_3pl_estimate_mmle(x$u, true_params=x)
# fix c-parameters
y <- model_3pl_estimate_mmle(x$u, c=0, true_params=x)
# no priors
y <- model_3pl_estimate_mmle(x$u, priors=NULL, iter=30, debug=T)
## End(Not run)
with(model_3pl_gendata(1000, 20), model_3pl_fitplot(u, t, a, b, c, index=c(1, 3, 5)))
Estimate Generalizaed Partial Credit Model
Description
Estimate the GPCM using the maximum likelihood estimation
model_gpcm_eap_scoring
scores response vectors using the EAP method
model_gpcm_map_scoring
scores response vectors using maximum a posteriori
model_gpcm_estimate_jmle
estimates the parameters using the
joint maximum likelihood estimation (JMLE) method
model_gpcm_estimate_mmle
estimates the parameters using the
marginal maximum likelihood estimation (MMLE) method
Usage
model_gpcm_eap_scoring(u, a, b, d, D = 1.702, prior = c(0, 1),
bound = c(-3, 3))
model_gpcm_map_scoring(u, a, b, d, D = 1.702, prior = NULL,
bound = c(-3, 3), nr_iter = 30, nr_conv = 0.001)
model_gpcm_dv_Pt(t, a, b, d, D)
model_gpcm_dv_Pa(t, a, b, d, D)
model_gpcm_dv_Pb(t, a, b, d, D)
model_gpcm_dv_Pd(t, a, b, d, D)
model_gpcm_dv_jmle(ix, dvp)
model_gpcm_estimate_jmle(u, t = NA, a = NA, b = NA, d = NA,
D = 1.702, iter = 100, nr_iter = 10, conv = 1, nr_conv = 0.001,
scale = c(0, 1), bounds_t = c(-4, 4), bounds_a = c(0.01, 2),
bounds_b = c(-4, 4), bounds_d = c(-4, 4), priors = list(t = c(0,
1), a = c(-0.1, 0.2), b = c(0, 1), d = c(0, 1)), decay = 1,
debug = FALSE, true_params = NULL)
model_gpcm_dv_mmle(u_ix, quad, pdv)
model_gpcm_estimate_mmle(u, t = NA, a = NA, b = NA, d = NA,
D = 1.702, iter = 100, nr_iter = 10, conv = 1, nr_conv = 0.001,
bounds_t = c(-4, 4), bounds_a = c(0.01, 2), bounds_b = c(-4, 4),
bounds_d = c(-4, 4), priors = list(t = c(0, 1), a = c(-0.1, 0.2), b =
c(0, 1), d = c(0, 1)), decay = 1, quad_degree = "11",
scoring = c("eap", "map"), debug = FALSE, true_params = NULL)
model_gpcm_fitplot(u, t, a, b, d, D = 1.702, insert_d0 = NULL,
index = NULL, intervals = seq(-3, 3, 0.5), show_points = TRUE)
Arguments
u |
the observed response matrix, 2d matrix |
a |
discrimination parameters, 1d vector (fixed value) or NA (freely estimate) |
b |
difficulty parameters, 1d vector (fixed value) or NA (freely estimate) |
d |
category parameters, 2d matrix (fixed value) or NA (freely estimate) |
D |
the scaling constant, 1.702 by default |
prior |
the prior distribution |
nr_iter |
the maximum iterations of newton-raphson |
nr_conv |
the convegence criterion for newton-raphson |
t |
ability parameters, 1d vector (fixed value) or NA (freely estimate) |
ix |
the 3d indices |
dvp |
the derivatives of P |
iter |
the maximum iterations |
conv |
the convergence criterion of the -2 log-likelihood |
scale |
the scale of theta parameters |
bounds_t |
bounds of ability parameters |
bounds_a |
bounds of discrimination parameters |
bounds_b |
bounds of location parameters |
bounds_d |
bounds of category parameters |
priors |
a list of prior distributions |
decay |
decay rate |
debug |
TRUE to print debuggin information |
true_params |
a list of true parameters for evaluating the estimation accuracy |
quad_degree |
the number of quadrature points |
scoring |
the scoring method: 'eap' or 'map' |
insert_d0 |
insert an initial category value |
index |
the indices of items being plotted |
intervals |
intervals on the x-axis |
show_points |
TRUE to show points |
Examples
with(model_gpcm_gendata(10, 40, 3), cbind(true=t, est=model_gpcm_eap_scoring(u, a, b, d)$t))
with(model_gpcm_gendata(10, 40, 3), cbind(true=t, est=model_gpcm_map_scoring(u, a, b, d)$t))
## Not run:
# generate data
x <- model_gpcm_gendata(1000, 40, 3)
# free calibration
y <- model_gpcm_estimate_jmle(x$u, true_params=x)
# no priors
y <- model_gpcm_estimate_jmle(x$u, priors=NULL, true_params=x)
## End(Not run)
## Not run:
# generate data
x <- model_gpcm_gendata(1000, 40, 3)
# free estimation
y <- model_gpcm_estimate_mmle(x$u, true_params=x)
# no priors
y <- model_gpcm_estimate_mmle(x$u, priors=NULL, true_params=x)
## End(Not run)
with(model_gpcm_gendata(1000, 20, 3), model_gpcm_fitplot(u, t, a, b, d, index=c(1, 3, 5)))
Estimate Graded Response Model
Description
Estimate the GRM using the maximum likelihood estimation
model_grm_eap_scoring
scores response vectors using the EAP method
model_grm_map_scoring
scores response vectors using the MAP method
model_grm_estimate_jmle
estimates the parameters using the
joint maximum likelihood estimation (JMLE) method
model_grm_estimate_mmle
estimates the parameters using the
marginal maximum likelihood estimation (MMLE) method
Usage
model_grm_eap_scoring(u, a, b, D = 1.702, prior = c(0, 1),
bound = c(-3, 3))
model_grm_map_scoring(u, a, b, D = 1.702, prior = NULL, bound = c(-3,
3), nr_iter = 30, nr_conv = 0.001)
model_grm_dv_Pt(t, a, b, D)
model_grm_dv_Pa(t, a, b, D)
model_grm_dv_Pb(t, a, b, D)
model_grm_dv_jmle(ix, dvp)
model_grm_estimate_jmle(u, t = NA, a = NA, b = NA, D = 1.702,
iter = 100, nr_iter = 10, conv = 1, nr_conv = 0.001,
scale = c(0, 1), bounds_t = c(-4, 4), bounds_a = c(0.01, 2),
bounds_b = c(-4, 4), priors = list(t = c(0, 1), a = c(-0.1, 0.2), b =
c(0, 1)), decay = 1, debug = FALSE, true_params = NULL)
model_grm_dv_mmle(u_ix, quad, pdv)
model_grm_estimate_mmle(u, t = NA, a = NA, b = NA, d = NA,
D = 1.702, iter = 100, nr_iter = 10, conv = 1, nr_conv = 0.001,
bounds_t = c(-4, 4), bounds_a = c(0.01, 2), bounds_b = c(-4, 4),
bounds_d = c(-4, 4), priors = list(t = c(0, 1), a = c(-0.1, 0.2), b =
c(0, 1)), decay = 1, quad_degree = "11", scoring = c("eap", "map"),
debug = FALSE, true_params = NULL)
model_grm_fitplot(u, t, a, b, D = 1.702, index = NULL,
intervals = seq(-3, 3, 0.5), show_points = TRUE)
Arguments
u |
the observed response matrix, 2d matrix |
a |
discrimination parameters, 1d vector (fixed value) or NA (freely estimate) |
b |
difficulty parameters, 2d matrix (fixed value) or NA (freely estimate) |
D |
the scaling constant, 1.702 by default |
prior |
the prior distribution |
nr_iter |
the maximum iterations of newton-raphson |
nr_conv |
the convegence criterion of newton-raphson |
t |
ability parameters, 1d vector (fixed value) or NA (freely estimate) |
ix |
the 3d indices |
dvp |
the derivatives of P |
iter |
the maximum iterations |
conv |
the convergence criterion for the -2 log-likelihood |
scale |
the scale of theta parameters |
bounds_t |
bounds of ability parameters |
bounds_a |
bounds of discrimination parameters |
bounds_b |
bounds of location parameters |
priors |
a list of prior distributions |
decay |
decay rate |
debug |
TRUE to print debuggin information |
true_params |
a list of true parameters for evaluating the estimation accuracy |
quad_degree |
the number of quadrature points |
scoring |
the scoring method: 'eap' or 'map' |
index |
the indices of items being plotted |
intervals |
intervals on the x-axis |
show_points |
TRUE to show points |
Examples
with(model_grm_gendata(10, 50, 3), cbind(true=t, est=model_grm_eap_scoring(u, a, b)$t))
with(model_grm_gendata(10, 50, 3), cbind(true=t, est=model_grm_map_scoring(u, a, b)$t))
## Not run:
# generate data
x <- model_grm_gendata(1000, 40, 3)
# free calibration
y <- model_grm_estimate_jmle(x$u, true_params=x)
# no priors
y <- model_grm_estimate_jmle(x$u, priors=NULL, true_params=x)
## End(Not run)
## Not run:
# generate data
x <- model_grm_gendata(1000, 40, 3)
# free estimation
y <- model_grm_estimate_mmle(x$u, true_params=x)
# no priors
y <- model_grm_estimate_mmle(x$u, priors=NULL, true_params=x)
## End(Not run)
with(model_grm_gendata(1000, 20, 3), model_grm_fitplot(u, t, a, b, index=c(1, 3, 5)))
Helper functions of Model Estimation
Description
miscellaneous helper functions for estimating IRT models
estimate_nr_iteration
updates the parameters using the newton-raphson method
Usage
estimate_nr_iteration(param, free, dv, h_max, lr, bound)
model_polytomous_3dindex(u)
model_polytomous_3dresponse(u)
Arguments
param |
the parameter being estimated |
free |
TRUE to freely estimate specific parameters |
dv |
the first and second derivatives |
h_max |
the maximum value of h |
lr |
the learning rate |
bound |
the lower and upper bounds of the parameter |
u |
the observed response, 2d matrix, values start from 0 |
#' Distribution of Expected Raw Scores
Description
Calculate the distribution of expected raw scores
Usage
expected_raw_score_dist(t, a, b, c)
Arguments
t |
the ability parameters, 1d vector |
a |
the item discrimination parameters, 1d vector |
b |
the item difficulty parameters, 1d vector |
c |
the item guessing parameters, 1d vector |
Frequency Counts
Description
Frequency counts of a vector
Usage
freq(x, values = NULL, rounding = NULL)
Arguments
x |
a numeric or character vector |
values |
valid values, |
rounding |
round percentage to n-th decimal places |
Hermite-Gauss Quadrature
Description
Pre-computed hermite gaussian quadratures points and weights
Usage
hermite_gauss(degree = c("20", "11", "7"))
Arguments
degree |
the degree of hermite-gauss quadrature: '20', '11', '7' |
3-parameter-logistic model
Description
Routine functions for the 3PL model
Usage
model_3pl_prob(t, a, b, c, D = 1.702)
model_3pl_info(t, a, b, c, D = 1.702)
model_3pl_lh(u, t, a, b, c, D = 1.702, log = FALSE)
model_3pl_rescale(t, a, b, c, param = c("t", "b"), mean = 0, sd = 1)
model_3pl_gendata(n_p, n_i, t = NULL, a = NULL, b = NULL, c = NULL,
D = 1.702, t_dist = c(0, 1), a_dist = c(-0.1, 0.2), b_dist = c(0,
0.7), c_dist = c(5, 46), missing = NULL)
model_3pl_plot(a, b, c, D = 1.702, type = c("prob", "info"),
total = FALSE, xaxis = seq(-4, 4, 0.1))
model_3pl_plot_loglh(u, a, b, c, D = 1.702, xaxis = seq(-4, 4, 0.1),
show_mle = FALSE)
Arguments
t |
ability parameters, 1d vector |
a |
discrimination parameters, 1d vector |
b |
difficulty parameters, 1d vector |
c |
guessing parameters, 1d vector |
D |
the scaling constant, 1.702 by default |
u |
observed responses, 2d matrix |
log |
True to return log-likelihood |
param |
the parameter of the new scale: 't' or 'b' |
mean |
the mean of the new scale |
sd |
the standard deviation of the new scale |
n_p |
the number of people to be generated |
n_i |
the number of items to be generated |
t_dist |
parameters of the normal distribution used to generate t-parameters |
a_dist |
parameters of the lognormal distribution used to generate a-parameters |
b_dist |
parameters of the normal distribution used to generate b-parameters |
c_dist |
parameters of the beta distribution used to generate c-parameters |
missing |
the proportion or number of missing responses |
type |
the type of plot: 'prob' for item characteristic curve (ICC) and 'info' for item information function curve (IIFC) |
total |
TRUE to sum values over items |
xaxis |
the values of x-axis |
show_mle |
TRUE to print maximum likelihood estimates |
Examples
with(model_3pl_gendata(10, 5), model_3pl_prob(t, a, b, c))
with(model_3pl_gendata(10, 5), model_3pl_info(t, a, b, c))
with(model_3pl_gendata(10, 5), model_3pl_lh(u, t, a, b, c))
model_3pl_gendata(10, 5)
model_3pl_gendata(10, 5, a=1, c=0, missing=.1)
with(model_3pl_gendata(10, 5), model_3pl_plot(a, b, c, type="prob"))
with(model_3pl_gendata(10, 5), model_3pl_plot(a, b, c, type="info", total=TRUE))
with(model_3pl_gendata(5, 50), model_3pl_plot_loglh(u, a, b, c, show_mle=TRUE))
Generalized Partial Credit Model
Description
Routine functions for the GPCM
Usage
model_gpcm_prob(t, a, b, d, D = 1.702, insert_d0 = NULL)
model_gpcm_info(t, a, b, d, D = 1.702, insert_d0 = NULL)
model_gpcm_lh(u, t, a, b, d, D = 1.702, insert_d0 = NULL,
log = FALSE)
model_gpcm_gendata(n_p, n_i, n_c, t = NULL, a = NULL, b = NULL,
d = NULL, D = 1.702, sort_d = FALSE, t_dist = c(0, 1),
a_dist = c(-0.1, 0.2), b_dist = c(0, 0.8), missing = NULL)
model_gpcm_rescale(t, a, b, d, param = c("t", "b"), mean = 0, sd = 1)
model_gpcm_plot(a, b, d, D = 1.702, insert_d0 = NULL,
type = c("prob", "info"), by_item = FALSE, total = FALSE,
xaxis = seq(-6, 6, 0.1))
model_gpcm_plot_loglh(u, a, b, d, D = 1.702, insert_d0 = NULL,
xaxis = seq(-6, 6, 0.1), show_mle = FALSE)
Arguments
t |
ability parameters, 1d vector |
a |
discrimination parameters, 1d vector |
b |
item location parameters, 1d vector |
d |
item category parameters, 2d vector |
D |
the scaling constant, 1.702 by default |
insert_d0 |
insert an initial category value |
u |
the observed scores (starting from 0), 2d matrix |
log |
TRUE to return log-likelihood |
n_p |
the number of people to be generated |
n_i |
the number of items to be generated |
n_c |
the number of score categories |
sort_d |
|
t_dist |
parameters of the normal distribution used to generate t-parameters |
a_dist |
parameters of the lognormal distribution parameters of a-parameters |
b_dist |
parameters of the normal distribution used to generate b-parameters |
missing |
the proportion or number of missing responses |
param |
the parameter of the new scale: 't' or 'b' |
mean |
the mean of the new scale |
sd |
the standard deviation of the new scale |
type |
the type of plot, prob for ICC and info for IIFC |
by_item |
TRUE to combine categories |
total |
TRUE to sum values over items |
xaxis |
the values of x-axis |
show_mle |
TRUE to print maximum likelihood values |
Details
Use NA
to represent unused category.
Examples
with(model_gpcm_gendata(10, 5, 3), model_gpcm_prob(t, a, b, d))
with(model_gpcm_gendata(10, 5, 3), model_gpcm_info(t, a, b, d))
with(model_gpcm_gendata(10, 5, 3), model_gpcm_lh(u, t, a, b, d))
model_gpcm_gendata(10, 5, 3)
model_gpcm_gendata(10, 5, 3, missing=.1)
# Figure 1 in Muraki, 1992 (APM)
b <- matrix(c(-2,0,2,-.5,0,2,-.5,0,2), nrow=3, byrow=TRUE)
model_gpcm_plot(a=c(1,1,.7), b=rowMeans(b), d=rowMeans(b)-b, D=1.0, insert_d0=0)
# Figure 2 in Muraki, 1992 (APM)
b <- matrix(c(.5,0,NA,0,0,0), nrow=2, byrow=TRUE)
model_gpcm_plot(a=.7, b=rowMeans(b, na.rm=TRUE), d=rowMeans(b, na.rm=TRUE)-b, D=1.0, insert_d0=0)
# Figure 3 in Muraki, 1992 (APM)
b <- matrix(c(1.759,-1.643,3.970,-2.764), nrow=2, byrow=TRUE)
model_gpcm_plot(a=c(.778,.946), b=rowMeans(b), d=rowMeans(b)-b, D=1.0, insert_d0=0)
# Figure 1 in Muraki, 1993 (APM)
b <- matrix(c(0,-2,4,0,-2,2,0,-2,0,0,-2,-2,0,-2,-4), nrow=5, byrow=TRUE)
model_gpcm_plot(a=1, b=rowMeans(b), d=rowMeans(b)-b, D=1.0)
# Figure 2 in Muraki, 1993 (APM)
b <- matrix(c(0,-2,4,0,-2,2,0,-2,0,0,-2,-2,0,-2,-4), nrow=5, byrow=TRUE)
model_gpcm_plot(a=1, b=rowMeans(b), d=rowMeans(b)-b, D=1.0, type='info', by_item=TRUE)
with(model_gpcm_gendata(5, 50, 3), model_gpcm_plot_loglh(u, a, b, d))
Graded Response Model
Description
Routine functions for the GRM
Usage
model_grm_prob(t, a, b, D = 1.702, raw = FALSE)
model_grm_info(t, a, b, D = 1.702)
model_grm_lh(u, t, a, b, D = 1.702, log = FALSE)
model_grm_gendata(n_p, n_i, n_c, t = NULL, a = NULL, b = NULL,
D = 1.702, t_dist = c(0, 1), a_dist = c(-0.1, 0.2), b_dist = c(0,
0.8), missing = NULL)
model_grm_rescale(t, a, b, param = c("t", "b"), mean = 0, sd = 1)
model_grm_plot(a, b, D = 1.702, type = c("prob", "info"),
by_item = FALSE, total = FALSE, xaxis = seq(-6, 6, 0.1),
raw = FALSE)
model_grm_plot_loglh(u, a, b, D = 1.702, xaxis = seq(-6, 6, 0.1),
show_mle = FALSE)
Arguments
t |
ability parameters, 1d vector |
a |
discrimination parameters, 1d vector |
b |
item location parameters, 2d matrix |
D |
the scaling constant, 1.702 by default |
raw |
TRUE to return P* |
u |
the observed scores (starting from 0), 2d matrix |
log |
TRUE to return log-likelihood |
n_p |
the number of people to be generated |
n_i |
the number of items to be generated |
n_c |
the number of score categories |
t_dist |
parameters of the normal distribution used to generate t-parameters |
a_dist |
parameters of the lognormal distribution used to generate a-parameters |
b_dist |
parameters of the normal distribution used to generate b-parameters |
missing |
the proportion or number of missing responses |
param |
the parameter of the new scale: 't' or 'b' |
mean |
the mean of the new scale |
sd |
the standard deviation of the new scale |
type |
the type of plot, prob for ICC and info for IIFC |
by_item |
TRUE to combine categories |
total |
TRUE to sum values over items |
xaxis |
the values of x-axis |
show_mle |
TRUE to print maximum likelihood values |
Examples
with(model_grm_gendata(10, 5, 3), model_grm_prob(t, a, b))
with(model_grm_gendata(10, 5, 3), model_grm_info(t, a, b))
with(model_grm_gendata(10, 5, 3), model_grm_lh(u, t, a, b))
model_grm_gendata(10, 5, 3)
model_grm_gendata(10, 5, 3, missing=.1)
with(model_grm_gendata(10, 5, 3), model_grm_plot(a, b, type='prob'))
with(model_grm_gendata(10, 5, 3), model_grm_plot(a, b, type='info', by_item=TRUE))
with(model_grm_gendata(5, 50, 3), model_grm_plot_loglh(u, a, b))
Computerized Multistage Testing (MST)
Description
mst
creates a multistage (MST) object for assembly
mst_route
adds/removes a route to/from the MST
mst_get_indices
maps the input indices to the actual indices
mst_obj
adds objective functions to the MST
mst_constraint
adds constraints to the MST
mst_stage_length
sets length limits on stages
mst_rdp
anchors the routing decision point (rdp) between adjacent modules
mst_module_mininfo
sets the minimum information for modules
mst_assemble
assembles the mst
mst_get_items
extracts items from the assembly results
Usage
mst(pool, design, num_panel, method = c("topdown", "bottomup"),
len = NULL, max_use = NULL, group = NULL, ...)
mst_route(x, route, op = c("+", "-"))
mst_get_indices(x, indices)
mst_obj(x, theta, indices = NULL, target = NULL, ...)
mst_constraint(x, coef, min = NA, max = NA, level = NULL,
indices = NULL)
mst_stage_length(x, stages, min = NA, max = NA)
mst_rdp(x, theta, indices, tol = 0)
mst_module_info(x, thetas, min, max, indices)
mst_assemble(x, ...)
## S3 method for class 'mst'
print(x, ...)
## S3 method for class 'mst'
plot(x, ...)
mst_get_items(x, panel_ix = NULL, stage_ix = NULL, module_ix = NULL,
route_ix = NULL)
Arguments
pool |
the item pool (data.frame) |
design |
the MST design (string): e.g., "1-3", "1-2-2", "1-2-3" |
num_panel |
the number of panels (integer) |
method |
the design method (string): 'topdown' or 'bottomup' |
len |
the module/route length (integer) |
max_use |
the maximum selection of items (integer) |
group |
the grouping variable (string or vector) |
... |
further arguments |
x |
the MST object |
route |
a MST route represented by a vector of module indices |
op |
"+" to add a route and "-" to remove a route |
indices |
the indices of the route (topdown) or the module (bottomup) where objectives are added |
theta |
a theta point or interval over which the TIF is optimized |
target |
the target values of the TIF objectives. |
coef |
the coefficients of the constraint |
min |
the lower bound of the constraint |
max |
the upper bound of the constraint |
level |
the constrained level, |
stages |
the stage indices |
tol |
tolerance parameter (numeric) |
thetas |
theta points, a vector |
panel_ix |
the panel index, an int vector |
stage_ix |
the stage index, an int vector |
module_ix |
the module index, an int vector |
route_ix |
the route index, an integer |
Details
There are two methods for designing a MST. The bottom-up approach adds objectives and constraints on individual modules, whereas the topdown approach adds objectives and constraints directly on routes.
plot.mst
draws module information functions when byroute=FALSE
and route information functions when byroute=TRUE
Examples
## Not run:
## generate item pool
num_item <- 300
pool <- with(model_3pl_gendata(1, num_item), data.frame(a=a, b=b, c=c))
pool$id <- 1:num_item
pool$content <- sample(1:3, num_item, replace=TRUE)
pool$time <- round(rlnorm(num_item, 4, .3))
pool$group <- sort(sample(1:round(num_item/3), num_item, replace=TRUE))
## ex. 1: 1-2-2 MST, 2 panels, topdown
## 20 items in total and 10 items in content area 1 in each route
## maximize info. at -1 and 1 for easy and hard routes
x <- mst(pool, "1-2-2", 2, 'topdown', len=20, max_use=1)
x <- mst_obj(x, theta=-1, indices=1:2)
x <- mst_obj(x, theta=1, indices=3:4)
x <- mst_constraint(x, "content", 10, 10, level=1)
x <- mst_assemble(x, timeout=5)
plot(x, byroute=TRUE)
for(p in 1:x$num_panel)
for(r in 1:x$num_route) {
route <- paste(x$route[r, 1:x$num_stage], collapse='-')
count <- sum(mst_get_items(x, panel_ix=p, route_ix=r)$content==1)
cat('panel=', p, ', route=', route, ': ', count, ' items in content area #1\n', sep='')
}
## ex. 2: 1-2-3 MST, 2 panels, bottomup,
## remove two routes with large theta change: 1-2-6, 1-3-4
## 10 items in total and 4 items in content area 2 in each module
## maximize info. at -1, 0 and 1 for easy, medium, and hard modules
x <- mst(pool, "1-2-3", 2, 'bottomup', len=10, max_use=1)
x <- mst_route(x, c(1, 2, 6), "-")
x <- mst_route(x, c(1, 3, 4), "-")
x <- mst_obj(x, theta= 0, indices=c(1, 5))
x <- mst_obj(x, theta=-1, indices=c(2, 4))
x <- mst_obj(x, theta= 1, indices=c(3, 6))
x <- mst_constraint(x, "content", 4, 4, level=2)
x <- mst_assemble(x, timeout=10)
plot(x, byroute=FALSE)
for(p in 1:x$num_panel)
for(m in 1:x$num_module){
count <- sum(mst_get_items(x, panel_ix=p, module_ix=m)$content==2)
cat('panel=', p, ', module=', m, ': ', count, ' items in content area #2\n', sep='')
}
## ex.3: same with ex.2 (w/o content constraints), but group-based
x <- mst(pool, "1-2-3", 2, 'bottomup', len=12, max_use=1, group="group")
x <- mst_route(x, c(1, 2, 6), "-")
x <- mst_route(x, c(1, 3, 4), "-")
x <- mst_obj(x, theta= 0, indices=c(1, 5))
x <- mst_obj(x, theta=-1, indices=c(2, 4))
x <- mst_obj(x, theta= 1, indices=c(3, 6))
x <- mst_assemble(x, timeout=10)
plot(x, byroute=FALSE)
for(p in 1:x$num_panel)
for(m in 1:x$num_module){
items <- mst_get_items(x, panel_ix=p, module_ix=m)
cat('panel=', p, ', module=', m, ': ', length(unique(items$id)), ' items from ',
length(unique(items$group)), ' groups\n', sep='')
}
## ex.4: 2 panels of 1-2-3 top-down design
## 20 total items and 10 items in content area 3
## 6+ items in stage 1 & 2
x <- mst(pool, "1-2-3", 2, "topdown", len=20, max_use=1)
x <- mst_route(x, c(1, 2, 6), "-")
x <- mst_route(x, c(1, 3, 4), "-")
x <- mst_obj(x, theta=-1, indices=1)
x <- mst_obj(x, theta=0, indices=2:3)
x <- mst_obj(x, theta=1, indices=4)
x <- mst_constraint(x, "content", 10, 10, level=3)
x <- mst_stage_length(x, 1:2, min=6)
x <- mst_assemble(x, timeout=15)
head(x$items)
plot(x, byroute=FALSE)
for(p in 1:x$num_panel)
for(s in 1:x$num_stage){
items <- mst_get_items(x, panel_ix=p, stage_ix=s)
cat('panel=', p, ', stage=', s, ': ', length(unique(items$id)), ' items\n', sep='')
}
## ex.5: same with ex.4, but use RDP instead of stage length to control routing errors
x <- mst(pool, "1-2-3", 2, "topdown", len=20, max_use=1)
x <- mst_route(x, c(1, 2, 6), "-")
x <- mst_route(x, c(1, 3, 4), "-")
x <- mst_obj(x, theta=-1, indices=1)
x <- mst_obj(x, theta=0, indices=2:3)
x <- mst_obj(x, theta=1, indices=4)
x <- mst_constraint(x, "content", 10, 10, level=3)
x <- mst_rdp(x, 0, 2:3, .1)
x <- mst_module_mininfo(x, 0, 5, 2:3)
x <- mst_assemble(x, timeout=15)
plot(x, byroute=FALSE)
## End(Not run)
Simulation of Multistage Testing
Description
mst_sim
simulates a MST administration
Usage
mst_sim(x, true, rdp = NULL, ...)
## S3 method for class 'mst_sim'
print(x, ...)
## S3 method for class 'mst_sim'
plot(x, ...)
Arguments
x |
the assembled MST |
true |
the true theta parameter (numeric) |
rdp |
routing decision points (list) |
... |
additional option/control parameters |
Examples
## Not run:
## assemble a MST
nitems <- 200
pool <- with(model_3pl_gendata(1, nitems), data.frame(a=a, b=b, c=c))
pool$content <- sample(1:3, nrow(pool), replace=TRUE)
x <- mst(pool, "1-2-2", 2, 'topdown', len=20, max_use=1)
x <- mst_obj(x, theta=-1, indices=1)
x <- mst_obj(x, theta=0, indices=2:3)
x <- mst_obj(x, theta=1, indices=4)
x <- mst_constraint(x, "content", 6, 6, level=1)
x <- mst_constraint(x, "content", 6, 6, level=2)
x <- mst_constraint(x, "content", 8, 8, level=3)
x <- mst_stage_length(x, 1:2, min=5)
x <- mst_assemble(x)
## ex. 1: administer the MST using fixed RDP for routing
x_sim <- mst_sim(x, .5, list(stage1=0, stage2=0))
plot(x_sim)
## ex. 2: administer the MST using the max. info. for routing
x_sim <- mst_sim(x, .5)
plot(x_sim, ylim=c(-5, 5))
## End(Not run)
Root Mean Squared Error
Description
Root mean squared error (RMSE) of two numeric vectors/matrices
Usage
rmse(x, y)
Arguments
x |
a numeric vector/matrix |
y |
a numeric vector/matrix |
Spearman Brown Prophecy
Description
Use Spearman-brown formula to compute the predicted reliability when the test length is extened to n-fold or reversely the n-fold extension of test length in order to reach the targeted reliability
Usage
spearman_brown(n, rho)
spearman_brown_reverse(rho, target)
Arguments
n |
extend the test length to n-fold |
rho |
the reliability of current test |
target |
the targeted reliability |
Examples
spearman_brown(2, .70)
spearman_brown_reverse(.70, .85)