Version: | 0.1.18 |
Date: | 2025-1-6 |
Title: | Clustering with Matrix Gaussian and Matrix Transformation Mixture Models |
Depends: | R (≥ 3.0.0) |
LazyLoad: | yes |
LazyData: | no |
Description: | Provides matrix Gaussian mixture models, matrix transformation mixture models and their model-based clustering results. The parsimonious models of the mean matrices and variance covariance matrices are implemented with a total of 196 variations. For more information, please check: Xuwen Zhu, Shuchismita Sarkar, and Volodymyr Melnykov (2021), "MatTransMix: an R package for matrix model-based clustering and parsimonious mixture modeling", <doi:10.1007/s00357-021-09401-9>. |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
Imports: | mvtnorm |
Packaged: | 2025-01-24 20:13:27 UTC; zhu |
Author: | Xuwen Zhu [aut, cre], Volodymyr Melnykov [aut], Shuchismita Sarkar [ctb], Michael Hutt [ctb, cph], Stephen Moshier [ctb, cph], Rouben Rostamian [ctb, cph], Carl Edward Rasmussen [ctb, cph], Dianne Cook [ctb, cph] |
Maintainer: | Xuwen Zhu <xzhu20@ua.edu> |
NeedsCompilation: | yes |
Repository: | CRAN |
Date/Publication: | 2025-01-29 17:20:08 UTC |
Finite mixture modeling and model-based clustering of matrices based on matrix Gaussian mixture and matrix transformation mixture models.
Description
The utility of this package is the clustering of random matrices. Finite mixture modeling and model-based clustering based on matrix Gaussian mixtures and matrix transformation mixtures are employed.
Details
Package: | MatTransMix |
Type: | Package |
Version: | 0.1.1 |
Date: | 2017-02-09 |
License: | GPL (>= 2) |
LazyLoad: | no |
Function 'MatTrans.init' runs the initialization for the EM algorithm.
Function 'MatTrans.EM' runs the EM algorithm for matrix-variate mixtures to cluster matrices.
Author(s)
Xuwen Zhu and Volodymyr Melnykov.
Maintainer: Xuwen Zhu <xuwen.zhu@louisville.edu>
IMDb data
Description
Data collected from IMDb.com on the ratings of 105 popular comedy movies.
Usage
data(IMDb)
Format
A list of 2 objects: Y and name, where Y represents the data array of ratings and name represents the comedy movie names. Y is the of dimensionality 2 x 4 x 105 with ratings on 105 movies from female and male by age groups 0-18, 18-29, 30-44, 45+.
Details
The data are publicly available on www.IMDb.com.
Examples
data(IMDb)
EM algorithm for matrix clustering
Description
Runs the EM algorithm for matrix clustering
Usage
MatTrans.EM(Y, initial = NULL, la = NULL, nu = NULL,
model = NULL, trans = "None", la.type = 0,
row.skew = TRUE, col.skew = TRUE, tol = 1e-05,
short.iter = NULL, long.iter = 1000, all.models = TRUE,
size.control = 0, silent = TRUE)
Arguments
Y |
dataset of random matrices (p x T x n), n random matrices of dimensionality (p x T) |
initial |
initialization parameters provided by function MatTrans.init() |
la |
initial skewness for rows (K x p) |
nu |
initial skewness for columns (K x T) |
model |
parsimonious model type, if null, then all 210 models are run |
trans |
transformation method: None (Gaussian models), Power, Manly |
la.type |
lambda type 0 or 1, 0: unrestricted, 1: same lambda across all variables |
row.skew |
if skewness for rows are fitted: TRUE or FALSE |
col.skew |
if skewness for columns are fitted: TRUE or FALSE |
tol |
tolerance level |
short.iter |
number of short EM iterations; if not specified, just run long EM |
long.iter |
number of long EM iterations |
all.models |
if true, run long EM for all models; otherwise just the best model returned by short EM in terms of BIC |
size.control |
minimum size of clusters allowed for controlling spurious solutions |
silent |
whether to produce output of steps or not |
Details
Runs the EM algorithm for modeling and clustering matrices for a provided dataset. Both matrix Gaussian mixture, matrix Power mixture and matrix Manly transformation mixture can be employed. The user should use the MatTrans.init() function to get initial parameters and input them as 'initial'. In the case when transformation parameters are not provided but 'trans' is specified to be 'Power' or 'Manly', 'la' and 'nu' take value of 0.5. 'model' can be specified as 'X-XXX-XX'. The first digit 'X' stands for the mean structure. It is either 'G': general mean or 'A': additive mean. The second 'XXX' specifies the variance-covariance Sigma. There are 14 options including EII, VII, EEI, VEI, EVI, VVI, EEE, EVE, VEE, VVE, EEV, VEV, EVV and VVV with detailed explanation as follows: "EII" spherical, equal volume "VII" spherical, unequal volume "EEI" diagonal, equal volume and shape "VEI" diagonal, varying volume, equal shape "EVI" diagonal, equal volume, varying shape "VVI" diagonal, varying volume and shape "EEE" ellipsoidal, equal volume, shape, and orientation "EVE" ellipsoidal, equal volume and orientation (*) "VEE" ellipsoidal, equal shape and orientation (*) "VVE" ellipsoidal, equal orientation (*) "EEV" ellipsoidal, equal volume and equal shape "VEV" ellipsoidal, equal shape "EVV" ellipsoidal, equal volume (*) "VVV" ellipsoidal, varying volume, shape, and orientation The last 2-digit 'XX' specifies the variance-covariance Psi. There are 8 options including II, EI, VI, EE, VE, EV, VV, AR. The user can specify the 'model' to be for example 'X-VVV-EV', then both 'G' and 'A' mean structures will be fitted while Sigma and Psi are fixed at 'VVV' and 'EV', respectively. Similarly, 'model' can be specified as 'G-XXX-EV' or 'G-VVV-XX' for selection of Sigma and Psi structures.
Value
No return value, called for side effects
Initialization for the EM algorithm for matrix clustering
Description
Runs the initialization for the EM algorithm for matrix clustering
Usage
MatTrans.init(Y, K, n.start = 10, scale = 1)
Arguments
Y |
dataset of random matrices (p x T x n), n random matrices of dimensionality (p x T) |
K |
number of clusters |
n.start |
initial random starts |
scale |
scaling parameter |
Details
Random starts are used to obtain different starting values. The number of clusters, the skewness parameters, and number of random starts need to be specified. In the case when transformation parameters are not provided, the function runs the EM algorithm without any transformations, i.e., it is equivalent to the EM algorithm for a matrix Gaussian mixture. Notation: n - sample size, p x T - dimensionality of the random matrices, K - number of mixture components.
Value
scale |
scale parameter set by the user |
result |
parsimonious models |
model |
model types |
loglik |
log likelihood values |
bic |
bic values |
best.result |
best parsimonious model |
best.model |
best model type |
best.loglik |
best logliklihood |
best.bic |
best bic |
trans |
transformation type |
Examples
set.seed(123)
data(crime)
Y <- crime$Y[c(2,7),,] / 1000
p <- dim(Y)[1]
T <- dim(Y)[2]
n <- dim(Y)[3]
K <- 2
init <- MatTrans.init(Y, K = K, n.start = 2)
Mean coordinate plot
Description
Mean coordinate plot provided for the best fitted model returned by MatTrans.EM model.
Usage
MatTrans.plot(X, model = NULL, xlab = "",
ylab = "", rownames = NULL, colnames = NULL,
lwd.obs = 0.8, lwd.mean = 2, line.cols = NULL, ...)
Arguments
X |
dataset of random matrices (p x T x n), n random matrices of dimensionality (p x T) |
model |
fitted MatTrans mixture model returned by function MatTrans.plot() |
xlab |
label on the X-axis |
ylab |
label on the Y-axis |
rownames |
input row variable names |
colnames |
input column variable names |
lwd.obs |
line width of observations |
lwd.mean |
line width of the mean profile |
line.cols |
line colors of the mean and observations |
... |
Details
Provides the mean profile plot for the fitted data by MatTrans.EM model.
Value
No return value, called for side effects
Salary data
Description
Data collected from the Chronicle of Higher Education web site reporting the average faculty salaries from 696 universities presented in the form of a 2 by 3 by 13 -dimensional tensor.
Usage
data(Salary)
Format
A list of 2 objects: Y, uni_info. Y represents the salary data array from 696 universities. uni_info has the university names, state and category. Y is of dimensionality 2 by 3 by 13 categorized by the following factors: gender (Male, Female), professor rank (Assistant, Associate, Full), and academic year (2003-2004,2015-2016).
Details
The data have been made publicly available by the Chronicle of Higher Education web site.
Examples
data(Salary)
Crime data
Description
Data collected by FBI's Uniform Crime on the violent and property crimes of 236 cities.
Usage
data(crime)
Format
A list of 3 objects: Y, department and state. Y represents the crime rate data array from 236 cities. Department is the police department names and state represents the states where each city is located at. Y is of dimensionality 10 x 13 x 236 with 236 crime rates on the following 10 variables from year 2000 through 2012.
- Population
Population of each city;
- Violent Crime rate
Total number of violent crimes;
- Murder and non-negligent manslaughter rate
Number of murders;
- Forcible rape rate
Number of rape crimes;
- Robbery rate
Number of robberies;
- Aggravated assault rate
Number of assaults;
- Property crime rate
Total number of property crimes;
- Burglary rate
Number of burglary crimes;
- Larceny-theft rate
Number of theft crimes;
- Motor vehicle theft rate
Number of vehicle theft crimes;
Details
The data have been made publicly available by FBI's Uniform Crime Reports.
Examples
data(crime)
Functions for Printing or Summarizing Objects
Description
EM
classes for printing and summarizing objects.
Usage
## S3 method for class 'EM'
print(x, ...)
## S3 method for class 'EM'
summary(object, ...)
Arguments
x |
an object with the 'EM' class attributes. |
object |
an object with the 'EM' class attributes. |
... |
other possible options. |
Details
Some useful functions for printing and summarizing results.
Value
No return value, called for side effects
See Also
MatTrans.EM
.