Type: Package
Title: Ferguson-Klass Type Algorithm for Posterior Normalized Random Measures
Version: 2023.3.8
Description: Bayesian nonparametric density estimation modeling mixtures by a Ferguson-Klass type algorithm for posterior normalized random measures.
License: GPL-2 | GPL-3 [expanded from: GPL (≥ 2)]
Encoding: UTF-8
LazyData: yes
ByteCompile: yes
RoxygenNote: 7.2.3
Depends: R(≥ 3.5.0)
Maintainer: Guillaume Kon Kam King <guillaume.konkamking.work@gmail.com>
Imports: ggplot2, gridExtra, survival, coda, dplyr, tidyr, viridis
Suggests: GreedyEPL, Rmpfr, gmp, knitr, rmarkdown, testthat, BNPmix
NeedsCompilation: no
Packaged: 2023-03-24 14:47:00 UTC; gkonkamking
Author: Julyan Arbel [ctb], Ernesto Barrios [aut], Guillaume Kon Kam King [aut, cre], Antonio Lijoi [aut], Luis E. Nieto-Barajas [aut], Igor Prünster [aut]
Repository: CRAN
Date/Publication: 2023-03-24 15:10:02 UTC

Bayesian nonparametric density estimation

Description

This package performs Bayesian nonparametric density estimation for exact and censored data via a normalized random measure mixture model. The package allows the user to specify the mixture kernel, the mixing normalized measure and the choice of performing fully nonparametric mixtures on locations and scales, or semiparametric mixtures on locations only with common scale parameter. Options for the kernels are: two kernels with support in the real line (gaussian and double exponential), two more kernels in the positive line (gamma and lognormal) and one with bounded support (beta). The options for the normalized random measures are members of the class of normalized generalized gamma, which include the Dirichlet process, the normalized inverse gaussian process and the normalized stable process. The type of censored data handled by the package is right, left and interval.

Details

Package: BNPdensity
Type: Package
Version: 2016.10
Date: 2016-10-14
License: GPL version 2 or later
LazyLoad: yes

The package includes four main functions: MixNRMI1, MixNRMI2, MixNRMI1cens and MixNRMI2cens which implement semiparametric and fully nonparametric mixtures for exact data, and semiparametric and fully nonparametric mixtures for censored data respectively. Additionally, the package includes several other functions required for sampling from conditional distributions in the MCMC implementation. These functions are intended for internal use only.

Author(s)

Barrios, E., Lijoi, A., Nieto-Barajas, L. E. and Prünster, I.; Contributor: Guillaume Kon Kam King.; Maintainer: Ernesto Barrios <ebarrios at itam.mx>

References

Barrios, E., Lijoi, A., Nieto-Barajas, L. E. and Prünster, I. (2013). Modeling with Normalized Random Measure Mixture Models. Statistical Science. Vol. 28, No. 3, 313-334.

Kon Kam King, G., Arbel, J. and Prünster, I. (2016). Species Sensitivity Distribution revisited: a Bayesian nonparametric approach. In preparation.

See Also

MixNRMI1, MixNRMI2, MixNRMI1cens, MixNRMI2cens

Examples


example(MixNRMI1)
example(MixNRMI2)
example(MixNRMI1cens)
example(MixNRMI2cens)

Fit of MixNRMI1 function to the enzyme dataset

Description

This object contains the output when setting set.seed(150520) and running the function Enzyme1.out <- MixNRMI1(enzyme, Alpha = 1, Kappa = 0.007, Gama = 0.5, distr.k = "gamma", distr.p0 = "gamma", asigma = 1, bsigma = 1, Meps = 0.005, Nit = 5000, Pbi = 0.2)

Details

See function MixNRMI1

Examples


data(Enzyme1.out)

Fit of MixNRMI2 function to the enzyme dataset

Description

This object contains the output when setting set.seed(150520) and running the function Enzyme2.out <- MixNRMI2(enzyme, Alpha = 1, Kappa = 0.007, Gama = 0.5, distr.k = "gamma", distr.py0 = "gamma", distr.pz0 = "gamma", mu.pz0 = 1, sigma.pz0 = 1, Meps = 0.005, Nit = 5000, Pbi = 0.2) See function MixNRMI2

Examples


data(Enzyme2.out)

Plot Goodness of fits graphical checks for censored data

Description

Plot Goodness of fits graphical checks for censored data

Usage

GOFplots(fit, qq_plot = FALSE, thinning_to = 500)

Arguments

fit

The result of the fit, obtained through the function MixNRMI1 or MixNRMI2, MixMRMI1cens or MixMRMI2cens

qq_plot

Whether to compute the QQ-plot

thinning_to

How many iterations to compute the mean posterior quantiles

Value

A density plot, a cumulative density plot with the Turnbull cumulative distribution, a percentile-percentile plot, and potentially a quantile-quantile plot.

Examples


set.seed(150520)
data(salinity)
out <- MixNRMI1cens(salinity$left, salinity$right, extras = TRUE, Nit = 100)
GOFplots(out)

Plot Goodness of fits graphical checks for censored data

Description

Plot Goodness of fits graphical checks for censored data

Usage

GOFplots_censored(fit, qq_plot = FALSE, thinning_to = 500)

Arguments

fit

The result of the fit, obtained through the function MixNRMI1 or MixNRMI2, MixMRMI1cens or MixMRMI2cens

qq_plot

Whether to compute the QQ-plot

thinning_to

How many iterations to compute the mean posterior quantiles

Value

A density plot, a cumulative density plot with the Turnbull cumulative distribution, and a percentile-percentile plot.

Examples


set.seed(150520)
data(salinty)
out <- MixNRMI1cens(salinity$left, salinity$right, extras = TRUE, Nit = 100)
BNPdensity:::GOFplots_censored(out)

Plot Goodness of fits graphical checks for non censored data

Description

Plot Goodness of fits graphical checks for non censored data

Usage

GOFplots_noncensored(fit, qq_plot = FALSE, thinning_to = 500)

Arguments

fit

The result of the fit, obtained through the function MixNRMI1 or MixNRMI2, MixMRMI1cens or MixMRMI2cens

qq_plot

Whether to compute the QQ-plot

thinning_to

How many iterations to compute the mean posterior quantiles

Value

A density plot with histogram, a cumulative density plot with the empirical cumulative distribution, and a percentile-percentile plot.

Examples


set.seed(150520)
data(acidity)
out <- MixNRMI1(acidity, extras = TRUE, Nit = 100)
BNPdensity:::GOFplots_noncensored(out)

Fit of MixNRMI1 function to the galaxy dataset

Description

This object contains the output when setting set.seed(150520) and running the function MixNRMI1(galaxy, Alpha = 1, Kappa = 0.015, Gama = 0.5, distr.k = "normal", distr.p0 = "gamma", asigma = 1, bsigma = 1, delta = 7, Meps = 0.005, Nit = 5000, Pbi = 0.2)

Details

See function MixNRMI1.

Examples


data(Galaxy1.out)

Fit of MixNRMI2 function to the galaxy dataset

Description

This object contains the output when setting set.seed(150520) and running the function Enzyme2.out <- MixNRMI2(x, Alpha = 1, Kappa = 0.007, Gama = 0.5, distr.k = "gamma", distr.py0 = "gamma", distr.pz0 = "gamma", mu.pz0 = 1, sigma.pz0 = 1, Meps = 0.005, Nit = 5000, Pbi = 0.2)

Details

See function MixNRMI2.

Examples


data(Galaxy2.out)

Normalized Random Measures Mixture of Type I

Description

Bayesian nonparametric estimation based on normalized measures driven mixtures for locations.

Usage

MixNRMI1(
  x,
  probs = c(0.025, 0.5, 0.975),
  Alpha = 1,
  Kappa = 0,
  Gama = 0.4,
  distr.k = "normal",
  distr.p0 = 1,
  asigma = 0.5,
  bsigma = 0.5,
  delta_S = 3,
  delta_U = 2,
  Meps = 0.01,
  Nx = 150,
  Nit = 1500,
  Pbi = 0.1,
  epsilon = NULL,
  printtime = TRUE,
  extras = TRUE,
  adaptive = FALSE
)

Arguments

x

Numeric vector. Data set to which the density is fitted.

probs

Numeric vector. Desired quantiles of the density estimates.

Alpha

Numeric constant. Total mass of the centering measure. See details.

Kappa

Numeric positive constant. See details.

Gama

Numeric constant. 0\leq \texttt{Gama} \leq 1. See details.

distr.k

The distribution name for the kernel. Allowed names are "normal", "gamma", "beta", "double exponential", "lognormal" or their common abbreviations "norm", "exp", or an integer number identifying the mixture kernel: 1 = Normal; 2 = Gamma; 3 = Beta; 4 = Double Exponential; 5 = Lognormal.

distr.p0

The distribution name for the centering measure. Allowed names are "normal", "gamma", "beta", or their common abbreviations "norm", "exp", or an integer number identifying the centering measure: 1 = Normal; 2 = Gamma; 3 = Beta.

asigma

Numeric positive constant. Shape parameter of the gamma prior on the standard deviation of the mixture kernel distr.k.

bsigma

Numeric positive constant. Rate parameter of the gamma prior on the standard deviation of the mixture kernel distr.k.

delta_S

Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling sigma.

delta_U

Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling the latent U.

Meps

Numeric constant. Relative error of the jump sizes in the continuous component of the process. Smaller values imply larger number of jumps.

Nx

Integer constant. Number of grid points for the evaluation of the density estimate.

Nit

Integer constant. Number of MCMC iterations.

Pbi

Numeric constant. Burn-in period proportion of Nit.

epsilon

Numeric constant. Extension to the evaluation grid range. See details.

printtime

Logical. If TRUE, prints out the execution time.

extras

Logical. If TRUE, gives additional objects: means, weights and Js.

adaptive

Logical. If TRUE, uses an adaptive MCMC strategy to sample the latent U (adaptive delta_U).

Details

This generic function fits a normalized random measure (NRMI) mixture model for density estimation (James et al. 2009). Specifically, the model assumes a normalized generalized gamma (NGG) prior for the locations (means) of the mixture kernel and a parametric prior for the common smoothing parameter sigma, leading to a semiparametric mixture model.

The details of the model are:

X_i|Y_i,\sigma \sim k(\cdot |Y_i,\sigma)

Y_i|P \sim P,\quad i=1,\dots,n

P \sim \textrm{NGG(\texttt{Alpha, Kappa, Gama; P\_0})}

\sigma \sim \textrm{Gamma(asigma, bsigma)}

where X_i's are the observed data, Y_i's are latent (location) variables, sigma is the smoothing parameter, k is a parametric kernel parameterized in terms of mean and standard deviation, (Alpha, Kappa, Gama; P_0) are the parameters of the NGG prior with P_0 being the centering measure whose parameters are assigned vague hyper prior distributions, and (asigma,bsigma) are the hyper-parameters of the gamma prior on the smoothing parameter sigma. In particular: NGG(Alpha, 1, 0; P_0) defines a Dirichlet process; NGG(1, Kappa, 1/2; P_0) defines a Normalized inverse Gaussian process; and NGG(1, 0, Gama; P_0) defines a normalized stable process.

The evaluation grid ranges from min(x) - epsilon to max(x) + epsilon. By default epsilon=sd(x)/4.

Value

The function returns a MixNRMI1 object. It is based on a list with the following components:

xx

Numeric vector. Evaluation grid.

qx

Numeric array. Matrix of dimension \texttt{Nx} \times (\texttt{length(probs)} + 1) with the posterior mean and the desired quantiles input in probs.

cpo

Numeric vector of length(x) with conditional predictive ordinates.

R

Numeric vector of length(Nit*(1-Pbi)) with the number of mixtures components (clusters).

S

Numeric vector of length(Nit*(1-Pbi)) with the values of common standard deviation sigma.

U

Numeric vector of length(Nit*(1-Pbi)) with the values of the latent variable U.

Allocs

List of length(Nit*(1-Pbi)) with the clustering allocations.

means

List of length(Nit*(1-Pbi)) with the cluster means (locations). Only if extras = TRUE.

weights

List of length(Nit*(1-Pbi)) with the mixture weights. Only if extras = TRUE.

Js

List of length(Nit*(1-Pbi)) with the unnormalized weights (jump sizes). Only if extras = TRUE.

Nm

Integer constant. Number of jumps of the continuous component of the unnormalized process.

Nx

Integer constant. Number of grid points for the evaluation of the density estimate.

Nit

Integer constant. Number of MCMC iterations.

Pbi

Numeric constant. Burn-in period proportion of Nit.

procTime

Numeric vector with execution time provided by proc.time function.

distr.k

Integer corresponding to the kernel chosen for the mixture

data

Data used for the fit

NRMI_params

A named list with the parameters of the NRMI process

Warning

The function is computing intensive. Be patient.

Author(s)

Barrios, E., Kon Kam King, G., Lijoi, A., Nieto-Barajas, L.E. and Prüenster, I.

References

1.- Barrios, E., Lijoi, A., Nieto-Barajas, L. E. and Prünster, I. (2013). Modeling with Normalized Random Measure Mixture Models. Statistical Science. Vol. 28, No. 3, 313-334.

2.- James, L.F., Lijoi, A. and Prünster, I. (2009). Posterior analysis for normalized random measure with independent increments. Scand. J. Statist 36, 76-97.

See Also

MixNRMI2, MixNRMI1cens, MixNRMI2cens, multMixNRMI1

Examples


### Example 1
## Not run: 
# Data
data(acidity)
x <- acidity
# Fitting the model under default specifications
out <- MixNRMI1(x)
# Plotting density estimate + 95% credible interval
plot(out)
### Example 2
set.seed(150520)
data(enzyme)
x <- enzyme
Enzyme1.out <- MixNRMI1(x, Alpha = 1, Kappa = 0.007, Gama = 0.5,
                         distr.k = "gamma", distr.p0 = "gamma",
                         asigma = 1, bsigma = 1, Meps=0.005,
                         Nit = 5000, Pbi = 0.2)
attach(Enzyme1.out)
# Plotting density estimate + 95% credible interval
plot(Enzyme1.out)
# Plotting number of clusters
par(mfrow = c(2, 1))
plot(R, type = "l", main = "Trace of R")
hist(R, breaks = min(R - 0.5):max(R + 0.5), probability = TRUE)
# Plotting sigma
par(mfrow = c(2, 1))
plot(S, type = "l", main = "Trace of sigma")
hist(S, nclass = 20, probability = TRUE, main = "Histogram of sigma")
# Plotting u
par(mfrow = c(2, 1))
plot(U, type = "l", main = "Trace of U")
hist(U, nclass = 20, probability = TRUE, main = "Histogram of U")
# Plotting cpo
par(mfrow = c(2, 1))
plot(cpo, main = "Scatter plot of CPO's")
boxplot(cpo, horizontal = TRUE, main = "Boxplot of CPO's")
print(paste("Average log(CPO)=", round(mean(log(cpo)), 4)))
print(paste("Median log(CPO)=", round(median(log(cpo)), 4)))
detach()

## End(Not run)

### Example 3
## Do not run
# set.seed(150520)
# data(galaxy)
# x <- galaxy
#  Galaxy1.out <- MixNRMI1(x, Alpha = 1, Kappa = 0.015, Gama = 0.5,
#                          distr.k = "normal", distr.p0 = "gamma",
#                          asigma = 1, bsigma = 1, delta = 7, Meps=0.005,
#                          Nit = 5000, Pbi = 0.2)

# The output of this run is already loaded in the package
# To show results run the following
# Data
data(galaxy)
x <- galaxy
data(Galaxy1.out)
attach(Galaxy1.out)
# Plotting density estimate + 95% credible interval
plot(Galaxy1.out)
# Plotting number of clusters
par(mfrow = c(2, 1))
plot(R, type = "l", main = "Trace of R")
hist(R, breaks = min(R - 0.5):max(R + 0.5), probability = TRUE)
# Plotting sigma
par(mfrow = c(2, 1))
plot(S, type = "l", main = "Trace of sigma")
hist(S, nclass = 20, probability = TRUE, main = "Histogram of sigma")
# Plotting u
par(mfrow = c(2, 1))
plot(U, type = "l", main = "Trace of U")
hist(U, nclass = 20, probability = TRUE, main = "Histogram of U")
# Plotting cpo
par(mfrow = c(2, 1))
plot(cpo, main = "Scatter plot of CPO's")
boxplot(cpo, horizontal = TRUE, main = "Boxplot of CPO's")
print(paste("Average log(CPO)=", round(mean(log(cpo)), 4)))
print(paste("Median log(CPO)=", round(median(log(cpo)), 4)))
detach()

Normalized Random Measures Mixture of Type I for censored data

Description

Bayesian nonparametric estimation based on normalized measures driven mixtures for locations.

Usage

MixNRMI1cens(
  xleft,
  xright,
  probs = c(0.025, 0.5, 0.975),
  Alpha = 1,
  Kappa = 0,
  Gama = 0.4,
  distr.k = "normal",
  distr.p0 = "normal",
  asigma = 0.5,
  bsigma = 0.5,
  delta_S = 3,
  delta_U = 2,
  Meps = 0.01,
  Nx = 150,
  Nit = 1500,
  Pbi = 0.1,
  epsilon = NULL,
  printtime = TRUE,
  extras = TRUE,
  adaptive = FALSE
)

Arguments

xleft

Numeric vector. Lower limit of interval censoring. For exact data the same as xright

xright

Numeric vector. Upper limit of interval censoring. For exact data the same as xleft.

probs

Numeric vector. Desired quantiles of the density estimates.

Alpha

Numeric constant. Total mass of the centering measure. See details.

Kappa

Numeric positive constant. See details.

Gama

Numeric constant. 0\leq \texttt{Gama} \leq 1. See details.

distr.k

The distribution name for the kernel. Allowed names are "normal", "gamma", "beta", "double exponential", "lognormal" or their common abbreviations "norm", "exp", or an integer number identifying the mixture kernel: 1 = Normal; 2 = Gamma; 3 = Beta; 4 = Double Exponential; 5 = Lognormal.

distr.p0

The distribution name for the centering measure. Allowed names are "normal", "gamma", "beta", or their common abbreviations "norm", "exp", or an integer number identifying the centering measure: 1 = Normal; 2 = Gamma; 3 = Beta.

asigma

Numeric positive constant. Shape parameter of the gamma prior on the standard deviation of the mixture kernel distr.k.

bsigma

Numeric positive constant. Rate parameter of the gamma prior on the standard deviation of the mixture kernel distr.k.

delta_S

Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling sigma.

delta_U

Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling the latent U.

Meps

Numeric constant. Relative error of the jump sizes in the continuous component of the process. Smaller values imply larger number of jumps.

Nx

Integer constant. Number of grid points for the evaluation of the density estimate.

Nit

Integer constant. Number of MCMC iterations.

Pbi

Numeric constant. Burn-in period proportion of Nit.

epsilon

Numeric constant. Extension to the evaluation grid range. See details.

printtime

Logical. If TRUE, prints out the execution time.

extras

Logical. If TRUE, gives additional objects: means, weights and Js.

adaptive

Logical. If TRUE, uses an adaptive MCMC strategy to sample the latent U (adaptive delta_U).

Details

This generic function fits a normalized random measure (NRMI) mixture model for density estimation (James et al. 2009) with censored data. Specifically, the model assumes a normalized generalized gamma (NGG) prior for the locations (means) of the mixture kernel and a parametric prior for the common smoothing parameter sigma, leading to a semiparametric mixture model.

This function coincides with MixNRMI1 when the lower (xleft) and upper (xright) censoring limits correspond to the same exact value.

The details of the model are:

X_i|Y_i,\sigma \sim k(\cdot |Y_i,\sigma)

Y_i|P \sim P,\quad i=1,\dots,n

P \sim \textrm{NGG(\texttt{Alpha, Kappa, Gama; P\_0})}

\sigma \sim \textrm{Gamma(asigma, bsigma)}

where X_i's are the observed data, Y_i's are latent (location) variables, sigma is the smoothing parameter, k is a parametric kernel parameterized in terms of mean and standard deviation, (Alpha, Kappa, Gama; P_0) are the parameters of the NGG prior with P_0 being the centering measure whose parameters are assigned vague hyper prior distributions, and (asigma,bsigma) are the hyper-parameters of the gamma prior on the smoothing parameter sigma. In particular: NGG(Alpha, 1, 0; P_0) defines a Dirichlet process; NGG(1, Kappa, 1/2; P_0) defines a Normalized inverse Gaussian process; and NGG(1, 0, Gama; P_0) defines a normalized stable process.

The evaluation grid ranges from min(x) - epsilon to max(x) + epsilon. By default epsilon=sd(x)/4.

Value

The function returns a list with the following components:

xx

Numeric vector. Evaluation grid.

qx

Numeric array. Matrix of dimension \texttt{Nx} \times (\texttt{length(probs)} + 1) with the posterior mean and the desired quantiles input in probs.

cpo

Numeric vector of length(x) with conditional predictive ordinates.

R

Numeric vector of length(Nit*(1-Pbi)) with the number of mixtures components (clusters).

S

Numeric vector of length(Nit*(1-Pbi)) with the values of common standard deviation sigma.

U

Numeric vector of length(Nit*(1-Pbi)) with the values of the latent variable U.

Allocs

List of length(Nit*(1-Pbi)) with the clustering allocations.

means

List of length(Nit*(1-Pbi)) with the cluster means (locations). Only if extras = TRUE.

weights

List of length(Nit*(1-Pbi)) with the mixture weights. Only if extras = TRUE.

Js

List of length(Nit*(1-Pbi)) with the unnormalized weights (jump sizes). Only if extras = TRUE.

Nm

Integer constant. Number of jumps of the continuous component of the unnormalized process.

Nx

Integer constant. Number of grid points for the evaluation of the density estimate.

Nit

Integer constant. Number of MCMC iterations.

Pbi

Numeric constant. Burn-in period proportion of Nit.

procTime

Numeric vector with execution time provided by proc.time function.

distr.k

Integer corresponding to the kernel chosen for the mixture

data

Data used for the fit

NRMI_params

A named list with the parameters of the NRMI process

Warning

The function is computing intensive. Be patient.

Author(s)

Barrios, E., Kon Kam King, G. and Nieto-Barajas, L.E.

References

1.- Barrios, E., Lijoi, A., Nieto-Barajas, L. E. and Prünster, I. (2013). Modeling with Normalized Random Measure Mixture Models. Statistical Science. Vol. 28, No. 3, 313-334.

2.- James, L.F., Lijoi, A. and Prünster, I. (2009). Posterior analysis for normalized random measure with independent increments. Scand. J. Statist 36, 76-97.

3.- Kon Kam King, G., Arbel, J. and Prünster, I. (2016). Species Sensitivity Distribution revisited: a Bayesian nonparametric approach. In preparation.

See Also

MixNRMI2, MixNRMI1cens, MixNRMI2cens, multMixNRMI1

Examples


### Example 1
## Not run: 
# Data
data(acidity)
x <- acidity
# Fitting the model under default specifications
out <- MixNRMI1cens(x, x)
# Plotting density estimate + 95% credible interval
plot(out)

## End(Not run)

## Not run: 
### Example 2
# Data
data(salinity)
# Fitting the model under default specifications
out <- MixNRMI1cens(xleft = salinity$left, xright = salinity$right, Nit = 5000)
# Plotting density estimate + 95% credible interval
attach(out)
plot(out)
# Plotting number of clusters
par(mfrow = c(2, 1))
plot(R, type = "l", main = "Trace of R")
hist(R, breaks = min(R - 0.5):max(R + 0.5), probability = TRUE)
detach()

## End(Not run)


Normalized Random Measures Mixture of Type II

Description

Bayesian nonparametric estimation based on normalized measures driven mixtures for locations and scales.

Usage

MixNRMI2(
  x,
  probs = c(0.025, 0.5, 0.975),
  Alpha = 1,
  Kappa = 0,
  Gama = 0.4,
  distr.k = "normal",
  distr.py0 = "normal",
  distr.pz0 = "gamma",
  mu.pz0 = 3,
  sigma.pz0 = sqrt(10),
  delta_S = 4,
  kappa = 2,
  delta_U = 2,
  Meps = 0.01,
  Nx = 150,
  Nit = 1500,
  Pbi = 0.1,
  epsilon = NULL,
  printtime = TRUE,
  extras = TRUE,
  adaptive = FALSE
)

Arguments

x

Numeric vector. Data set to which the density is fitted.

probs

Numeric vector. Desired quantiles of the density estimates.

Alpha

Numeric constant. Total mass of the centering measure. See details.

Kappa

Numeric positive constant. See details.

Gama

Numeric constant. 0 \leq Gama \leq 1. See details.

distr.k

The distribution name for the kernel. Allowed names are "normal", "gamma", "beta", "double exponential", "lognormal" or their common abbreviations "norm", "exp", or an integer number identifying the mixture kernel: 1 = Normal; 2 = Gamma; 3 = Beta; 4 = Double Exponential; 5 = Lognormal.

distr.py0

The distribution name for the centering measure for locations. Allowed names are "normal", "gamma", "beta", or their common abbreviations "norm", "exp", or an integer number identifying the centering measure for locations: 1 = Normal; 2 = Gamma; 3 = Beta.

distr.pz0

The distribution name for the centering measure for scales. Allowed names are "gamma", or an integer number identifying the centering measure for scales: 2 = Gamma. For more options use MixNRMI2cens.

mu.pz0

Numeric constant. Prior mean of the centering measure for scales.

sigma.pz0

Numeric constant. Prior standard deviation of the centering measure for scales.

delta_S

Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling the scales.

kappa

Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling the location parameters.

delta_U

Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling the latent U. If 'adaptive=TRUE', 'delta_U'is the starting value for the adaptation.

Meps

Numeric constant. Relative error of the jump sizes in the continuous component of the process. Smaller values imply larger number of jumps.

Nx

Integer constant. Number of grid points for the evaluation of the density estimate.

Nit

Integer constant. Number of MCMC iterations.

Pbi

Numeric constant. Burn-in period proportion of Nit.

epsilon

Numeric constant. Extension to the evaluation grid range. See details.

printtime

Logical. If TRUE, prints out the execution time.

extras

Logical. If TRUE, gives additional objects: means, sigmas, weights and Js.

adaptive

Logical. If TRUE, uses an adaptive MCMC strategy to sample the latent U (adaptive delta_U).

Details

This generic function fits a normalized random measure (NRMI) mixture model for density estimation (James et al. 2009). Specifically, the model assumes a normalized generalized gamma (NGG) prior for both, locations (means) and standard deviations, of the mixture kernel, leading to a fully nonparametric mixture model.

The details of the model are:

X_i|Y_i,Z_i \sim k(\cdot|Y_i,Z_i)

(Y_i,Z_i)|P \sim P, i=1,\dots,n

P \sim \textrm{NGG}(\texttt{Alpha, Kappa, Gama; P\_0})

where, X_i's are the observed data, (Y_i,Z_i)'s are bivariate latent (location and scale) vectors, k is a parametric kernel parameterized in terms of mean and standard deviation, (Alpha, Kappa, Gama; P_0) are the parameters of the NGG prior with a bivariate P_0 being the centering measure with independent components, that is, P_0(Y,Z) = P_0(Y)*P_0(Z). The parameters of P_0(Y) are assigned vague hyper prior distributions and (mu.pz0,sigma.pz0) are the hyper-parameters of P_0(Z). In particular, NGG(Alpha, 1, 0; P_0) defines a Dirichlet process; NGG(1, Kappa, 1/2;P_0) defines a Normalized inverse Gaussian process; and NGG(1, 0, Gama; P_0) defines a normalized stable process. The evaluation grid ranges from min(x) - epsilon to max(x) + epsilon. By default epsilon=sd(x)/4.

Value

The function returns a list with the following components:

xx

Numeric vector. Evaluation grid.

qx

Numeric array. Matrix of dimension \texttt{Nx} \times (\texttt{length(probs)} + 1) with the posterior mean and the desired quantiles input in probs.

cpo

Numeric vector of length(x) with conditional predictive ordinates.

R

Numeric vector of length(Nit*(1-Pbi)) with the number of mixtures components (clusters).

U

Numeric vector of length(Nit*(1-Pbi)) with the values of the latent variable U.

Allocs

List of length(Nit*(1-Pbi)) with the clustering allocations.

means

List of length(Nit*(1-Pbi)) with the cluster means (locations). Only if extras = TRUE.

sigmas

Numeric vector of length(Nit*(1-Pbi)) with the cluster standard deviations. Only if extras = TRUE.

weights

List of length(Nit*(1-Pbi)) with the mixture weights. Only if extras = TRUE.

Js

List of length(Nit*(1-Pbi)) with the unnormalized weights (jump sizes). Only if extras = TRUE.

Nm

Integer constant. Number of jumps of the continuous component of the unnormalized process.

delta_Us

List of length(Nit*(1-Pbi)) with the sequence of adapted delta_U used in the MH step for the latent variable U.

Nx

Integer constant. Number of grid points for the evaluation of the density estimate.

Nit

Integer constant. Number of MCMC iterations.

Pbi

Numeric constant. Burn-in period proportion of Nit.

procTime

Numeric vector with execution time provided by proc.time function.

distr.k

Integer corresponding to the kernel chosen for the mixture

data

Data used for the fit

NRMI_params

A named list with the parameters of the NRMI process

Warning

The function is computing intensive. Be patient.

Author(s)

Barrios, Kon Kam King, G., E., Lijoi, A., Nieto-Barajas, L.E. and Prüenster, I.

References

1.- Barrios, E., Lijoi, A., Nieto-Barajas, L. E. and Prünster, I. (2013). Modeling with Normalized Random Measure Mixture Models. Statistical Science. Vol. 28, No. 3, 313-334.

2.- James, L.F., Lijoi, A. and Prünster, I. (2009). Posterior analysis for normalized random measure with independent increments. Scand. J. Statist 36, 76-97.

3.- Arbel, J., Kon Kam King, G., Lijoi, A., Nieto-Barajas, L.E. and Prüenster, I. (2021). BNPdensity: a package for Bayesian Nonparametric density estimation using Normalised Random Measures with Independent Increments.. Australian and New Zealand Journal of Statistics, to appear

See Also

MixNRMI2, MixNRMI1cens, MixNRMI2cens, multMixNRMI1

Examples

## Not run: 
### Example 1
# Data
data(acidity)
x <- acidity
# Fitting the model under default specifications
out <- MixNRMI2(x)
# Plotting density estimate + 95% credible interval
plot(out)

## End(Not run)

### Example 2
## Do not run
# set.seed(150520)
# data(enzyme)
# x <- enzyme
#  Enzyme2.out <- MixNRMI2(x, Alpha = 1, Kappa = 0.007, Gama = 0.5,
#                          distr.k = "gamma", distr.py0 = "gamma",
#                          distr.pz0 = "gamma", mu.pz0 = 1, sigma.pz0 = 1, Meps=0.005,
#                          Nit = 5000, Pbi = 0.2)
# The output of this run is already loaded in the package
# To show results run the following
# Data
data(enzyme)
x <- enzyme
data(Enzyme2.out)
attach(Enzyme2.out)
# Plotting density estimate + 95% credible interval
plot(Enzyme2.out)
# Plotting number of clusters
par(mfrow = c(2, 1))
plot(R, type = "l", main = "Trace of R")
hist(R, breaks = min(R - 0.5):max(R + 0.5), probability = TRUE)
# Plotting u
par(mfrow = c(2, 1))
plot(U, type = "l", main = "Trace of U")
hist(U, nclass = 20, probability = TRUE, main = "Histogram of U")
# Plotting cpo
par(mfrow = c(2, 1))
plot(cpo, main = "Scatter plot of CPO's")
boxplot(cpo, horizontal = TRUE, main = "Boxplot of CPO's")
print(paste("Average log(CPO)=", round(mean(log(cpo)), 4)))
print(paste("Median log(CPO)=", round(median(log(cpo)), 4)))
detach()

### Example 3
## Do not run
# set.seed(150520)
# data(galaxy)
# x <- galaxy
#  Galaxy2.out <- MixNRMI2(x, Alpha = 1, Kappa = 0.015, Gama = 0.5,
#                          distr.k = "normal", distr.py0 = "gamma",
#                          distr.pz0 = "gamma", mu.pz0 = 1, sigma.pz0 = 1,  Meps=0.005,
#                          Nit = 5000, Pbi = 0.2)
# The output of this run is already loaded in the package
# To show results run the following
# Data
data(galaxy)
x <- galaxy
data(Galaxy2.out)
attach(Galaxy2.out)
# Plotting density estimate + 95% credible interval
plot(Galaxy2.out)
# Plotting number of clusters
par(mfrow = c(2, 1))
plot(R, type = "l", main = "Trace of R")
hist(R, breaks = min(R - 0.5):max(R + 0.5), probability = TRUE)
# Plotting u
par(mfrow = c(2, 1))
plot(U, type = "l", main = "Trace of U")
hist(U, nclass = 20, probability = TRUE, main = "Histogram of U")
# Plotting cpo
par(mfrow = c(2, 1))
plot(cpo, main = "Scatter plot of CPO's")
boxplot(cpo, horizontal = TRUE, main = "Boxplot of CPO's")
print(paste("Average log(CPO)=", round(mean(log(cpo)), 4)))
print(paste("Median log(CPO)=", round(median(log(cpo)), 4)))
detach()

Normalized Random Measures Mixture of Type II for censored data

Description

Bayesian nonparametric estimation based on normalized measures driven mixtures for locations and scales.

Usage

MixNRMI2cens(
  xleft,
  xright,
  probs = c(0.025, 0.5, 0.975),
  Alpha = 1,
  Kappa = 0,
  Gama = 0.4,
  distr.k = "normal",
  distr.py0 = "normal",
  distr.pz0 = "gamma",
  mu.pz0 = 3,
  sigma.pz0 = sqrt(10),
  delta_S = 4,
  kappa = 2,
  delta_U = 2,
  Meps = 0.01,
  Nx = 150,
  Nit = 1500,
  Pbi = 0.1,
  epsilon = NULL,
  printtime = TRUE,
  extras = TRUE,
  adaptive = FALSE
)

Arguments

xleft

Numeric vector. Lower limit of interval censoring. For exact data the same as xright

xright

Numeric vector. Upper limit of interval censoring. For exact data the same as xleft.

probs

Numeric vector. Desired quantiles of the density estimates.

Alpha

Numeric constant. Total mass of the centering measure. See details.

Kappa

Numeric positive constant. See details.

Gama

Numeric constant. 0 \leq Gama \leq 1. See details.

distr.k

The distribution name for the kernel. Allowed names are "normal", "gamma", "beta", "double exponential", "lognormal" or their common abbreviations "norm", "exp", or an integer number identifying the mixture kernel: 1 = Normal; 2 = Gamma; 3 = Beta; 4 = Double Exponential; 5 = Lognormal.

distr.py0

The distribution name for the centering measure for locations. Allowed names are "normal", "gamma", "beta", or their common abbreviations "norm", "exp", or an integer number identifying the centering measure for locations: 1 = Normal; 2 = Gamma; 3 = Beta.

distr.pz0

The distribution name for the centering measure for scales. Allowed names are "gamma", "lognormal", "half-Cauchy", "half-normal", "half-student", "uniform" and "truncated normal", or their common abbreviations "norm", "exp", "lnorm", "halfcauchy", "halfnorm", "halft" and "unif", or an integer number identifying the centering measure for scales: 2 = Gamma, 5 = Lognormal, 6 = Half Cauchy, 7 = Half Normal, 8 = Half Student-t, 9 = Uniform, 10 = Truncated Normal.

mu.pz0

Numeric constant. Prior mean of the centering measure for scales.

sigma.pz0

Numeric constant. Prior standard deviation of the centering measure for scales.

delta_S

Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling the scales.

kappa

Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling the location parameters.

delta_U

Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling the latent U. If 'adaptive=TRUE', 'delta_U'is the starting value for the adaptation.

Meps

Numeric constant. Relative error of the jump sizes in the continuous component of the process. Smaller values imply larger number of jumps.

Nx

Integer constant. Number of grid points for the evaluation of the density estimate.

Nit

Integer constant. Number of MCMC iterations.

Pbi

Numeric constant. Burn-in period proportion of Nit.

epsilon

Numeric constant. Extension to the evaluation grid range. See details.

printtime

Logical. If TRUE, prints out the execution time.

extras

Logical. If TRUE, gives additional objects: means, sigmas, weights and Js.

adaptive

Logical. If TRUE, uses an adaptive MCMC strategy to sample the latent U (adaptive delta_U).

Details

This generic function fits a normalized random measure (NRMI) mixture model for density estimation (James et al. 2009). Specifically, the model assumes a normalized generalized gamma (NGG) prior for both, locations (means) and standard deviations, of the mixture kernel, leading to a fully nonparametric mixture model.

The details of the model are:

X_i|Y_i,Z_i \sim k(\cdot|Y_i,Z_i)

(Y_i,Z_i)|P \sim P, i=1,\dots,n

P \sim \textrm{NGG}(\texttt{Alpha, Kappa, Gama; P\_0})

where, X_i's are the observed data, (Y_i,Z_i)'s are bivariate latent (location and scale) vectors, k is a parametric kernel parameterized in terms of mean and standard deviation, (Alpha, Kappa, Gama; P_0) are the parameters of the NGG prior with a bivariate P_0 being the centering measure with independent components, that is, P_0(Y,Z) = P_0(Y)*P_0(Z). The parameters of P_0(Y) are assigned vague hyper prior distributions and (mu.pz0,sigma.pz0) are the hyper-parameters of P_0(Z). In particular, NGG(Alpha, 1, 0; P_0) defines a Dirichlet process; NGG(1, Kappa, 1/2;P_0) defines a Normalized inverse Gaussian process; and NGG(1, 0, Gama; P_0) defines a normalized stable process. The evaluation grid ranges from min(x) - epsilon to max(x) + epsilon. By default epsilon=sd(x)/4.

Value

The function returns a list with the following components:

xx

Numeric vector. Evaluation grid.

qx

Numeric array. Matrix of dimension \texttt{Nx} \times (\texttt{length(probs)} + 1) with the posterior mean and the desired quantiles input in probs.

cpo

Numeric vector of length(x) with conditional predictive ordinates.

R

Numeric vector of length(Nit*(1-Pbi)) with the number of mixtures components (clusters).

U

Numeric vector of length(Nit*(1-Pbi)) with the values of the latent variable U.

Allocs

List of length(Nit*(1-Pbi)) with the clustering allocations.

means

List of length(Nit*(1-Pbi)) with the cluster means (locations). Only if extras = TRUE.

sigmas

Numeric vector of length(Nit*(1-Pbi)) with the cluster standard deviations. Only if extras = TRUE.

weights

List of length(Nit*(1-Pbi)) with the mixture weights. Only if extras = TRUE.

Js

List of length(Nit*(1-Pbi)) with the unnormalized weights (jump sizes). Only if extras = TRUE.

Nm

Integer constant. Number of jumps of the continuous component of the unnormalized process.

delta_Us

List of length(Nit*(1-Pbi)) with the sequence of adapted delta_U used in the MH step for the latent variable U.

Nx

Integer constant. Number of grid points for the evaluation of the density estimate.

Nit

Integer constant. Number of MCMC iterations.

Pbi

Numeric constant. Burn-in period proportion of Nit.

procTime

Numeric vector with execution time provided by proc.time function.

distr.k

Integer corresponding to the kernel chosen for the mixture

data

Data used for the fit

NRMI_params

A named list with the parameters of the NRMI process

Warning

The function is computing intensive. Be patient.

Author(s)

Barrios, E., Kon Kam King, G. and Nieto-Barajas, L.E.

References

1.- Barrios, E., Lijoi, A., Nieto-Barajas, L. E. and Prünster, I. (2013). Modeling with Normalized Random Measure Mixture Models. Statistical Science. Vol. 28, No. 3, 313-334.

2.- James, L.F., Lijoi, A. and Prünster, I. (2009). Posterior analysis for normalized random measure with independent increments. Scand. J. Statist 36, 76-97.

3.- Kon Kam King, G., Arbel, J. and Prünster, I. (2016). Species Sensitivity Distribution revisited: a Bayesian nonparametric approach. In preparation.

See Also

MixNRMI2, MixNRMI1cens, MixNRMI2cens, multMixNRMI1

Examples

## Not run: 
### Example 1
# Data
data(acidity)
x <- acidity
# Fitting the model under default specifications
out <- MixNRMI2cens(x, x)
# Plotting density estimate + 95% credible interval
plot(out)

## End(Not run)

## Not run: 
### Example 2
# Data
data(salinity)
# Fitting the model under special specifications
out <- MixNRMI2cens(
  xleft = salinity$left, xright = salinity$right, Nit = 5000, distr.pz0 = 10,
  mu.pz0 = 1, sigma.pz0 = 2
)
# Plotting density estimate + 95% credible interval
attach(out)
plot(out)
# Plotting number of clusters
par(mfrow = c(2, 1))
plot(R, type = "l", main = "Trace of R")
hist(R, breaks = min(R - 0.5):max(R + 0.5), probability = TRUE)
detach()

## End(Not run)


Pitman-Yor process mixture of Type I

Description

This function calls the PYdensity function from package BNPmix, to allow fitting a Pitman-Yor process mixture to the data.

Usage

MixPY1(
  x,
  probs = c(0.025, 0.5, 0.975),
  Alpha = 1,
  Gama = 0.4,
  asigma = 2,
  bsigma = 1/var(x),
  Nx = 100,
  Nit = 1500,
  Pbi = 0.5,
  epsilon = NULL,
  printtime = TRUE,
  extras = TRUE
)

Arguments

x

Numeric vector. Data set to which the density is fitted.

probs

Numeric vector. Desired quantiles of the density estimates.

Alpha

Numeric constant. Total mass of the centering measure. See

Gama

Numeric constant. 0\leq \texttt{Gama} \leq 1. See details.

asigma

Numeric positive constant. Shape parameter of the gamma prior on the standard deviation of the mixture kernel. Default value suggested by package BNPmix.

bsigma

Numeric positive constant. Rate parameter of the gamma prior on the standard deviation of the mixture kernel. Default value suggested by package BNPmix.

Nx

Integer constant. Number of grid points for the evaluation of the density estimate.

Nit

Integer constant. Number of MCMC iterations.

Pbi

Numeric constant. Burn-in period proportion of Nit.

epsilon

Numeric constant. Extension to the evaluation grid range. See details.

printtime

Logical. If TRUE, prints out the execution time.

extras

Logical. If TRUE, gives additional objects: means and weights

Value

The function returns a MixPY1 object. It is based on a list with the following components:

xx

Numeric vector. Evaluation grid.

qx

Numeric array. Matrix of dimension \texttt{Nx} \times (\texttt{length(probs)} + 1) with the posterior mean and the desired quantiles input in probs.

R

Numeric vector of length(Nit*(1-Pbi)) with the number of mixtures components (clusters).

S

Numeric vector of length(Nit*(1-Pbi)) with the values of common standard deviation sigma.

Allocs

List of length(Nit*(1-Pbi)) with the clustering allocations.

means

List of length(Nit*(1-Pbi)) with the cluster means (locations). Only if extras = TRUE.

weights

List of length(Nit*(1-Pbi)) with the mixture weights. Only if extras = TRUE.

Nit

Integer constant. Number of MCMC iterations.

Pbi

Numeric constant. Burn-in period proportion of Nit.

distr.k

Integer corresponding to the kernel chosen for the mixture. Always 1, since the Pitman-Yor process is only written to work with Gaussian kernels.

data

Data used for the fit

PY_params

A named list with the parameters of the Pitman-Yor process

Examples

# Data
data(acidity)
x <- acidity
# Fitting the model under default specifications
out <- MixPY1(x)
# Plotting density estimate + 95% credible interval
plot(out)

Pitman-Yor process mixture of Type II

Description

This function calls the PYdensity function from package BNPmix, to allow fitting a Pitman-Yor process mixture to the data.

Usage

MixPY2(
  x,
  probs = c(0.025, 0.5, 0.975),
  Alpha = 1,
  Gama = 0.4,
  asigma = 2,
  bsigma = 1/var(x),
  Nx = 100,
  Nit = 1500,
  Pbi = 0.5,
  epsilon = NULL,
  printtime = TRUE,
  extras = TRUE
)

Arguments

x

Numeric vector. Data set to which the density is fitted.

probs

Numeric vector. Desired quantiles of the density estimates.

Alpha

Numeric constant. Total mass of the centering measure. See

Gama

Numeric constant. 0\leq \texttt{Gama} \leq 1. See details.

asigma

Numeric positive constant. Shape parameter of the gamma prior on the standard deviation of the mixture kernel. Default value suggested by package BNPmix.

bsigma

Numeric positive constant. Rate parameter of the gamma prior on the standard deviation of the mixture kernel. Default value suggested by package BNPmix.

Nx

Integer constant. Number of grid points for the evaluation of the density estimate.

Nit

Integer constant. Number of MCMC iterations.

Pbi

Numeric constant. Burn-in period proportion of Nit.

epsilon

Numeric constant. Extension to the evaluation grid range. See details.

printtime

Logical. If TRUE, prints out the execution time.

extras

Logical. If TRUE, gives additional objects: means and weights

Value

The function returns a MixPY2 object. It is based on a list with the following components:

xx

Numeric vector. Evaluation grid.

qx

Numeric array. Matrix of dimension \texttt{Nx} \times (\texttt{length(probs)} + 1) with the posterior mean and the desired quantiles input in probs.

R

Numeric vector of length(Nit*(1-Pbi)) with the number of mixtures components (clusters).

Allocs

List of length(Nit*(1-Pbi)) with the clustering allocations.

means

List of length(Nit*(1-Pbi)) with the cluster means (locations). Only if extras = TRUE.

sigmas

List of length(Nit*(1-Pbi)) with the cluster standard deviations (scales). Only if extras = TRUE.

weights

List of length(Nit*(1-Pbi)) with the mixture weights. Only if extras = TRUE.

Nit

Integer constant. Number of MCMC iterations.

Pbi

Numeric constant. Burn-in period proportion of Nit.

distr.k

Integer corresponding to the kernel chosen for the mixture. Always 1, since the Pitman-Yor process is only written to work with Gaussian kernels.

data

Data used for the fit

PY_params

A named list with the parameters of the Pitman-Yor process

Examples

# Data
data(acidity)
x <- acidity
# Fitting the model under default specifications
out <- MixPY2(x)
# Plotting density estimate + 95% credible interval
plot(out)

Continuous Jump heights function

Description

This function evaluates the M(v) function that determines the jump heights in the "continuous" part of an increasing additive process.

Usage

Mv(u, alpha, beta, gama, low, upp, N)

Details

For internal use.

Examples


## The function is currently defined as
function(u = 0.5, alpha = 1, beta = 1, gama = 1 / 2, low = 1e-04,
         upp = 10, N = 5001) {
  x <- -log(seq(from = exp(-low), to = exp(-upp), length = N))
  f <- alpha / gamma(1 - gama) * x^(-(1 + gama)) * exp(-(u +
    beta) * x)
  dx <- diff(x)
  h <- (f[-1] + f[-N]) / 2
  Mv <- rep(0, N)
  for (i in seq(N - 1, 1)) Mv[i] <- Mv[i + 1] + dx[i] * h[i]
  return(list(v = x, Mv = Mv))
}

Invert jump heights function

Description

Determines the jump heights of an increasing additive process by inverting the M(v) function. Use a truncation level based on expected moments of the NGG process (thresholdGG). For internal use.

Usage

MvInv(eps, u = 0.5, alpha = 1, kappa = 1, gama = 1/2, N = 3001)

Arguments

eps

Dummy argument kept for consistency with past versions of the functions

u

Real number. The value of the latent variable at the current step.

alpha

Numeric constant. Total mass of the centering measure.

kappa

Numeric positive constant.

gama

Numeric constant. Discount parameter of the NRMI process.

N

Number of steps in the discretization scheme for the grid inversion.

## The function has been optimised but it is morally defined as: function(eps, u = 0.5, alpha = 1, kappa = 1, gama = 1 / 2, N = 3001) n <- length(w) v <- rep(NA, n) x <- -log(seq(from = exp(-1e-05), to = exp(-10), length = N)) f <- alpha / gamma(1 - gama) * x^(-(1 + gama)) * exp(-(u + kappa) * x) dx <- diff(x) h <- (f[-1] + f[-N]) / 2 Mv <- rep(0, N) for (i in seq(N - 1, 1)) Mv[i] <- Mv[i + 1] + dx[i] * h[i] for (j in seq(n)) v[j] <- x[which.min(Mv > w[j])] return(v)


Acidity Index Dataset

Description

Concerns an acidity index measured in a sample of 155 lakes in north-central Wisconsin.

Format

A real vector with 155 observations.

References

Crawford, S. L., DeGroot, M. H., Kadane, J. B. and Small, M. J. (1992). Modeling lake chemistry distributions: approximate Bayesian methods for estimating a finite mixture model. Technometrics, 34, 441-453.

Examples


data(acidity)
hist(acidity)

Add x and y

Description

This is a helper function for use in Reduce() over a list of vectors

Usage

add(x, y)

Arguments

x

first argument of the sum

y

second argument of the sum

Value

x + y


Convert the output of multMixNRMI into a coda mcmc object

Description

Convert the output of multMixNRMI into a coda mcmc object

Usage

## S3 method for class 'multNRMI'
as.mcmc(x, ..., thinning_to = 1000, ncores = parallel::detectCores())

Arguments

x

Output of multMixNRMI.

...

Further arguments to be passed to specific methods

thinning_to

Final length of the chain after thinning.

ncores

Specify the number of cores to use in the conversion

Value

a coda::mcmc object

Examples

data(acidity)
out <- multMixNRMI1(acidity, parallel = TRUE, Nit = 10, ncores = 2)
coda::as.mcmc(out, ncores = 2)

If the function Rmpfr::asNumeric returns a warning about inefficiency, silence it.

Description

The function Rmpfr::asNumeric prints the following warning: In asMethod(object) : coercing "mpfr1" via "mpfr" (inefficient). It is not clear how to avoid it nor how to silence it, hence this function. A cleaner solution may be available at: https://stackoverflow.com/questions/4948361/how-do-i-save-warnings-and-errors-as-output-from-a-function/4952908#4952908

Usage

asNumeric_no_warning(x)

Arguments

x

An object of class Rmpfr::mpfr1

Value

a "numeric" number


Censoring data check

Description

Checks that a censored dataset is valid. This performs two checks: check that the dataset does not contain only NA, and check that the for interval censored data, the bounds are in the right order.

Usage

cens_data_check(xleft, xright)

Arguments

xleft

left bounds for the censored dataset. Can be a real number or NA

xright

right bounds for the censored dataset

Details

For internal use

Examples


## The function is currently defined as
function(xleft, xright) {
  if (any(xright < xleft, na.rm = T)) {
    stop("in censored data, left bound not always smaller than right bound")
  }
  if (any(mapply(FUN = function(xileft, xiright) {
    is.na(xileft) & is.na(xiright)
  }, xleft, xright))) {
    stop("in censored data, there is an NA NA")
  }
}

Censor code right-left

Description

Creates censoring code 0:3.

Usage

censor_code_rl(left, right)

Details

For internal use

Examples


## The function is currently defined as
function(left, right) {
  test_ <- function(k) {
    if (is.na(left[[k]]) & is.na(right[[k]])) {
      NA
    } else if (is.na(left[[k]])) {
      2
    } else if (is.na(right[[k]])) {
      0
    } else if (left[[k]] == right[[k]]) {
      1
    } else {
      3
    }
  }
  sapply(seq_along(left), FUN = test_)
}

Comment on the NRMI process depending on the value of the parameters

Description

Comment on the NRMI process depending on the value of the parameters

Usage

comment_on_NRMI_type(NRMI_param = list(Alpha = 1, Kappa = 0, Gamma = 0.4))

Arguments

NRMI_param

A named list of the form list("Alpha" = 1, "Kappa" = 0, "Gamma" = 0.4)

Value

A string containing a comment on the NRMI process

Examples

BNPdensity:::comment_on_NRMI_type(list("Alpha" = 1, "Kappa" = 0, "Gamma" = 0.4))
BNPdensity:::comment_on_NRMI_type(list("Alpha" = 1, "Kappa" = 0.1, "Gamma" = 0.4))
BNPdensity:::comment_on_NRMI_type(list("Alpha" = 1, "Kappa" = 0.1, "Gamma" = 0.5))

Ties function: univariate

Description

This function computes the distinct observations and their frequencies in a numeric vector.

Usage

comp1(y)

Details

For internal use.

Examples


## The function is currently defined as
function(y) {
  n <- length(y)
  mat <- outer(y, y, "==")
  jstar <- led <- rep(FALSE, n)
  for (j in seq(n)) {
    if (!led[j]) {
      jstar[j] <- TRUE
      if (j == n) {
        break
      }
      ji <- seq(j + 1, n)
      tt <- mat[ji, j] %in% TRUE
      led[ji] <- led[ji] | tt
    }
    if (all(led[-seq(j)])) {
      break
    }
  }
  ystar <- y[jstar]
  nstar <- apply(mat[, jstar], 2, sum)
  r <- length(nstar)
  idx <- match(y, ystar)
  return(list(ystar = ystar, nstar = nstar, r = r, idx = idx))
}

Ties function: bivariate

Description

This function computes the distinct observations (couples) and their frequencies in a bivariate numeric vector.

Usage

comp2(y, z)

Details

For internal use.

Examples


## The function is currently defined as
function(y, z) {
  if (length(y) != length(z)) {
    stop("Vectors y and z should have equal length!")
  }
  n <- length(y)
  matY <- outer(y, y, "==")
  matZ <- outer(z, z, "==")
  mat <- matY & matZ
  jstar <- led <- rep(FALSE, n)
  for (j in seq(n)) {
    if (!led[j]) {
      jstar[j] <- TRUE
      if (j == n) {
        break
      }
      ji <- seq(j + 1, n)
      tt <- mat[ji, j] %in% TRUE
      led[ji] <- led[ji] | tt
    }
    if (all(led[-seq(j)])) {
      break
    }
  }
  ystar <- y[jstar]
  zstar <- z[jstar]
  nstar <- apply(mat[, jstar], 2, sum)
  rstar <- length(nstar)
  idx <- match(y, ystar)
  return(list(
    ystar = ystar, zstar = zstar, nstar = nstar,
    rstar = rstar, idx = idx
  ))
}

Compute the optimal clustering from an MCMC sample

Description

Summarizes the posterior on all possible clusterings by an optimal clustering where optimality is defined as minimizing the posterior expectation of a specific loss function, the Variation of Information or Binder's loss function. Computation can be lengthy for large datasets, because of the large size of the space of all clusterings.

Usage

compute_optimal_clustering(fit, loss_type = "VI")

Arguments

fit

The fitted object, obtained from one of the MixNRMIx functions

loss_type

Defines the loss function to be used in the expected posterior loss minimization. Can be one of "VI" (Variation of Information), "B" (Binder's loss), "NVI" (Normalized Variation of Information) or "NID" (Normalized Information Distance). Defaults to "VI".

Value

A vector of integers with the same size as the data, indicating the allocation of each data point.


Compute the grid for thinning the MCMC chain

Description

This function creates an real grid then rounds it. If the grid is fine enough, there is a risk that rounding ties, i.e. iteration which are kept twice. To avoid this, if the total number of iterations is smaller than twice the number of iterations desired after thinning, the chain is not thinned.

Usage

compute_thinning_grid(Nit, thinning_to = 10)

Arguments

Nit

Length of the MCMC chain

thinning_to

Desired number of iterations after thinning.

Value

an integer vector of the MCMC iterations retained.


Convert the output of multMixNRMI into a coda mcmc object

Description

Convert the output of multMixNRMI into a coda mcmc object

Usage

convert_to_mcmc(fitlist, thinning_to = 1000, ncores = parallel::detectCores())

Arguments

fitlist

Output of multMixNRMI.

thinning_to

Final length of the chain after thinning.

ncores

Specify the number of cores to use in the conversion

Value

a coda::mcmc object


Conditional predictive ordinate function

Description

This function computes conditional predictive ordinates for each data point.

Usage

cpo(object, ...)

cpo(object, ...)

Arguments

object

A fit obtained through one of the NRMI functions

...

Details

For internal use.

Value

A vector of Conditional Predictive Ordinates (CPOs)

Examples


## The function is currently defined as
function(obj) {
  fx <- obj$fx
  cpo <- 1 / apply(1 / fx, 1, mean)
  return(cpo)
}

Extract the Conditional Predictive Ordinates (CPOs) from a fitted object

Description

Extract the Conditional Predictive Ordinates (CPOs) from a fitted object

Usage

## S3 method for class 'NRMI1'
cpo(object, ...)

Arguments

object

A fit obtained through from the functions MixNRMI1/MixNRMI1cens

...

Further arguments to be passed to generic function, ignored at the moment

Value

A vector of Conditional Predictive Ordinates (CPOs)

Examples

data(acidity)
out <- MixNRMI1(acidity, Nit = 50)
cpo(out)

Extract the Conditional Predictive Ordinates (CPOs) from a fitted object

Description

Extract the Conditional Predictive Ordinates (CPOs) from a fitted object

Usage

## S3 method for class 'NRMI2'
cpo(object, ...)

Arguments

object

A fit obtained through from the function MixNRMI2/MixNRMI2cens

...

Further arguments to be passed to generic function, ignored at the moment

Value

A vector of Conditional Predictive Ordinates (CPOs)

Examples

data(acidity)
out <- MixNRMI2(acidity, Nit = 50)
cpo(out)

Extract the Conditional Predictive Ordinates (CPOs) from a list of fitted objects

Description

This function assumes that all chains have the same size. To allow for different chain sizes, care should be paid to proper weighting.

Usage

## S3 method for class 'multNRMI'
cpo(object, ...)

Arguments

object

A fit obtained through from the functions MixNRMI1/MixNRMI1cens

...

Further arguments to be passed to generic function, ignored at the moment

Value

A vector of Conditional Predictive Ordinates (CPOs)

Examples

data(acidity)
out <- multMixNRMI1(acidity, parallel = TRUE, Nit = 10, ncores = 2)
cpo(out)

Density half Cauchy

Description

Computes the density.

Usage

dhalfcauchy(x, location = 0, scale = 1)

Details

For internal use

Examples


## The function is currently defined as
function(x, location = 0, scale = 1) {
  ifelse(x < 0, 0, 1) * dcauchy(x, location, scale) / (1 - pcauchy(
    0,
    location, scale
  ))
}

Density half normal

Description

Computes the density.

Usage

dhalfnorm(x, mean = 0, sd = 1)

Details

For internal use

Examples


## The function is currently defined as
function(x, mean = 0, sd = 1) {
  ifelse(x < 0, 0, 1) * dnorm(x, mean, sd) / (1 - pnorm(
    0, mean,
    sd
  ))
}

Density half Student-t

Description

Computes the density.

Usage

dhalft(x, df = 1, mean = 0, sd = 1)

Details

For internal use

Examples


## The function is currently defined as
function(x, df = 1, mean = 0, sd = 1) {
  ifelse(x < 0, 0, 1) * dt_(x, df, mean, sd) / (1 - pt_(
    0, df,
    mean, sd
  ))
}

Convert distribution names to indices

Description

Convert distribution names to indices

Usage

dist_name_k_index_converter(distname)

Arguments

distname

a character representing the distribution name. Allowed names are "normal", "gamma", "beta", "exponential", "double exponential", "lognormal", "half-Cauchy", "half-normal", "half-student", "uniform" and "truncated normal", or their common abbreviations "norm", "exp", "lnorm", "halfcauchy", "halfnorm", "halft" and "unif".

Value

an index describing the distribution. 1 = Normal; 2 = Gamma; 3 = Beta; 4 = Double Exponential; 5 = Lognormal, 6 = Half-Cauchy, 7 = Half-normal, 8 = Half-Student, 9 = Uniform, 10 = Truncated normal


Kernel density function

Description

This functions evaluates a density at a certain data point. There are 4 density options (1 (normal), 2 (gamma), 3 (beta), 4 (exponential), 5 (lognormal), 6 (half-Cauchy), 7 (half-normal), 8 (half-student), 9 (uniform) and 10 (truncated normal)). All densities are parameterized in terms of mean and standard deviation.

Usage

dk(x, distr = NULL, mu = NULL, sigma = NULL)

Details

For internal use.


Density of the chosen kernel

Description

Computes likelihood contribution for censored data.

Usage

dkcens2(xleft, xright, c_code_filters, distr = NULL, mu = NULL, sigma = NULL)

Details

For internal use

Examples


## The function is currently defined as
function(xleft, xright, c_code_filters, distr = NULL, mu = NULL,
         sigma = NULL) {
  res <- seq_along(xleft)
  res[c_code_filters[["1"]]] <- dk(
    x = xleft[c_code_filters[["1"]]],
    distr, mu, sigma
  )
  res[c_code_filters[["2"]]] <- pk(
    xright[c_code_filters[["2"]]],
    distr, mu, sigma
  )
  res[c_code_filters[["0"]]] <- 1 - pk(
    xleft[c_code_filters[["0"]]],
    distr, mu, sigma
  )
  res[c_code_filters[["3"]]] <- pk(
    xright[c_code_filters[["3"]]],
    distr, mu, sigma
  ) - pk(
    xleft[c_code_filters[["3"]]],
    distr, mu, sigma
  )
  return(res)
}

Density evaluation once

Description

Computes the likelihood contribution for one data point in the case of censoring.

Usage

dkcens2_1val(xleft, xright, c_code, distr = NULL, mu = NULL, sigma = NULL)

Details

For internal use

Examples


## The function is currently defined as
function(xleft, xright, c_code, distr = NULL, mu = NULL, sigma = NULL) {
  if (c_code == 1) {
    dk(x = xleft, distr, mu, sigma)
  } else if (c_code == 2) {
    pk(xright, distr, mu, sigma)
  } else if (c_code == 0) {
    1 - pk(xleft, distr, mu, sigma)
  } else if (c_code == 3) {
    pk(xright, distr, mu, sigma) - pk(xleft, distr, mu, sigma)
  } else {
    stop("Wrong integer code for censored data")
  }
}

Non-standard student-t density

Description

Computes the density.

Usage

dt_(x, df, mean, sd)

Arguments

x

Numeric vector. Data set to which the density is evaluated.

df

Numeric constant. Degrees of freedom (> 0, maybe non-integer)

mean

Numeric constant. Location parameter.

sd

Positive numeric constant. Scale parameter.

## The function is currently defined as function(x, df, mean, sd) dt((x - mean) / sd, df, ncp = 0) / sd

Details

For internal use


Density truncated normal

Description

Computes the density.

Usage

dtnorm(x, mean = 0, sd = 1, lower = -Inf, upper = Inf, log = FALSE)

Details

For internal use

Note

Taken from msm R-package.

Author(s)

C. H. Jackson

References

Taken from

Examples


## The function is currently defined as
function(x, mean = 0, sd = 1, lower = -Inf, upper = Inf, log = FALSE) {
  ret <- numeric(length(x))
  ret[x < lower | x > upper] <- if (log) {
    -Inf
  } else {
    0
  }
  ret[upper < lower] <- NaN
  ind <- x >= lower & x <= upper
  if (any(ind)) {
    denom <- pnorm(upper, mean, sd) - pnorm(
      lower, mean,
      sd
    )
    xtmp <- dnorm(x, mean, sd, log)
    if (log) {
      xtmp <- xtmp - log(denom)
    } else {
      xtmp <- xtmp / denom
    }
    ret[x >= lower & x <= upper] <- xtmp[ind]
  }
  ret
}

Enzyme Dataset

Description

Concerns the distribution of enzymatic activity in the blood, for an enzyme involved in the metabolism of carcinogenetic substances, among a group of 245 unrelated individuals.

Format

A data frame with 244 observations on the following variable:

list("enzyme")

A numeric vector.

References

Bechtel, Y. C., Bonaiti-Pellie, C., Poisson, N., Magnette, J. and Bechtel, P.R. (1993). A population and family study of N-acetyltransferase using caffeine urinary metabolites. Clin. Pharm. Therp., 54, 134-141.

Examples


data(enzyme)
hist(enzyme)

Computes the expected number of components for a Dirichlet process.

Description

Computes the expected number of components for a Dirichlet process.

Usage

expected_number_of_components_Dirichlet(
  n,
  Alpha,
  ntrunc = NULL,
  silence = TRUE
)

Arguments

n

Number of data points

Alpha

Numeric constant. Total mass of the centering measure.

ntrunc

Level of truncation when computing the expectation. Defaults to n. If greater than n, it is fixed to n.

silence

Boolean. Whether to print the current calculation step for the Stable process, as the function can be long

Value

A real value which approximates the expected number of components

Reference: P. De Blasi, S. Favaro, A. Lijoi, R. H. Mena, I. Prünster, and M. Ruggiero, “Are Gibbs-type priors the most natural generalization of the Dirichlet process?,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 37, no. 2, pp. 212–229, 2015.

Examples


expected_number_of_components_Dirichlet(100, 1.2)

Computes the expected number of components for a stable process.

Description

Computes the expected number of components for a stable process.

Usage

expected_number_of_components_stable(n, Gama, ntrunc = NULL)

Arguments

n

Number of data points

Gama

Numeric constant. 0 <= Gama <=1.

ntrunc

Level of truncation when computing the expectation. Defaults to n. If greater than n, it is fixed to n.

Value

A real value of type mpfr1 which approximates the expected number of components

In spite of the high precision arithmetic packages used for in function, it can be numerically unstable for small values of Gama. This is because evaluating a sum with alternated signs, in the generalized factorial coefficients, is tricky. Reference: P. De Blasi, S. Favaro, A. Lijoi, R. H. Mena, I. Prünster, and M. Ruggiero, “Are gibbs-type priors the most natural generalization of the Dirichlet process?,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 37, no. 2, pp. 212–229, 2015.

Examples


expected_number_of_components_stable(100, 0.8)

Conditional density evaluation in the semiparametric model

Description

This function evaluates a density path conditionally on a posterior realization of the normalized measure.

Usage

fcondXA(x, distr, Tau, J, sigma)

Details

For internal use.

Examples


## The function is currently defined as
function(x, distr = 1, Tau, J, sigma) {
  pJ <- J / sum(J)
  K <- matrix(NA, nrow = length(Tau), ncol = length(x))
  for (i in seq(Tau)) {
    K[i, ] <- dk(x, distr = distr, mu = Tau[i], sigma = sigma)
  }
  fcondXA <- apply(K, 2, function(x) sum(x * pJ))
  return(fcondXA)
}

Conditional density evaluation in the fully nonparametric model

Description

This function evaluates a density path conditionally on a posterior realization of the normalized measure.

Usage

fcondXA2(x, distr, Tauy, Tauz, J)

Details

For internal use.

Examples


## The function is currently defined as
function(x, distr = 1, Tauy, Tauz, J) {
  pJ <- J / sum(J)
  K <- matrix(NA, nrow = length(Tauy), ncol = length(x))
  for (i in seq(Tauy)) {
    K[i, ] <- dk(x, distr = distr, mu = Tauy[i], sigma = Tauz[i])
  }
  fcondXA2 <- apply(K, 2, function(x) sum(x * pJ))
  return(fcondXA2)
}

Conditional density evaluation in the fully nonparametric model for censored data

Description

This function evaluates a density path conditionally on a posterior realization of the normalized measure.

Usage

fcondXA2cens2(xleft, xright, censor_code_filters, distr, Tauy, Tauz, J)

Details

For internal use

Examples


## The function is currently defined as
function(xleft, xright, censor_code_filters, distr, Tauy, Tauz,
         J) {
  pJ <- J / sum(J)
  K <- matrix(NA, nrow = length(Tauy), ncol = length(xleft))
  for (i in seq(Tauy)) {
    K[i, ] <- dkcens2(
      xleft = xleft, xright = xright, c_code_filters = censor_code_filters,
      distr = distr, mu = Tauy[i], sigma = Tauz[i]
    )
  }
  fcondXA2cens <- apply(K, 2, function(x) sum(x * pJ))
  return(fcondXA2cens)
}

Conditional posterior distribution of the latents Y

Description

This function simulates form the conditional posterior distribution of the latents Y.

Usage

fcondYXA(x, distr, Tau, J, sigma)

Details

For internal use.

Examples


## The function is currently defined as
function(x, distr = 1, Tau, J, sigma) {
  K <- matrix(NA, nrow = length(Tau), ncol = length(x))
  for (i in seq(Tau)) {
    K[i, ] <- dk(x, distr = distr, mu = Tau[i], sigma = sigma) *
      J[i]
  }
  pK <- prop.table(K, margin = 2)
  y <- apply(pK, 2, function(x) sample(Tau, size = 1, prob = x))
  return(y)
}

Conditional posterior distribution of the latents Y in the censoring case

Description

This function simulates form the conditional posterior distribution of the latents Y.

Usage

fcondYXAcens2(xleft, xright, censor_code_filters, distr, Tau, J, sigma)

Details

For internal use

Examples


## The function is currently defined as
function(xleft, xright, censor_code_filters, distr, Tau, J,
         sigma) {
  K <- matrix(NA, nrow = length(Tau), ncol = length(xleft))
  for (i in seq(Tau)) {
    K[i, ] <- dkcens2(
      xleft = xleft, xright = xright, c_code_filters = censor_code_filters,
      distr = distr, mu = Tau[i], sigma = sigma
    ) * J[i]
  }
  pK <- prop.table(K, margin = 2)
  y <- apply(pK, 2, function(x) sample(Tau, size = 1, prob = x))
  return(y)
}

Conditional posterior distribution of the bivariate latents (Y,Z)

Description

This function simulates form the conditional posterior distribution of the latents (Y,Z).

Usage

fcondYZXA(x, distr, Tauy, Tauz, J)

Details

For internal use.

Examples


## The function is currently defined as
function(x, distr = 1, Tauy, Tauz, J) {
  K <- matrix(NA, nrow = length(Tauy), ncol = length(x))
  for (i in seq(Tauy)) {
    K[i, ] <- dk(x, distr = distr, mu = Tauy[i], sigma = Tauz[i]) *
      J[i]
  }
  if (any(is.na(K))) {
    print(K, Tauy, Tauz, J)
  }
  pK <- prop.table(K, margin = 2)
  j <- apply(pK, 2, function(x) {
    sample(length(Tauy),
      size = 1,
      prob = x
    )
  })
  return(matrix(c(y = Tauy[j], z = Tauz[j]),
    nrow = length(x),
    ncol = 2
  ))
}

Conditional posterior distribution of the bivariate latents (Y,Z) in the case of censoring

Description

This function simulates form the conditional posterior distribution of the latents (Y,Z).

Usage

fcondYZXAcens2(xleft, xright, censor_code_filters, distr, Tauy, Tauz, J)

Details

For internal use

Examples


## The function is currently defined as
function(xleft, xright, censor_code_filters, distr, Tauy, Tauz,
         J) {
  K <- matrix(NA, nrow = length(Tauy), ncol = length(xleft))
  for (i in seq(Tauy)) {
    K[i, ] <- dkcens2(xleft, xright,
      c_code_filters = censor_code_filters,
      distr = distr, mu = Tauy[i], sigma = Tauz[i]
    ) * J[i]
  }
  if (any(is.na(K))) {
    print(K, Tauy, Tauz, J)
  }
  pK <- prop.table(K, margin = 2)
  j <- apply(pK, 2, function(x) {
    sample(length(Tauy),
      size = 1,
      prob = x
    )
  })
  return(matrix(c(y = Tauy[j], z = Tauz[j]),
    nrow = length(xleft),
    ncol = 2
  ))
}

Repeat the common scale parameter of a semiparametric model to match the dimension of the location parameters.

Description

Repeat the common scale parameter of a semiparametric model to match the dimension of the location parameters.

Usage

fill_sigmas(semiparametric_fit)

Arguments

semiparametric_fit

The result of the fit, obtained through the function MixNRMI1.

Value

an adequate list of vectors of sigmas


Galaxy Data Set

Description

Velocities of 82 galaxies diverging from our own galaxy.

Format

A data frame with 82 observations on the following variable:

list("velocity")

A numeric vector.

References

Roeder, K. (1990) "Density estimation with confidence sets exemplified by superclusters and voids in the galaxies". Journal of the American Statistical Association. 85, 617-624.

Examples


data(galaxy)
hist(galaxy)

Gives the kernel name from the integer code

Description

This function is used in the print methods for MixNRMI1, MixNRMI2, MixNRMI1cens, MixNRMI2cens, and all the multMixNRMIx versions

Usage

give_kernel_name(distr.k)

Arguments

distr.k

The distribution name for the kernel. Allowed names are "normal", "gamma", "beta", "double exponential", "lognormal" or their common abbreviations "norm", "exp", or an integer number identifying the mixture kernel: 1 = Normal; 2 = Gamma; 3 = Beta; 4 = Double Exponential; 5 = Lognormal.

Value

A character with the name of the distribution used as the kernel

Examples

BNPdensity:::give_kernel_name(4)

Create a plotting grid from censored or non-censored data.

Description

Create a plotting grid from censored or non-censored data.

Usage

grid_from_data(data, npoints = 100)

Arguments

data

Input data from which to compute the grid.

npoints

Number of points on the grid.

Value

a vector containing the plotting grid


Create a plotting grid from censored data.

Description

Create a plotting grid from censored data.

Usage

grid_from_data_censored(data, npoints = 100)

Arguments

data

Censored input data from which to compute the grid.

npoints

Number of points on the grid.

Value

a vector containing the plotting grid


Create a plotting grid from non-censored data.

Description

Create a plotting grid from non-censored data.

Usage

grid_from_data_noncensored(data, npoints = 100)

Arguments

data

Non-censored input data from which to compute the grid.

npoints

Number of points on the grid.

Value

a vector containing the plotting grid


Conditional posterior distribution of latent U

Description

This function simulates from the conditional posterior distribution of the latent U.

Usage

gs3(ut, n, r, alpha, kappa, gama, delta)

Details

For internal use.

Examples


## The function is currently defined as
function(ut, n = 200, r = 20, alpha = 1, kappa = 1, gama = 1 / 2,
         delta = 2) {
  w <- ut
  ratio <- NaN
  while (is.nan(ratio)) {
    v <- ustar <- rgamma(1, shape = delta, rate = delta / ut)
    vw <- v / w
    vb <- v + kappa
    wb <- w + kappa
    A <- vw^(n - 2 * delta)
    B <- (vb / wb)^(r * gama - n)
    D <- vb^gama - wb^gama
    E <- 1 / vw - vw
    ratio <- A * B * exp(-alpha / gama * D - delta * E)
  }
  p <- min(1, ratio)
  u <- ifelse(runif(1) <= p, ustar, ut)
  return(u)
}

Conditional posterior distribution of latent U

Description

This function simulates from the conditional posterior distribution of the latent U, with an adaptive proposal

Usage

gs3_adaptive3(ut, n, r, alpha, kappa, gama, delta, U, iter, adapt = FALSE)

Conditional posterior distribution of latent logU

Description

This function simulates from the conditional posterior distribution of a log transformation of the latent U.

Usage

gs3_log(logut, n, r, alpha, kappa, gama, delta)

Arguments

logut

Real, log of the latent variable U at the current iteration.

n

Integer, number of data points.

r

Integer, number of clusters.

alpha

Positive real. Total mass of the centering measure.

kappa

Positive real. A parameter of the NRMI process.

gama

Real. 0\leq \texttt{gama} \leq 1. See details.

delta

Scale of the Metropolis-Hastings proposal distribution


Resampling Ystar function

Description

This function resamples the distinct Ystar in the semiparametric model.

Usage

gs4(ystar, x, idx, distr.k, sigma.k, distr.p0, mu.p0, sigma.p0)

Details

For internal use.

Examples


## The function is currently defined as
function(ystar, x, idx, distr.k, sigma.k, distr.p0, mu.p0, sigma.p0) {
  r <- length(ystar)
  nstar <- as.numeric(table(idx))
  for (j in seq(r)) {
    id <- which(!is.na(match(idx, j)))
    xj <- x[id]
    xbar <- sum(xj) / nstar[j]
    y2star <- rk(1, distr = distr.k, mu = xbar, sigma = sigma.k / sqrt(nstar[j]))
    f.ratio <- rfyzstar(y2star, ystar[j], xj,
      distr = distr.k, sigma = sigma.k,
      distr.p0 = distr.p0, mu.p0 = mu.p0, sigma.p0 = sigma.p0
    )
    k.ratio <- dk(ystar[j],
      distr = distr.k,
      mu = xbar, sigma = sigma.k / sqrt(nstar[j])
    ) / dk(y2star,
      distr = distr.k, mu = xbar, sigma = sigma.k / sqrt(nstar[j])
    )
    q2 <- min(1, f.ratio * k.ratio)
    ystar[j] <- ifelse(runif(1) <= q2, y2star, ystar[j])
  }
  return(ystar)
}

Resampling Ystar function in the case of censoring

Description

This function resamples the distinct Ystar in the semiparametric model.

Usage

gs4cens2(
  ystar,
  xleft,
  xright,
  censor_code,
  idx,
  distr.k,
  sigma.k,
  distr.p0,
  mu.p0,
  sigma.p0
)

Details

For internal use

Examples


## The function is currently defined as
function(ystar, xleft, xright, censor_code, idx, distr.k, sigma.k,
         distr.p0, mu.p0, sigma.p0) {
  r <- length(ystar)
  nstar <- as.numeric(table(idx))
  for (j in seq(r)) {
    id <- which(!is.na(match(idx, j)))
    xjleft <- xleft[id]
    xjright <- xright[id]
    xbar <- 0.5 * sum(xjleft + xjright, na.rm = T) / nstar[j]
    y2star <- rk(1, distr = distr.k, mu = xbar, sigma = sigma.k / sqrt(nstar[j]))
    f.ratio <- rfystarcens2(
      v = y2star, v2 = ystar[j], xleft = xjleft,
      xright = xjright, censor_code = censor_code[id],
      distr.k = distr.k, sigma.k = sigma.k, distr.p0 = distr.p0,
      mu.p0 = mu.p0, sigma.p0 = sigma.p0
    )
    k.ratio <- dk(ystar[j], distr = distr.k, mu = xbar, sigma = sigma.k / sqrt(nstar[j])) /
      dk(y2star,
        distr = distr.k, mu = xbar, sigma = sigma.k / sqrt(nstar[j])
      )
    if (!is.nan(f.ratio * k.ratio)) {
      q2 <- min(1, f.ratio * k.ratio)
      ystar[j] <- ifelse(runif(1) <= q2, y2star, ystar[j])
    }
  }
  return(ystar)
}

Conditional posterior distribution of sigma

Description

This function simulates from the conditional posterior distribution of sigma.

Usage

gs5(sigma, x, y, distr, asigma, bsigma, delta)

Details

For internal use.

Examples


## The function is currently defined as
function(sigma, x, y, distr = 1, asigma = 1, bsigma = 2, delta = 4) {
  sigmaStar <- rgamma(1, shape = delta, rate = delta / sigma)
  sigmaT <- sigma
  qgammas <- sigmaT / sigmaStar
  Qgammas <- sigmaStar / sigmaT
  Term2 <- qgammas^(2 * delta - 1) * exp(-delta * (qgammas -
    Qgammas))
  Kgamma <- Qgammas^(asigma - 1) * exp(-bsigma * (sigmaStar -
    sigmaT))
  Prod <- 1
  for (i in seq(length(x))) {
    Prod <- Prod * (dk(x[i], distr = distr, mu = y[i], sigma = sigmaStar) / dk(x[i],
      distr = distr, mu = y[i], sigma = sigmaT
    ))
  }
  q3 <- min(1, Kgamma * Prod * Term2)
  sigma <- ifelse(runif(1) <= q3, sigmaStar, sigmaT)
  return(sigma)
}

Conditional posterior distribution of sigma in the case of censoring

Description

This function simulates from the conditional posterior distribution of sigma.

Usage

gs5cens2(
  sigma,
  xleft,
  xright,
  censor_code,
  y,
  distr = 1,
  asigma = 1,
  bsigma = 2,
  delta = 4
)

Details

For internal use

Examples


## The function is currently defined as
function(sigma, xleft, xright, censor_code, y, distr = 1, asigma = 1,
         bsigma = 2, delta = 4) {
  sigmaStar <- rgamma(1, shape = delta, rate = delta / sigma)
  sigmaT <- sigma
  qgammas <- sigmaT / sigmaStar
  Qgammas <- sigmaStar / sigmaT
  Term2 <- qgammas^(2 * delta - 1) * exp(-delta * (qgammas -
    Qgammas))
  Kgamma <- Qgammas^(asigma - 1) * exp(-bsigma * (sigmaStar -
    sigmaT))
  Prod <- 1
  for (i in seq_along(xleft)) {
    Prod <- Prod * dkcens2_1val(
      xleft = xleft[i], xright = xright[i],
      c_code = censor_code[i], distr = distr, mu = y[i],
      sigma = sigmaStar
    ) / dkcens2_1val(
      xleft = xleft[i],
      xright = xright[i], c_code = censor_code[i], distr = distr,
      mu = y[i], sigma = sigmaT
    )
  }
  q3 <- min(1, Kgamma * Prod * Term2)
  sigma <- ifelse(runif(1) <= q3, sigmaStar, sigmaT)
  return(sigma)
}

Updates the hyper-parameters of py0

Description

This function updates the hyper-parameters of the centering distribution py0.

Usage

gsHP(ystar, rstar, distr)

Details

For internal use.

Examples


## The function is currently defined as
function(ystar, rstar, distr) {
  if (distr == 1) {
    mu0 <- 0
    s0 <- 0.01
    q1 <- 0.1
    q2 <- 0.1
    a <- q1 + rstar / 2
    b <- q2 + (rstar - 1) * var(ystar) / 2 + s0 * rstar * (mean(ystar) - mu0)^2 / 2 / (s0 + rstar)
    t2 <- rgamma(1, shape = a, rate = b)
    a <- (s0 * mu0 + sum(ystar)) / (s0 + rstar)
    b <- (s0 + rstar) * t2
    t1 <- rnorm(1, mean = a, sd = 1 / sqrt(b))
    mu.py0 <- t1
    sigma.py0 <- 1 / sqrt(t2)
  }
  else if (distr == 2) {
    q1 <- 0.01
    q2 <- 0.01
    t1 <- rgamma(1, shape = q1 + rstar, rate = q2 + sum(ystar))
    mu.py0 <- sigma.py0 <- 1 / t1
  }
  else if (distr == 3) {
    q1 <- 0.01
    q2 <- 0.01
    t1 <- rgamma(1, shape = q1 + rstar, rate = q2 - sum(log(ystar)))
    mu.py0 <- t1 / (t1 + 1)
    sigma.py0 <- sqrt(t1 / (t1 + 1)^2 / (t1 + 2))
  }
  else {
    stop("Argument \"distr\" should be defined numeric with possible values 1,2 or 3")
  }
  return(list(mu.py0 = mu.py0, sigma.py0 = sigma.py0))
}

Jointly resampling Ystar and Zstar function

Description

This function resamples jointly the distinct pairs (Ystar,Zstar) in the fully nonparametric model.

Usage

gsYZstar(
  ystar,
  zstar,
  nstar,
  rstar,
  idx,
  x,
  delta,
  kappa,
  distr.k,
  distr.py0,
  mu.py0,
  sigma.py0,
  distr.pz0,
  mu.pz0,
  sigma.pz0
)

Details

For internal use.

Examples


## The function is currently defined as
function(ystar, zstar, nstar, rstar, idx, x, delta, kappa, distr.k,
         distr.py0, mu.py0, sigma.py0, distr.pz0, mu.pz0, sigma.pz0) {
  for (j in seq(rstar)) {
    flag <- 1
    while (flag == 1) {
      id <- which(!is.na(match(idx, j)))
      xj <- x[id]
      xbar <- sum(xj) / nstar[j]
      z2star <- rk(1, distr = distr.pz0, mu = zstar[j], sigma = zstar[j] / sqrt(delta))
      y2star <- rk(1, distr = distr.py0, mu = xbar, sigma = kappa * z2star / sqrt(nstar[j]))
      f.ratio <- rfyzstar(y2star, ystar[j], z2star, zstar[j], xj,
        distr.k = distr.k,
        distr.py0 = distr.py0, mu.py0 = mu.py0, sigma.py0 = sigma.py0,
        distr.pz0 = distr.pz0, mu.pz0 = mu.pz0, sigma.pz0 = sigma.pz0
      )
      k.ratioNum <- dk(zstar[j],
        distr = distr.pz0, mu = z2star,
        sigma = z2star / sqrt(delta)
      )
      k.ratioDen <- dk(z2star,
        distr = distr.pz0, mu = zstar[j],
        sigma = zstar[j] / sqrt(delta)
      )
      k.ratio <- k.ratioNum / k.ratioDen
      k.ratioNum <- dk(ystar[j],
        distr = distr.py0, mu = xbar,
        sigma = kappa * zstar[j] / sqrt(nstar[j])
      )
      k.ratioDen <- dk(y2star,
        distr = distr.py0, mu = xbar,
        sigma = kappa * z2star / sqrt(nstar[j])
      )
      k.ratio <- k.ratio * k.ratioNum / k.ratioDen
      q2 <- min(1, f.ratio * k.ratio)
      if (is.na(q2)) {
        flag <- 1
      } else {
        if (runif(1) <= q2) {
          ystar[j] <- y2star
          zstar[j] <- z2star
          flag <- 0
        }
      }
    }
  }
  return(list(ystar = ystar, zstar = zstar))
}

Jointly resampling Ystar and Zstar function in the case of censoring

Description

This function resamples jointly the distinct pairs (Ystar,Zstar) in the fully nonparametric model.

Usage

gsYZstarcens2(
  ystar,
  zstar,
  nstar,
  rstar,
  idx,
  xleft,
  xright,
  censor_code,
  delta,
  kappa,
  distr.k,
  distr.py0,
  mu.py0,
  sigma.py0,
  distr.pz0,
  mu.pz0,
  sigma.pz0
)

Details

For internal use

Examples


## The function is currently defined as
function(ystar, zstar, nstar, rstar, idx, xleft, xright, censor_code,
         delta, kappa, distr.k, distr.py0, mu.py0, sigma.py0, distr.pz0,
         mu.pz0, sigma.pz0) {
  for (j in seq(rstar)) {
    flag <- 1
    while (flag == 1) {
      id <- which(!is.na(match(idx, j)))
      xjleft <- xleft[id]
      xjright <- xright[id]
      xbar <- 0.5 * sum(xjleft + xjright, na.rm = T) / nstar[j]
      z2star <- rk(1,
        distr = distr.pz0, mu = zstar[j],
        sigma = zstar[j] / sqrt(delta)
      )
      y2star <- rk(1, distr = distr.py0, mu = xbar, sigma = kappa *
        z2star / sqrt(nstar[j]))
      f.ratio <- rfyzstarcens2(
        v = y2star, v2 = ystar[j],
        z = z2star, z2 = zstar[j], xleft = xjleft, xright = xjright,
        censor_code = censor_code[id], distr.k = distr.k,
        distr.py0 = distr.py0, mu.py0 = mu.py0, sigma.py0 = sigma.py0,
        distr.pz0 = distr.pz0, mu.pz0 = mu.pz0, sigma.pz0 = sigma.pz0
      )
      k.ratioNum <- dk(zstar[j],
        distr = distr.pz0, mu = z2star,
        sigma = z2star / sqrt(delta)
      )
      k.ratioDen <- dk(z2star,
        distr = distr.pz0, mu = zstar[j],
        sigma = zstar[j] / sqrt(delta)
      )
      k.ratio <- k.ratioNum / k.ratioDen
      k.ratioNum <- dk(ystar[j],
        distr = distr.py0, mu = xbar,
        sigma = kappa * zstar[j] / sqrt(nstar[j])
      )
      k.ratioDen <- dk(y2star,
        distr = distr.py0, mu = xbar,
        sigma = kappa * z2star / sqrt(nstar[j])
      )
      k.ratio <- k.ratio * k.ratioNum / k.ratioDen
      q2 <- min(1, f.ratio * k.ratio)
      if (is.na(q2)) {
        flag <- 1
      }
      else {
        flag <- 0
        if (runif(1) <= q2) {
          ystar[j] <- y2star
          zstar[j] <- z2star
        }
      }
    }
  }
  return(list(ystar = ystar, zstar = zstar))
}

Test if the data is censored

Description

Test if the data is censored

Usage

is_censored(dat)

Arguments

dat

The dataset to be tested

Value

TRUE if the data is censored

Examples


data(salinity)
BNPdensity:::is_censored(salinity)

Tests if a fit is a semi parametric or nonparametric model.

Description

Tests if a fit is a semi parametric or nonparametric model.

Usage

is_semiparametric(fit)

Arguments

fit

The result of the fit, obtained through the function MixNRMI1 or MixNRMI2.

Value

TRUE if the fit is a semiparametric model

Examples


set.seed(150520)
data(acidity)
x <- enzyme
out <- MixNRMI1(enzyme, extras = TRUE, Nit = 10)
BNPdensity:::is_semiparametric(out)

Metropolis-Hastings ratio for the conditional of logU

Description

This function computes the Metropolis-Hastings ratio to decide whether to accept or reject a new value for logU.

Usage

logacceptance_ratio_logu(logu, logu_prime, n, r, gamma, kappa, a, delta)

Arguments

logu

Real, log of the latent variable U at the current iteration.

logu_prime

Real, log of the new proposed latent variable U.

n

Integer, number of data points.

r

Integer, number of clusters.

kappa

Positive real. A parameter of the NRMI process.

a

Positive real. Total mass of the centering measure.

delta

Scale of the Metropolis-Hastings proposal distribution


Contribution of the proposal kernel logdensity to the Metropolis-Hastings ratio

Description

Contribution of the proposal kernel logdensity to the Metropolis-Hastings ratio

Usage

logdprop_logu(logu_prime, logu, delta)

Contribution of the target logdensity of logU to the Metropolis-Hastings ratio

Description

Contribution of the target logdensity of logU to the Metropolis-Hastings ratio

Usage

logf_logu_cond_y(logu, n, r, gamma, kappa, a)

Target logdensity of U given the data

Description

Target logdensity of U given the data

Usage

logf_u_cond_y(u, n, r, gamma, kappa, a)

Multiple chains of MixNRMI1

Description

Multiple chains of MixNRMI1

Usage

multMixNRMI1(
  x,
  probs = c(0.025, 0.5, 0.975),
  Alpha = 1,
  Kappa = 0,
  Gama = 0.4,
  distr.k = "normal",
  distr.p0 = "normal",
  asigma = 0.5,
  bsigma = 0.5,
  delta_S = 3,
  delta_U = 2,
  Meps = 0.01,
  Nx = 150,
  Nit = 1500,
  Pbi = 0.1,
  epsilon = NULL,
  printtime = TRUE,
  extras = TRUE,
  adaptive = FALSE,
  nchains = 4,
  parallel = TRUE,
  ncores = parallel::detectCores()
)

Arguments

x

Numeric vector. Data set to which the density is fitted.

probs

Numeric vector. Desired quantiles of the density estimates.

Alpha

Numeric constant. Total mass of the centering measure. See details.

Kappa

Numeric positive constant. See details.

Gama

Numeric constant. 0\leq \texttt{Gama} \leq 1. See details.

distr.k

The distribution name for the kernel. Allowed names are "normal", "gamma", "beta", "double exponential", "lognormal" or their common abbreviations "norm", "exp", or an integer number identifying the mixture kernel: 1 = Normal; 2 = Gamma; 3 = Beta; 4 = Double Exponential; 5 = Lognormal.

distr.p0

The distribution name for the centering measure. Allowed names are "normal", "gamma", "beta", or their common abbreviations "norm", "exp", or an integer number identifying the centering measure: 1 = Normal; 2 = Gamma; 3 = Beta.

asigma

Numeric positive constant. Shape parameter of the gamma prior on the standard deviation of the mixture kernel distr.k.

bsigma

Numeric positive constant. Rate parameter of the gamma prior on the standard deviation of the mixture kernel distr.k.

delta_S

Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling sigma.

delta_U

Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling the latent U.

Meps

Numeric constant. Relative error of the jump sizes in the continuous component of the process. Smaller values imply larger number of jumps.

Nx

Integer constant. Number of grid points for the evaluation of the density estimate.

Nit

Integer constant. Number of MCMC iterations.

Pbi

Numeric constant. Burn-in period proportion of Nit.

epsilon

Numeric constant. Extension to the evaluation grid range. See details.

printtime

Logical. If TRUE, prints out the execution time.

extras

Logical. If TRUE, gives additional objects: means, weights and Js.

adaptive

Logical. If TRUE, uses an adaptive MCMC strategy to sample the latent U (adaptive delta_U).

nchains

The number of chains to run.

parallel

Whether to run the chains in parallel. Only works on UNIX-like systems as it rests on Fork parallelism

ncores

Number of cores for the parallel run. Defaults to parallel::detectCores(), i.e. the maximum number of cores detected by R on your system.

Value

a list containing the multiple fits.

See Also

MixNRMI2, MixNRMI1cens, MixNRMI2cens

Examples


data(acidity)
multMixNRMI1(acidity, parallel = TRUE, Nit = 10, ncores = 2)

Multiple chains of MixNRMI1cens

Description

Multiple chains of MixNRMI1cens

Usage

multMixNRMI1cens(
  xleft,
  xright,
  probs = c(0.025, 0.5, 0.975),
  Alpha = 1,
  Kappa = 0,
  Gama = 0.4,
  distr.k = "normal",
  distr.p0 = "normal",
  asigma = 0.5,
  bsigma = 0.5,
  delta_S = 3,
  delta_U = 2,
  Meps = 0.01,
  Nx = 150,
  Nit = 1500,
  Pbi = 0.1,
  epsilon = NULL,
  printtime = TRUE,
  extras = TRUE,
  adaptive = FALSE,
  nchains = 4,
  parallel = TRUE,
  ncores = parallel::detectCores()
)

Arguments

xleft

Numeric vector. Lower limit of interval censoring. For exact data the same as xright

xright

Numeric vector. Upper limit of interval censoring. For exact data the same as xleft.

probs

Numeric vector. Desired quantiles of the density estimates.

Alpha

Numeric constant. Total mass of the centering measure. See details.

Kappa

Numeric positive constant. See details.

Gama

Numeric constant. 0\leq \texttt{Gama} \leq 1. See details.

distr.k

The distribution name for the kernel. Allowed names are "normal", "gamma", "beta", "double exponential", "lognormal" or their common abbreviations "norm", "exp", or an integer number identifying the mixture kernel: 1 = Normal; 2 = Gamma; 3 = Beta; 4 = Double Exponential; 5 = Lognormal.

distr.p0

The distribution name for the centering measure. Allowed names are "normal", "gamma", "beta", or their common abbreviations "norm", "exp", or an integer number identifying the centering measure: 1 = Normal; 2 = Gamma; 3 = Beta.

asigma

Numeric positive constant. Shape parameter of the gamma prior on the standard deviation of the mixture kernel distr.k.

bsigma

Numeric positive constant. Rate parameter of the gamma prior on the standard deviation of the mixture kernel distr.k.

delta_S

Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling sigma.

delta_U

Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling the latent U.

Meps

Numeric constant. Relative error of the jump sizes in the continuous component of the process. Smaller values imply larger number of jumps.

Nx

Integer constant. Number of grid points for the evaluation of the density estimate.

Nit

Integer constant. Number of MCMC iterations.

Pbi

Numeric constant. Burn-in period proportion of Nit.

epsilon

Numeric constant. Extension to the evaluation grid range. See details.

printtime

Logical. If TRUE, prints out the execution time.

extras

Logical. If TRUE, gives additional objects: means, weights and Js.

adaptive

Logical. If TRUE, uses an adaptive MCMC strategy to sample the latent U (adaptive delta_U).

nchains

The number of chains to run.

parallel

Whether to run the chains in parallel. Only works on UNIX-like systems as it rests on Fork parallelism

ncores

Number of cores for the parallel run. Defaults to parallel::detectCores(), i.e. the maximum number of cores detected by R on your system.

Value

a list containing the multiple fits.

See Also

MixNRMI2, MixNRMI1cens, MixNRMI2cens, multMixNRMI1

Examples


data(salinity)
multMixNRMI1cens(salinity$left, salinity$right, parallel = TRUE, Nit = 10, ncores = 2)

Multiple chains of MixNRMI2

Description

Multiple chains of MixNRMI2

Usage

multMixNRMI2(
  x,
  probs = c(0.025, 0.5, 0.975),
  Alpha = 1,
  Kappa = 0,
  Gama = 0.4,
  distr.k = "normal",
  distr.py0 = "normal",
  distr.pz0 = "gamma",
  mu.pz0 = 3,
  sigma.pz0 = sqrt(10),
  delta_S = 4,
  kappa = 2,
  delta_U = 2,
  Meps = 0.01,
  Nx = 150,
  Nit = 1500,
  Pbi = 0.1,
  epsilon = NULL,
  printtime = TRUE,
  extras = TRUE,
  adaptive = FALSE,
  nchains = 4,
  parallel = FALSE,
  ncores = parallel::detectCores()
)

Arguments

x

Numeric vector. Data set to which the density is fitted.

probs

Numeric vector. Desired quantiles of the density estimates.

Alpha

Numeric constant. Total mass of the centering measure. See details.

Kappa

Numeric positive constant. See details.

Gama

Numeric constant. 0 \leq Gama \leq 1. See details.

distr.k

The distribution name for the kernel. Allowed names are "normal", "gamma", "beta", "double exponential", "lognormal" or their common abbreviations "norm", "exp", or an integer number identifying the mixture kernel: 1 = Normal; 2 = Gamma; 3 = Beta; 4 = Double Exponential; 5 = Lognormal.

distr.py0

The distribution name for the centering measure for locations. Allowed names are "normal", "gamma", "beta", or their common abbreviations "norm", "exp", or an integer number identifying the centering measure for locations: 1 = Normal; 2 = Gamma; 3 = Beta.

distr.pz0

The distribution name for the centering measure for scales. Allowed names are "gamma", or an integer number identifying the centering measure for scales: 2 = Gamma. For more options use MixNRMI2cens.

mu.pz0

Numeric constant. Prior mean of the centering measure for scales.

sigma.pz0

Numeric constant. Prior standard deviation of the centering measure for scales.

delta_S

Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling the scales.

kappa

Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling the location parameters.

delta_U

Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling the latent U. If 'adaptive=TRUE', 'delta_U'is the starting value for the adaptation.

Meps

Numeric constant. Relative error of the jump sizes in the continuous component of the process. Smaller values imply larger number of jumps.

Nx

Integer constant. Number of grid points for the evaluation of the density estimate.

Nit

Integer constant. Number of MCMC iterations.

Pbi

Numeric constant. Burn-in period proportion of Nit.

epsilon

Numeric constant. Extension to the evaluation grid range. See details.

printtime

Logical. If TRUE, prints out the execution time.

extras

Logical. If TRUE, gives additional objects: means, sigmas, weights and Js.

adaptive

Logical. If TRUE, uses an adaptive MCMC strategy to sample the latent U (adaptive delta_U).

nchains

The number of chains to run.

parallel

Whether to run the chains in parallel. Only works on UNIX-like systems as it rests on Fork parallelism

ncores

Number of cores for the parallel run. Defaults to parallel::detectCores(), i.e. the maximum number of cores detected by R on your system.

Value

a list containing the multiple fits.

See Also

MixNRMI2, MixNRMI1cens, MixNRMI2cens, multMixNRMI1

Examples


data(acidity)
multMixNRMI2(acidity, parallel = TRUE, Nit = 10, ncores = 2)

Multiple chains of MixNRMI2cens

Description

Multiple chains of MixNRMI2cens

Usage

multMixNRMI2cens(
  xleft,
  xright,
  probs = c(0.025, 0.5, 0.975),
  Alpha = 1,
  Kappa = 0,
  Gama = 0.4,
  distr.k = "normal",
  distr.py0 = "normal",
  distr.pz0 = "gamma",
  mu.pz0 = 3,
  sigma.pz0 = sqrt(10),
  delta_S = 4,
  kappa = 2,
  delta_U = 2,
  Meps = 0.01,
  Nx = 150,
  Nit = 1500,
  Pbi = 0.1,
  epsilon = NULL,
  printtime = TRUE,
  extras = TRUE,
  adaptive = FALSE,
  nchains = 4,
  parallel = TRUE,
  ncores = parallel::detectCores()
)

Arguments

xleft

Numeric vector. Lower limit of interval censoring. For exact data the same as xright

xright

Numeric vector. Upper limit of interval censoring. For exact data the same as xleft.

probs

Numeric vector. Desired quantiles of the density estimates.

Alpha

Numeric constant. Total mass of the centering measure. See details.

Kappa

Numeric positive constant. See details.

Gama

Numeric constant. 0 \leq Gama \leq 1. See details.

distr.k

The distribution name for the kernel. Allowed names are "normal", "gamma", "beta", "double exponential", "lognormal" or their common abbreviations "norm", "exp", or an integer number identifying the mixture kernel: 1 = Normal; 2 = Gamma; 3 = Beta; 4 = Double Exponential; 5 = Lognormal.

distr.py0

The distribution name for the centering measure for locations. Allowed names are "normal", "gamma", "beta", or their common abbreviations "norm", "exp", or an integer number identifying the centering measure for locations: 1 = Normal; 2 = Gamma; 3 = Beta.

distr.pz0

The distribution name for the centering measure for scales. Allowed names are "gamma", "lognormal", "half-Cauchy", "half-normal", "half-student", "uniform" and "truncated normal", or their common abbreviations "norm", "exp", "lnorm", "halfcauchy", "halfnorm", "halft" and "unif", or an integer number identifying the centering measure for scales: 2 = Gamma, 5 = Lognormal, 6 = Half Cauchy, 7 = Half Normal, 8 = Half Student-t, 9 = Uniform, 10 = Truncated Normal.

mu.pz0

Numeric constant. Prior mean of the centering measure for scales.

sigma.pz0

Numeric constant. Prior standard deviation of the centering measure for scales.

delta_S

Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling the scales.

kappa

Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling the location parameters.

delta_U

Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling the latent U. If 'adaptive=TRUE', 'delta_U'is the starting value for the adaptation.

Meps

Numeric constant. Relative error of the jump sizes in the continuous component of the process. Smaller values imply larger number of jumps.

Nx

Integer constant. Number of grid points for the evaluation of the density estimate.

Nit

Integer constant. Number of MCMC iterations.

Pbi

Numeric constant. Burn-in period proportion of Nit.

epsilon

Numeric constant. Extension to the evaluation grid range. See details.

printtime

Logical. If TRUE, prints out the execution time.

extras

Logical. If TRUE, gives additional objects: means, sigmas, weights and Js.

adaptive

Logical. If TRUE, uses an adaptive MCMC strategy to sample the latent U (adaptive delta_U).

nchains

The number of chains to run.

parallel

Whether to run the chains in parallel. Only works on UNIX-like systems as it rests on Fork parallelism

ncores

Number of cores for the parallel run. Defaults to parallel::detectCores(), i.e. the maximum number of cores detected by R on your system.

Value

a list containing the multiple fits.

See Also

MixNRMI2, MixNRMI1cens, MixNRMI2cens, multMixNRMI1

Examples


data(salinity)
## Not run: 
multMixNRMI2cens(salinity$left, salinity$right, parallel = TRUE, Nit = 20, ncores = 2)

## End(Not run)


Centering function

Description

This function determines the centering density of the normalized random measure.

Usage

p0(x, distr = NULL, mu = NULL, sigma = NULL)

Details

For internal use.

Examples


## The function is currently defined as
function(x, distr = NULL, mu = NULL, sigma = NULL) {
  if (is.null(distr)) {
    stop("Argument \"distr\" should be defined numeric with possible values 1,2, or 3")
  }
  else if (distr == 1) {
    a <- ifelse(is.null(mu), 0, mu)
    b <- ifelse(is.null(sigma), 1, sigma)
    p0 <- dnorm(x, mean = a, sd = b)
  }
  else if (distr == 2) {
    a <- ifelse(is.null(mu), 1, mu^2 / sigma^2)
    b <- ifelse(is.null(sigma), 1, mu / sigma^2)
    p0 <- dgamma(x, shape = a, rate = b)
  }
  else if (distr == 3) {
    a <- ifelse(is.null(mu), 0.5, (1 - mu) * (mu / sigma)^2 -
      mu)
    b <- ifelse(is.null(sigma), 1 / sqrt(12), (mu * (1 - mu) / sigma^2 -
      1) * (1 - mu))
    if (any(c(a, b) <= 0)) {
      stop(paste(
        "\nNegative Beta parameters:\n a =", a,
        ";\t b =", b
      ))
    }
    p0 <- dbeta(x, shape1 = a, shape2 = b)
  }
  else {
    stop("Argument \"distr\" should be defined numeric with possible values 1,2, or 3")
  }
  return(p0)
}

Distribution function half Cauchy

Description

Computes the cdf.

Usage

phalfcauchy(q, location = 0, scale = 1)

Details

For internal use

Examples


## The function is currently defined as
function(q, location = 0, scale = 1) {
  ifelse(x < 0, 0, 1) * (pcauchy(q, location, scale) - pcauchy(
    0,
    location, scale
  )) / (1 - pcauchy(0, location, scale))
}

Distribution function half Normal

Description

Computes the cdf.

Usage

phalfnorm(q, mean = 0, sd = 1)

Details

For internal use

Examples


## The function is currently defined as
function(q, mean = 0, sd = 1) {
  ifelse(q < 0, 0, 1) * (pnorm(q, mean, sd) - pnorm(
    0, mean,
    sd
  )) / (1 - pnorm(0, mean, sd))
}

Distribution function half Student-t

Description

Computes the cumulative distribution function.

Usage

phalft(q, df = 1, mean = 0, sd = 1)

Details

For internal use

Examples


## The function is currently defined as
function(q, df = 1, mean = 0, sd = 1) {
  ifelse(x < 0, 0, 1) * (pt_(q, df, mean, sd) - pt_(
    0, df,
    mean, sd
  )) / (1 - pt_(0, df, mean, sd))
}

Kernel distribution function

Description

This functions evaluates the cumulative distribution function at a certain data point.

Usage

pk(q, distr = NULL, mu = NULL, sigma = NULL)

Details

For internal use

Examples


## The function is currently defined as
function(q, distr = NULL, mu = NULL, sigma = NULL) {
  if (is.null(distr)) {
    stop(msg)
  }
  else if (distr == 1) {
    a <- ifelse(is.null(mu), 0, mu)
    b <- ifelse(is.null(sigma), 1, sigma)
    pk <- pnorm(q, mean = a, sd = b)
  }
  else if (distr == 2) {
    a <- ifelse(is.null(mu), 1, mu^2 / sigma^2)
    b <- ifelse(is.null(sigma), 1, mu / sigma^2)
    pk <- pgamma(q, shape = a, rate = b)
  }
  else if (distr == 3) {
    a <- ifelse(is.null(mu), 0.5, (1 - mu) * (mu / sigma)^2 -
      mu)
    b <- ifelse(is.null(sigma), 1 / sqrt(12), (mu * (1 - mu) / sigma^2 -
      1) * (1 - mu))
    if (any(c(a, b) <= 0)) {
      stop(paste(
        "\nNegative Beta parameters:\n a =", a,
        ";\t b =", b
      ))
    }
    pk <- pbeta(q, shape1 = a, shape2 = b)
  }
  else if (distr == 4) {
    a <- ifelse(is.null(mu), 0, mu)
    b <- ifelse(is.null(sigma), 1 / sqrt(2), sigma / sqrt(2))
    pk <- ifelse(q < a, exp((q - a) / b) / 2, 1 - exp((a - q) / b) / 2)
  }
  else if (distr == 5) {
    a <- ifelse(is.null(mu), exp(1 / 2), log(mu / sqrt(1 + (sigma / mu)^2)))
    b <- ifelse(is.null(sigma), exp(1) * (exp(1) - 1), sqrt(log(1 +
      (sigma / y)^2)))
    pk <- plnorm(q, meanlog = a, sdlog = b)
  }
  else if (distr == 6) {
    pk <- phalfcauchy(q, location = ifelse(is.null(mu), 0,
      mu
    ), scale = ifelse(is.null(sigma), 1, sigma))
  }
  else if (distr == 7) {
    pk <- phalfnorm(q,
      mean = ifelse(is.null(mu), 0, mu),
      sd = ifelse(is.null(sigma), 1, sigma)
    )
  }
  else if (distr == 8) {
    pk <- phalft(q, df = 10, mean = ifelse(is.null(mu), 0,
      mu
    ), sd = ifelse(is.null(sigma), 1, sigma))
  }
  else if (distr == 9) {
    pk <- punif(q, min = ifelse(is.null(mu), 0, mu), max = ifelse(is.null(sigma),
      1, sigma
    ))
  }
  else if (distr == 10) {
    pk <- ptnorm(q, mean = ifelse(is.null(mu), 0, mu), sd = ifelse(is.null(sigma),
      1, sigma
    ), lower = 0.1)
  }
  else {
    stop(msg)
  }
  return(pk)
}

Plot the density estimate and the 95% credible interval

Description

The density estimate is the mean posterior density computed on the data points.

Usage

## S3 method for class 'NRMI1'
plot(x, ...)

Arguments

x

A fitted object of class NRMI1

...

Further arguments to be passed to generic function, ignored at the moment

Value

A graph with the density estimate, the 95% credible interval and a histogram of the data

Examples


## Example for non censored data

data(acidity)
out <- MixNRMI1(acidity, Nit = 50)
plot(out)

## Example for censored data

data(salinity)
out <- MixNRMI1cens(salinity$left, salinity$right, Nit = 50)
plot(out)

Plot the density estimate and the 95% credible interval

Description

The density estimate is the mean posterior density computed on the data points.

Usage

## S3 method for class 'NRMI2'
plot(x, ...)

Arguments

x

A fitted object of class NRMI2

...

Further arguments to be passed to generic function, ignored at the moment

Value

A graph with the density estimate, the 95% credible interval and a histogram of the data

Examples

## Example for non censored data

data(acidity)
out <- MixNRMI2(acidity, Nit = 20)
plot(out)

## Example for censored data

data(salinity)
out <- MixNRMI2cens(salinity$left, salinity$right, Nit = 20)
plot(out)

Plot the density estimate and the 95% credible interval

Description

Plot the density estimate and the 95% credible interval

Usage

## S3 method for class 'PY1'
plot(x, ...)

Arguments

x

A fitted object of class PY1

...

Further arguments to be passed to generic function, ignored at the moment

Value

A graph with the density estimate, the 95% credible interval and a histogram of the data

Examples

data(acidity)
out <- MixPY1(acidity, Nit = 50)
plot(out)

Plot the density estimate and the 95% credible interval

Description

Plot the density estimate and the 95% credible interval

Usage

## S3 method for class 'PY2'
plot(x, ...)

Arguments

x

A fitted object of class PY2

...

Further arguments to be passed to generic function, ignored at the moment

Value

A graph with the density estimate, the 95% credible interval and a histogram of the data

Examples

data(acidity)
out <- MixPY2(acidity, Nit = 50)
plot(out)

Plot the density estimate and the 95% credible interval

Description

The density estimate is the mean posterior density computed on the data points.

Usage

## S3 method for class 'multNRMI'
plot(x, ...)

Arguments

x

An object of class multNRMI

...

Further arguments to be passed to generic functions, ignored at the moment

Value

A graph with the density estimate, the 95% credible interval. Includes a histogram if the data is non censored.

Examples


data(salinity)
fit <- multMixNRMI2cens(salinity$left, salinity$right, parallel = TRUE, Nit = 10, ncores = 2)
plot(fit)


Plot the Turnbull CDF and fitted CDF for censored data.

Description

Plot the Turnbull CDF and fitted CDF for censored data.

Usage

plotCDF_censored(fit)

Arguments

fit

The result of the fit, obtained through the function MixNRMI1cens or MixNRMI2cens.

Value

Plot of the empirical and fitted CDF for non censored data.

Examples


set.seed(150520)
data(salinity)
out <- MixNRMI1cens(salinity$left, salinity$right, extras = TRUE, Nit = 100)
BNPdensity:::plotCDF_censored(out)

Plot the empirical and fitted CDF for non censored data.

Description

Plot the empirical and fitted CDF for non censored data.

Usage

plotCDF_noncensored(fit)

Arguments

fit

The result of the fit, obtained through the function MixNRMI1 or MixNRMI2.

Value

Plot of the empirical and fitted CDF for non censored data.

Examples


set.seed(150520)
data(acidity)
out <- MixNRMI1(acidity, extras = TRUE, Nit = 10)
BNPdensity:::plotCDF_noncensored(out)

Plot the density for censored data.

Description

Plot the density for censored data.

Usage

plotPDF_censored(fit)

Arguments

fit

The result of the fit, obtained through the function MixNRMI1cens or MixNRMI2cens.

Value

Plot of the density and a histogram for non censored data.

Examples


set.seed(150520)
data(salinity)
out <- MixNRMI1cens(xleft = salinity$left, xright = salinity$right, extras = TRUE, Nit = 100)
BNPdensity:::plotPDF_censored(out)

Plot the density and a histogram for non censored data.

Description

Plot the density and a histogram for non censored data.

Usage

plotPDF_noncensored(fit)

Arguments

fit

The result of the fit, obtained through the function MixNRMI1 or MixNRMI2.

Value

Plot of the density and a histogram for non censored data.

Examples


set.seed(150520)
data(acidity)
out <- MixNRMI1(acidity, extras = TRUE, Nit = 100)
BNPdensity:::plotPDF_noncensored(out)

Plot the clustering and the Cumulative Distribution Function

Description

This is a function to visualize the clustering induced by the BNP model. The data points are plotted with a color reflecting their cluster.

Usage

plot_clustering_and_CDF(fit, clustering, label_vector = NULL)

Arguments

fit

The fitted object, obtained from one of the MixNRMIx functions

clustering

A vector of integers with the same length as the data, representing the allocation variable for data each point.

label_vector

A vector of data labels to be plotted, to provide some identification to each point.

Value

A plot of the Cumulative Distribution Function (or Turnbull estimate for censored data) with data points whose color denotes the cluster allocation. For censored data, right or left censored data points are not represented, while interval censored data points are represented at the middle of the censoring interval.


This plots the prior distribution on the number of components for the stable process. The Dirichlet process is provided for comparison.

Description

This plots the prior distribution on the number of components for the stable process. The Dirichlet process is provided for comparison.

Usage

plot_prior_number_of_components(
  n,
  Gama,
  Alpha = 1,
  grid = NULL,
  silence = TRUE
)

Arguments

n

Number of data points

Gama

Numeric constant. 0 <= Gama <=1.

Alpha

Numeric constant. Total mass of the centering measure for the Dirichlet process.

grid

Integer vector. Level of truncation when computing the expectation. Defaults to n. If greater than n, it is fixed to n.

silence

Boolean. Whether to print the current calculation step for the Stable process, as the function can be long

Value

A plot with the prior distribution on the number of components.

Examples


plot_prior_number_of_components(50, 0.4)

Plot the density estimate and the 95% credible interval for censored data

Description

The density estimate is the mean posterior density computed on the data points. It is not possible to display a histogram for censored data.

Usage

plotfit_censored(fit)

Arguments

fit

A fitted object of class NRMI1cens or NRMI2cens

Value

A graph with the density estimate and the 95% credible interval

Examples


data(acidity)
out <- MixNRMI1(acidity, Nit = 50)
plot(out)

Plot the density estimate and the 95% credible interval for noncensored data

Description

The density estimate is the mean posterior density computed on the data points.

Usage

plotfit_noncensored(fit)

Arguments

fit

A fitted object of class NRMI1 or NRMI2

Value

A graph with the density estimate, the 95% credible interval and a histogram of the data

Examples


data(acidity)
out <- MixNRMI1(acidity, Nit = 50)
plot(out)

Plot the percentile-percentile graph for non censored data, using the Turnbull estimator the position of the percentiles.

Description

Plot the percentile-percentile graph for non censored data, using the Turnbull estimator the position of the percentiles.

Usage

pp_plot_censored(fit)

Arguments

fit

The result of the fit, obtained through the function MixNRMI1cens or MixNRMI2cens.

Value

Percentile-percentile graph using the Turnbull estimator

Examples


set.seed(150520)
data(salinity)
out <- MixNRMI1cens(xleft = salinity$left, xright = salinity$right, extras = TRUE, Nit = 100)
BNPdensity:::pp_plot_censored(out)

Plot the percentile-percentile graph for non censored data.

Description

Plot the percentile-percentile graph for non censored data.

Usage

pp_plot_noncensored(fit)

Arguments

fit

The result of the fit, obtained through the function MixNRMI1 or MixNRMI2.

Value

Percentile-percentile plot for non censored data.

Examples


set.seed(150520)
data(acidity)
out <- MixNRMI1(acidity, extras = TRUE, Nit = 100)
BNPdensity:::pp_plot_noncensored(out)

S3 method for class 'MixNRMI1'

Description

S3 method for class 'MixNRMI1'

Usage

## S3 method for class 'NRMI1'
print(x, ...)

Arguments

x

A fitted object of class NRMI1

...

Further arguments to be passed to generic function, ignored at the moment

Value

A visualization of the important information about the object

Examples


## Example for non censored data

data(acidity)
out <- MixNRMI1(acidity, Nit = 50)
print(out)

## Example for censored data

data(salinity)
out <- MixNRMI1cens(salinity$left, salinity$right, Nit = 50)
print(out)

S3 method for class 'MixNRMI2'

Description

S3 method for class 'MixNRMI2'

Usage

## S3 method for class 'NRMI2'
print(x, ...)

Arguments

x

A fitted object of class NRMI2

...

Further arguments to be passed to generic function, ignored at the moment

Value

A visualization of the important information about the object

Examples

#' ## Example for censored data
data(acidity)
out <- MixNRMI2(acidity, Nit = 20)
print(out)

data(salinity)
out <- MixNRMI2cens(salinity$left, salinity$right, Nit = 20)
print(out)

S3 method for class 'PY1'

Description

S3 method for class 'PY1'

Usage

## S3 method for class 'PY1'
print(x, ...)

Arguments

x

A fitted object of class PY1

...

Further arguments to be passed to generic function, ignored at the moment

Value

A visualization of the important information about the object

Examples


## Example for non censored data

data(acidity)
out <- MixPY1(acidity, Nit = 50)
print(out)

S3 method for class 'PY2'

Description

S3 method for class 'PY2'

Usage

## S3 method for class 'PY2'
print(x, ...)

Arguments

x

A fitted object of class PY2

...

Further arguments to be passed to generic function, ignored at the moment

Value

A visualization of the important information about the object

Examples


## Example for non censored data

data(acidity)
out <- MixPY2(acidity, Nit = 50)
print(out)

S3 method for class 'multNRMI'

Description

S3 method for class 'multNRMI'

Usage

## S3 method for class 'multNRMI'
print(x, ...)

Arguments

x

An object of class multNRMI

...

Further arguments to be passed to generic functions, ignored at the moment

Value

A visualization of the important information about the object

Examples


data(salinity)
out <- multMixNRMI2cens(salinity$left, salinity$right, parallel = TRUE, Nit = 10, ncores = 2)
print(out)


Process the distribution name argument into a distribution index

Description

This function is intended to help with compatibility with the previous versions of the package.

Usage

process_dist_name(distname)

Arguments

distname

Can be an integer or a distribution name. Allowed names are "normal", "gamma", "beta", "exponential", "lognormal", "half-Cauchy", "half-normal", "half-student", "uniform" and "truncated normal", or their common abbreviations "norm", "exp", "halfcauchy", "halfnorm", "halft" and "unif".

Value

an integer both if distname is an integer or a character


Distribution function non-standard student-t

Description

Computes the cdf.

Usage

pt_(x, df, mean, sd)

Details

For internal use

Examples


## The function is currently defined as
function(x, df, mean, sd) {
  pt((x - mean) / sd, df, ncp = 0)
}

Distribution function truncated normal

Description

Computes the cumulative distribution function.

Usage

ptnorm(
  q,
  mean = 0,
  sd = 1,
  lower = -Inf,
  upper = Inf,
  lower.tail = TRUE,
  log.p = FALSE
)

Details

For internal use

Note

Taken from msm R-package.

Author(s)

C. H. Jackson

References

Taken from

Examples


## The function is currently defined as
function(q, mean = 0, sd = 1, lower = -Inf, upper = Inf, lower.tail = TRUE,
         log.p = FALSE) {
  ret <- numeric(length(q))
  if (lower.tail) {
    ret[q < lower] <- 0
    ret[q > upper] <- 1
  }
  else {
    ret[q < lower] <- 1
    ret[q > upper] <- 0
  }
  ret[upper < lower] <- NaN
  ind <- q >= lower & q <= upper
  if (any(ind)) {
    denom <- pnorm(upper, mean, sd) - pnorm(
      lower, mean,
      sd
    )
    if (lower.tail) {
      qtmp <- pnorm(q, mean, sd) - pnorm(lower, mean, sd)
    } else {
      qtmp <- pnorm(upper, mean, sd) - pnorm(
        q, mean,
        sd
      )
    }
    if (log.p) {
      qtmp <- log(qtmp) - log(denom)
    } else {
      qtmp <- qtmp / denom
    }
    ret[q >= lower & q <= upper] <- qtmp[ind]
  }
  ret
}

Generic function to find quantiles of a distribution

Description

Computes quantiles.

Usage

qgeneric(pdist, p, ...)

Details

For internal use

Note

Taken from msm R-package.

Author(s)

Christopher Jackson

Examples


## The function is currently defined as
function(pdist, p, ...) {
  args <- list(...)
  if (is.null(args$log.p)) {
    args$log.p <- FALSE
  }
  if (is.null(args$lower.tail)) {
    args$lower.tail <- TRUE
  }
  if (is.null(args$lbound)) {
    args$lbound <- -Inf
  }
  if (is.null(args$ubound)) {
    args$ubound <- Inf
  }
  if (args$log.p) {
    p <- exp(p)
  }
  if (!args$lower.tail) {
    p <- 1 - p
  }
  ret <- numeric(length(p))
  ret[p == 0] <- args$lbound
  ret[p == 1] <- args$ubound
  args[c("lower.tail", "log.p", "lbound", "ubound")] <- NULL
  maxlen <- max(sapply(c(args, p = list(p)), length))
  for (i in seq(along = args)) {
    args[[i]] <- rep(args[[i]],
      length.out = maxlen
    )
  }
  p <- rep(p, length.out = maxlen)
  ret[p < 0 | p > 1] <- NaN
  ind <- (p > 0 & p < 1)
  if (any(ind)) {
    hind <- seq(along = p)[ind]
    h <- function(y) {
      args <- lapply(args, function(x) x[hind[i]])
      p <- p[hind[i]]
      args$q <- y
      (do.call(pdist, args) - p)
    }
    ptmp <- numeric(length(p[ind]))
    for (i in 1:length(p[ind])) {
      interval <- c(-1, 1)
      while (h(interval[1]) * h(interval[2]) >= 0) {
        interval <- interval + c(-1, 1) * 0.5 * (interval[2] -
          interval[1])
      }
      ptmp[i] <- uniroot(h, interval, tol = .Machine$double.eps)$root
    }
    ret[ind] <- ptmp
  }
  if (any(is.nan(ret))) {
    warning("NaNs produced")
  }
  ret
}

Quantile function half Cauchy

Description

Computes the quantiles.

Usage

qhalfcauchy(p, location = 0, scale = 1)

Details

For internal use

Examples


## The function is currently defined as
function(p, location = 0, scale = 1) {
  qcauchy(p * (1 - pcauchy(0, location, scale)) + pcauchy(
    0,
    location, scale
  ), location, scale)
}

Quantile function half Normal

Description

Computes the quantiles.

Usage

qhalfnorm(p, mean = 0, sd = 1)

Details

For internal use

Examples


## The function is currently defined as
function(p, mean = 0, sd = 1) {
  qnorm(
    p * (1 - pnorm(0, mean, sd)) + pnorm(0, mean, sd),
    mean, sd
  )
}

Quantile function half Student-t

Description

Computes the quantiles.

Usage

qhalft(p, df = 1, mean = 0, sd = 1)

Details

For internal use

Examples


## The function is currently defined as
function(p, df = 1, mean = 0, sd = 1) {
  qt_(
    p * (1 - pt_(0, df, mean, sd)) + pt_(0, df, mean, sd),
    df, mean, sd
  )
}

Plot the quantile-quantile graph for censored data.

Description

This function may be rather slow for many iterations/many data because it relies on numerical inversion of the mixture Cumulative Distribution Function. set.seed(150520) data(salinity) out <- MixNRMI1cens(xleft = salinity$left, xright = salinity$right, extras = TRUE, Nit = 100) BNPdensity:::qq_plot_censored(out)

Usage

qq_plot_censored(fit, thinning_to = 500)

Arguments

fit

The result of the fit, obtained through the function MixNRMI1 or MixNRMI2, MixMRMI1cens or MixMRMI2cens

thinning_to

How many iterations to compute the mean posterior quantiles

Value

quantile-quantile plot for non censored data.


Plot the quantile-quantile graph for non censored data.

Description

This function may be rather slow for many iterations/many data because it relies on numerical inversion of the mixture Cumulative Distribution Function.

Usage

qq_plot_noncensored(fit, thinning_to = 500)

Arguments

fit

The result of the fit, obtained through the function MixNRMI1 or MixNRMI2, MixMRMI1cens or MixMRMI2cens

thinning_to

How many iterations to compute the mean posterior quantiles

Value

quantile-quantile plot for non censored data.

Examples



### Not run
# set.seed(150520)
# data(acidity)
# out <- MixNRMI1(acidity, extras = TRUE, Nit = 100)
# BNPdensity:::qq_plot_noncensored(out)

Quantile function non-standard Student-t

Description

Computes the quantiles.

Usage

qt_(p, df, mean, sd)

Details

For internal use

Examples


## The function is currently defined as
function(p, df, mean, sd) {
  sd * qt(p, df, ncp = 0) + mean
}

Quantile function truncated normal

Description

Computes the quantiles.

Usage

qtnorm(
  p,
  mean = 0,
  sd = 1,
  lower = -Inf,
  upper = Inf,
  lower.tail = TRUE,
  log.p = FALSE
)

Details

For internal use

Note

Taken from msm R-package.

Author(s)

C. H. Jackson

References

Taken from

Examples


## The function is currently defined as
function(p, mean = 0, sd = 1, lower = -Inf, upper = Inf, lower.tail = TRUE,
         log.p = FALSE) {
  qgeneric(ptnorm,
    p = p, mean = mean, sd = sd, lower = lower,
    upper = upper, lbound = lower, ubound = upper, lower.tail = lower.tail,
    log.p = log.p
  )
}

Conditional posterior distribution of the distinct Ystar

Description

This function evaluates the ratio of conditional posterior distributions of the distinct latents Ystar.

Usage

rfystar(v, v2, x, distr.k, sigma.k, distr.p0, mu.p0, sigma.p0)

Details

For internal use.

Examples


## The function is currently defined as
function(v, v2, x, distr.k, sigma.k, distr.p0, mu.p0, sigma.p0) {
  alpha <- p0(v, distr = distr.p0, mu = mu.p0, sigma = sigma.p0) /
    p0(v2, distr = distr.p0, mu = mu.p0, sigma = sigma.p0)
  Prod <- 1
  for (i in seq(length(x))) {
    fac <- dk(x[i], distr = distr.k, mu = v, sigma = sigma.k) /
      dk(x[i], distr = distr.k, mu = v2, sigma = sigma.k)
    Prod <- Prod * fac
  }
  f <- alpha * Prod
  return(f)
}

Conditional posterior distribution of the distinct Ystar in the case of censoring

Description

This function evaluates the ratio of conditional posterior distributions of the distinct latents Ystar.

Usage

rfystarcens2(
  v,
  v2,
  xleft,
  xright,
  censor_code,
  distr.k,
  sigma.k,
  distr.p0,
  mu.p0,
  sigma.p0
)

Details

For internal use

Examples


## The function is currently defined as
function(v, v2, xleft, xright, censor_code, distr.k, sigma.k,
         distr.p0, mu.p0, sigma.p0) {
  alpha <- p0(v, distr = distr.p0, mu = mu.p0, sigma = sigma.p0) / p0(v2,
    distr = distr.p0, mu = mu.p0, sigma = sigma.p0
  )
  Prod <- 1
  for (i in seq_along(xleft)) {
    fac <- dkcens2_1val(
      xleft = xleft[i], xright = xright[i],
      c_code = censor_code[i], distr = distr.k, mu = v,
      sigma = sigma.k
    ) / dkcens2_1val(
      xleft = xleft[i], xright = xright[i],
      c_code = censor_code[i], distr = distr.k, mu = v2,
      sigma = sigma.k
    )
    Prod <- Prod * fac
  }
  f <- alpha * Prod
  return(f)
}

Conditional posterior distribution of the distinct vectors (Ystar,Zstar)

Description

This function evaluates the ratio of conditional posterior distributions of the distinct latent vectors (Ystar,Zstar).

Usage

rfyzstar(
  v,
  v2,
  z,
  z2,
  x,
  distr.k,
  distr.py0,
  mu.py0,
  sigma.py0,
  distr.pz0,
  mu.pz0,
  sigma.pz0
)

Details

For internal use.

Examples


## The function is currently defined as
function(v, v2, z, z2, x, distr.k, distr.py0, mu.py0, sigma.py0, distr.pz0, mu.pz0, sigma.pz0) {
  alpha <- p0(v, distr = distr.py0, mu = mu.py0, sigma = sigma.py0) /
    p0(v2, distr = distr.py0, mu = mu.py0, sigma = sigma.py0) *
    p0(z, distr = distr.pz0, mu = mu.pz0, sigma = sigma.pz0) /
    p0(z2, distr = distr.pz0, mu = mu.pz0, sigma = sigma.pz0)
  Prod <- 1
  for (i in seq(length(x))) {
    fac <- dk(x[i], distr = distr.k, mu = v, sigma = z) / dk(x[i],
      distr = distr.k, mu = v2, sigma = z2
    )
    Prod <- Prod * fac
  }
  f <- alpha * Prod
  return(f)
}

Conditional posterior distribution of the distinct vectors (Ystar,Zstar) in the case of censoring

Description

This function evaluates the ratio of conditional posterior distributions of the distinct latent vectors (Ystar,Zstar).

Usage

rfyzstarcens2(
  v,
  v2,
  z,
  z2,
  xleft,
  xright,
  censor_code,
  distr.k,
  distr.py0,
  mu.py0,
  sigma.py0,
  distr.pz0,
  mu.pz0,
  sigma.pz0
)

Details

For internal use

Examples


## The function is currently defined as
function(v, v2, z, z2, xleft, xright, censor_code, distr.k,
         distr.py0, mu.py0, sigma.py0, distr.pz0, mu.pz0, sigma.pz0) {
  alpha <- p0(v, distr = distr.py0, mu = mu.py0, sigma = sigma.py0) / p0(v2,
    distr = distr.py0, mu = mu.py0, sigma = sigma.py0
  ) *
    p0(z, distr = distr.pz0, mu = mu.pz0, sigma = sigma.pz0) / p0(z2,
      distr = distr.pz0, mu = mu.pz0, sigma = sigma.pz0
    )
  Prod <- 1
  for (i in seq_along(xleft)) {
    fac <- dkcens2_1val(
      xleft = xleft[i], xright = xright[i],
      c_code = censor_code[i], distr = distr.k, mu = v,
      sigma = z
    ) / dkcens2_1val(
      xleft = xleft[i], xright = xright[i],
      c_code = censor_code[i], distr = distr.k, mu = v2,
      sigma = z2
    )
    Prod <- Prod * fac
  }
  f <- alpha * Prod
  return(f)
}

Random number generator half Cauchy

Description

Computes a random number.

Usage

rhalfcauchy(n, location = 0, scale = 1)

Details

For internal use

Examples


## The function is currently defined as
function(n, location = 0, scale = 1) {
  abs(rcauchy(n, location, scale))
}

Random number generator half Normal

Description

Computes a random number.

Usage

rhalfnorm(n, mean = 0, sd = 1)

Details

For internal use

Examples


## The function is currently defined as
function(n, mean = 0, sd = 1) {
  abs(rnorm(n, mean, sd))
}

Random number generator half Student-t

Description

Generates a random number.

Usage

rhalft(n, df = 1, mean = 0, sd = 1)

Details

For internal use

Examples


## The function is currently defined as
function(n, df = 1, mean = 0, sd = 1) {
  abs(rt_(n, df, mean, sd))
}

Kernel density sampling function

Description

This function simulates from a density. There are 4 density options (1 = Gaussian, 2 = Gamma, 3 = Beta, 4 = double exponential, 5 = lognormal). All densities are parameterized in terms of mean and standard deviation.

Usage

rk(n, distr = NULL, mu = NULL, sigma = NULL)

Details

For internal use.

Examples


## The function is currently defined as
function(n, distr = NULL, mu = NULL, sigma = NULL) {
  if (is.null(distr)) {
    stop("Argument \"distr\" should be defined numeric with possible values 1,2,3,4 or 5")
  }
  else if (distr == 1) {
    a <- ifelse(is.null(mu), 0, mu)
    b <- ifelse(is.null(sigma), 1, sigma)
    rk <- rnorm(n, mean = a, sd = b)
  }
  else if (distr == 2) {
    a <- ifelse(is.null(mu), 0, mu)
    b <- ifelse(is.null(sigma), 1 / sqrt(2), sigma / sqrt(2))
    rk <- a + b * sample(c(-1, +1), size = n, replace = TRUE) *
      rexp(n)
  }
  else if (distr == 3) {
    a <- ifelse(is.null(mu), exp(1 / 2), log(mu / sqrt(1 + (sigma / mu)^2)))
    b <- ifelse(is.null(sigma), exp(1) * (exp(1) - 1), sqrt(log(1 +
      (sigma / y)^2)))
    rk <- rlnorm(n, meanlog = a, sdlog = b)
  }
  else if (distr == 4) {
    a <- ifelse(is.null(mu), 1, mu^2 / sigma^2)
    b <- ifelse(is.null(sigma), 1, mu / sigma^2)
    rk <- rgamma(n, shape = a, rate = b)
  }
  else if (distr == 5) {
    a <- ifelse(is.null(mu), 0.5, (1 - mu) * (mu / sigma)^2 -
      mu)
    b <- ifelse(is.null(sigma), 1 / sqrt(12), (mu * (1 - mu) / sigma^2 -
      1) * (1 - mu))
    if (any(c(a, b) <= 0)) {
      stop(paste(
        "\nNegative Beta parameters:\n a =", a,
        ";\t b =", b
      ))
    }
    rk <- rbeta(n, shape1 = a, shape2 = b)
  }
  else {
    stop("Argument \"distr\" should be defined numeric with possible values 1,2,3,4 or 5")
  }
  return(rk)
}

Proposal distribution for logU

Description

This function makes a proposal for a new value of logU

Usage

rprop_logu(logu, delta)

Arguments

logu

Real, log of the latent variable U at the current iteration.

delta

Scale of the Metropolis-Hastings proposal distribution


Random number generator non-standard Student-t

Description

Computes a random number.

Usage

rt_(n, df, mean, sd)

Details

For internal use

Examples


## The function is currently defined as
function(n, df, mean, sd) {
  mean + sd * rt(n, df, ncp = 0)
}

Random number generator for a truncated normal distribution

Description

Generates a random number from a truncated normal distribution.

Usage

rtnorm(n, mean = 0, sd = 1, lower = -Inf, upper = Inf)

Details

For internal use

Note

Taken from msm R-package.

Author(s)

C. H. Jackson

Examples


## The function is currently defined as
function(n, mean = 0, sd = 1, lower = -Inf, upper = Inf) {
  if (length(n) > 1) {
    n <- length(n)
  }
  mean <- rep(mean, length = n)
  sd <- rep(sd, length = n)
  lower <- rep(lower, length = n)
  upper <- rep(upper, length = n)
  lower <- (lower - mean) / sd
  upper <- (upper - mean) / sd
  ind <- seq(length = n)
  ret <- numeric(n)
  alg <- ifelse(lower > upper, -1, ifelse(((lower < 0 & upper ==
    Inf) | (lower == -Inf & upper > 0) | (is.finite(lower) &
    is.finite(upper) & (lower < 0) & (upper > 0) & (upper -
    lower > sqrt(2 * pi)))), 0, ifelse((lower >= 0 & (upper >
    lower + 2 * sqrt(exp(1)) / (lower + sqrt(lower^2 + 4)) *
      exp((lower * 2 - lower * sqrt(lower^2 + 4)) / 4))),
  1, ifelse(upper <= 0 & (-lower > -upper + 2 * sqrt(exp(1)) / (-upper +
    sqrt(upper^2 + 4)) * exp((upper * 2 - -upper * sqrt(upper^2 +
    4)) / 4)), 2, 3)
  )))
  ind.nan <- ind[alg == -1]
  ind.no <- ind[alg == 0]
  ind.expl <- ind[alg == 1]
  ind.expu <- ind[alg == 2]
  ind.u <- ind[alg == 3]
  ret[ind.nan] <- NaN
  while (length(ind.no) > 0) {
    y <- rnorm(length(ind.no))
    done <- which(y >= lower[ind.no] & y <= upper[ind.no])
    ret[ind.no[done]] <- y[done]
    ind.no <- setdiff(ind.no, ind.no[done])
  }
  stopifnot(length(ind.no) == 0)
  while (length(ind.expl) > 0) {
    a <- (lower[ind.expl] + sqrt(lower[ind.expl]^2 + 4)) / 2
    z <- rexp(length(ind.expl), a) + lower[ind.expl]
    u <- runif(length(ind.expl))
    done <- which((u <= exp(-(z - a)^2 / 2)) & (z <= upper[ind.expl]))
    ret[ind.expl[done]] <- z[done]
    ind.expl <- setdiff(ind.expl, ind.expl[done])
  }
  stopifnot(length(ind.expl) == 0)
  while (length(ind.expu) > 0) {
    a <- (-upper[ind.expu] + sqrt(upper[ind.expu]^2 + 4)) / 2
    z <- rexp(length(ind.expu), a) - upper[ind.expu]
    u <- runif(length(ind.expu))
    done <- which((u <= exp(-(z - a)^2 / 2)) & (z <= -lower[ind.expu]))
    ret[ind.expu[done]] <- -z[done]
    ind.expu <- setdiff(ind.expu, ind.expu[done])
  }
  stopifnot(length(ind.expu) == 0)
  while (length(ind.u) > 0) {
    z <- runif(length(ind.u), lower[ind.u], upper[ind.u])
    rho <- ifelse(lower[ind.u] > 0, exp((lower[ind.u]^2 -
      z^2) / 2), ifelse(upper[ind.u] < 0, exp((upper[ind.u]^2 -
      z^2) / 2), exp(-z^2 / 2)))
    u <- runif(length(ind.u))
    done <- which(u <= rho)
    ret[ind.u[done]] <- z[done]
    ind.u <- setdiff(ind.u, ind.u[done])
  }
  stopifnot(length(ind.u) == 0)
  ret * sd + mean
}

Salinity tolerance

Description

72-hour acute salinity tolerance (LC50 values) of riverine macro-invertebrates.

Format

A data frame with 108 observations on the following two variables:

left

A numeric vector.

right

A numeric vector.

Source

fitdistrplus R-package

References

Kefford, B.J., Nugegoda, D., Metzeling, L., Fields, E. 2006. Validating species sensitivity distributions using salinity tolerance of riverine macroinvertebrates in the southern Murray-darling Basin (Victoria, Australia). Canadian Journal of Fisheries and Aquatic Science, 63, 1865-1877.

Examples


data(salinity)
hist(salinity$left)

S3 method for class 'MixNRMI1'

Description

S3 method for class 'MixNRMI1'

Usage

## S3 method for class 'NRMI1'
summary(object, number_of_clusters = FALSE, ...)

Arguments

object

A fitted object of class NRMI1

number_of_clusters

Whether to compute the optimal number of clusters, which can be a time-consuming operation (see compute_optimal_clustering)

...

Further arguments to be passed to generic function, ignored at the moment

Value

Prints out the text for the summary S3 methods

Examples


## Example for non censored data

data(acidity)
out <- MixNRMI1(acidity, Nit = 50)
summary(out)

S3 method for class 'MixNRMI2'

Description

S3 method for class 'MixNRMI2'

Usage

## S3 method for class 'NRMI2'
summary(object, number_of_clusters = FALSE, ...)

Arguments

object

A fitted object of class NRMI2

number_of_clusters

Whether to compute the optimal number of clusters, which can be a time-consuming operation (see compute_optimal_clustering)

...

Further arguments to be passed to generic function, ignored at the moment

Value

Prints out the text for the summary S3 methods

Examples

data(acidity)
out <- MixNRMI2(acidity, Nit = 20)
summary(out)

data(salinity)
out <- MixNRMI2cens(salinity$left, salinity$right, Nit = 20)
summary(out)

S3 method for class 'PY1'

Description

S3 method for class 'PY1'

Usage

## S3 method for class 'PY1'
summary(object, number_of_clusters = FALSE, ...)

Arguments

object

A fitted object of class PY1

number_of_clusters

Whether to compute the optimal number of clusters, which can be a time-consuming operation (see compute_optimal_clustering)

...

Further arguments to be passed to generic function, ignored at the moment

Value

Prints out the text for the summary S3 methods

Examples


## Example for non censored data

data(acidity)
out <- MixPY1(acidity, Nit = 50)
summary(out)

S3 method for class 'PY2'

Description

S3 method for class 'PY2'

Usage

## S3 method for class 'PY2'
summary(object, number_of_clusters = FALSE, ...)

Arguments

object

A fitted object of class PY2

number_of_clusters

Whether to compute the optimal number of clusters, which can be a time-consuming operation (see compute_optimal_clustering)

...

Further arguments to be passed to generic function, ignored at the moment

Value

Prints out the text for the summary S3 methods

Examples


## Example for non censored data

data(acidity)
out <- MixPY2(acidity, Nit = 50)
summary(out)

S3 method for class 'multNRMI'

Description

S3 method for class 'multNRMI'

Usage

## S3 method for class 'multNRMI'
summary(object, number_of_clusters = FALSE, ...)

Arguments

object

A fitted object of class NRMI1cens

number_of_clusters

Whether to compute the optimal number of clusters, which can be a time-consuming operation (see compute_optimal_clustering)

...

Further arguments to be passed to generic function, ignored at the moment

Value

Prints out the text for the summary S3 methods

Examples


data(salinity)
out <- multMixNRMI2cens(salinity$left, salinity$right, parallel = TRUE, Nit = 10, ncores = 2)
summary(out)


Common text for the summary S3 methods

Description

Common text for the summary S3 methods

Usage

summarytext(
  fit,
  kernel_comment,
  BNP_process_comment,
  number_of_clusters = FALSE
)

Arguments

fit

NRMIx or PYx object

kernel_comment

Text specific to the parametric and nonparametric nature of the model

BNP_process_comment

Text specific to the nonparametric process, NRMI or Pitman-Yor

number_of_clusters

Flag to decide whether to compute the optimal clustering

Value

Prints out the text for the summary S3 methods


Choosing the truncation level for the NGG process

Description

This function uses the M_array which provides the threshold which ensures a moment match of 5

Usage

thresholdGG(alpha = 1, kappa = 1, gama = 1/2, max_threshold = 200)

Arguments

alpha

Numeric constant. Total mass of the centering measure

kappa

Numeric positive constant.

gama

Numeric constant. 0 \leq Gama \leq 1.

max_threshold

Numeric positive integer. Maximum allowed value for the threshold

Details

For internal use

Value

Numeric positive integer, the truncation level of the NGG process


Draw a traceplot for multiple chains

Description

This is a convenience function which works when coda is not yet loaded by the user. If coda is loaded, it gets masked. See also file multMixNRMI.R

Usage

traceplot(fitlist)

Arguments

fitlist

Output of multMixNRMI.

Value

A traceplot for multiple chains.