Type: | Package |
Title: | Bivariate Distributions via Conditional Specification |
Version: | 0.1.1 |
Maintainer: | Mina Norouzirad <mina.norouzirad@gmail.com> |
Description: | Implementation of bivariate binomial, geometric, and Poisson distributions based on conditional specifications. The package also includes tools for data generation and goodness-of-fit testing for these three distribution families. For methodological details, see Ghosh, Marques, and Chakraborty (2025) <doi:10.1080/03610926.2024.2315294>, Ghosh, Marques, and Chakraborty (2023) <doi:10.1080/03610918.2021.2004419>, and Ghosh, Marques, and Chakraborty (2021) <doi:10.1080/02664763.2020.1793307>. |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
Encoding: | UTF-8 |
LazyData: | true |
RoxygenNote: | 7.3.2 |
Depends: | R (≥ 3.5) |
Imports: | stats |
Suggests: | knitr, rmarkdown |
VignetteBuilder: | knitr |
URL: | https://github.com/mnrzrad/BCD |
BugReports: | https://github.com/mnrzrad/BCD/issues |
NeedsCompilation: | no |
Packaged: | 2025-06-25 10:19:31 UTC; minan |
Author: | Mina Norouzirad |
Repository: | CRAN |
Date/Publication: | 2025-06-25 10:30:02 UTC |
Freeman–Tukey Test for Bivariate Distributions via Conditional Specification
Description
Performs a goodness-of-fit test using the Freeman–Tukey (F–T) statistic for a given dataset and a specified bivariate distribution via Conditional Specification.
Usage
FTtest(data, distribution, params, num_params)
Arguments
data |
a dataset or matrix with two columns. |
distribution |
a string specifying the theoretical distribution ('"BBCD"', '"BBPD"', or '"BBGD"'). |
params |
a named list of parameters required by the specified distribution. |
num_params |
an integer specifying the number of parameters that were estimated |
Details
The Freeman–Tukey (F–T) statistic is used to assess the goodness of fit in contingency tables. It is defined as:
T^2 = 4 \sum_{i=1}^{r} \sum_{j=1}^{c} \left( \sqrt{O_{ij}} - \sqrt{E_{ij}} \right)^2
where O_{ij}
and E_{ij}
are the observed and expected frequencies, respectively.
The statistic T^2
asymptotically follows a chi-squared distribution with
(r \cdot c - 1)
degrees of freedom, where r
is the number of rows
and c
is the number of columns in the contingency table.
Value
A list with components:
- observed
Observed frequency table
- expected
Expected frequency table under the specified distribution
- test
Result of the Freeman–Tukey test, a list with test statistic and p-value
Examples
samples <- rgeomBCD(n = 20, q1 = 0.5, q2 = 0.5, q3 = 0.1, seed = 123)
params <- MLEgeomBCD(samples)
result_bgcd <- FTtest(samples, "BGCD", params, num_params = 3)
result_bgcd
samples <- rpoisBCD(20, lambda1=.5, lambda2=.5, lambda3=.5)
params <- MLEpoisBCD(samples)
result_bpcd <- FTtest(samples, "BPCD", params, num_params = 3)
result_bpcd
Maximum Likelihood Estimation for a Bivariate Binomial Distribution via Conditional Specification
Description
Estimates the parameters of a Bivariate Binomial Conditionals via Conditional Specification using maximum likelihood.
Usage
MLEbinomBCD(data, fixed_n1 = NULL, fixed_n2 = NULL, verbose = TRUE)
Arguments
data |
A data frame or matrix with columns 'X' and 'Y' |
fixed_n1 |
known value of 'n1' (NULL to estimate) |
fixed_n2 |
known value of 'n2' (NULL to estimate) |
verbose |
logical; print progress |
Value
A list of class "MLEpoisBCD"
containing:
n1
estimated n1
n2
estimated n2
p1
estimated p1
p2
estimated p2
lambda
estimated lambda
logLik
Maximum log-likelihood achieved.
AIC
Akaike Information Criterion.
BIC
Bayesian Information Criterion.
convergence
Convergence status from the optimizer (0 means successful).
Examples
data <- rbinomBCD(n = 10,n1 = 5, n2 = 3, p1 = 0.6, p2 = 0.4, lambda = 1.2)
MLEbinomBCD(data)
MLEbinomBCD(data, fixed_n1 = 5, fixed_n2 = 3)
Maximum Likelihood Estimation for a Bivariate Geometric Distribution via Conditional Specification
Description
Estimates the parameters of a bivariate geometric distribution via Conditional Specification using maximum likelihood.
Usage
MLEgeomBCD(data, initial_values = c(0.5, 0.5, 0.5))
Arguments
data |
data frame or matrix with two columns, representing paired observations of count variables |
initial_values |
numeric vector of length 3 with initial values for the parameters |
Details
The model estimates parameters from a joint distribution for (X, Y)
with the form:
P(X = x, Y = y) = K(q_1, q_2, q_3) q_1^x q_2^y q_3^{xy},
where K(q_1, q_2, q_3)
is the normalizing constant.
Value
A list containing:
q1
estimated q1.
q2
estimated q2.
q3
estimated q3.
logLik
Maximum log-likelihood achieved.
AIC
Akaike Information Criterion.
BIC
Bayesian Information Criterion.
convergence
Convergence status from the optimizer (0 means successful).
References
Ghosh, I., Marques, F., & Chakraborty, S. (2023) A bivariate geometric distribution via conditional specification: properties and applications, Communications in Statistics - Simulation and Computation, 52:12, 5925–5945, doi:10.1080/03610918.2021.2004419
See Also
Examples
# Simulate data
samples <- rgeomBCD(n = 50, q1 = 0.2, q2 = 0.2, q3 = 0.5)
result <-MLEgeomBCD(samples)
print(result)
# For better estimation accuracy and stability, consider increasing the sample size (n = 1000)
data(abortflights)
MLEgeomBCD(abortflights)
Maximum Likelihood Estimation for a Bivariate Poisson Distribution via Conditional Specification
Description
Estimates the parameters of a bivariate Poisson distribution via Conditional Specification using maximum likelihood.
Usage
MLEpoisBCD(data, initial_values = NULL)
Arguments
data |
data frame or matrix with two columns, representing paired observations of count variables |
initial_values |
optional named list with initial values for the parameters: |
Details
The model estimates parameters from a joint distribution for (X, Y)
with the form:
P(X = x, Y = y) = K(\lambda_1, \lambda_2, \lambda_3) \frac{\lambda_1^x \lambda_2^y \lambda_3^{xy}}{x! y!},
where x, y = 0, 1, 2, \ldots
, and K(\lambda_1, \lambda_2, \lambda_3)
is the normalizing constant.
Value
A list of class "MLEpoisBCD"
containing:
lambda1
estimated lambda1.
lambda2
estimated lambda2.
lambda3
estimated dependence parameter (must be in (0, 1]).
logLik
Maximum log-likelihood achieved.
AIC
Akaike Information Criterion.
BIC
Bayesian Information Criterion.
convergence
Convergence status from the optimizer (0 means successful).
See Also
Examples
# Simulate data
data <- rpoisBCD(n = 50, lambda1 = 3, lambda2 = 5, lambda3 = 1)
result <- MLEpoisBCD(data)
print(result)
data(eplSeasonGoals)
MLEpoisBCD(eplSeasonGoals[["1819"]])
data(lensfaults)
MLEpoisBCD(lensfaults)
Aborted Flight Counts for 109 Aircrafts
Description
This dataset records the number of aborted flights by 109 aircrafts during two consecutive periods. The counts are cross-tabulated by the number of aborted flights in each period.
Usage
abortflights
Format
A data frame with 109 rows and 2 variables:
- X
Number of aborted flights in Period 1.
- Y
Number of aborted flights in Period 2.
References
Barbiero, A. (2019). A bivariate geometric distribution allowing for positive or negative correlation. Communications in Statistics - Theory and Methods, 48 (11), 2842–-2861. doi:10.1080/03610926.2018.1473428.
Ghosh, I., Marques, F., & Chakraborty, S. (2023) A bivariate geometric distribution via conditional specification: properties and applications, Communications in Statistics - Simulation and Computation, 52:12, 5925–5945, doi:10.1080/03610918.2021.2004419
Examples
data(abortflights)
head(abortflights)
table(abortflights$X, abortflights$Y)
Joint Probability Mass Function for a Bivariate Binomial Distribution via Conditional Specification
Description
Computes the probability mass function (p.m.f.) of the bivariate binomial conditionals distribution (BBCD) as defined by Ghosh, Marques, and Chakraborty (2025). The distribution is characterized by conditional binomial distributions for X
and Y
.
Usage
dbinomBCD(x, y, n1, n2, p1, p2, lambda)
Arguments
x |
value of |
y |
value of |
n1 |
number of trials for |
n2 |
number of trials for |
p1 |
base success probability for |
p2 |
base success probability for |
lambda |
dependence parameter, must be positive. |
Details
The joint p.m.f. of the BBCD is
P(X = x, Y = y) = K_B(n_1, n_2, p_1, p_2, \lambda) \binom{n_1}{x} \binom{n_2}{y} p_1^x p_2^y (1 - p_1)^{n_1 - x} (1 - p_2)^{n_2 - y} \lambda^{xy},
where x = 0, 1, \ldots, n_1
, y = 0, 1, \ldots, n_2
, and K_B(n_1, n_2, p_1, p_2, \lambda)
is the normalizing constant.
Value
The probability P(X = x, Y = y)
.
References
Ghosh, I., Marques, F., & Chakraborty, S. (2025). A form of bivariate binomial conditionals distributions. Communications in Statistics - Theory and Methods, 54(2), 534–553. doi:10.1080/03610926.2024.2315294
See Also
pbinomBCD
rbinomBCD
MLEbinomBCD
Examples
# Compute P(X = 2, Y = 1) with n1 = 5, n2 = 5, p1 = 0.5, p2 = 0.4, lambda = 0.5
dbinomBCD(x = 2, y = 1, n1 = 5, n2 = 5, p1 = 0.5, p2 = 0.4, lambda = 0.5)
# Example with independence (lambda = 1)
dbinomBCD(x = 2, y = 1, n1 = 5, n2 = 5, p1 = 0.5, p2 = 0.4, lambda = 1.0)
Joint Probability Mass Function for A Bivariate Geometric Distribution via Conditional Specification
Description
Computes the joint probability mass function (p.m.f.) of a Bivariate Geometric Conditional Distributions (BGCD) based on Ghosh, Marques, and Chakraborty (2023). This distribution models paired count data with geometric conditionals, incorporating dependence between variables X
and Y
.
Usage
dgeomBCD(x, y, q1, q2, q3)
Arguments
x |
value of |
y |
value of |
q1 |
probability parameter for |
q2 |
probability parameter for |
q3 |
dependence parameter, in |
Details
The joint p.m.f. of the BGCD is:
P(X = x, Y = y) = K(q_1, q_2, q_3) q_1^x q_2^y q_3^{xy},
where K(q_1, q_2, q_3)
is the normalizing constant computed by the function normalize_constant_BGCD
.
Note that:
- q_3 < 1
: indicates the negative correlation between X
and Y
- q_3 = 1
: indicates the independence between X
and Y
Value
The probability P(X = x, Y = y)
for each pair of x
and y
.
References
Ghosh, I., Marques, F., & Chakraborty, S.(2023) A bivariate geometric distribution via conditional specification: properties and applications, Communications in Statistics - Simulation and Computation, 52:12, 5925–5945, doi:10.1080/03610918.2021.2004419
See Also
Examples
# Compute P(X = 1, Y = 2) with q1 = 0.5, q2 = 0.6, q3 = 0.8
dgeomBCD(x = 1, y = 2, q1 = 0.5, q2 = 0.6, q3 = 0.8)
# # Compute P(X = 0, Y = 4) with q1 = 0.5, q2 = 0.6, q3 = 0.8
dgeomBCD(x = 0, y = 4, q1 = 0.5, q2 = 0.6, q3 = 0.8)
Joint Probability Mass Function for a Bivariate Poisson Distribution via Conditional Specification
Description
Computes the joint probability mass function (p.m.f.) of a Bivariate Poisson Conditionals distribution (BPCD) based on Ghosh, Marques, and Chakraborty (2021).
Usage
dpoisBCD(x, y, lambda1, lambda2, lambda3)
Arguments
x |
value of |
y |
value of |
lambda1 |
rate parameter for |
lambda2 |
rate parameter for |
lambda3 |
dependence parameter that must be |
Details
The joint p.m.f. of the BGCD is
P(X = x, Y = y) = K(\lambda_1, \lambda_2, \lambda_3) \frac{\lambda_1^x \lambda_2^y \lambda_3^{xy}}{x! y!},
where x, y = 0, 1, 2, \ldots
, and K(\lambda_1, \lambda_2, \lambda_3)
is the normalizing constant computed by the function normalize_constant_BPCD
.
Key properties of the BPCD include:
- Negative correlation for \lambda_3 < 1
,
- Independence for \lambda_3 = 1
.
Value
probability P(X = x, Y = y)
for each pair of x
and y
.
References
Ghosh, I., Marques, F., & Chakraborty, S. (2021). A new bivariate Poisson distribution via conditional specification: properties and applications. Journal of Applied Statistics, 48(16), 3025-3047. doi:10.1080/02664763.2020.1793307
See Also
Examples
# Compute P(X = 1, Y = 2) with lambda1 = 0.5, lambda2 = 0.5, lambda3 = 0.5
dpoisBCD(x = 1, y = 2, lambda1 = 0.5, lambda2 = 0.5, lambda3 = 0.5)
# Compute P(X = 0, Y = 1) with lambda1 = 0.5, lambda2 = 0.5, lambda3 = 0.5
dpoisBCD(x = 0, y = 1, lambda1 = 0.5, lambda2 = 0.5, lambda3 = 0.5)
English Premier League Goals (2014–2019)
Description
A list of data frames for five consecutive seasons (2014/15 to 2018/19) from the English Premier League. Each data frame contains the number of full-time home ('X') and away ('Y') goals scored in each match of the season.
Usage
data(eplSeasonGoals)
Format
A named list of 5 data frames:
- 1415
380 rows, variables:
X
(home goals),Y
(away goals)- 1516
380 rows, variables:
X
(home goals),Y
(away goals)- 1617
380 rows, variables:
X
(home goals),Y
(away goals)- 1718
380 rows, variables:
X
(home goals),Y
(away goals)- 1819
380 rows, variables:
X
(home goals),Y
(away goals)- 1920
380 rows, variables:
X
(home goals),Y
(away goals)- 2021
380 rows, variables:
X
(home goals),Y
(away goals)- 2122
380 rows, variables:
X
(home goals),Y
(away goals)- 2223
380 rows, variables:
X
(home goals),Y
(away goals)- 2324
380 rows, variables:
X
(home goals),Y
(away goals)- 2525
380 rows, variables:
X
(home goals),Y
(away goals)
Details
Data source: English Premier League match results from https://football-data.co.uk/ (formerly hosted on datahub.io).
References
Ghosh, I., Marques, F., & Chakraborty, S. (2021). A new bivariate Poisson distribution via conditional specification: properties and applications. Journal of Applied Statistics, 48(16), 3025-3047. doi:10.1080/02664763.2020.1793307
Examples
data(eplSeasonGoals)
head(eplSeasonGoals[["1415"]])
head(eplSeasonGoals[["2425"]])
Surface and Interior Faults in 100 Lenses
Description
This dataset records counts of surface faults (X
) and interior faults (Y
) observed in 100 optical lenses
Usage
lensfaults
Format
A data frame with 100 rows and 2 variables:
- X
Number of surface faults in a lens.
- Y
Number of interior faults in the same lens.
References
Aitchison, J., & Ho, C. H. (1989). The multivariate Poisson-log normal distribution. Biometrika, 76(4), 643–653.
Ghosh, I., Marques, F., & Chakraborty, S. (2021). A new bivariate Poisson distribution via conditional specification: properties and applications. Journal of Applied Statistics, 48(16), 3025-3047. doi:10.1080/02664763.2020.1793307
Cumulative Distribution Function for a Bivariate Binomial Distribution via Conditional Specification
Description
Computes the cumulative distribution function (c.d.f.) of a bivariate binomial conditionals distribution (BBCD) as defined by Ghosh, Marques, and Chakraborty (2025).
Usage
pbinomBCD(x, y, n1, n2, p1, p2, lambda)
Arguments
x |
value at which the c.d.f. is evaluated |
y |
value at which the c.d.f. is evaluated |
n1 |
number of trials for |
n2 |
number of trials for |
p1 |
base success probability for |
p2 |
base success probability for |
lambda |
dependence parameter, must be positive. |
Value
The probability P(X \leq x, Y \leq y)
.
References
Ghosh, I., Marques, F., & Chakraborty, S. (2025). A form of bivariate binomial conditionals distributions. Communications in Statistics - Theory and Methodsm 54(2), 534–553. doi:10.1080/03610926.2024.2315294
See Also
Examples
# Compute P(X \le 2, Y \le 1) with n1 = 5, n2 = 5, p1 = 0.5, p2 = 0.4, lambda = 0.5
pbinomBCD(x = 2, y = 5, n1 = 5, n2 = 5, p1 = 0.5, p2 = 0.4, lambda = 0.5)
# Example with independence (lambda = 1)
pbinomBCD(x = 1, y = 1, n1 = 10, n2 = 10, p1 = 0.3, p2 = 0.6, lambda = 1)
Cumulative Distribution Function for a Bivariate Geometric Distribution via Conditional Specification
Description
Computes the cumulative distribution function (c.d.f.) of a bivariate geometric conditionals distribution (BGCD) based on Ghosh, Marques, and Chakraborty (2023).
Usage
pgeomBCD(x, y, q1, q2, q3)
Arguments
x |
value at which the c.d.f. is evaluated |
y |
value at which the c.d.f. is evaluated |
q1 |
probability parameter for |
q2 |
probability parameter for |
q3 |
dependence parameter, in (0, 1] |
Value
The probability P(X \leq x, Y \leq y)
.
References
Ghosh, I., Marques, F., & Chakraborty, S. (2023) A bivariate geometric distribution via conditional specification: properties and applications, Communications in Statistics - Simulation and Computation, 52:12, 5925–5945, doi:10.1080/03610918.2021.2004419
See Also
Examples
# Compute P(X \le 1, Y \le 2) with q1 = 0.5, q2 = 0.6, q3 = 0.8
pgeomBCD(x = 1, y = 2, q1 = 0.5, q2 = 0.6, q3 = 0.8)
# Example with small values
pgeomBCD(x = 0, y = 0, q1 = 0.4, q2 = 0.3, q3 = 0.9)
Cumulative Distribution Function for a Bivariate Poisson Distribution via Conditional Specification
Description
Computes the cumulative distribution function (c.d.f.) of a bivariate Poisson distribution (BPD) with conditional specification, as described by Ghosh, Marques, and Chakraborty (2021).
Usage
ppoisBCD(x, y, lambda1, lambda2, lambda3)
Arguments
x |
value at which the c.d.f. is evaluated |
y |
value at which the c.d.f. is evaluated |
lambda1 |
rate parameter for |
lambda2 |
rate parameter for |
lambda3 |
dependence parameter that must be (0, 1] |
Value
The probability P(X \leq x, Y \leq y)
.
References
Ghosh, I., Marques, F., & Chakraborty, S. (2021). A new bivariate Poisson distribution via conditional specification: properties and applications. Journal of Applied Statistics, 48(16), 3025-3047. doi:10.1080/02664763.2020.1793307
See Also
Examples
# Compute P(X \le 1, Y \le 1) with lambda1 = 0.5, lambda2 = 0.5, lambda3 = 0.5
ppoisBCD(x = 1, y = 1, lambda1 = 0.5, lambda2 = 0.5, lambda3 = 0.5)
# Example with larger values
ppoisBCD(x = 2, y = 2, lambda1 = 1.0, lambda2 = 1.0, lambda3 = 0.8)
Random Sampling from a Bivariate Binomial Distribution via Conditional Specification
Description
Generates random samples from a bivariate binomial conditionals distribution (BBCD).
Usage
rbinomBCD(n, n1, n2, p1, p2, lambda, seed = 123, verbose = TRUE)
Arguments
n |
number of samples to generate. |
n1 |
number of trials for |
n2 |
number of trials for |
p1 |
base success probability for |
p2 |
base success probability for |
lambda |
dependence parameter, must be positive. |
seed |
seed for random number generation (default = 123). |
verbose |
logical; if TRUE (default), prints progress updates and a summary. |
Value
A data frame with columns 'X' and 'Y', containing the sampled values.
Examples
samples <- rbinomBCD(n = 100, n1 = 10, n2 = 10, p1 = 0.5, p2 = 0.4, lambda = 1.2)
head(samples)
Random Sampling from a Bivariate Geometric Distribution via Conditional Specification
Description
Generates random samples from a bivariate geometric distribution (BGCD)
Usage
rgeomBCD(n, q1, q2, q3, seed = 123)
Arguments
n |
number of samples to generate |
q1 |
probability parameter for |
q2 |
probability parameter for |
q3 |
dependence parameter, in |
seed |
seed for random number generation (default = 123) |
Value
A data frame with two columns: 'X' and 'Y', containing the sampled values.
Examples
# Generate 100 samples
samples <- rgeomBCD(n = 100, q1 = 0.5, q2 = 0.5, q3 = 0.00001)
head(samples)
cor(samples$X, samples$Y) # Should be negative
Random Sampling from a Bivariate Poisson Distribution via Conditional Specification
Description
Generates random samples from a bivariate Poisson distribution (BPD).
Usage
rpoisBCD(n, lambda1, lambda2, lambda3, seed = 123)
Arguments
n |
number of samples to generate |
lambda1 |
rate parameter for |
lambda2 |
rate parameter for |
lambda3 |
dependence parameter that must be (0, 1] |
seed |
seed for random number generation (default = 123) |
Value
A data frame with columns 'X' and 'Y', containing the sampled values.
Examples
samples <- rpoisBCD(n = 100, lambda1 = 0.5, lambda2 = 0.5, lambda3 = 0.5)
cor(samples$X, samples$Y) # Should be negative
Seed and Plant Count Data
Description
This dataset records the number of seeds sown and the number of resulting plants grown over plots of fixed area (5 square feet).
Usage
seedplant
Format
A data frame with n rows and 2 variables:
- X
Number of seeds sown.
- Y
Number of plants grown.
#' @references Lakshminarayana, J., S. N. N. Pandit, and K. Srinivasa Rao. 1999. On a bivariate poisson distribution. Communications in Statistics - Theory and Methods, 28 (2), 267–276. doi:10.1080/03610929908832297
Ghosh, I., Marques, F., & Chakraborty, S. (2025). A form of bivariate binomial conditionals distributions. Communications in Statistics - Theory and Methods, 54(2), 534–553. doi:10.1080/03610926.2024.2315294
Examples
data(seedplant)
head(seedplant)
plot(seedplant$X, seedplant$Y,
xlab = "Seeds Sown",
ylab = "Plants Grown",
main = "Seed vs. Plant Count per Plot")
Railway Shunter Accident Data (1937–1947)
Description
Accident records for 122 experienced railway shunters across two historical periods.
Usage
shacc
Format
A data frame with 122 rows and 2 variables:
- X
Number of accidents during the 6-year period from 1937 to 1942.
- Y
Number of accidents during the 5-year period from 1943 to 1947.
This dataset is useful for analyzing accident rates before and after possible policy or operational changes.
References
Arbous, A. G., & Kerrich, J. E. (1951). Accident statistics and the concept of accident-proneness. Biometrics, 7(4), 340. doi:10.2307/3001656
Ghosh, I., Marques, F., & Chakraborty, S. (2025). A form of bivariate binomial conditionals distributions. Communications in Statistics - Theory and Methods, 54(2), 534–553. doi:10.1080/03610926.2024.2315294
Examples
data(shacc)
head(shacc)
plot(shacc$X, shacc$Y, xlab = "Accidents 1937–42", ylab = "Accidents 1943–47")