Version: | 2.0.0.9.3 |
Date: | 2023-08-21 |
Title: | Fitting Single and Mixture of Generalised Lambda Distributions |
Maintainer: | Steve Su <allegro.su@gmail.com> |
Depends: | cluster, grDevices, graphics, stats, spacefillr |
Description: | The fitting algorithms considered in this package have two major objectives. One is to provide a smoothing device to fit distributions to data using the weight and unweighted discretised approach based on the bin width of the histogram. The other is to provide a definitive fit to the data set using the maximum likelihood and quantile matching estimation. Other methods such as moment matching, starship method, L moment matching are also provided. Diagnostics on goodness of fit can be done via qqplots, KS-resample tests and comparing mean, variance, skewness and kurtosis of the data with the fitted distribution. References include the following: Karvanen and Nuutinen (2008) "Characterizing the generalized lambda distribution by L-moments" <doi:10.1016/j.csda.2007.06.021>, King and MacGillivray (1999) "A starship method for fitting the generalised lambda distributions" <doi:10.1111/1467-842X.00089>, Su (2005) "A Discretized Approach to Flexibly Fit Generalized Lambda Distributions to Data" <doi:10.22237/jmasm/1130803560>, Su (2007) "Nmerical Maximum Log Likelihood Estimation for Generalized Lambda Distributions" <doi:10.1016/j.csda.2006.06.008>, Su (2007) "Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R" <doi:10.18637/jss.v021.i09>, Su (2009) "Confidence Intervals for Quantiles Using Generalized Lambda Distributions" <doi:10.1016/j.csda.2009.02.014>, Su (2010) "Chapter 14: Fitting GLDs and Mixture of GLDs to Data using Quantile Matching Method" <doi:10.1201/b10159>, Su (2010) "Chapter 15: Fitting GLD to data using GLDEX 1.0.4 in R" <doi:10.1201/b10159>, Su (2015) "Flexible Parametric Quantile Regression Model" <doi:10.1007/s11222-014-9457-1>, Su (2021) "Flexible parametric accelerated failure time model"<doi:10.1080/10543406.2021.1934854>. |
License: | GPL (≥ 3) |
NeedsCompilation: | yes |
Packaged: | 2023-08-20 23:06:07 UTC; allegro |
Author: | Steve Su |
Repository: | CRAN |
Date/Publication: | 2023-08-21 04:02:41 UTC |
This package fits RS and FMKL generalised lambda distributions using various methods. It also provides functions for fitting bimodal distributions using mixtures of generalised lambda distributions.
Description
The fitting algorithms considered in this package have two major objectives. One is to provide a smoothing device to fit distributions to data using the weight and unweighted discretised approach based on the bin width of the histogram. The other is to provide a definitive fit to the data set using the maximum likelihood estimation.
Copyright Information: To ensure the stability of this package, this package ports other functions from other open sourced packages directly so that any changes in other packages will not cause this package to malfunction.
All functions obtained from other sources have been acknowledged by the author in the authorship or the description sections of the help files and they are freely available online for all to use. Please contact the author for any copyright issues.
Specifically the following functions have been modified from R:
hist.su, ks.gof, pretty.su
The following functions are taken from other open source packages in R:
runif.pseudo, rnorm.pseudo, runif.halton, rnorm.halton, runif.sobol, rnorm.sobol by Diethelm Wuertz distributed under GPL.
digitsBase, QUnif and sHalton written by Martin Maechler distributed under GPL.
dgl, pgl, rgl, qgl, starship.adpativegrid, starship.obj and starship written by Robert King and some functions modified by Steve Su distributed under GPL.
Lmoments and t1lmoments written by Juha Karvanen distributed under GPL.
Details
This package allows a direct fitting method onto the data set using
fun.RMFMKL.ml
, fun.RMFMKL.ml.m
, fun.RMFMKL.hs
,
fun.RMFMKL.hs.nw
, fun.RPRS.ml
, fun.RPRS.ml.m
,
fun.RPRS.hs
, fun.RPRS.hs.nw
,
fun.RMFMKL.qs
, fun.RPRS.qs
,
fun.RMFMKL.mm
, fun.RPRS.mm
,
fun.RMFMKL.lm
, fun.RPRS.lm
and in the case of
bimodal data set: fun.auto.bimodal.qs
,
fun.auto.bimodal.ml
,
fun.auto.bimodal.pml
functions.
The resulting fits can be graphically gauged by fun.plot.fit
or
fun.plot.fit.bm
(for bimodal data), or examined by numbers
using the Kolmogorov-Smirnoff resample tests (fun.diag.ks.g
) and
fun.diag.ks.g.bimodal
). For unimodal data fits, it is also
possible to compare the mean, variance, skewness and kurtosis of the fitted
distribution with the data set using fun.comp.moments.ml
and
fun.comp.moments.ml.2
functions. Similarly, for bimodal data fits,
this is done via fun.theo.bi.mv.gld
and fun.moments.r
.
Additionally, L moments for single generalised lambda distribution can be
obtained using fun.lm.theo.gld
. For graphical display of
goodness of fit, quantile plots can be used, these can be done using
qqplot.gld
and qqplot.gld.bi
for univariate
and bimodal generalised lambda distribution fits respectively.
Author(s)
Steve Su <allegro.su@gmail.com>
References
Asquith, W. (2007), L-moments and TL-moments of the generalized lambda distribution, Computational Statistics and Data Analysis 51(9): 4484-4496.
Freimer, M., Mudholkar, G. S., Kollia, G. & Lin, C. T. (1988), A study of the generalized tukey lambda family, Communications in Statistics - Theory and Methods *17*, 3547-3567.
Gilchrist, Warren G. (2000), Statistical Modelling with Quantile Functions, Chapman & Hall
Karian, Z.A., Dudewicz, E.J., and McDonald, P. (1996), The extended generalized lambda distribution system for fitting distributions to data: history, completion of theory, tables, applications, the “Final Word” on Moment fits, Communications in Statistics - Simulation and Computation *25*, 611-642.
Karian, Zaven A. and Dudewicz, Edward J. (2000), Fitting statistical distributions: the Generalized Lambda Distribution and Generalized Bootstrap methods, Chapman & Hall
Karvanen, J. and A. Nuutinen (2008), Characterizing the generalized lambda distribution by L-moments, Computational Statistics and Data Analysis 52(4): 1971-1983. doi:10.1016/j.csda.2007.06.021
King, R.A.R. & MacGillivray, H. L. (1999), A starship method for fitting the generalised lambda distributions, Australian and New Zealand Journal of Statistics, 41, 353-374 doi:10.1111/1467-842X.00089
Ramberg, J. S. & Schmeiser, B. W. (1974), An approximate method for generating asymmetric random variables, Communications of the ACM *17*, 78-82.
Su, S. (2005). A Discretized Approach to Flexibly Fit Generalized Lambda Distributions to Data. Journal of Modern Applied Statistical Methods (November): 408-424. doi:10.22237/jmasm/1130803560
Su, S. (2007). Nmerical Maximum Log Likelihood Estimation for Generalized Lambda Distributions. Computational Statistics and Data Analysis: *51*, 8, 3983-3998. doi:10.1016/j.csda.2006.06.008
Su, S. (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9. doi:10.18637/jss.v021.i09
Su, S. (2009). Confidence Intervals for Quantiles Using Generalized Lambda Distributions. Computational Statistics and Data Analysis *53*, 9, 3324-3333. doi:10.1016/j.csda.2009.02.014
Su, S. (2010). Chapter 14: Fitting GLDs and Mixture of GLDs to Data using Quantile Matching Method. Handbook of Distribution Fitting Methods with R. (Karian and Dudewicz) 557-583. doi:10.1201/b10159-3
Su, S. (2010). Chapter 15: Fitting GLD to data using GLDEX 1.0.4 in R. Handbook of Distribution Fitting Methods with R. (Karian and Dudewicz) 585-608. doi:10.1201/b10159-3
Su, S. (2015) "Flexible Parametric Quantile Regression Model" Statistics & Computing 25 (3). 635-650. doi:10.1007/s11222-014-9457-1
Su S. (2021) "Flexible parametric accelerated failure time model" J Biopharm Stat. 2021 Sep 31(5):650-667. doi:10.1080/10543406.2021.1934854
See Also
GLDreg package in R for GLD regression models.
Examples
###Univariate distributions example:
set.seed(1000)
junk<-rweibull(300,3,2)
##A faster ML estimtion
junk.fit1<-fun.RMFMKL.ml.m(junk)
junk.fit2<-fun.RPRS.ml.m(junk)
qqplot.gld(junk.fit1,data=junk,param="fmkl")
qqplot.gld(junk.fit2,data=junk,param="rs")
##Using discretised approach, with 50 classes
#Using discretised method with weights
obj.fit1.hs<-fun.data.fit.hs(junk)
#Plot the resulting fit. The fun.plot.fit function also works for singular fits
#such as those by fun.plot.fit(obj.fit1.ml,junk,nclass=50,
#param=c("rs","fmkl","fmkl"),xlab="x")
fun.plot.fit(obj.fit1.hs,junk,nclass=50,param=c("rs","fmkl"),xlab="x")
#Compare the mean, variance, skewness and kurtosis of the fitted distribution
#with actual data
fun.theo.mv.gld(obj.fit1.hs[1,1],obj.fit1.hs[2,1],obj.fit1.hs[3,1],
obj.fit1.hs[4,1],param="rs")
fun.theo.mv.gld(obj.fit1.hs[1,2],obj.fit1.hs[2,2],obj.fit1.hs[3,2],
obj.fit1.hs[4,2],param="fmkl")
fun.moments.r(junk)
#Conduct resample KS tests
fun.diag.ks.g(obj.fit1.hs[,1],junk,param="rs")
fun.diag.ks.g(obj.fit1.hs[,2],junk,param="fmkl")
##Try another fit, say 15 classes
obj.fit2.hs<-fun.data.fit.hs(junk,rs.default="N",fmkl.default="N",no.c.rs = 15,
no.c.fmkl = 15)
fun.plot.fit(obj.fit2.hs,junk,nclass=50,param=c("rs","fmkl"),xlab="x")
fun.theo.mv.gld(obj.fit2.hs[1,1],obj.fit2.hs[2,1],obj.fit2.hs[3,1],
obj.fit2.hs[4,1],param="rs")
fun.theo.mv.gld(obj.fit2.hs[1,2],obj.fit2.hs[2,2],obj.fit2.hs[3,2],
obj.fit2.hs[4,2],param="fmkl")
fun.moments.r(junk)
fun.diag.ks.g(obj.fit2.hs[,1],junk,param="rs")
fun.diag.ks.g(obj.fit2.hs[,2],junk,param="fmkl")
##Uses the maximum likelihood estimation method
#Fit the function using fun.data.fit.ml
obj.fit1.ml<-fun.data.fit.ml(junk)
#Plot the resulting fit
fun.plot.fit(obj.fit1.ml,junk,nclass=50,param=c("rs","fmkl","fmkl"),xlab="x",
name=".ML")
#Compare the mean, variance, skewness and kurtosis of the fitted distribution
#with actual data
fun.comp.moments.ml(obj.fit1.ml,junk)
#Do a quantile plot
qqplot.gld(junk,obj.fit1.ml[,1],param="rs",name="RS ML fit")
#Run a KS resample test on the resulting fit
fun.diag2(obj.fit1.ml,junk,1000)
#It is possible to use say fun.data.fit.ml(junk,rs.leap=409,fmkl.leap=409,
#FUN="QUnif") to find solution under a different set of low discrepancy number
#generators.
###Bimodal distributions example:
#Fitting mixture of generalised lambda distributions on the data set using both
#the maximum likelihood and partition maximum likelihood and plot the resulting
#fits
opar <- par()
par(mfrow=c(2,1))
junk<-fun.auto.bimodal.ml(faithful[,1],per.of.mix=0.01,clustering.m=clara,
init1.sel="rprs",init2.sel="rmfmkl",init1=c(-1.5,1.5),init2=c(-0.25,1.5),
leap1=3,leap2=3)
fun.plot.fit.bm(nclass=50,fit.obj=junk,data=faithful[,1],
name="Maximum likelihood using",xlab="faithful1",param.vec=c("rs","fmkl"))
#Do a quantile plot
qqplot.gld.bi(faithful[,1],junk$par,param1="rs",param2="fmkl",
name="RS FMKL ML fit",range=c(0.001,0.999))
par(opar)
junk<-fun.auto.bimodal.pml(faithful[,1],clustering.m=clara,init1.sel="rprs",
init2.sel="rmfmkl",init1=c(-1.5,1.5),init2=c(-0.25,1.5),leap1=3,leap2=3)
fun.plot.fit.bm(nclass=50,fit.obj=junk,data=faithful[,1],
name="Partition maximum likelihood using",xlab="faithful1",
param.vec=c("rs","fmkl"))
#Fit the faithful[,1] data from the dataset library using sobol sequence
#generator for the first distribution fit and halton sequence for the second
#distribution fit.
fit1<-fun.auto.bimodal.ml(faithful[,1],init1.sel="rmfmkl",init2.sel="rmfmkl",
init1=c(-0.25,1.5),init2=c(-0.25,1.5),leap1=3,leap2=3,fun1="runif.sobol",
fun2="runif.halton")
#Run diagnostic KS tests
fun.diag.ks.g.bimodal(fit1$par[1:4],fit1$par[5:8],prop1=fit1$par[9],
data=faithful[,1],param1="fmkl",param2="fmkl")
#Find the theoretical moments of the fit
fun.theo.bi.mv.gld(fit1$par[1],fit1$par[2],fit1$par[3],fit1$par[4],"fmkl",
fit1$par[5],fit1$par[6],fit1$par[7],fit1$par[8],"fmkl",fit1$par[9])
#Compare this with the empirical moments from the data set.
fun.moments.r(faithful[,1])
#Do a quantile plot
qqplot.gld.bi(faithful[,1],fit1$par,param1="fmkl",param2="fmkl",
name="FMKL FMKL ML fit")
#Quantile matching method
#Fitting faithful data from the dataset library, with the clara clustering
#regime. The first distribution is RS and the second distribution is fmkl.
#The percentage of data mix is 1%.
#Fitting normal(3,2) distriution using the default setting
junk<-rnorm(50,3,2)
fun.data.fit.qs(junk)
fun.auto.bimodal.qs(faithful[,1],per.of.mix=0.01,clustering.m=clara,
init1.sel="rprs",init2.sel="rmfmkl",init1=c(-1.5,1,5),init2=c(-0.25,1.5),
leap1=3,leap2=3)
#L Moment matching
#Fitting normal(3,2) distriution using the default setting
junk<-rnorm(50,3,2)
fun.data.fit.lm(junk)
# Moment matching method
#Fitting normal(3,2) distriution using the default setting
junk<-rnorm(50,3,2)
fun.data.fit.mm(junk)
# Example on fitting mixture of normal distributions
data1<-c(rnorm(1500,-1,2/3),rnorm(1500,1,2/3))
junk<-fun.auto.bimodal.ml(data1,per.of.mix=0.01,clustering.m=data1>0,
init1.sel="rprs",init2.sel="rmfmkl",init1=c(-1.5,1.5),init2=c(-0.25,1.5),
leap1=3,leap2=3)
fun.plot.fit.bm(nclass=50,fit.obj=junk,data=data1,
name="Maximum likelihood using",xlab="faithful1",param.vec=c("rs","fmkl"))
qqplot.gld.bi(data1,junk$par,param1="rs",param2="fmkl",
name="RS FMKL ML fit",range=c(0.001,0.999))
# Generate random observations from FMKL generalised lambda distributions with
# parameters (1,2,3,4) and (4,3,2,1) with 50% of data from each distribution.
fun.simu.bimodal(c(1,2,3,4),c(4,3,2,1),prop1=0.5,param1="fmkl",param2="fmkl")
This is a collection of functions designed to find the initial values under the method of moments for FMKL generalised lambda distribution. It also includes basic FMKL GLD functions.
Description
These functions are for maintainers only and it is not designed for the users of this package.
Value
List of computation results for various fitting algorithms, not intended to be used by users of this package
Note
Please contact the author directly if you find a bug!
Author(s)
Steve Su
This is a collection of functions designed to implement the fitting algorithms for all the methods covered in this package.
Description
These functions are for maintainers only and it is not designed for the users of this package.
Value
List of unique optimisation parameters and objective values under different fitting methods, not intended to be used directly by users of this package
Note
Please contact the author directly if you find a bug!
Author(s)
Steve Su
This is a collection of functions designed to implement the basic GLD functions.
Description
These functions are for maintainers only and it is not designed for the users of this package.
Value
Various basic calculations for generalised lambda distributions, not intended to be used by users of this package
Note
Please contact the author directly if you find a bug!
Author(s)
Steve Su
This is a collection of functions designed to find the initial values of method of moments for RS generalised lambda distribution. It also includes basic RS GLD functions.
Description
These functions are for maintainers only and it is not designed for the users of this package.
Value
List of computation results for various fitting algorithms, not intended to be used by users of this package
Note
Please contact the author directly if you find a bug!
Author(s)
Steve Su
The Generalised Lambda Distribution Family
Description
Density, quantile density, distribution function, quantile
function and random generation for the generalised lambda
distribution (also known as the asymmetric lambda, or Tukey
lambda). Works for both the "fmkl"
and "rs"
parameterisations. These functions originate from the gld
library by Robert King and they are modified in this package
to allow greater functionality and adaptability to new fitting
methods. It does not give an error message for invalid distributions
but will return NAs instead. To allow comparability with the pkg(gld)
package, this package uses the same notation and description as those
written by Robert King.
Usage
dgl(x, lambda1 = 0, lambda2 = NULL, lambda3 = NULL, lambda4 = NULL,
param = "fmkl", inverse.eps = 1e-08,
max.iterations = 500)
pgl(q, lambda1 = 0, lambda2 = NULL, lambda3 = NULL, lambda4 = NULL,
param = "fmkl", inverse.eps = 1e-08,
max.iterations = 500)
qgl(p, lambda1, lambda2 = NULL, lambda3 = NULL, lambda4 = NULL,
param = "fmkl")
rgl(n, lambda1=0, lambda2 = NULL, lambda3 = NULL, lambda4 = NULL,
param = "fmkl")
Arguments
x |
Vector of actual values for |
p |
Vector of probabilities for |
q |
Vector of quantiles for |
n |
Number of observations to be generated for |
lambda1 |
This can be either a single numeric value or a vector. If it
is a vector, it must be of length 4 for parameterisations |
lambda2 |
Scale parameter |
lambda3 |
First shape parameter |
lambda4 |
Second shape parameter |
param |
|
inverse.eps |
Accuracy of calculation for the numerical determination of F(x), defaults to 1e-8 |
max.iterations |
Maximum number of iterations in the numerical determination of F(x), defaults to 500 |
Details
The generalised lambda distribution, also known as the asymmetric lambda, or Tukey lambda distribution, is a distribution with a wide range of shapes. The distribution is defined by its quantile function, the inverse of the distribution function. The 'gld' package implements three parameterisations of the distribution. The default parameterisation (the FMKL) is that due to Freimer Mudholkar, Kollia and Lin (1988) (see references below), with a quantile function:
F^{-1}(u)= \lambda_1 + { { \frac{u^{\lambda_3}-1}{\lambda_3} -
\frac{(1-u)^{\lambda_4}-1}{\lambda_4} } \over \lambda_2 }
for \lambda_2 > 0
.
A second parameterisation, the RS, chosen by setting param="rs"
is
that due to Ramberg and Schmeiser (1974), with the quantile function:
F^{-1}(u)= \lambda_1 + \frac{u^{\lambda_3} - (1-u)^{\lambda_4}}
{\lambda_2}
This parameterisation has a complex series of rules determining
which values of the parameters produce valid statistical
distributions. See gl.check.lambda
for details.
Value
dgl |
gives the density (based on the quantile density and a
numerical solution to |
qdgl |
gives the quantile density, |
pgl |
gives the distribution function (based on a numerical
solution to |
qgl |
gives the quantile function, and |
rgl |
generates random observations. |
References
Freimer, M., Kollia, G., Mudholkar, G. S. & Lin, C. T. (1988), A study of the generalized tukey lambda family, Communications in Statistics - Theory and Methods *17*, 3547-3567.
Gilchrist, Warren G. (2000), Statistical Modelling with Quantile Functions, Chapman & Hall
Karian, Z.A., Dudewicz, E.J., and McDonald, P. (1996), The extended generalized lambda distribution system for fitting distributions to data: history, completion of theory, tables, applications, the “Final Word” on Moment fits, Communications in Statistics - Simulation and Computation *25*, 611-642.
Karian, Zaven A. and Dudewicz, Edward J. (2000), Fitting statistical distributions: the Generalized Lambda Distribution and Generalized Bootstrap methods, Chapman & Hall
Ramberg, J. S. & Schmeiser, B. W. (1974), An approximate method for generating asymmetric random variables, Communications of the ACM *17*, 78-82.
Examples
qgl(seq(0,1,0.02),0,1,0.123,-4.3)
pgl(seq(-2,2,0.2),0,1,-.1,-.2,param="fmkl",inverse.eps=.Machine$double.eps)
L-moments
Description
Calculates sample L-moments, L-coefficients and covariance matrix of L-moments.
Usage
Lmoments(data,rmax=4,na.rm=FALSE,returnobject=FALSE,trim=c(0,0))
Lcoefs(data,rmax=4,na.rm=FALSE,trim=c(0,0))
Lmomcov(data,rmax=4,na.rm=FALSE)
Lmoments_calc(data,rmax=4)
Lmomcov_calc(data,rmax=4)
Arguments
data |
matrix or data frame. |
rmax |
maximum order of L-moments. |
na.rm |
a logical value indicating whether 'NA' values should be removed before the computation proceeds. |
returnobject |
a logical value indicating whether a list object should be returned instead of an array of L-moments. |
trim |
c(0,0) for ordinary L-moments and c(1,1) for trimmed (t=1) L-moments |
Value
Lmoments |
returns an array of L-moments containing a row for each variable in data, or if returnobject=TRUE, a list containing the following: |
lambdas |
an array of L-moments |
ratios |
an array of mean, L-scale and L-moment ratios |
trim |
the value of the parameter 'trim' |
source |
a string with value "Lmoments" or "t1lmoments" |
Lcoefs |
returns an array of L-coefficients (mean, L-scale, L-skewness, L-kurtosis, ...) |
Lmomcov |
returns the covariance matrix of L-moments or a list of covariance matrices if the input has multiple columns. |
Lmoments_calc |
is internal function. |
Lmomcov_calc |
is internal function. |
Note
Functions Lmoments
and Lcoefs
calculate trimmed L-moments
if you specify trim=c(1,1)
.
Author(s)
Juha Karvanen <juha.karvanen@ktl.fi>
References
Karvanen, J. and A. Nuutinen (2008). "Characterizing the generalized lambda distribution by L-moments." Computational Statistics and Data Analysis 52(4): 1971-1983.
Asquith, W. (2007). "L-moments and TL-moments of the generalized lambda distribution." Computational Statistics and Data Analysis 51(9): 4484-4496.
Elamir, E. A., Seheult, A. H. 2004. "Exact variance structure of sample L-moments" Journal of Statistical Planning and Inference 124 (2) 337-359.
Hosking, J. 1990. "L-moments: Analysis and estimation distributions using linear combinations of order statistics", Journal of Royal Statistical Society B 52, 105-124.
See Also
t1lmoments
for trimmed L-moments
Examples
x<-rnorm(500)
Lmoments(x)
This is a collection of functions used in the optimisation processes for all the fitting methods covered in this package.
Description
These functions are for maintainers only and it is not designed for the users of this package.
Value
Optimisation value under different fitting schemes, not intended to be used by users of this package
Note
Please contact the author directly if you find a bug!
Author(s)
Steve Su
Quasi Randum Numbers via Halton Sequences
Description
These functions provide quasi random numbers or space filling or
low discrepancy sequences in the p
-dimensional unit cube.
Usage
sHalton(n.max, n.min = 1, base = 2, leap = 1)
QUnif (n, min = 0, max = 1, n.min = 1, p, leap = 1)
Arguments
n.max |
maximal (sequence) number. |
n.min |
minimal sequence number. |
n |
number of |
base |
integer |
min , max |
lower and upper limits of the univariate intervals.
Must be of length 1 or |
p |
dimensionality of space (the unit cube) in which points are generated. |
leap |
integer indicating (if |
Value
sHalton(n,m)
returns a numeric vector of length n-m+1
of
values in [0,1]
.
QUnif(n, min, max, n.min, p=p)
generates n-n.min+1
p-dimensional points in [min,max]^p
returning a numeric matrix
with p columns.
Note
For leap
Kocis and Whiten recommend values of
L=31,61,149,409
, and particularly the L=409
for dimensions
up to 400.
Author(s)
Martin Maechler
References
James Gentle (1998) Random Number Generation and Monte Carlo Simulation; sec.\ 6.3. Springer.
Kocis, L. and Whiten, W.J. (1997) Computationsl Investigations of Low-Discrepancy Sequences. ACM Transactions of Mathematical Software 23, 2, 266–294.
Examples
32*sHalton(20, base=2)
QUnif(n=10,p=2,leap=409)
Digit/Bit Representation of Integers in any Base
Description
Integer number representations in other Bases.
Formally, for every element N =
x[i]
, compute the (vector
of) “digits” A
of the base
b
representation of the number N
, N = \sum_{k=0}^M A_{M-k} b ^ k
.
Revert such a representation to integers.
Usage
digitsBase(x, base = 2, ndigits = 1 + floor(log(max(x), base)))
Arguments
x |
For For |
base |
integer, at least 2 specifying the base for representation. |
ndigits |
number of bits/digits to use. |
Value
For digitsBase()
, an object, say m
, of class
"basedInt"
which is basically a (ndigits
x n
)
matrix
where m[,i]
corresponds to x[i]
,
n <- length(x)
and attr(m,"base")
is the input
base
.
as.intBase()
and the as.integer
method for
basedInt
objects return an integer
vector.
Note
digits
and digits.v
are now deprecated and will be
removed from the sfsmisc package.
Author(s)
Martin Maechler, Dec 4, 1991 (for S-plus; then called digits.v
).
Examples
digitsBase(0:12, 8) #-- octal representation
## This may be handy for just one number (and default decimal):
digits <- function(n, base = 10) as.vector(digitsBase(n, base = base))
digits(128, base = 8) # 2 0 0
## one way of pretty printing (base <= 10!)
b2ch <- function(db)
noquote(gsub("^0+(.{1,})$"," \1", apply(db, 2, paste, collapse = "")))
b2ch(digitsBase(0:33, 2)) #-> 0 1 10 11 100 101 ... 100001
b2ch(digitsBase(0:33, 4)) #-> 0 1 2 3 10 11 12 13 20 ... 200 201
## Hexadecimal:
i <- c(1:20, 100:106)
M <- digitsBase(i, 16)
hexdig <- c(0:9, LETTERS[1:6])
cM <- hexdig[1 + M]; dim(cM) <- dim(M)
b2ch(cM) #-> 1 2 3 4 5 6 7 8 9 A B C D E F 10 11 ... 6A
Fit FMKL generalised distribution to data using discretised approach with weights.
Description
This function fits FMKL generalised distribution to data using discretised approach with weights. It is designed to act as a smoother device rather than as a definitive fit.
Usage
fun.RMFMKL.hs(data, default = "Y", fmkl.init = c(-0.25, 1.5), no.c.fmkl = 50,
leap = 3,FUN="runif.sobol",no=10000)
Arguments
data |
Dataset to be fitted |
default |
If yes, this function uses the default method
|
fmkl.init |
Initial values for FMKL distribution optimization,
|
no.c.fmkl |
Number of classes or bins of histogram to be optimized over.
This argument is ineffective if |
leap |
See scrambling argument in |
FUN |
A character string of either |
no |
Number of initial random values to find the best initial values for optimisation. |
Details
This function optimises the deviations of frequency of the bins to that of the theoretical so it has the effect of "fitting clothes" onto the data set. The user can decide the frequency of the bins they want the distribution to smooth over. The resulting fit may or may not be an adequate fit from a formal statistical point of view such as satisfying the goodness of fit for example, but it can be useful to suggest the range of different distributions exhibited by the data set. The default number of classes calculates the mean and variance after categorising the data into different bins and uses the number of classes that best matches the mean and variance of the original, ungrouped data. The weighting is designed to accentuate the peak or the dense part of the distribution and suppress the tails.
Value
A vector representing four parametefmkl of the FMKL generalised lambda distribution.
Note
In some cases, the resulting fit may not converge, there are currently no checking mechanism in place to ensure global convergence.
Author(s)
Steve Su
References
Su, S. (2005). A Discretized Approach to Flexibly Fit Generalized Lambda Distributions to Data. Journal of Modern Applied Statistical Methods (November): 408-424.
Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.
See Also
fun.RMFMKL.hs.nw
, fun.RPRS.hs.nw
,
fun.RPRS.hs
, fun.data.fit.hs
,
fun.data.fit.hs.nw
Examples
# Using the default number of classes
fun.RMFMKL.hs(data=rnorm(1000,3,2),default="Y",fmkl.init=c(-0.25,1.5),leap=3)
# Using 20 classes
fun.RMFMKL.hs(data=rnorm(1000,3,2),default="N",fmkl.init=c(-0.25,1.5),
no.c.fmkl=20,leap=3)
Fit FMKL generalised distribution to data using discretised approach without weights.
Description
This function fits FMKL generalised distribution to data using discretised approach without weights. It is designed to act as a smoother device rather than as a definitive fit.
Usage
fun.RMFMKL.hs.nw(data, default = "Y", fmkl.init = c(-0.25, 1.5),
no.c.fmkl = 50, leap = 3,FUN="runif.sobol",no=10000)
Arguments
data |
Dataset to be fitted |
default |
If yes, this function uses the default method
|
fmkl.init |
Initial values for FMKL distribution optimization,
|
no.c.fmkl |
Number of classes or bins of histogram to be optimized over.
This argument is ineffective if |
leap |
See scrambling argument in |
FUN |
A character string of either |
no |
Number of initial random values to find the best initial values for optimisation. |
Details
This function optimises the deviations of frequency of the bins to that of the theoretical so it has the effect of "fitting clothes" onto the data set. The user can decide the frequency of the bins they want the distribution to smooth over. The resulting fit may or may not be an adequate fit from a formal statistical point of view such as satisfying the goodness of fit for example, but it can be useful to suggest the range of different distributions exhibited by the data set. The default number of classes calculates the mean and variance after categorising the data into different bins and uses the number of classes that best matches the mean and variance of the original, ungrouped data.
Value
A vector representing four parameters of the FMKL generalised lambda distribution.
Note
In some cases, the resulting fit may not converge, there are currently no checking mechanism in place to ensure global convergence.
Author(s)
Steve Su
References
Su, S. (2005). A Discretized Approach to Flexibly Fit Generalized Lambda Distributions to Data. Journal of Modern Applied Statistical Methods (November): 408-424.
Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.
See Also
fun.RPRS.hs.nw
, fun.RMFMKL.hs
,
fun.RPRS.hs
, fun.data.fit.hs
,
fun.data.fit.hs.nw
Examples
# Using the default number of classes
fun.RMFMKL.hs.nw(data=rnorm(1000,3,2),default="Y",
fmkl.init=c(-0.25,1.5),leap=3)
# Using 20 classes
fun.RMFMKL.hs.nw(data=rnorm(1000,6,5),default="N",fmkl.init=c(-0.25,1.5),
no.c.fmkl=20,leap=3)
Fit FMKL generalised lambda distribution to data set using L moment matching
Description
This function fits FMKL generalised lambda distribution to data set using L moment matching
Usage
fun.RMFMKL.lm(data, fmkl.init = c(-0.25, 1.5), leap = 3, FUN = "runif.sobol",
no = 10000)
Arguments
data |
Dataset to be fitted |
fmkl.init |
Initial values for FMKL distribution optimization,
|
leap |
See scrambling argument in |
FUN |
A character string of either |
no |
Number of initial random values to find the best initial values for optimisation. |
Details
This function provides method of L moment fitting scheme for FMKL GLD. Note this function can fail if there are no defined percentiles from the data set or if the initial values do not lead to a valid FMKL generalised lambda distribution.
This function is based on scheme detailed in the literature below but it has been modified by the author (Steve Su).
Value
A vector representing four parametefmkl of the FMKL generalised lambda distribution.
Author(s)
Steve Su
References
Asquith, W. (2007). "L-moments and TL-moments of the generalized lambda distribution." Computational Statistics and Data Analysis 51(9): 4484-4496.
Karvanen, J. and A. Nuutinen (2008). "Characterizing the generalized lambda distribution by L-moments." Computational Statistics and Data Analysis 52(4): 1971-1983.
See Also
fun.RMFMKL.ml
, fun.RMFMKL.mm
,
fun.RMFMKL.qs
, fun.data.fit.ml
fun.data.fit.lm
, fun.data.fit.qs
,
fun.data.fit.mm
Examples
# Fitting the normal distribution
fun.RMFMKL.lm(data=rnorm(1000,2,3),fmkl.init=c(-0.25,1.5),leap=3)
Fit FMKL generalised lambda distribution to data set using maximum likelihood estimation
Description
This function fits FMKL generalised lambda distribution to data set using maximum likelihood estimation.
Usage
fun.RMFMKL.ml(data, fmkl.init = c(-0.25, 1.5), leap = 3,FUN="runif.sobol",
no=10000)
Arguments
data |
Dataset to be fitted |
fmkl.init |
Initial values for FMKL distribution optimization,
|
leap |
See scrambling argument in |
FUN |
A character string of either |
no |
Number of initial random values to find the best initial values for optimisation. |
Details
This function provides one of the definitive fit to data set using generalised lambda distributions.
Value
A vector representing four parameters of the FMKL generalised lambda distribution.
Author(s)
Steve Su
References
Su, S. (2007). Numerical Maximum Log Likelihood Estimation for Generalized Lambda Distributions. Journal of Computational statistics and data analysis 51(8) 3983-3998.
Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.
See Also
fun.RPRS.ml
, fun.data.fit.ml
,
fun.data.fit.qs
Examples
# Fitting the normal distribution
fun.RMFMKL.ml(data=rnorm(1000,2,3),fmkl.init=c(-0.25,1.5),leap=3)
Fit RS generalised lambda distribution to data set using maximum likelihood estimation
Description
This function fits FMKL generalised lambda distribution to data set using maximum likelihood estimation using faster implementation through C programming
Usage
fun.RMFMKL.ml.m(data, fmkl.init = c(-0.25, 1.5), leap = 3, FUN = "runif.sobol",
no = 10000)
Arguments
data |
Dataset to be fitted |
fmkl.init |
Initial values for FMKL distribution optimization,
|
leap |
See scrambling argument in |
FUN |
A character string of either |
no |
Number of initial random values to find the best initial values for optimisation. |
Details
This function provides one of the definitive fit to data set using generalised lambda distributions.
Value
A vector representing four parameters of the FMKL generalised lambda distribution.
Author(s)
Steve Su
References
Su, S. (2007). Numerical Maximum Log Likelihood Estimation for Generalized Lambda Distributions. Journal of Computational statistics and data analysis 51(8) 3983-3998.
Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.
See Also
Examples
# Fitting the normal distribution
fun.RMFMKL.ml.m(data=rnorm(1000,2,3),fmkl.init=c(-0.25,1.5),leap=3)
Fit FMKL generalised lambda distribution to data set using moment matching
Description
This function fits FMKL generalised lambda distribution to data set using moment matching
Usage
fun.RMFMKL.mm(data, fmkl.init = c(-0.25, 1.5), leap = 3, FUN = "runif.sobol",
no = 10000)
Arguments
data |
Dataset to be fitted |
fmkl.init |
Initial values for FMKL distribution optimization,
|
leap |
See scrambling argument in |
FUN |
A character string of either |
no |
Number of initial random values to find the best initial values for optimisation. |
Details
This function provides method of moment fitting scheme for FMKL GLD. Note this function can fail if there are no defined percentiles from the data set or if the initial values do not lead to a valid FMKL generalised lambda distribution.
This function is based on scheme detailed in the literature below but it has been modified by the author (Steve Su).
Value
A vector representing four parametefmkl of the FMKL generalised lambda distribution.
Author(s)
Steve Su
References
Karian, Z. and E. Dudewicz (2000). Fitting Statistical Distributions: The Generalized Lambda Distribution and Generalised Bootstrap Methods. New York, Chapman and Hall.
See Also
fun.RMFMKL.ml
, fun.RMFMKL.lm
,
fun.RMFMKL.qs
, fun.data.fit.ml
fun.data.fit.lm
, fun.data.fit.qs
,
fun.data.fit.mm
Examples
# Fitting the normal distribution
fun.RMFMKL.mm(data=rnorm(1000,2,3),fmkl.init=c(-0.25,1.5),leap=3)
Fit FMKL generalised lambda distribution to data set using quantile matching
Description
This function fits FMKL generalised lambda distribution to data set using quantile matching
Usage
fun.RMFMKL.qs(data, fmkl.init = c(-0.25, 1.5), leap = 3, FUN = "runif.sobol",
trial.n = 100, len = 1000, type = 7, no = 10000)
Arguments
data |
Dataset to be fitted |
fmkl.init |
Initial values for FMKL distribution optimization,
|
leap |
See scrambling argument in |
FUN |
A character string of either |
trial.n |
Number of evenly spaced quantile ranging from 0 to 1 to be
used in the checking phase, to find the best set of initial values for
optimisation, this is intended to be lower than |
len |
Number of evenly spaced quantile ranging from 0 to 1 to be used, default is 1000 |
type |
Type of quantile to be used, default is 7, see |
no |
Number of initial random values to find the best initial values for optimisation. |
Details
This function provides quantile matching fitting scheme for FMKL GLD. Note this function can fail if there are no defined percentiles from the data set or if the initial values do not lead to a valid FMKL generalised lambda distribution.
Value
A vector representing four parametefmkl of the FMKL generalised lambda distribution.
Author(s)
Steve Su
References
Su (2008). Fitting GLD to data via quantile matching method. (Book chapter to appear)
See Also
fun.RMFMKL.ml
, fun.RMFMKL.lm
,
fun.RMFMKL.mm
, fun.data.fit.ml
fun.data.fit.lm
, fun.data.fit.qs
,
fun.data.fit.mm
Examples
# Fitting the normal distribution
fun.RMFMKL.qs(data=rnorm(1000,2,3),fmkl.init=c(-0.25,1.5),leap=3)
Fit RS generalised distribution to data using discretised approach with weights.
Description
This function fits RS generalised distribution to data using discretised approach with weights. It is designed to act as a smoother device rather than as a definitive fit.
Usage
fun.RPRS.hs(data, default = "Y", rs.init = c(-1.5, 1.5), no.c.rs = 50,
leap = 3,FUN="runif.sobol",no=10000)
Arguments
data |
Dataset to be fitted |
default |
If yes, this function uses the default method
|
rs.init |
Initial values for RS distribution optimization,
|
no.c.rs |
Number of classes or bins of histogram to be optimized over.
This argument is ineffective if |
leap |
See scrambling argument in |
FUN |
A character string of either |
no |
Number of initial random values to find the best initial values for optimisation. |
Details
This function optimises the deviations of frequency of the bins to that of the theoretical so it has the effect of "fitting clothes" onto the data set. The user can decide the frequency of the bins they want the distribution to smooth over. The resulting fit may or may not be an adequate fit from a formal statistical point of view such as satisfying the goodness of fit for example, but it can be useful to suggest the range of different distributions exhibited by the data set. The default number of classes calculates the mean and variance after categorising the data into different bins and uses the number of classes that best matches the mean and variance of the original, ungrouped data. The weighting is designed to accentuate the peak or the dense part of the distribution and suppress the tails.
Value
A vector representing four parameters of the RS generalised lambda distribution.
Note
In some cases, the resulting fit may not converge, there are currently no checking mechanism in place to ensure global convergence. The RPRS method can sometimes fail if there are no valid percentiles in the data set or if initial values do not give a valid distribution.
Author(s)
Steve Su
References
Su, S. (2005). A Discretized Approach to Flexibly Fit Generalized Lambda Distributions to Data. Journal of Modern Applied Statistical Methods (November): 408-424.
Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.
See Also
fun.RPRS.hs.nw
, fun.RMFMKL.hs.nw
,
fun.RMFMKL.hs
, fun.data.fit.hs
,
fun.data.fit.hs.nw
Examples
# Using the default number of classes
fun.RPRS.hs(data=rnorm(1000,2,3),default="Y",rs.init=c(-1.5,1.5),leap=3)
# Using 20 classes
fun.RPRS.hs(data=rnorm(1000,2,3),default="N",rs.init=c(-1.5,1.5),
no.c.rs=20,leap=3)
Fit RS generalised distribution to data using discretised approach without weights.
Description
This function fits RS generalised distribution to data using discretised approach without weights. It is designed to act as a smoother device rather than as a definitive fit.
Usage
fun.RPRS.hs.nw(data, default = "Y", rs.init = c(-1.5, 1.5), no.c.rs = 50,
leap = 3,FUN="runif.sobol",no=10000)
Arguments
data |
Dataset to be fitted |
default |
If yes, this function uses the default method
|
rs.init |
Initial values for RS distribution optimization,
|
no.c.rs |
Number of classes or bins of histogram to be optimized over.
This argument is ineffective if |
leap |
See scrambling argument in |
FUN |
A character string of either |
no |
Number of initial random values to find the best initial values for optimisation. |
Details
This function optimises the deviations of frequency of the bins to that of the theoretical so it has the effect of "fitting clothes" onto the data set. The user can decide the frequency of the bins they want the distribution to smooth over. The resulting fit may or may not be an adequate fit from a formal statistical point of view such as satisfying the goodness of fit for example, but it can be useful to suggest the range of different distributions exhibited by the data set. The default number of classes calculates the mean and variance after categorising the data into different bins and uses the number of classes that best matches the mean and variance of the original, ungrouped data.
Value
A vector representing four parameters of the RS generalised lambda distribution.
Note
In some cases, the resulting fit may not converge, there are currently no checking mechanism in place to ensure global convergence. The RPRS method can sometimes fail if there are no valid percentiles in the data set or if initial values do not give a valid distribution.
Author(s)
Steve Su
References
Su, S. (2005). A Discretized Approach to Flexibly Fit Generalized Lambda Distributions to Data. Journal of Modern Applied Statistical Methods (November): 408-424.
Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.
See Also
fun.RPRS.hs.nw
, fun.RPRS.hs
,
fun.RMFMKL.hs
, fun.data.fit.hs
,
fun.data.fit.hs.nw
Examples
# Using the default number of classes
fun.RPRS.hs.nw(data=rnorm(1000,3,2),default="Y",rs.init=c(-1.5,1.5),leap=3)
# Using 20 classes
fun.RPRS.hs.nw(data=rnorm(1000,3,2),default="N",rs.init=c(-1.5,1.5),
no.c.rs=20,leap=3)
Fit RS generalised lambda distribution to data set using L moment matching
Description
This function fits RS generalised lambda distribution to data set using L moment matching
Usage
fun.RPRS.lm(data, rs.init = c(-1.5, 1.5), leap = 3, FUN = "runif.sobol",
no = 10000)
Arguments
data |
Dataset to be fitted |
rs.init |
Initial values for RS distribution optimization,
|
leap |
See scrambling argument in |
FUN |
A character string of either |
no |
Number of initial random values to find the best initial values for optimisation. |
Details
This function provides method of L moment fitting scheme for RS GLD. Note this function can fail if there are no defined percentiles from the data set or if the initial values do not lead to a valid RS generalised lambda distribution.
This function is based on scheme detailed in the literature below but it has been modified by the author (Steve Su).
Value
A vector representing four parameters of the RS generalised lambda distribution.
Author(s)
Steve Su
References
Asquith, W. (2007). "L-moments and TL-moments of the generalized lambda distribution." Computational Statistics and Data Analysis 51(9): 4484-4496.
Karvanen, J. and A. Nuutinen (2008). "Characterizing the generalized lambda distribution by L-moments." Computational Statistics and Data Analysis 52(4): 1971-1983.
See Also
fun.RPRS.ml
, fun.RPRS.mm
,
fun.RPRS.qs
, fun.data.fit.ml
fun.data.fit.lm
, fun.data.fit.qs
,
fun.data.fit.mm
Examples
# Fitting the normal distribution
fun.RPRS.lm(data=rnorm(1000,2,3),rs.init=c(-1.5,1.5),leap=3)
Fit RS generalised lambda distribution to data set using maximum likelihood estimation
Description
This function fits RS generalised lambda distribution to data set using maximum likelihood estimation.
Usage
fun.RPRS.ml(data, rs.init = c(-1.5, 1.5), leap = 3,FUN="runif.sobol",no=10000)
Arguments
data |
Dataset to be fitted |
rs.init |
Initial values for RS distribution optimization,
|
leap |
See scrambling argument in |
FUN |
A character string of either |
no |
Number of initial random values to find the best initial values for optimisation. |
Details
This function provides one of the definitive fit to data set using generalised lambda distributions. Note this function can fail if there are no defined percentiles from the data set or if the initial values do not lead to a valid RS generalised lambda distribution.
Value
A vector representing four parameters of the RS generalised lambda distribution.
Author(s)
Steve Su
References
Su, S. (2007). Numerical Maximum Log Likelihood Estimation for Generalized Lambda Distributions. Computational statistics and data analysis 51(8) 3983-3998.
Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.
See Also
fun.RMFMKL.ml
, fun.data.fit.ml
,
fun.data.fit.qs
Examples
# Fitting the normal distribution
fun.RPRS.ml(data=rnorm(1000,2,3),rs.init=c(-1.5,1.5),leap=3)
Fit RS generalised lambda distribution to data set using maximum likelihood estimation
Description
This function fits RS generalised lambda distribution to data set using maximum likelihood estimation using faster implementation through C programming
Usage
fun.RPRS.ml.m(data, rs.init = c(-1.5, 1.5), leap = 3, FUN = "runif.sobol",
no = 10000)
Arguments
data |
Dataset to be fitted |
rs.init |
Initial values for RS distribution optimization,
|
leap |
See scrambling argument in |
FUN |
A character string of either |
no |
Number of initial random values to find the best initial values for optimisation. |
Details
This function provides one of the definitive fit to data set using generalised lambda distributions. Note this function can fail if there are no defined percentiles from the data set or if the initial values do not lead to a valid RS generalised lambda distribution.
Value
A vector representing four parameters of the RS generalised lambda distribution.
Author(s)
Steve Su
References
Su, S. (2007). Numerical Maximum Log Likelihood Estimation for Generalized Lambda Distributions. Computational statistics and data analysis 51(8) 3983-3998.
Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.
See Also
Examples
# Fitting the normal distribution
fun.RPRS.ml.m(data=rnorm(1000,2,3),rs.init=c(-1.5,1.5),leap=3)
Fit RS generalised lambda distribution to data set using moment matching
Description
This function fits RS generalised lambda distribution to data set using moment matching
Usage
fun.RPRS.mm(data, rs.init = c(-1.5, 1.5), leap = 3, FUN = "runif.sobol",
no = 10000)
Arguments
data |
Dataset to be fitted |
rs.init |
Initial values for RS distribution optimization,
|
leap |
See scrambling argument in |
FUN |
A character string of either |
no |
Number of initial random values to find the best initial values for optimisation. |
Details
This function provides method of moment fitting scheme for RS GLD. Note this function can fail if there are no defined percentiles from the data set or if the initial values do not lead to a valid RS generalised lambda distribution.
This function is based on scheme detailed in the literature below but it has been modified by the author (Steve Su).
Value
A vector representing four parameters of the RS generalised lambda distribution.
Author(s)
Steve Su
References
Karian, Z. and E. Dudewicz (2000). Fitting Statistical Distributions: The Generalized Lambda Distribution and Generalised Bootstrap Methods. New York, Chapman and Hall.
See Also
fun.RPRS.ml
, fun.RPRS.lm
,
fun.RPRS.qs
, fun.data.fit.ml
fun.data.fit.lm
, fun.data.fit.qs
,
fun.data.fit.mm
Examples
# Fitting the normal distribution
fun.RPRS.mm(data=rnorm(1000,2,3),rs.init=c(-1.5,1.5),leap=3)
Fit RS generalised lambda distribution to data set using quantile matching
Description
This function fits RS generalised lambda distribution to data set using quantile matching
Usage
fun.RPRS.qs(data, rs.init = c(-1.5, 1.5), leap = 3, FUN = "runif.sobol",
trial.n = 100, len = 1000, type = 7, no = 10000)
Arguments
data |
Dataset to be fitted |
rs.init |
Initial values for RS distribution optimization,
|
leap |
See scrambling argument in |
FUN |
A character string of either |
trial.n |
Number of evenly spaced quantile ranging from 0 to 1 to be
used in the checking phase, to find the best set of initial values for
optimisation, this is intended to be lower than |
len |
Number of evenly spaced quantile ranging from 0 to 1 to be used, default is 1000 |
type |
Type of quantile to be used, default is 7, see |
no |
Number of initial random values to find the best initial values for optimisation. |
Details
This function provides quantile matching fitting scheme for RS GLD. Note this function can fail if there are no defined percentiles from the data set or if the initial values do not lead to a valid RS generalised lambda distribution.
Value
A vector representing four parameters of the RS generalised lambda distribution.
Author(s)
Steve Su
References
Su (2008). Fitting GLD to data via quantile matching method. (Book chapter to appear)
See Also
fun.RPRS.ml
, fun.RPRS.lm
,
fun.RPRS.mm
, fun.data.fit.ml
fun.data.fit.lm
, fun.data.fit.qs
,
fun.data.fit.mm
Examples
# Fitting the normal distribution
fun.RPRS.qs(data=rnorm(1000,2,3),rs.init=c(-1.5,1.5),leap=3)
Fitting mixture of generalied lambda distribtions to data using maximum likelihood estimation via the EM algorithm
Description
This function will fit mixture of generalised lambda distributions to dataset. It is restricted to two generalised lambda distributions. The method of fitting is maximum likelihood via EM algorithm. It is a two step optimization procedure, each unimodal part of the bimodal distribution is modelled using the maximum likelihood method or the starship method (FMKL GLD only), these initial values are the used to maximise the likelihood for the entire bimodal distribution using the EM algorithm. It fits mixture of the form p*(f1)+(1-p)*(f2) where f1 and f2 are pdfs of the generalised lambda distributions.
Usage
fun.auto.bimodal.ml(data, per.of.mix = 0.01, clustering.m = clara,
init1.sel = "rprs", init2.sel = "rprs", init1=c(-1.5, 1.5), init2=c(-1.5, 1.5),
leap1=3, leap2=3,fun1="runif.sobol",fun2="runif.sobol",no=10000, max.it=5000,
optim.further="Y")
Arguments
data |
A numerical vector representing the dataset. |
per.of.mix |
Level of mix between two parts of the distribution, usually 1-2% of cross mix is sufficient. |
clustering.m |
Clustering method used in classifying the dataset into two parts. Valid arguments include clara, fanny and pam from the cluster library. Default is clara. Or a logical vector specifying how data should be split. |
init1.sel |
This can be |
init2.sel |
This can be |
init1 |
Inititial values lambda3 and lambda4 for the first generalised lambda distribution. |
init2 |
Inititial values lambda3 and lambda4 for the second generalised lambda distribution. |
leap1 |
See scrambling argument in |
fun1 |
A character string of either |
leap2 |
See scrambling argument in |
fun2 |
A character string of either |
no |
Number of initial random values to find the best initial values for optimisation. |
max.it |
Maximum number of iterations for numerical optimisation. |
optim.further |
Whether to optimise the function further using full maximum likelihood method, recommended setting is "Y" |
Details
The initial values that work well for RPRS are c(-1.5,1.5)
and for RMFMKL
are c(-0.25,1.5)
. For scrambling, if 1
, 2
or 3
the
sequence is scrambled otherwise not. If 1
, Owen type type of scrambling
is applied, if 2
, Faure-Tezuka type of scrambling, is applied, and if
3
, both Owen+Faure-Tezuka type of scrambling is applied. The star
method uses the same initial values as rmfmkl
since it uses the FMKL
generalised lambda distribution. Nelder-Simplex algorithm is used in the
numerical optimization. rprs
stands for revised percentile method for
RS generalised lambda distribution and "rmfmkl" stands for revised method of
moment for FMKL generalised lambda distribution. These acronyms represents the
initial optimization algorithm used to get a reasonable set of initial values
for the subsequent optimization procedues.
This function is an improvement from Su (2007) in Journal of Statistical
Software.
Value
par |
The best set of parameters found, the first four corresponds to the first distribution fit, the second four corresponds to the second distribution fit, the last value correspond to p for the first distribution fit. |
value |
The value of -ML for the paramters obtained. |
counts |
A two-element integer vector giving the number of calls to
|
convergence |
|
message |
A character string giving any additional information returned by
the optimizer, or |
Note
If the number of observations is small,
rprs
can sometimes fail as the percentiles may not exist for this data.
Also, if the initial values do not span a valid generalised lambda distribution,
try another set of initial values.
Author(s)
Steve Su
References
Bratley P. and Fox B.L. (1988) Algorithm 659: Implementing Sobol's quasi random sequence generator, ACM Transactions on Mathematical Software 14, 88-100.
Joe S. and Kuo F.Y. (1998) Remark on Algorithm 659: Implementing Sobol's quasi random Sequence Generator.
Nelder, J. A. and Mead, R. (1965) A simplex algorithm for function minimization. Computer Journal *7*, 308-313.
Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.
See Also
fun.auto.bimodal.pml
, fun.plot.fit.bm
,
fun.diag.ks.g.bimodal
Examples
# Fitting faithful data from the dataset library, with the clara clustering
# regime. The first distribution is RS and the second distribution is fmkl.
# The percentage of data mix is 1%.
fun.auto.bimodal.ml(faithful[,1],per.of.mix=0.01,clustering.m=clara,
init1.sel="rprs",init2.sel="rmfmkl",init1=c(-1.5,1,5),init2=c(-0.25,1.5),
leap1=3,leap2=3)
Fitting mixture of generalied lambda distribtions to data using parition maximum likelihood estimation
Description
This function will fit mixture of generalised lambda distributions to dataset. It is restricted to two generalised lambda distributions. The method of fitting is parition maximum likelihood. It is a two step optimization procedure, each unimodal part of the bimodal distribution is modelled using the maximum likelihood method or the starship method (FMKL GLD only). These initial values the used to "maximise" the complete log likelihood for the entire bimodal distribution. It fits mixture of the form p*(f1)+(1-p)*(f2) where f1 and f2 are pdfs of the generalised lambda distributions.
Usage
fun.auto.bimodal.pml(data, clustering.m = clara, init1.sel = "rprs",
init2.sel = "rprs", init1=c(-1.5, 1.5), init2=c(-1.5, 1.5), leap1=3, leap2=3,
fun1="runif.sobol", fun2="runif.sobol",no=10000,max.it=5000, optim.further="Y")
Arguments
data |
A numerical vector representing the dataset. |
clustering.m |
Clustering method used in classifying the dataset into two parts. Valid arguments include clara, fanny and pam from the cluster library. Default is clara. Or a logical vector specifying how data should be split. |
init1.sel |
This can be |
init2.sel |
This can be |
init1 |
Inititial values lambda3 and lambda4 for the first generalised lambda distribution. |
init2 |
Inititial values lambda3 and lambda4 for the second generalised lambda distribution. |
leap1 |
See scrambling argument in |
fun1 |
A character string of either |
leap2 |
See scrambling argument in |
fun2 |
A character string of either |
no |
Number of initial random values to find the best initial values for optimisation. |
max.it |
Maximum number of iterations for numerical optimisation. |
optim.further |
Whether to optimise the function further using full maximum likelihood method, recommended setting is "Y" |
Details
The initial values that work well for RPRS are c(-1.5,1.5)
and for RMFMKL
are c(-0.25,1.5)
. For scrambling, if 1
, 2
or 3
the
sequence is scrambled otherwise not. If 1
, Owen type type of scrambling
is applied, if 2
, Faure-Tezuka type of scrambling, is applied, and if
3
, both Owen+Faure-Tezuka type of scrambling is applied. The star
method uses the same initial values as rmfmkl
since it uses the FMKL
generalised lambda distribution. Nelder-Simplex algorithm is used in the
numerical optimization. rprs
stands for revised percentile method for RS
generalised lambda distribution and "rmfmkl" stands for revised method of moment
for FMKL generalised lambda distribution. These acronyms represents the initial
optimization algorithm used to get a reasonable set of initial values for the
subsequent optimization procedues. This function is an improvement from Su
(2007) in Journal of Statistical Software.
Value
par |
The best set of parameters found, the first four corresponds to the first distribution fit, the second four corresponds to the second distribution fit, the last value correspond to p for the first distribution fit. |
value |
The value of -PML for the paramters obtained. |
counts |
A two-element integer vector giving the number of calls to "fn" and "gr" respectively. This excludes those calls needed to compute the Hessian, if requested, and any calls to 'fn' to compute a finite-difference approximation to the gradient. |
convergence |
|
message |
A character string giving any additional information returned by
the optimizer, or |
Note
If the number of observations is small, rprs
can sometimes fail as
the percentiles may not exist for this data. Also, if the initial values do not
result in a valid generalised lambda distribution, try another set of initial
values.
References
Bratley P. and Fox B.L. (1988) Algorithm 659: Implementing Sobol's quasi random sequence generator, ACM Transactions on Mathematical Software 14, 88-100.
Joe S. and Kuo F.Y. (1998) Remark on Algorithm 659: Implementing Sobol's quasi random Sequence Generator.
Nelder, J. A. and Mead, R. (1965) A simplex algorithm for function minimization. Computer Journal *7*, 308-313.
Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.
See Also
fun.auto.bimodal.ml
,fun.plot.fit.bm
,
fun.diag.ks.g.bimodal
Examples
# Fitting faithful data from the dataset library, with the clara clustering
# regime. The first distribution is RS and the second distribution is fmkl.
fun.auto.bimodal.pml(faithful[,1],clustering.m=clara,init1.sel="rprs",
init2.sel="rmfmkl",init1=c(-1.5,1,5),init2=c(-0.25,1.5),leap1=3,leap2=3)
Fitting mixtures of generalied lambda distribtions to data using quantile matching method
Description
This function will fit mixture of generalised lambda distributions to dataset. It is restricted to two generalised lambda distributions. The method of fitting is quantile matching method. It is a two step optimization procedure, each unimodal part of the bimodal distribution is modelled using quantile matching method. The initial values obtained are then used to maximise the theoretical and empirical quantile match for the entire bimodal distribution. It fits mixture of the form p*(f1)+(1-p)*(f2) where f1 and f2 are pdfs of the generalised lambda distributions.
Usage
fun.auto.bimodal.qs(data, per.of.mix = 0.01, clustering.m = clara,
init1.sel = "rprs", init2.sel = "rprs", init1=c(-1.5, 1.5), init2=c(-1.5, 1.5),
leap1=3, leap2=3, fun1 = "runif.sobol", fun2 = "runif.sobol", trial.n = 100,
len = 1000, type = 7, no = 10000, maxit = 5000)
Arguments
data |
A numerical vector representing the dataset. |
per.of.mix |
Level of mix between two parts of the distribution, usually 1-2% of cross mix is sufficient. |
clustering.m |
Clustering method used in classifying the dataset into two parts. Valid arguments include clara, fanny and pam from the cluster library. Default is clara. Or a logical vector specifying how data should be split. |
init1.sel |
This can be |
init2.sel |
This can be |
init1 |
Inititial values lambda3 and lambda4 for the first generalised lambda distribution. |
init2 |
Inititial values lambda3 and lambda4 for the second generalised lambda distribution. |
leap1 |
See scrambling argument in |
fun1 |
A character string of either |
leap2 |
See scrambling argument in |
fun2 |
A character string of either |
trial.n |
Number of evenly spaced quantile ranging from 0 to 1 to be
used in the checking phase, to find the best set of initial values for
optimisation, this is intended to be lower than |
len |
Number of evenly spaced quantile ranging from 0 to 1 to be used, default is 1000 |
type |
Type of quantile to be used, default is 7, see |
no |
Number of initial random values to find the best initial values for optimisation. |
maxit |
Maximum number of iterations for numerical optimisation. Default is 5000. |
Details
The initial values that work well for RPRS are c(-1.5,1.5)
and for RMFMKL
are c(-0.25,1.5)
. For scrambling, if 1
, 2
or 3
the
sequence is scrambled otherwise not. If 1
, Owen type type of scrambling
is applied, if 2
, Faure-Tezuka type of scrambling, is applied, and if
3
, both Owen+Faure-Tezuka type of scrambling is applied. The star
method uses the same initial values as rmfmkl
since it uses the FMKL
generalised lambda distribution. Nelder-Simplex algorithm is used in the
numerical optimization. rprs
stands for revised percentile method for
RS generalised lambda distribution and "rmfmkl" stands for revised method of
moment for FMKL generalised lambda distribution. These acronyms represents the
initial optimization algorithm used to get a reasonable set of initial values
for the subsequent optimization procedues.
Value
par |
The best set of parameters found, the first four corresponds to the first distribution fit, the second four corresponds to the second distribution fit, the last value correspond to p for the first distribution fit. |
value |
The value of -ML for the paramters obtained. |
counts |
A two-element integer vector giving the number of calls to
|
convergence |
|
message |
A character string giving any additional information returned by
the optimizer, or |
Note
If the number of observations is small,
rprs
can sometimes fail as the percentiles may not exist for this data.
Also, if the initial values do not span a valid generalised lambda distribution,
try another set of initial values.
Author(s)
Steve Su
References
Bratley P. and Fox B.L. (1988) Algorithm 659: Implementing Sobol's quasi random sequence generator, ACM Transactions on Mathematical Software 14, 88-100.
Joe S. and Kuo F.Y. (1998) Remark on Algorithm 659: Implementing Sobol's quasi random Sequence Generator.
Nelder, J. A. and Mead, R. (1965) A simplex algorithm for function minimization. Computer Journal *7*, 308-313.
Su (2008). Fitting GLD to data via quantile matching method. (Book chapter to appear)
See Also
fun.auto.bimodal.pml
, fun.auto.bimodal.ml
,
fun.plot.fit.bm
,
fun.diag.ks.g.bimodal
Examples
# Fitting faithful data from the dataset library, with the clara clustering
# regime. The first distribution is RS and the second distribution is fmkl.
# The percentage of data mix is 1%.
fun.auto.bimodal.qs(faithful[,1],per.of.mix=0.01,clustering.m=clara,
init1.sel="rprs",init2.sel="rmfmkl",init1=c(-1.5,1,5),init2=c(-0.25,1.5),
leap1=3,leap2=3)
This is a collection of functions used in the calculation of the beta function.
Description
These functions are for maintainers only and it is not designed for the users of this package.
Value
Numerical result for beta function
Note
Please contact the author directly if you find a bug!
Author(s)
Steve Su
Finds the final fits using the maximum likelihood estimation for the bimodal dataset.
Description
This is the secondary optimization procedure to evaluate the final bimodal
distribution fits using the maximum likelihood. It usually relies on initial
values found by fun.bimodal.init
function.
Usage
fun.bimodal.fit.ml(data, first.fit, second.fit, prop, param1, param2, selc1,
selc2)
Arguments
data |
Dataset to be fitted. |
first.fit |
The distribution parameters or the initial values of the first distribution fit. |
second.fit |
The distribution parameters or the initial values of the second distribution fit. |
prop |
The proportion of the data set, usually obtained from
|
param1 |
Can be either |
param2 |
Can be either |
selc1 |
Selection of initial values for the first distribution, can be
either |
selc2 |
Selection of initial values for the second distribution, can be
either |
Details
This function should be used in tandem with fun.bimodal.init
.
Value
par |
The first four numbers are the parameters of the first generalised lambda distribution, the second four numbers are the parameters of the second generalised lambda distribution and the last value is the proportion of the first generalised lambda distribution. |
value |
The objective value of negative likelihood obtained using the par above. |
counts |
A two-element integer vector giving the number of calls to functions. Gradient is not used in this case. |
convergence |
An integer code.
|
message |
A character string giving any additional information returned
by the optimizer, or |
Note
There is currently no guarantee of a global convergence.
Author(s)
Steve Su
References
Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.
See Also
link{fun.bimodal.fit.pml}
, fun.bimodal.init
Examples
# Extract faithful[,2] into faithful2
faithful2<-faithful[,2]
# Uses clara clustering method
clara.faithful2<-fun.class.regime.bi(faithful2, 0.01, clara)
# Save into two different objects
qqqq1.faithful2.cc<-clara.faithful2$data.a
qqqq2.faithful2.cc<-clara.faithful2$data.b
# Find the initial values
result.faithful2.init<-fun.bimodal.init(data1=qqqq1.faithful2.cc,
data2=qqqq2.faithful2.cc, rs.leap1=3,fmkl.leap1=3,rs.init1 = c(-1.5, 1.5),
fmkl.init1 = c(-0.25, 1.5), rs.leap2=3,fmkl.leap2=3,rs.init2 = c(-1.5, 1.5),
fmkl.init2 = c(-0.25, 1.5))
# Find the final fits
result.faithful2.rsrs<-fun.bimodal.fit.ml(data=faithful2,
result.faithful2.init[[2]],result.faithful2.init[[3]],
result.faithful2.init[[1]], param1="rs",param2="rs",selc1="rs",selc2="rs")
# Output
result.faithful2.rsrs
Finds the final fits using partition maximum likelihood estimation for the bimodal dataset.
Description
This is the secondary optimization procedure to evaluate the final bimodal
distribution fits using the partition
maximum likelihood. It usually relies on initial values found by
fun.bimodal.init
function.
Usage
fun.bimodal.fit.pml(data1, data2, first.fit, second.fit, prop, param1, param2,
selc1, selc2)
Arguments
data1 |
First data set, usually obtained by
|
data2 |
Second data set, usually obtained by
|
first.fit |
The distribution parameters or the initial values of the first distribution fit. |
second.fit |
The distribution parameters or the initial values of the second distribution fit. |
prop |
The proportion of the data set, usually obtained from
|
param1 |
Can be either |
param2 |
Can be either |
selc1 |
Selection of initial values for the first distribution, can be
either |
selc2 |
Selection of initial values for the second distribution, can be
either |
Details
This function should be used in tandem with fun.bimodal.init
function.
Value
par |
The first four numbers are the parameters of the first generalised lambda distribution, the second four numbers are the parameters of the second generalised lambda distribution and the last value is the proportion of the first generalised lambda distribution. |
value |
The objective value of negative likelihood obtained. |
counts |
A two-element integer vector giving the number of calls to functions. Gradient is not used in this case. |
convergence |
An integer code.
|
message |
A character string giving any additional information returned
by the optimizer, or |
Note
There is currently no guarantee of a global convergence.
Author(s)
Steve Su
References
Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.
See Also
fun.bimodal.fit.ml
, fun.bimodal.init
Examples
# Extract faithful[,2] into faithful2
faithful2<-faithful[,2]
# Uses clara clustering method
clara.faithful2<-clara(faithful2,2)$clustering
# Save into two different objects
qqqq1.faithful2.cc<-faithful2[clara.faithful2==1]
qqqq2.faithful2.cc<-faithful2[clara.faithful2==2]
# Find the initial values
result.faithful2.init<-fun.bimodal.init(data1=qqqq1.faithful2.cc,
data2=qqqq2.faithful2.cc, rs.leap1=3,fmkl.leap1=3,rs.init1 = c(-1.5, 1.5),
fmkl.init1 = c(-0.25, 1.5), rs.leap2=3,fmkl.leap2=3,rs.init2 = c(-1.5, 1.5),
fmkl.init2 = c(-0.25, 1.5))
# Find the final fits
result.faithful2.rsrs<-fun.bimodal.fit.pml(data1=qqqq1.faithful2.cc,
data2=qqqq2.faithful2.cc, result.faithful2.init[[2]],
result.faithful2.init[[3]], result.faithful2.init[[1]],param1="rs",
param2="rs",selc1="rs",selc2="rs")
# Output
result.faithful2.rsrs
Finds the initial values for optimisation in fitting the bimodal generalised lambda distribution.
Description
After classifying the data using fun.class.regime.bi
, this
function evaluates the temporary or initial solutions
by estimating each part of the bimodal distribution using the maximum likelihood
estimation and starship method.
These initial solutions are then passed onto fun.bimodal.fit.ml
or
fun.bimodal.fit.pml
to obtain the final fits.
Usage
fun.bimodal.init(data1, data2, rs.leap1, fmkl.leap1, rs.init1, fmkl.init1,
rs.leap2, fmkl.leap2, rs.init2, fmkl.init2,fun1="runif.sobol",
fun2="runif.sobol",no=10000)
Arguments
data1 |
The first data obtained by the clustering algorithm. |
data2 |
The second data obtained by the clustering algorithm. |
rs.leap1 |
See scrambling argument in |
fmkl.leap1 |
See scrambling argument in |
rs.init1 |
Inititial values (lambda3 and lambda4) for the first RS
generalised lambda distribution. |
fmkl.init1 |
Inititial values (lambda3 and lambda4) for the first FMKL
generalised lambda distribution. |
rs.leap2 |
See scrambling argument in |
fmkl.leap2 |
See scrambling argument in |
rs.init2 |
Inititial values (lambda3 and lambda4) for the second RS
generalised lambda distribution. |
fmkl.init2 |
Inititial values (lambda3 and lambda4) for the second FMKL
generalised lambda distribution. |
fun1 |
A character string of either |
fun2 |
A character string of either |
no |
Number of initial random values to find the best initial values for optimisation. |
Details
All three methods of fitting (RPRS, RMFMKL and STAR) will be given for each part of the bimodal distribution.
Value
prop |
Proportion of the number of observations in the first data in relation to the entire data. |
first.fit |
A matrix comprising the parameters of GLD obtained from RPRS, RMFMKL and STAR for the first dataset. |
second.fit |
A matrix comprising the parameters of GLD obtained from RPRS, RMFMKL and STAR for the second dataset. |
Note
This is not designed to be called by the end user explicitly, the difficulties with RPRS parameterisation should be noted by the users.
Author(s)
Steve Su
References
Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.
See Also
fun.class.regime.bi
,fun.bimodal.fit.pml
,
fun.bimodal.fit.ml
Examples
# Split the first column of the faithful data into two using
fun.class.regime.bi
faithful1.mod<-fun.class.regime.bi(faithful[,1], 0.1, clara)
# Save the datasets
qqqq1.faithful1.cc1<-faithful1.mod$data.a
qqqq2.faithful1.cc1<-faithful1.mod$data.b
# Find the initial values for secondary optimisation.
result.faithful1.init1<-fun.bimodal.init(data1=qqqq1.faithful1.cc1,
data2=qqqq2.faithful1.cc1, rs.leap1=3,fmkl.leap1=3,rs.init1 = c(-1.5, 1.5),
fmkl.init1 = c(-0.25, 1.5), rs.leap2=3,fmkl.leap2=3,rs.init2 = c(-1.5, 1.5),
fmkl.init2 = c(-0.25, 1.5))
# These initial values are then passed onto fun,bimodal.fit.ml to obtain the
# final fits.
Check whether the RS or FMKL/FKML GLD is a valid GLD for single values of L1, L2, L3 and L4
Description
This function will return a single logical value showing whether a combination of L1, L2, L3 and L4 is a valid GLD.
Usage
fun.check.gld(lambda1, lambda2, lambda3, lambda4, param)
Arguments
lambda1 |
A numerical vector for L1 of GLD |
lambda2 |
A numerical vector for L2 of GLD |
lambda3 |
A numerical vector for L3 of GLD |
lambda4 |
A numerical vector for L4 of GLD |
param |
Can be "rs", "fmkl", or "fkml" |
Value
A single logical value indicating whether the specified GLD is a valid probability density function
Author(s)
Steve Su
See Also
Examples
fun.check.gld(1,4,3,2,"rs")
fun.check.gld(1,4,3,2,"fkml")
fun.check.gld(1,4,3,-2,"rs")
Check whether the RS or FMKL/FKML GLD is a valid GLD for vectors of L1, L2, L3 and L4
Description
This function will return a logical vector showing whether vector combinations of L1, L2, L3 and L4 are valid GLDs.
Usage
fun.check.gld.multi(l1, l2, l3, l4, param)
Arguments
l1 |
A numerical vector for L1 of GLD |
l2 |
A numerical vector for L2 of GLD |
l3 |
A numerical vector for L3 of GLD |
l4 |
A numerical vector for L4 of GLD |
param |
Can be "rs", "fmkl", or "fkml" |
Details
This is an extension to fun.check.gld
Value
A logical vector indicating whether the specified parameters are valid GLDs
Author(s)
Steve Su
See Also
Examples
fun.check.gld.multi(c(1,2,3),c(4,5,6),c(7,8,9),c(10,11,12),param="rs")
fun.check.gld.multi(c(1,2,3),c(4,5,6),c(7,8,9),c(10,11,-12),param="rs")
Classifies data into two groups using a clustering regime.
Description
This function is primarily designed to split a bimodal data vector into two groups to allow the fitting of mixture generalised lambda distributions.
Usage
fun.class.regime.bi(data, perc.cross, fun.cross)
Arguments
data |
Data to be classified into two groups. |
perc.cross |
Percentage of cross over from one data to the other, usually set at 1% |
fun.cross |
Any clustering function such as |
Details
This function is part of the routine mixture fitting procedure provided in this
package. The perc.cross
argument or percentage of cross over is designed
to allow the use of maximum likelihood estimation via EM algorithm for fitting
bimodal data. When this is invoked, it will ensure both part of the data will
contain both the minmum and maximum of the data set as well as a proportion (
specified in perc.cross argument) of observations from each other. If 1% is
required, then data.a will contains 1% of the data.b and vice versa after the
full data set has been classified into data.a
and data.b
by the
fun.cross
classification regime.
Value
data.a |
First group of data obtained by the classification algorithm. |
data.b |
Second group of data obtained by the classification algorithm. |
Author(s)
Steve Su
References
Kaufman, L. and Rousseeuw, P. J. (1990). Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York.
Su (2006) Maximum Log Likelihood Estimation using EM Algorithm and Partition Maximum Log Likelihood Estimation for Mixtures of Generalized Lambda Distributions. Working Paper.
See Also
link{clara}
, pam
, fanny
Examples
# Classify the faithful[,1] data into two categories with 10% cross over mix.
fun.class.regime.bi(faithful[,1],0.1,clara)
# Classify the faithful[,1] data into two categories with no mixing:
fun.class.regime.bi(faithful[,1],0,clara)
Compare the moments of the data and the fitted univariate generalised lambda distribution.
Description
After fitting the distribution, it is often desirable to see whether the
moments of the data matches with the fitted distribution. This function computes
the theoretical and actual moments especially for fun.data.fit.ml
function output.
Usage
fun.comp.moments.ml(theo.obj, data, name = "ML")
Arguments
theo.obj |
Fitted distribution parameters, usually output from
|
data |
Data set used |
name |
Naming the method used in fitting the distribution, by default this is "ML". |
Value
r.mat |
A matrix showing the mean, variance, skewness and kurtosis of the fitted distribution in comparison to the data set. |
eval.mat |
Absolute difference in each of the four moments from the data under each of the distibutional fits. |
Note
Sometimes it is difficult to find RPRS type of fits to data set, so
instead fun.comp.moments.ml.2
is used to compare the theoretical
moments of RMFMKL.ML and STAR methods
with respect to the dataset fitted.
Author(s)
Steve Su
See Also
Examples
# Generate random normally distributed observations.
junk<-rnorm(1000,3,2)
# Fit the dataset using fun.data.ml
fit<-fun.data.fit.ml(junk)
# Compare the resulting fits. It is usually the case the maximum likelihood
# provides better estimation of the moments than the starship method.
fun.comp.moments.ml(fit,junk)
Compare the moments of the data and the fitted univariate generalised lambda distribution. Specialised funtion designed for RMFMKL.ML and STAR methods.
Description
After fitting the distribution, it is often desirable to see whether the moments of the data matches with the fitted distribution. This function computes the theoretical and actual moments for the FMKL GLD maximum likelihood estimation and starship method.
Usage
fun.comp.moments.ml.2(theo.obj, data, name = "ML")
Arguments
theo.obj |
Fitted distribution parameters, there should be two sets, both FMKL GLD. |
data |
Data set used |
name |
Naming the method used in fitting the distribution, by default this is "ML". |
Value
r.mat |
A matrix showing the mean, variance, skewness and kurtosis of the fitted distribution in comparison to the data set. |
eval.mat |
Absolute difference in each of the four moments from the data under each of the distibutional fits. |
Note
To compare all three fits under fun.data.fit.ml
see
fun.comp.moments.ml
function.
Author(s)
Steve Su
See Also
Examples
## Generate random normally distributed observations.
junk<-rnorm(1000,3,2)
## Fit the dataset using fun.data.ml
fit<-cbind(fun.RMFMKL.ml(junk),starship(junk)$lambda)
## Compare the resulting fits. It is usually the case the maximum likelihood
## provides better estimation of the moments than the starship method.
fun.comp.moments.ml.2(fit,junk)
Fit RS and FMKL generalised distributions to data using discretised approach with weights.
Description
This function fits RS and FMKL generalised distribution to data using discretised approach with weights. It is designed to act as a smoother device rather than a definitive fit.
Usage
fun.data.fit.hs(data, rs.default = "Y", fmkl.default = "Y", rs.leap = 3,
fmkl.leap = 3, rs.init = c(-1.5, 1.5), fmkl.init = c(-0.25, 1.5), no.c.rs = 50,
no.c.fmkl = 50,FUN="runif.sobol", no=10000)
Arguments
data |
Dataset to be fitted |
rs.default |
If yes, this function uses the default method
|
fmkl.default |
If yes, this function uses the default method
|
rs.leap |
See scrambling argument in |
fmkl.leap |
See scrambling argument in |
rs.init |
Initial values for RS distribution optimization,
|
fmkl.init |
Initial values for FMKL distribution optimization,
|
no.c.rs |
Number of classes or bins of histogram to be optimized over
for the RS GLD. This argument is ineffective if |
no.c.fmkl |
Number of classes or bins of histogram to be optimized over
for the FMKL GLD. This argument is ineffective if |
FUN |
A character string of either |
no |
Number of initial random values to find the best initial values for optimisation. |
Details
This function optimises the deviations of frequency of the bins to that of the theoretical so it has the effect of "fitting clothes" onto the data set. The user can decide the frequency of the bins they want the distribution to smooth over. The resulting fit may or may not be an adequate fit from a formal statistical point of view such as satisfying the goodness of fit for example, but it can be useful to suggest the range of different distributions exhibited by the data set. The default number of classes calculates the mean and variance after categorising the data into different bins and uses the number of classes that best matches the mean and variance of the original, ungrouped data.The weighting is designed to accentuate the peak or the dense part of the distribution and suppress the tails.
Value
A matrix showing the four parameters of the RS and FMKL distribution fit.
Note
In some cases, the resulting fit may not converge, there are currently no checking mechanism in place to ensure global convergence. The RPRS method can sometimes fail if there are no valid percentiles in the data set or if initial values do not give a valid distribution.
Author(s)
Steve Su
References
Su, S. (2005). A Discretized Approach to Flexibly Fit Generalized Lambda Distributions to Data. Journal of Modern Applied Statistical Methods (November): 408-424.
Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.
See Also
fun.RPRS.hs.nw
, fun.RMFMKL.hs.nw
,
fun.RMFMKL.hs
, fun.RPRS.hs
,
fun.data.fit.hs.nw
, fun.data.fit.ml
Examples
# Fitting normal(3,2) distriution using the default setting
junk<-rnorm(1000,3,2)
fun.data.fit.hs(junk)
Fit RS and FMKL generalised distributions to data using discretised approach without weights.
Description
This function fits RS and FMKL generalised distribution to data using discretised approach without weights. It is designed to act as a smoother device rather than a definitive fit.
Usage
fun.data.fit.hs.nw(data, rs.default = "Y", fmkl.default = "Y", rs.leap = 3,
fmkl.leap = 3, rs.init = c(-1.5, 1.5), fmkl.init = c(-0.25, 1.5), no.c.rs = 50,
no.c.fmkl = 50,FUN="runif.sobol",no=10000)
Arguments
data |
Dataset to be fitted |
rs.default |
If yes, this function uses the default method
|
fmkl.default |
If yes, this function uses the default method
|
rs.leap |
See scrambling argument in |
fmkl.leap |
See scrambling argument in |
rs.init |
Initial values for RS distribution optimization,
|
fmkl.init |
Initial values for FMKL distribution optimization,
|
no.c.rs |
Number of classes or bins of histogram to be optimized over
for the RS GLD. This argument is ineffective if |
no.c.fmkl |
Number of classes or bins of histogram to be optimized over
for the FMKL GLD. This argument is ineffective if |
FUN |
A character string of either |
no |
Number of initial random values to find the best initial values for optimisation. |
Details
This function optimises the deviations of frequency of the bins to that of the theoretical so it has the effect of "fitting clothes" onto the data set. The user can decide the frequency of the bins they want the distribution to smooth over. The resulting fit may or may not be an adequate fit from a formal statistical point of view such as satisfying the goodness of fit for example, but it can be useful to suggest the range of different distributions exhibited by the data set. The default number of classes calculates the mean and variance after categorising the data into different bins and uses the number of classes that best matches the mean and variance of the original, ungrouped data.
Value
A matrix showing the four parameters of the RS and FMKL distribution fit.
Note
In some cases, the resulting fit may not converge, there are currently no checking mechanism in place to ensure global convergence. The RPRS method can sometimes fail if there are no valid percentiles in the data set or if initial values do not give a valid distribution.
Author(s)
Steve Su
References
Su, S. (2005). A Discretized Approach to Flexibly Fit Generalized Lambda Distributions to Data. Journal of Modern Applied Statistical Methods (November): 408-424.
Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.
See Also
fun.RPRS.hs
, fun.RMFMKL.hs
,
fun.RMFMKL.hs.nw
, fun.RPRS.hs.nw
,
fun.data.fit.hs
, fun.data.fit.ml
Examples
# Fitting normal(3,2) distriution using the default setting
junk<-rnorm(1000,3,2)
fun.data.fit.hs.nw(junk)
Fit data using L moment matching estimation for RS and FMKL GLD
Description
This function fits generalised lambda distributions to data using L moment matching method
Usage
fun.data.fit.lm(data, rs.leap = 3, fmkl.leap = 3, rs.init = c(-1.5, 1.5),
fmkl.init = c(-0.25, 1.5), FUN = "runif.sobol", no = 10000)
Arguments
data |
Dataset to be fitted. |
rs.leap |
See scrambling argument in |
fmkl.leap |
See scrambling argument in |
rs.init |
Inititial values (lambda3 and lambda4) for the RS generalised lambda distribution. |
fmkl.init |
Inititial values (lambda3 and lambda4) for the FMKL generalised lambda distribution. |
FUN |
A character string of either |
no |
Number of initial random values to find the best initial values for optimisation. |
Details
This function consolidates fun.RPRS.lm
and
fun.RMFMKL.lm
and gives all the fits in
one output.
Value
A matrix showing the parameters of RS and FMKL generalised lambda distributions.
Author(s)
Steve Su
References
Asquith, W. (2007). "L-moments and TL-moments of the generalized lambda distribution." Computational Statistics and Data Analysis 51(9): 4484-4496.
Karvanen, J. and A. Nuutinen (2008). "Characterizing the generalized lambda distribution by L-moments." Computational Statistics and Data Analysis 52(4): 1971-1983.
See Also
fun.RPRS.qs
, fun.RMFMKL.qs
,
fun.data.fit.ml
, fun.data.fit.hs
,
fun.data.fit.hs.nw
, fun.data.fit.qs
,
fun.data.fit.mm
Examples
# Fitting normal(3,2) distriution using the default setting
junk<-rnorm(50,3,2)
fun.data.fit.lm(junk)
Fit data using RS, FMKL maximum likelihood estimation and the FMKL starship method.
Description
This function fits generalised lambda distributions to data using RPRS, RMFMKL and starship methods.
Usage
fun.data.fit.ml(data, rs.leap = 3, fmkl.leap = 3, rs.init = c(-1.5, 1.5),
fmkl.init = c(-0.25, 1.5),FUN="runif.sobol",no=10000)
Arguments
data |
Dataset to be fitted. |
rs.leap |
See scrambling argument in |
fmkl.leap |
See scrambling argument in |
rs.init |
Inititial values (lambda3 and lambda4) for the RS generalised lambda distribution. |
fmkl.init |
Inititial values (lambda3 and lambda4) for the FMKL generalised lambda distribution. |
FUN |
A character string of either |
no |
Number of initial random values to find the best initial values for optimisation. |
Details
This function consolidates fun.RPRS.ml
,
fun.RMFMKL.ml
and starship
and gives all the fits in
one output.
Value
A matrix showing the parameters of generalised lambda distribution for RPRS, FMFKL and STAR methods.
Note
RPRS can sometimes fail if it is not possible to calculate the percentiles of the data set. This usually happens when the number of data point is small.
Author(s)
Steve Su
References
King, R.A.R. & MacGillivray, H. L. (1999), A starship method for fitting the generalised lambda distributions, Australian and New Zealand Journal of Statistics, 41, 353-374
Su, S. (2007). Numerical Maximum Log Likelihood Estimation for Generalized Lambda Distributions. Computational statistics and data analysis 51(8) 3983-3998.
Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.
See Also
fun.RPRS.ml
, fun.RMFMKL.ml
,
starship
, fun.data.fit.hs
,
fun.data.fit.hs.nw
,
fun.data.fit.qs
,
fun.data.fit.mm
,
fun.data.fit.lm
Examples
# Fitting normal(3,2) distriution using the default setting
junk<-rnorm(50,3,2)
fun.data.fit.ml(junk)
Fit data using moment matching estimation for RS and FMKL GLD
Description
This function fits generalised lambda distributions to data using moment matching method
Usage
fun.data.fit.mm(data, rs.leap = 3, fmkl.leap = 3, rs.init = c(-1.5, 1.5),
fmkl.init = c(-0.25, 1.5), FUN = "runif.sobol", no = 10000)
Arguments
data |
Dataset to be fitted. |
rs.leap |
See scrambling argument in |
fmkl.leap |
See scrambling argument in |
rs.init |
Inititial values (lambda3 and lambda4) for the RS generalised lambda distribution. |
fmkl.init |
Inititial values (lambda3 and lambda4) for the FMKL generalised lambda distribution. |
FUN |
A character string of either |
no |
Number of initial random values to find the best initial values for optimisation. |
Details
This function consolidates fun.RPRS.mm
and
fun.RMFMKL.mm
and gives all the fits in
one output.
Value
A matrix showing the parameters of RS and FMKL generalised lambda distributions.
Note
RPRS can sometimes fail if it is not possible to calculate the percentiles of the data set. This usually happens when the number of data point is small.
References
Karian, Z. and E. Dudewicz (2000). Fitting Statistical Distributions: The Generalized Lambda Distribution and Generalised Bootstrap Methods. New York, Chapman and Hall.
See Also
fun.RPRS.qs
, fun.RMFMKL.qs
,
fun.data.fit.ml
, fun.data.fit.hs
,
fun.data.fit.hs.nw
, fun.data.fit.qs
,
fun.data.fit.lm
Examples
# Fitting normal(3,2) distriution using the default setting
junk<-rnorm(50,3,2)
fun.data.fit.mm(junk)
Fit data using quantile matching estimation for RS and FMKL GLD
Description
This function fits generalised lambda distributions to data using quantile matching method
Usage
fun.data.fit.qs(data, rs.leap = 3, fmkl.leap = 3, rs.init = c(-1.5, 1.5),
fmkl.init = c(-0.25, 1.5), FUN = "runif.sobol", trial.n = 100, len = 1000,
type = 7, no = 10000)
Arguments
data |
Dataset to be fitted. |
rs.leap |
See scrambling argument in |
fmkl.leap |
See scrambling argument in |
rs.init |
Inititial values (lambda3 and lambda4) for the RS generalised lambda distribution. |
fmkl.init |
Inititial values (lambda3 and lambda4) for the FMKL generalised lambda distribution. |
FUN |
A character string of either |
trial.n |
Number of evenly spaced quantile ranging from 0 to 1 to be
used in the checking phase, to find the best set of initial values for
optimisation, this is intended to be lower than |
len |
Number of evenly spaced quantile ranging from 0 to 1 to be used, default is 1000 |
type |
Type of quantile to be used, default is 7, see |
no |
Number of initial random values to find the best initial values for optimisation. |
Details
This function consolidates fun.RPRS.qs
and
fun.RMFMKL.qs
and gives all the fits in
one output.
Value
A matrix showing the parameters of RS and FMKL generalised lambda distributions.
Note
RPRS can sometimes fail if it is not possible to calculate the percentiles of the data set. This usually happens when the number of data point is small.
Author(s)
Steve Su
References
Su (2008). Fitting GLD to data via quantile matching method. (Book chapter to appear)
See Also
fun.RPRS.qs
, fun.RMFMKL.qs
,
fun.data.fit.ml
, fun.data.fit.hs
,
fun.data.fit.hs.nw
, fun.data.fit.mm
,
fun.data.fit.lm
Examples
# Fitting normal(3,2) distriution using the default setting
junk<-rnorm(50,3,2)
fun.data.fit.qs(junk)
Compute the simulated Kolmogorov-Smirnov tests for the unimodal dataset
Description
This function counts the number of times the p-value exceed 0.05 for the null hypothesis that the observations simulated from the fitted distribution is the same as the observations simulated from the unimodal data set.
Usage
fun.diag.ks.g(result, data, no.test = 1000, len = floor(0.9 * length(data)),
param, alpha = 0.05)
Arguments
result |
A vector representing the four parameters of the generalised lambda distribution. |
data |
The unimodal dataset. |
no.test |
Total number of tests required. |
len |
Number of data to sample. |
param |
Type of the generalised lambda distribution, |
alpha |
Significance level of KS test. |
Value
A numerical value representing number of times the p-value exceeds alpha.
Note
If there are ties, jittering is used in ks.gof
.
Author(s)
Steve Su
References
Stephens, M. A. (1986). Tests based on EDF statistics. In Goodness-of-Fit Techniques. D'Agostino, R. B. and Stevens, M. A., eds. New York: Marcel Dekker.
Su, S. (2005). A Discretized Approach to Flexibly Fit Generalized Lambda Distributions to Data. Journal of Modern Applied Statistical Methods (November): 408-424.
Su (2007). Nmerical Maximum Log Likelihood Estimation for Generalized Lambda Distributions. Computational Statistics and Data Analysis: *51*, 8, 3983-3998.
Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.
See Also
Examples
# Generate 1000 random observations from Normal distribution with mean=100,
# standard deviation=10. Save this as junk
junk<-rnorm(1000,100,10)
# Fit junk using RPRS method via the maxmum likelihood.
fit1<-fun.RPRS.ml(junk, c(-1.5, 1.5), leap = 3)
# Calculate the simulated KS test result:
fun.diag.ks.g(fit1,junk,param="rs")
Compute the simulated Kolmogorov-Smirnov tests for the bimodal dataset
Description
This function counts the number of times the p-value exceed 0.05 for the null hypothesis that the observations simulated from the fitted distribution is the same as the observations simulated from the bimodal data set.
Usage
fun.diag.ks.g.bimodal(result1, result2, prop1, prop2, data, no.test = 1000,
len = floor(0.9 * length(data)), param1, param2, alpha = 0.05)
Arguments
result1 |
A vector representing the four parameters of the first generalised lambda distribution. |
result2 |
A vector representing the four parameters of the second generalised lambda distribution. |
prop1 |
Proportion of the first distribution fitted to the bimodal dataset. |
prop2 |
Proportion of the second distribution fitted to the bimodal dataset. |
data |
The bimodal dataset. |
no.test |
Total number of tests required. |
len |
Number of data to sample. |
param1 |
Type of first generalised lambda distribution, can be
|
param2 |
Type of second generalised lambda distribution, can be
|
alpha |
Significance level of KS test. |
Value
A numerical value representing number of times the p-value exceeds alpha.
Note
If there are ties, jittering is used in ks.gof
.
Author(s)
Steve Su
References
Stephens, M. A. (1986). Tests based on EDF statistics. In Goodness- of-Fit Techniques. D'Agostino, R. B. and Stevens, M. A., eds. New York: Marcel Dekker.
Su, S. (2005). A Discretized Approach to Flexibly Fit Generalized Lambda Distributions to Data. Journal of Modern Applied Statistical Methods (November): 408-424.
Su (2007). Nmerical Maximum Log Likelihood Estimation for Generalized Lambda Distributions. Computational Statistics and Data Analysis: *51*, 8, 3983-3998.
Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.
See Also
Examples
# Fit the faithful[,1] data from the MASS library
fit1<-fun.auto.bimodal.ml(faithful[,1],init1.sel="rprs",init2.sel="rmfmkl",
init1=c(-1.5,1,5),init2=c(-0.25,1.5),leap1=3,leap2=3)
# Run diagnostic KS tests
fun.diag.ks.g.bimodal(fit1$par[1:4],fit1$par[5:8],prop1=fit1$par[9],
data=faithful[,1],param1="rs",param2="fmkl")
Diagnostic function for theoretical distribution fits through the resample Kolmogorov-Smirnoff tests
Description
This function is primarily designed to be used for testing the fitted
distribution with reference to a theoretical distribution. It is also
tailored for output obtained from the fun.data.fit.ml
function.
Usage
fun.diag1(result, test, no.test = 1000, alpha = 0.05)
Arguments
result |
Output from |
test |
Simulated observations from theoretical distribution, the length
should be no.test |
no.test |
Number of times to do the KS tests. |
alpha |
Significance level of KS test. |
Value
A vector showing the number of times the KS p-value is greater than alpha for each of the distribution fit strategy.
Note
If there are ties, jittering is used in ks.gof
.
Author(s)
Steve Su
References
Su, S. (2005). A Discretized Approach to Flexibly Fit Generalized Lambda Distributions to Data. Journal of Modern Applied Statistical Methods (November): 408-424.
Su, S. (2007). Numerical Maximum Log Likelihood Estimation for Generalized Lambda Distributions. Journal of Computational statistics and data analysis 51(8) 3983-3998.
Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.
See Also
fun.diag2
, fun.diag.ks.g
,
fun.diag.ks.g.bimodal
Examples
# Fits a Weibull 5,2 distribution:
weibull.approx.ml<-fun.data.fit.ml(rweibull(1000,5,2))
# Compute the resample K-S test results.
fun.diag1(weibull.approx.ml, rweibull(100000, 5, 2))
Diagnostic function for empirical data distribution fits through the resample Kolmogorov-Smirnoff tests
Description
This function is primarily designed to be used for testing the fitted
distribution with reference to an empirical data. It is also
tailored for output obtained from the fun.data.fit.ml
function.
Usage
fun.diag2(result, data, no.test = 1000, len=100, alpha = 0.05)
Arguments
result |
Output from |
data |
Observations in which the distribution was fitted upon. |
no.test |
Number of times to do the KS tests. |
len |
Number of observations to sample from the data. This is also the number of observations sampled from the fitted distribution in each KS test. |
alpha |
Significance level of KS test. |
Value
A vector showing the number of times the KS p-value is greater than alpha for each of the distribution fit strategy.
Note
If there are ties, jittering is used in ks.gof
.
Author(s)
Steve Su
References
Su, S. (2005). A Discretized Approach to Flexibly Fit Generalized Lambda Distributions to Data. Journal of Modern Applied Statistical Methods (November): 408-424.
Su, S. (2007). Numerical Maximum Log Likelihood Estimation for Generalized Lambda Distributions. Journal of Computational statistics and data analysis 51(8) 3983-3998.
Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.
See Also
fun.diag1
, fun.diag.ks.g
,
fun.diag.ks.g.bimodal
Examples
# Fits a Normal 3,2 distribution:
junk<-rnorm(1000,3,2)
fit<-fun.data.fit.ml(junk)
# Compute the resample K-S test results.
fun.diag2(fit,junk)
Estimates the mean and variance after cutting up a vector of variable into evenly spaced categories.
Description
This function supplements fun.nclass.e
and it is not intended to
be used by the users directly.
Usage
fun.disc.estimation(x, nint)
Arguments
x |
A vector of observations. |
nint |
Number of intervals to cut the vectors into. |
Details
The function cuts the vector into evenly spaced categories and estimate the mean and variance of the actual data based on the categorisation.
Value
Two numerical values, the first being the mean and the second being the variance.
Author(s)
Steve Su
See Also
Examples
## Cut up a randomly normally distributed observations into 5 evenly spaced
## categories and estimate the mean and variance based on this cateogorisation.
junk<-rnorm(1000,3,2)
fun.disc.estimation(junk,5)
Finds the low discrepancy quasi random numbers
Description
This function calls the runif.sobol
, runif.sobol.owen
and
runif.halton
essentially from the spacefillr package.
Usage
fun.gen.qrn(n, dimension, scrambling, FUN = "runif.sobol")
Arguments
n |
Number to generate. |
dimension |
Number of dimensions. |
scrambling |
seed used, or leap as in the case of
|
FUN |
This can be |
Value
A vector of values if dimension=1, otherwise a matrix of values between 0 and 1.
Author(s)
Steve Su
Examples
fun.gen.qrn(1000,5,3,"runif.sobol")
fun.gen.qrn(1000,5,409,"QUnif")
Find the theoretical first four L moments of the generalised lambda distribution.
Description
This function computes the first four L moments for both RS and FMKL generalised lambda distributions.
Usage
fun.lm.theo.gld(L1, L2, L3, L4, param)
Arguments
L1 |
Lambda 1. Or c(Lambda 1,Lambda 2,Lambda 3,Lambda 4). |
L2 |
Lambda 2. |
L3 |
Lambda 3. |
L4 |
Lambda 4. |
param |
"rs" or "fmkl" or "fkml" |
Value
A vector listing the first four L moments
Note
Sometimes the theoretical moments may not exist, in those cases,
NA
is returned.
Author(s)
Steve Su
References
Asquith, W. (2007). "L-moments and TL-moments of the generalized lambda distribution." Computational Statistics and Data Analysis 51(9): 4484-4496.
Karvanen, J. and A. Nuutinen (2008). "Characterizing the generalized lambda distribution by L-moments." Computational Statistics and Data Analysis 52(4): 1971-1983.
See Also
Examples
fun.lm.theo.gld(1, 2, 3, 4, "rs")
fun.lm.theo.gld(1, 2, 3, 4, "fmkl")
Applying functions based on an index for a matrix.
Description
This is a generic function that can be used to find mean, variance, sum or other operations according to some index imposed on the matrix or vector.
Usage
fun.mApply(X, INDEX, FUN = NULL, ..., simplify = TRUE)
Arguments
X |
Matrix with n rows. |
INDEX |
Vector or list of vectors of length n. |
FUN |
Function to operate on submatrices of |
... |
Arguments to function. |
simplify |
Set as |
Value
If FUN
returns more than one number, fun.mApply
returns a matrix
with rows corresponding to unique values of INDEX
.
Author(s)
Tony Plate
Examples
# Finding the row medians of a matrix (matrix(1:20,nrow=5))
fun.mApply(matrix(1:20,nrow=5),list(1:5),median)
Check whether the specified GLDs cover the minimum and the maximum values in a dataset
Description
This function checks the lowest and highest quantiles of the specified GLDs against the specified dataset
Usage
fun.minmax.check.gld(data, lambdas, param, lessequalmin = 1,
greaterequalmax = 1)
Arguments
data |
A vector of numerical dataset |
lambdas |
A matrix of four columns representing lambda 1 to lambda 4 of the GLD |
param |
Can be "rs", "fkml" or "fmkl" |
lessequalmin |
Can be 0 or 1 |
greaterequalmax |
Can be 0 or 1 |
Details
lessequalmin==1 means the lowest value of GLD <= minimum value of data lessequalmin==0 means the lowest value of GLD < minimum value of data greaterequalmin==1 means the highest value of GLD >= maximum value of data greaterequalmin==0 means the highest value of GLD > maximum value of data
Value
A vector of logical values indicating whether the specified data the specified GLDs cover the minimum and the maximum values in a dataset
Author(s)
Steve Su
Examples
fun.minmax.check.gld(runif(100,.9,1),matrix(1:12,ncol=4),param="rs",0,0)
fun.minmax.check.gld(runif(100,.98,1),matrix(1:12,ncol=4),param="fkml",1,1)
Finds the moments of fitted mixture of generalised lambda distribution by simulation.
Description
This functions compute the mean, variance, skewness and kurtosis of the fitted generalised lambda distribution mixtures using Monte Carlo simulation.
Usage
fun.moments.bimodal(result1, result2, prop1, prop2, len = 1000,
no.test = 1000, param1, param2)
Arguments
result1 |
A vector comprising four values for the first generalised lambda distribution. |
result2 |
A vector comprising four values for the second generalised lambda distribution. |
prop1 |
Proportion of the first generalised lambda distribution |
prop2 |
1-prop1, this can be left unspecified. |
len |
Length of object for each simulation run. |
no.test |
Number of simulation run. |
param1 |
This can be |
param2 |
This can be |
Details
There is also a theoretical computation of the moments in
fun.theo.bi.mv.gld
, it should be noted
that the theoretical moments may not exist. The length of object in len
means how many observations should
be generated in each simulation run, with the number of simulation runs governed
by no.test
.
Value
A matrix with four columns showing the mean, variance, skewness and kurtosis of the fitted generalised lambda distribution mixtures using Monte Carlo simulation. Each row represents a simulation run.
Author(s)
Steve Su
See Also
fun.theo.bi.mv.gld
, fun.simu.bimodal
,
fun.rawmoments
Examples
# Fitting the first column of the Old Faithful Geyser data
fit1<-fun.auto.bimodal.ml(faithful[,1],init1.sel="rmfmkl",init2.sel="rmfmkl",
init1=c(-0.25,1.5),init2=c(-0.25,1.5),leap1=3,leap2=3)
# After fitting compute the monte carlo moments using fun.moments.bimodal
fun.moments.bimodal(fit1$par[1:4],fit1$par[5:8],prop1=fit1$par[9],
param1="fmkl",param2="fmkl")
# It is also possible to compare this with the moments of the original dataset:
fun.moments(faithful[,1])
Calculate mean, variance, skewness and kurtosis of a numerical vector
Description
This function evaluates the mean, variance, skewness and kurtosis of a numerical vector. Missing values are automatically removed.
Usage
fun.moments.r(x, normalise = "N")
Arguments
x |
A numeric vector |
normalise |
"Y" if you want kurtosis to be calculated with reference to kurtosis = 0 under Normal distribution. Default is "N". |
Value
A vector of mean, variance, skewness and kurtosis.
Note
Please contact the author directly if you find a bug!
Author(s)
Steve Su
See Also
Examples
fun.moments.r(rnorm(1000))
fun.moments.r(rnorm(1000),normalise="Y")
Estimates the number of classes or bins to smooth over in the discretised method of fitting generalised lambda distribution to data.
Description
Support function for discretised method of fitting distribution to data.
Usage
fun.nclass.e(x)
Arguments
x |
Vector of data. |
Details
This function calculates the mean and variance of the discretised data from 1 to the very last observation and chooses the best number of categories that represent the mean and variance of the actual data set through the criterion of squared deviations.
Value
A numerical value suggesting the best number of class that can be used to represent the mean and variane of the original data set.
Note
This is not designed to be called directly by end user.
Author(s)
Steve Su
See Also
Examples
fun.nclass.e(rnorm(100,3,2))
Plotting the univariate generalised lambda distribution fits on the data set.
Description
This function is designed for univariate generalised lambda distribution fits only.
Usage
fun.plot.fit(fit.obj, data, nclass = 50, xlab = "", name = "", param.vec,
ylab="Density", main="")
Arguments
fit.obj |
Fitted object from |
data |
Dataset to be plotted. |
nclass |
Number of class of histogram, the default is 50. |
xlab |
Label on the x axis. |
name |
Naming the type of distribution fits. |
param.vec |
A vector describing the type of generalised lambda
distribution used in the |
ylab |
Label on the y axis. |
main |
Title of the graph. |
Value
A graphical output showing the data and the resulting distributional fits.
Note
If the distribution fits over fits the peak of the distribution, it can be difficult to see the actual data set.
Author(s)
Steve Su
See Also
fun.plot.fit.bm
, fun.data.fit.ml
,
fun.data.fit.hs
, fun.data.fit.hs.nw
,
fun.RPRS.ml
, fun.RMFMKL.ml
,
fun.RPRS.hs
, fun.RMFMKL.hs
,
fun.RPRS.hs.nw
, fun.RMFMKL.hs.nw
Examples
# Generate Normally distribute random numbers as dataset
junk<-rnorm(1000,3,2)
# Fit the data set using fun.data.fit.ml.
# Also, fun.data.fit.hs or fun.data.fit.hs.nw can be used.
obj.fit<-fun.data.fit.ml(junk)
# Plot the resulting fits
fun.plot.fit(obj.fit,junk,xlab="x",name=".ML",param.vec=c("rs","fmkl","fmkl"))
# This function also works for singular fits such as those by fun.RPRS.ml,
# fun.RMFMKL.ml, fun.RPRS.hs, fun.RMFMKL.hs, fun.RPRS.hs.nw, fun.RMFMKL.hs.nw
junk<-rnorm(1000,3,2)
obj.fit<-fun.RPRS.ml(junk)
fun.plot.fit(obj.fit,junk,xlab="x",name="RPRS.ML",param.vec=c("rs"))
Plotting mixture of two generalised lambda distributions on the data set.
Description
This function is designed for mixture of two generalised lambda distributions only.
Usage
fun.plot.fit.bm(fit.obj, data, nclass = 50, xlab = "", name = "", main="",
param.vec, ylab="Density")
Arguments
fit.obj |
Fitted object from |
data |
Dataset to be plotted. |
nclass |
Number of class of histogram, the default is 50. |
xlab |
Label on the x axis. |
name |
Legend, usually used to identify type of GLD used if |
main |
Title of the graph. |
param.vec |
A vector describing the type of generalised lambda
distribution used in the |
ylab |
Label on the y axis. |
Value
A graphical output showing the data and the resulting distributional fits.
Note
If the distribution fits over fits the peak of the distribution, it can be difficult to see the actual data set.
Author(s)
Steve Su
See Also
fun.auto.bimodal.ml
, fun.auto.bimodal.pml
,
fun.plot.fit
Examples
opar <- par()
par(mfrow=c(2,1))
# Fitting mixture of generalised lambda distributions on the data set using
# both the maximum likelihood and partition maximum likelihood and plot
# the resulting fits
junk<-fun.auto.bimodal.ml(faithful[,1],per.of.mix=0.1,clustering.m=clara,
init1.sel="rprs",init2.sel="rmfmkl",init1=c(-1.5,1,5),init2=c(-0.25,1.5),
leap1=3,leap2=3)
fun.plot.fit.bm(nclass=50,fit.obj=junk,data=faithful[,1],
name="Maximum likelihood using",xlab="faithful1",param.vec=c("rs","fmkl"))
junk<-fun.auto.bimodal.pml(faithful[,1],clustering.m=clara,init1.sel="rprs",
init2.sel="rmfmkl",init1=c(-1.5,1,5),init2=c(-0.25,1.5),leap1=3,leap2=3)
fun.plot.fit.bm(nclass=50,fit.obj=junk,data=faithful[,1],
name="Partition maximum likelihood using",xlab="faithful1",
param.vec=c("rs","fmkl"))
junk<-fun.auto.bimodal.ml(faithful[,1],per.of.mix=0.1,clustering.m=clara,
init1.sel="rprs",init2.sel="rmfmkl",init1=c(-1.5,1,5),init2=c(-0.25,1.5),
leap1=3,leap2=3)
fun.plot.fit.bm(nclass=50,fit.obj=junk,data=faithful[,1],
main="Mixture distribution fit",
name="RS and FMKL GLD",xlab="faithful1",param.vec=c("rs","fmkl"))
par(opar)
Plotting many univariate generalised lambda distributions on one page.
Description
This is a variant of the fun.plot.fit
function.
Usage
fun.plot.many.gld(fit.obj, data, xlab="", ylab="Density", main="", legd="",
param.vec)
Arguments
fit.obj |
A matrix of generalised lambda distibutions parameters
from |
data |
Dataset to be plotted or two values showing the ranges of value to be compared. |
xlab |
X-axis labels. |
ylab |
Y-axis labels. |
main |
Title for the plot. |
legd |
Legend for the plot. |
param.vec |
A vector showing the types of generalised lambda distributions. This can be "rs" or "fmkl", only needed if you want to put your own parameters for generalised lambda distributions which are not generated from a fitting algorithm in this package. |
Value
A graph showing the different distributions on the same page.
Note
The data part of the function is not plotted, to see the dataset use the
fun.plot.fit
function.
Author(s)
Steve Su
See Also
Examples
# Fit the dataset
junk<-rnorm(1000,3,2)
result.hs<-fun.data.fit.hs(junk,rs.default = "Y", fmkl.default = "Y",
rs.leap=3, fmkl.leap=3,rs.init = c(-1.5, 1.5), fmkl.init = c(-0.25, 1.5),
no.c.rs=50,no.c.fmkl=50)
opar <- par()
par(mfrow=c(2,2))
# Plot the entire data range
fun.plot.many.gld(result.hs,junk,"x","density","",
legd=c("RPRS.hs", "RMFMKL.hs"))
# Plot and compare parts of the distributions
fun.plot.many.gld(result.hs,c(1,2),"x","density","",legd=c("RPRS.hs",
"RMFMKL.hs"))
fun.plot.many.gld(result.hs,c(0.1,0,2),"x","density","",legd=c("RPRS.hs",
"RMFMKL.hs"))
fun.plot.many.gld(result.hs,c(3,4),"x","density","",legd=c("RPRS.hs",
"RMFMKL.hs"))
par(opar)
Computes the raw moments of the generalised lambda distribution up to 4th order.
Description
This function is of theoretical interest only.
Usage
fun.rawmoments(L1, L2, L3, L4, param = "fmkl")
Arguments
L1 |
Location parameter of the generalised lambda distribution. |
L2 |
Scale parameter of the generalised lambda distribution. |
L3 |
First shape parameter of the generalised lambda distribution. |
L4 |
Second shape parameter of the generalised lambda distribution. |
param |
|
Details
This function is the building block for fun.theo.bi.mv.gld
.
Value
A vector showing the raw moments of the specified generalised lambda distribution up to the fourth order.
Author(s)
Steve Su
References
Freimer, M., Mudholkar, G. S., Kollia, G. & Lin, C. T. (1988), A study of the generalized tukey lambda family, Communications in Statistics - Theory and Methods *17*, 3547-3567.
Karian, Zaven A. and Dudewicz, Edward J. (2000), Fitting statistical distributions: the Generalized Lambda Distribution and Generalized Bootstrap methods, Chapman & Hall
Ramberg, J. S. & Schmeiser, B. W. (1974), An approximate method for generating asymmetric random variables, Communications of the ACM *17*, 78-82.
See Also
~~objects to See Also as help
, ~~~
Examples
## Generate some random numbers using FMKL and RS generalised lambda
## distributions and then compute the empirical and theoretical
## E(X), E(X^2), E(X^3), E(X^4)
junk<-rgl(100000,1,2,3,4)
mean(junk)
mean(junk^2)
mean(junk^3)
mean(junk^4)
junk<-rgl(100000,1,2,3,4,"rs")
mean(junk)
mean(junk^2)
mean(junk^3)
mean(junk^4)
fun.rawmoments(1,2,3,4)
fun.rawmoments(1,2,3,4,"rs")
Simulate a mixture of two generalised lambda distributions.
Description
This function allows the user to simulate observations from a mixture of two generalised lambda distributions. It can be very useful for sensitivity analysis.
Usage
fun.simu.bimodal(result1, result2, prop1, prop2, len = 1000,
no.test = 1000, param1, param2)
Arguments
result1 |
A vector comprising four values for the first generalised lambda distribution. |
result2 |
A vector comprising four values for the second generalised lambda distribution. |
prop1 |
Proportion of the first generalised lambda distribution |
prop2 |
1-prop1, this can be left unspecified. |
len |
Length of object for each simulation run. |
no.test |
Number of simulation run. |
param1 |
This can be |
param2 |
This can be |
Details
The length of object in len
means how many observations should
be generated in each simulation run, with the number of simulation runs governed
by no.test
.
Value
A list with length equal to the number of simulation runs. Each subset of the
list has random observations equal to the the number specified in
len
.
Author(s)
Steve Su
See Also
fun.theo.bi.mv.gld
, fun.moments.bimodal
,
fun.rawmoments
Examples
# Generate random observations from FMKL generalised lambda distributions with
# parameters (1,2,3,4) and (4,3,2,1) with 50% of data from each distribution.
junk<-fun.simu.bimodal(c(1,2,3,4),c(4,3,2,1),prop1=0.5,param1="fmkl",
param2="fmkl")
# Calculate the maximum number from each simulation run
sapply(junk,max)
# Calculate the median from each simulation run
sapply(junk,median)
Calculates the theoretical mean, variance, skewness and kurtosis for mixture of two generalised lambda distributions.
Description
This is the bimodal counterpart for fun.comp.moments.ml.2
and
fun.comp.moments.ml
.
Usage
fun.theo.bi.mv.gld(L1, L2, L3, L4, param1, M1, M2, M3, M4, param2, p1,
normalise="N")
Arguments
L1 |
Location parameter of the first generalised lambda distribution. Or all the parameters of mixture distribution in the form of c(L1,L2,L3,L4,M1,M2,M3,M4,p), you still must specify param1 and param2. |
L2 |
Scale parameter of the first generalised lambda distribution. |
L3 |
First shape parameter of the first generalised lambda distribution. |
L4 |
Second shape parameter of the first generalised lambda distribution. |
param1 |
|
M1 |
Location parameter of the second generalised lambda distribution |
M2 |
Scale parameter of the second generalised lambda distribution. |
M3 |
First shape parameter of the second generalised lambda distribution. |
M4 |
Second shape parameter of the second generalised lambda distribution. |
param2 |
|
p1 |
Proportion of the first generalisd lambda distribution. |
normalise |
"Y" if you want kurtosis to be calculated with reference to kurtosis = 0 under Normal distribution. |
Value
A vector showing the theoretical mean, variance, skewness and kurtosis for mixture of two generalised lambda distributions.
Note
The theoretical moments may not always exist for generalised lambda distributions.
Author(s)
Steve Su
References
Freimer, M., Mudholkar, G. S., Kollia, G. & Lin, C. T. (1988), A study of the generalized tukey lambda family, Communications in Statistics - Theory and Methods *17*, 3547-3567.
Karian, Zaven A. and Dudewicz, Edward J. (2000), Fitting statistical distributions: the Generalized Lambda Distribution and Generalized Bootstrap methods, Chapman & Hall
Ramberg, J. S. & Schmeiser, B. W. (1974), An approximate method for generating asymmetric random variables, Communications of the ACM *17*, 78-82.
See Also
fun.moments.bimodal
, fun.simu.bimodal
,
fun.rawmoments
Examples
# Fits the Old Faithful geyser data (first column) using the maximum
# likelihood.
fit1<-fun.auto.bimodal.ml(faithful[,1],init1.sel="rmfmkl",init2.sel="rmfmkl",
init1=c(-0.25,1.5),init2=c(-0.25,1.5),leap1=3,leap2=3)
# Find the theoretical moments of the fit
fun.theo.bi.mv.gld(fit1$par[1],fit1$par[2],fit1$par[3],fit1$par[4],"fmkl",
fit1$par[5],fit1$par[6],fit1$par[7],fit1$par[8],"fmkl",fit1$par[9])
# Compare this with the empirical moments from the data set.
fun.moments(faithful[,1])
Find the theoretical first four moments of the generalised lambda distribution.
Description
Computes the "mean","variance","skewness","kurtosis" statistics from a given generalised lambda distribution.
Usage
fun.theo.mv.gld(L1, L2, L3, L4, param, normalise="N")
Arguments
L1 |
Lambda 1. Or c(Lambda 1,Lambda 2,Lambda 3,Lambda 4). |
L2 |
Lambda 2. |
L3 |
Lambda 3. |
L4 |
Lambda 4. |
param |
"rs" or "fmkl" or "fkml" |
normalise |
"Y" if you want kurtosis to be calculated with reference to kurtosis = 0 under Normal distribution. |
Value
A vector listing the values of mean, variance, skewness and kurtosis.
Note
Sometimes the theoretical moments may not exist, in those cases,
NA
is returned.
Author(s)
Steve Su
References
Freimer, M., Mudholkar, G. S., Kollia, G. & Lin, C. T. (1988), A study of the generalized tukey lambda family, Communications in Statistics - Theory and Methods *17*, 3547-3567.
Gilchrist, Warren G. (2000), Statistical Modelling with Quantile Functions, Chapman & Hall
Karian, Z.A., Dudewicz, E.J., and McDonald, P. (1996), The extended generalized lambda distribution system for fitting distributions to data: history, completion of theory, tables, applications, the “Final Word” on Moment fits, Communications in Statistics - Simulation and Computation *25*, 611-642.
Karian, Zaven A. and Dudewicz, Edward J. (2000), Fitting statistical distributions: the Generalized Lambda Distribution and Generalized Bootstrap methods, Chapman & Hall
See Also
fun.comp.moments.ml
,
fun.comp.moments.ml.2
, fun.lm.theo.gld
Examples
fun.theo.mv.gld(1, 2, 3, 4, "rs")
fun.theo.mv.gld(1, 2, 3, 4, "fmkl")
Determine which values are zero.
Description
Returns an integer vector showing the position of zero values in the data.
Usage
fun.which.zero(data)
Arguments
data |
A vector of data. |
Value
An integer vector showing the position of zero values in the data.
Note
Any missing values will be returned as missing.
Author(s)
Steve Su
See Also
Examples
# Finding where the zeros are in this vector: c(0,1,2,3,4,0,2)
fun.which.zero(c(0,1,2,3,4,0,2))
# Finding where the zeros are in this vector: c(0,1,2,3,NA,0,2)
fun.which.zero(c(0,1,2,3,NA,0,2))
Returns a vector after removing all the zeros.
Description
This function returns a vector after removing all the zeros.
Usage
fun.zero.omit(object)
Arguments
object |
A vector of data. |
Value
Returns a vector after removing zeros and also give information on the number of zeros in the data removed.
Note
Missing value and Inf values are not removed in this zero removing process.
Author(s)
Steve Su
See Also
Examples
# Removing zero entries from the vector c(0,1,2,3,4,0,2)
fun.zero.omit(c(0,1,2,3,4,0,2))
Checks whether the parameters provided constitute a valid generalised lambda distribution.
Description
An alternative to the gl.check.lambda
function in gld package.
Usage
gl.check.lambda.alt(l1, l2, l3, l4, param = "fmkl")
Arguments
l1 |
Lambda 1. |
l2 |
Lambda 2. |
l3 |
Lambda 3. |
l4 |
Lambda 4. |
param |
|
Details
This version differs from gl.check.lambda
in gld library.
Value
A logical value, TRUE
or FALSE
. TRUE
indicates the
parameters given is a valid probability distribution.
Author(s)
Steve Su
References
Freimer, M., Mudholkar, G. S., Kollia, G. & Lin, C. T. (1988), A study of the generalized tukey lambda family, Communications in Statistics - Theory and Methods *17*, 3547-3567.
Karian, Z.E., Dudewicz, E.J., and McDonald, P. (1996), The extended generalized lambda distribution system for fitting distributions to data: history, completion of theory, tables, applications, the “Final Word” on Moment fits, Communications in Statistics - Simulation and Computation *25*, 611-642.
Ramberg, J. S. & Schmeiser, B. W. (1974), An approximate method for generating asymmetric random variables, Communications of the ACM *17*, 78-82.
See Also
Examples
gl.check.lambda.alt(0,1,.23,4.5,param="fmkl") ## TRUE
gl.check.lambda.alt(0,-1,.23,4.5,param="fmkl") ## FALSE
gl.check.lambda.alt(0,1,0.5,-0.5,param="rs") ## FALSE
Checks whether the parameters provided constitute a valid generalised lambda distribution.
Description
A replacement to the gl.check.lambda function in gld package.
Usage
gl.check.lambda.alt1(l1, l2 = NULL, l3 = NULL, l4 = NULL,
param = "fmkl", vect = FALSE)
Arguments
l1 |
Lambda 1. |
l2 |
Lambda 2. |
l3 |
Lambda 3. |
l4 |
Lambda 4. |
param |
|
vect |
A logical, set this to |
Details
This is a modified gl.check.lambda
function in replace of gld
library's gl.check.lambda
function to allow for 5 parameters FMKL
distributions and vector input of parameter values into this function.
Value
A logical value, TRUE
or FALSE
. TRUE
indicates the
parameters given is a valid probability distribution.
Author(s)
Steve Su
References
Freimer, M., Mudholkar, G. S., Kollia, G. & Lin, C. T. (1988), A study of the generalized tukey lambda family, Communications in Statistics - Theory and Methods *17*, 3547-3567.
Karian, Z.E., Dudewicz, E.J., and McDonald, P. (1996), The extended generalized lambda distribution system for fitting distributions to data: history, completion of theory, tables, applications, the “Final Word” on Moment fits, Communications in Statistics - Simulation and Computation *25*, 611-642.
Ramberg, J. S. & Schmeiser, B. W. (1974), An approximate method for generating asymmetric random variables, Communications of the ACM *17*, 78-82.
See Also
Examples
gl.check.lambda.alt1(c(0,1,.23,4.5),param="fmkl",vect=TRUE)
## TRUE, Using vector input of parameter values.
gl.check.lambda.alt1(0,-1,.23,4.5,param="fmkl") ## FALSE
gl.check.lambda.alt1(0,1,0.5,-0.5,param="rs") ## FALSE
Histogram with exact number of bins specified by the user
Description
The generic function histsu computes a histogram of the given data values.
Usage
histsu(x, breaks = "Sturges", freq = NULL, probability = !freq,
include.lowest = TRUE, right = TRUE, density = NULL, angle = 45, col = NULL,
border = NULL, main = paste("Histogram of", xname), xlim = range(breaks),
ylim = NULL, xlab = xname, ylab, axes = TRUE, plot = TRUE, labels = FALSE,
nclass = NULL, ...)
Arguments
x |
A vector of values for which the histogram is desired. |
breaks |
Either: 1) A vector giving the breakpoints between histogram cells, OR 2) A single number giving the number of cells for the histogram, OR 3) A character string naming an algorithm to compute the number of cells (see Details), OR 4) A function to compute the number of cells. |
freq |
logical; if |
probability |
A logical value, |
include.lowest |
If |
right |
If |
density |
The density of shading lines, in lines per inch. The default
value of |
angle |
The slope of shading lines, given as an angle in degrees (counter-clockwise). |
col |
A colour to be used to fill the bars. The default of |
border |
The color of the border around the bars. The default is to use the standard foreground color. |
main |
Title of the graph. |
xlim |
A two valued vector specifying the lower and upper limits of the x axis. |
ylim |
A two valued vector specifying the lower and upper limits of the y axis. |
xlab |
X axis labels. |
ylab |
Y axis labels. |
axes |
Logical value, if |
plot |
Logical value, if |
labels |
Logical or character. Additionally draw labels on top of bars,
if not |
nclass |
Number of bins of the histogram. |
... |
Other graphical parameters, see par for details. |
Details
See hist
help file. This function forces the number of class of
histogram to that as specified by the user.
Value
An object of class "histogram"
which is a list with components:
breaks |
The n+1 cell boundaries (= |
counts |
N integers; for each cell, the number of |
density |
Values as estimated density values. If |
intensities |
Same as density, deprecated. |
mids |
The n cell midpoints. |
xname |
A character string with the actual |
equidist |
Logical, indicating if the distances between |
Note
Please see hist
help file.
Author(s)
R development team with modifications by Steve Su
References
Venables, W. N. and Ripley. B. D. (2002) Modern Applied Statistics with S. Springer.
See Also
Examples
# See hist for extended example:
junk<-rgamma(1000,5)
# Forcing the number of bins to be 10:
histsu(junk,nclass=10)
Returns a logical vecto, TRUE if the value is Inf or -Inf.
Description
This function works in similar fashion as in is.na
and
is.notinf
.
Usage
is.inf(x)
Arguments
x |
A numerical value or a vector of data. |
Value
A logical vector, T
if the value is Inf or -Inf.
Note
In the presence of missing value, the function will return a missing value.
Author(s)
Steve Su
See Also
Examples
is.inf(c(Inf,2,2,1,-Inf))
Returns a logical vector TRUE, if the value is not Inf or -Inf.
Description
This function works in similar fashion as in is.na
and is.inf
.
Usage
is.notinf(x)
Arguments
x |
A numerical value or a vector of data. |
Value
A logical vector, T
if the value is not Inf or -Inf.
Note
In the presence of missing value, the function will return a missing value.
Author(s)
Steve Su
See Also
Examples
is.notinf(c(Inf,2,2,1,-Inf))
Kolmogorov-Smirnov test
Description
Performs one or two sample Kolmogorov-Smirnov tests.
Usage
ks.gof(x, y, ..., alternative = c("two.sided", "less", "greater"),
exact = NULL)
Arguments
x |
A numeric vector of data values. |
y |
Either a numeric vector of data values, or a character string naming a distribution function. |
... |
Parameters of the distribution specified (as a character string) by 'y'. |
alternative |
Indicates the alternative hypothesis and must be one of
|
exact |
|
Details
If y
is numeric, a two-sample test of the null hypothesis that
x
and y
were drawn from the same continuous distribution is
performed.
Alternatively, y
can be a character string naming a continuous
distribution function. In this case, a one-sample test is carried
out of the null that the distribution function which generated x
is distribution y
with parameters specified by "...".
The possible values "two.sided"
(default),"less"
and "greater"
of alternative
specify the null hypothesis that the
true distribution function of x
is equal to, not less than or not
greater than the hypothesized distribution function (one-sample case) or the
distribution function of y
(two-sample case),
respectively.
Exact p-values are not available for the one-sided two-sample case, or in the
case of ties. exact = NULL
(the default), an exact p-value is computed
if the sample size if less than 100 in the one-sample case, and if the product
of the sample sizes is less than 10000 in the two-sample case. Otherwise,
asymptotic distributions are used whose approximations may be inaccurate in
small samples. In the one-sample two-sided case, exact p-values are obtained as
described in Marsaglia, Tsang & Wang (2003). The formula of Birnbaum & Tingey
(1951) is used for the one-sample one-sided case.
If a single-sample test is used, the parameters specified in "..." must be
pre-specified and not estimated from the data. There is some more refined
distribution theory for the KS test with estimated parameters
(see Durbin, 1973), but that is not implemented in ks.gof
.
Value
A list with class "htest"
containing the following components:
statistic |
Value of test statistics. |
p.value |
P-value. |
alternative |
Character string describing the alternative hypothesis. |
method |
Character string indicating what type of test was performed. |
data.name |
Character string giving the name(s) of the data. |
Note
This function handle ties by jittering, adding a very small uniform random number generated from the minimal value of the data set divided by 1e+08 to minimal value divided by 1e+07.
Author(s)
R
References
Z. W. Birnbaum & Fred H. Tingey (1951), One-sided confidence contours for probability distribution functions. The Annals of Mathematical Statistics, *22*/4, 592-596.
William J. Conover (1971), Practical nonparametric statistics. New York: John Wiley & Sons. Pages 295-301 (one-sample "Kolmogorov" test), 309-314 (two-sample "Smirnov" test).
Durbin, J. (1973) Distribution theory for tests based on the sample distribution function. SIAM.
George Marsaglia, Wai Wan Tsang & Jingbo Wang (2003), Evaluating Kolmogorov's distribution. Journal of Statistical Software, *8*/18. <URL: http://www.jstatsoft.org/v08/i18/>.
See Also
Examples
x <- rnorm(50)
y <- runif(30)
# Do x and y come from the same distribution?
ks.gof(x, y)
# Does x come from a shifted gamma distribution with shape 3 and rate 2?
ks.gof(x+2, "pgamma", 3, 2) # two-sided, exact
ks.gof(x+2, "pgamma", 3, 2, exact = FALSE)
ks.gof(x+2, "pgamma", 3, 2, alternative = "gr")
An alternative to the normal pretty function in R.
Description
Divide a range of values into equally spaced divisions. End points are given as output.
Usage
pretty.su(x, nint = 5)
Arguments
x |
A vector of values. |
nint |
Number of intervals required. |
Details
This is also used for the plotting of histogram in the histsu
function.
Value
A vector of endpoints dividing the data into equally spaced regions.
Author(s)
Steve Su
See Also
Examples
# Generate random numbers from normal distribution:
junk<-rnorm(1000,2,3)
# Cut them into 7 regions, 8 endpoints.
pretty.su(junk,7)
Do a quantile plot on the univariate distribution fits.
Description
This plots the theoretical and actual data quantiles to allow the user to examine the adequacy of a single gld distribution fit.
Usage
qqplot.gld(data, fit, param, len = 10000, name = "",
corner = "topleft",type="",range=c(0,1),xlab="",main="")
Arguments
data |
Data fitted. |
fit |
Parameters of distribution fit. |
param |
Can be either |
len |
Precision of the quantile calculatons. Default is 10000. This means 10000 points are taken from 0 to 1. |
name |
Name of the data set, added to the title of plot if |
corner |
Can be |
type |
This can be "" or "str.qqplot", the first produces the raw quantiles and the second plot them on a straight line. Default is "". |
range |
This is the range for which the quantiles are to be plotted.
Default is |
xlab |
x axis label, if left blank, then default is "Data". |
main |
Title of the plot, if left blank, a default title will be added. |
Value
A plot is given.
Author(s)
Steve Su
See Also
Examples
set.seed(1000)
junk<-rweibull(300,3,2)
# Fit the function using fun.data.fit.ml
obj.fit1.ml<-fun.data.fit.ml(junk)
# Do a quantile plot on the raw quantiles
qqplot.gld(junk,obj.fit1.ml[,1],param="rs",name="RS ML fit")
# Or a qq plot to examine deviation from straight line
qqplot.gld(junk,obj.fit1.ml[,1],param="rs",name="RS ML fit",type="str.qqplot")
Do a quantile plot on the bimodal distribution fits.
Description
This plots the theoretical and actual data quantiles to allow the user to examine the adequacy of two gld distributions mixture fit.
Usage
qqplot.gld.bi(data, fit, param1, param2, len = 10000, name = "",
corner = "topleft",type="",range=c(0,1),xlab="",main="")
Arguments
data |
Data fitted. |
fit |
Parameters of distribution fit. |
param1 |
Can be either |
param2 |
Can be either |
len |
Precision of the quantile calculatons. Default is 10000. This means 10000 points are taken from 0 to 1. |
name |
Name of the data set, added to the title of plot if |
corner |
Can be |
type |
This can be "" or "str.qqplot", the first produces the raw quantiles and the second plot them on a straight line. Default is "". |
range |
This is the range for which the quantiles are to be plotted.
Default is |
xlab |
x axis label, if left blank, then default is "Data" |
main |
Title of the plot, if left blank, a default title will be added. |
Value
A plot is given.
Author(s)
Steve Su
See Also
Examples
set.seed(1000)
junk<-rweibull(300,3,2)
## Fitting mixture of generalised lambda distributions on the data set using
## both the maximum likelihood and partition maximum likelihood and plot the
## resulting fits
junk<-fun.auto.bimodal.ml(faithful[,1],per.of.mix=0.1,clustering.m=clara,
init1.sel="rprs",init2.sel="rmfmkl",init1=c(-1.5,1.5),init2=c(-0.25,1.5),
leap1=3,leap2=3)
fun.plot.fit.bm(nclass=50,fit.obj=junk,data=faithful[,1],
name="Maximum likelihood using",xlab="faithful1",param.vec=c("rs","fmkl"))
## Do a quantile plot on the raw quantiles
qqplot.gld.bi(faithful[,1],junk$par,param1="rs",param2="fmkl",
name="RS FMKL ML fit")
## Or a qq plot to examine deviation from straight line
qqplot.gld.bi(faithful[,1],junk$par,param1="rs",param2="fmkl",
name="RS FMKL ML fit",type="str.qqplot")
Compute skewness and kurtosis statistics
Description
This uses the S+ version directly.
Usage
skewness(x, na.rm = FALSE, method = "fisher")
kurtosis(x, na.rm = FALSE, method = "fisher")
Arguments
x |
Any numerical object. Missing values |
na.rm |
Logical flag: if na.rm=TRUE, missing values are removed from x
before doing the computations. If |
method |
Character string specifying the computation method. The two
possible values are |
Details
The moment
forms are based on the definitions of skewness and kurtosis
for distributions; these forms should be used when resampling (bootstrap or
jackknife). The "fisher" forms correspond to the usual "unbiased" definition of
sample variance, though in the case of skewness and kurtosis exact unbiasedness
is not possible.
Value
A single value of skewness or kurtotis.
If y = x - mean(x), then the "moment" method computes the skewness value as
mean(y\mbox{\textasciitilde}
3)/mean(y\mbox{\textasciitilde}
2)
\mbox{\textasciitilde}
1.5
and the kurtosis value as mean(y\mbox{\textasciitilde}
4)/mean(y
\mbox{\textasciitilde}
2)\mbox{\textasciitilde}
2 - 3.
To see the "fisher" calculations, print out the functions.
Author(s)
Splus
See Also
var
Examples
x <- runif(30)
skewness(x)
skewness(x, method="moment")
kurtosis(x)
kurtosis(x, method="moment")
Carry out the “starship” estimation method for the generalised lambda distribution
Description
Calculates estimates for the FMKL parameterisation of the generalised lambda
distribution on the basis of data, using the starship method.
The starship method is built on the fact that the
generalised lambda distribution
is a transformation of the uniform distribution. This method finds the
parameters that transform the data closest to the uniform distribution.
This function uses a grid-based search to find a suitable starting point (using
starship.adaptivegrid
) then uses optim
to find
the parameters that do this.
Usage
starship(data, optim.method = "Nelder-Mead", initgrid = NULL, param="FMKL",
optim.control=NULL)
Arguments
data |
Data to be fitted, as a vector | |||||||||||||||||||||||||||||||||
optim.method |
Optimisation method for | |||||||||||||||||||||||||||||||||
initgrid |
Grid of values of If it is left as NULL, the default grid depends on the parameterisation.
For
( For
( For
and
| |||||||||||||||||||||||||||||||||
param |
choose parameterisation:
| |||||||||||||||||||||||||||||||||
optim.control |
List of options for the optimisation step. See
|
Details
The starship method is described in King and MacGillivray, 1999 (see references). It is built on the fact that the generalised lambda distribution is a transformation of the uniform distribution. Thus the inverse of this transformation is the distribution function for the gld. The starship method applies different values of the parameters of the distribution to the distribution function, calculates the depths q corresponding to the data and chooses the parameters that make the depths closest to a uniform distribution.
The closeness to the uniform is assessed by calculating the Anderson-Darling
goodness-of-fit test on the transformed data against the uniform, for a
sample of size length(data)
.
This is implemented in 2 stages in this function. First a grid search is
carried out, over a small number of possible parameter values
(see starship.adaptivegrid
for details). Then the minimum from
this search is given as a starting point for an optimisation of the
Anderson-Darling value using optim, with method given by optim.method
See references for details on parameterisations.
Value
Returns a list, with
lambda |
A vector of length 4, giving
the estimated parameters, in order,
|
grid.results |
output from the grid search - see
|
optim |
output from the optim search -
|
Author(s)
Robert King, Darren Wraith
References
Freimer, M., Mudholkar, G. S., Kollia, G. & Lin, C. T. (1988), A study of the generalized tukey lambda family, Communications in Statistics - Theory and Methods 17, 3547–3567.
Ramberg, J. S. & Schmeiser, B. W. (1974), An approximate method for generating asymmetric random variables, Communications of the ACM 17, 78–82.
King, R.A.R. & MacGillivray, H. L. (1999), A starship method for
fitting the generalised \lambda
distributions,
Australian and New Zealand Journal of
Statistics 41, 353–374
Owen, D. B. (1988), The starship, Communications in Statistics - Computation and Simulation 17, 315–323.
See Also
starship.adaptivegrid
,
starship.obj
Examples
data <- rgl(100,0,1,.2,.2)
starship(data,optim.method="Nelder-Mead",initgrid=list(lcvect=(0:4)/10,
ldvect=(0:4)/10))
Carry out the “starship” estimation method for the generalised lambda distribution using a grid-based search
Description
Calculates estimates for the generalised lambda distribution on the basis of data, using the starship method. The starship method is built on the fact that the generalised lambda distribution is a transformation of the uniform distribution. This method finds the parameters that transform the data closest to the uniform distribution. This function uses a grid-based search.
Usage
starship.adaptivegrid(data, initgrid=list(
lcvect = c(-1.5, -1, -0.5, -0.1, 0, 0.1, 0.2, 0.4, 0.8, 1, 1.5),
ldvect = c(-1.5, -1, -0.5, -0.1, 0, 0.1, 0.2, 0.4, 0.8, 1, 1.5),
levect = c(-0.5,-0.25,0,0.25,0.5)),param="FMKL")
Arguments
data |
Data to be fitted, as a vector |
initgrid |
A list with elements,
Note: if |
param |
choose parameterisation:
|
Details
The starship method is described in King and MacGillivray, 1999 (see references). It is built on the fact that the generalised lambda distribution is a transformation of the uniform distribution. Thus the inverse of this transformation is the distribution function for the gld. The starship method applies different values of the parameters of the distribution to the distribution function, calculates the depths q corresponding to the data and chooses the parameters that make the depths closest to a uniform distribution.
The closeness to the uniform is assessed by calculating the Anderson-Darling
goodness-of-fit test on the transformed data against the uniform, for a
sample of size length(data)
.
This function carries out a grid-based search. This was the original method
of King and MacGillivray, 1999, but you are advised to instead use
starship
which uses a grid-based search together with an
optimisation based search.
See references for details on parameterisations.
Value
response |
The minimum “response value” — the result of the internal goodness-of-fit measure. This is the return value of starship.obj. See King and MacGillivray, 1999 for more details |
lambda |
A vector of length 4 giving the values of
|
Author(s)
Robert King, Darren Wraith
References
Freimer, M., Mudholkar, G. S., Kollia, G. & Lin, C. T. (1988), A study of the generalized tukey lambda family, Communications in Statistics - Theory and Methods 17, 3547–3567.
Ramberg, J. S. & Schmeiser, B. W. (1974), An approximate method for generating asymmetric random variables, Communications of the ACM 17, 78–82.
King, R.A.R. & MacGillivray, H. L. (1999), A starship method for
fitting the generalised \lambda
distributions,
Australian and New Zealand Journal of
Statistics 41, 353–374
Owen, D. B. (1988), The starship, Communications in Statistics - Computation and Simulation 17, 315–323.
See Also
Examples
data <- rgl(100,0,1,.2,.2)
starship.adaptivegrid(data,list(lcvect=(0:4)/10,ldvect=(0:4)/10))
Objective function that is minimised in starship estimation method
Description
The starship is a method for fitting the generalised lambda distribution.
See starship
for more details.
This function is the objective funciton minimised in the methods. It is a goodness of fit measure carried out on the depths of the data.
Usage
starship.obj(par, data, param = "fmkl")
Arguments
par |
parameters of the generalised lambda distribution, a vector of
length 4, giving |
data |
Data — a vector |
param |
choose parameterisation:
|
Details
The starship method is described in King and MacGillivray, 1999 (see references). It is built on the fact that the generalised lambda distribution is a transformation of the uniform distribution. Thus the inverse of this transformation is the distribution function for the gld. The starship method applies different values of the parameters of the distribution to the distribution function, calculates the depths q corresponding to the data and chooses the parameters that make the depths closest to a uniform distribution.
The closeness to the uniform is assessed by calculating the Anderson-Darling
goodness-of-fit test on the transformed data against the uniform, for a
sample of size length(data)
.
This function returns that objective function. It is provided as a seperate
function to allow users to carry out minimisations using optim
or other methods. The recommended method is to use the link{starship}
function.
Value
The Anderson-Darling goodness of fit measure, computed on the transformed data, compared to a uniform distribution. Note that this is NOT the goodness-of-fit measure of the generalised lambda distribution with the given parameter values to the data.
Author(s)
Robert King, Darren Wraith
References
Freimer, M., Mudholkar, G. S., Kollia, G. & Lin, C. T. (1988), A study of the generalized tukey lambda family, Communications in Statistics - Theory and Methods 17, 3547–3567.
Ramberg, J. S. & Schmeiser, B. W. (1974), An approximate method for generating asymmetric random variables, Communications of the ACM 17, 78–82.
King, R.A.R. & MacGillivray, H. L. (1999), A starship method for
fitting the generalised \lambda
distributions,
Australian and New Zealand Journal of
Statistics 41, 353–374
Owen, D. B. (1988), The starship, Communications in Statistics - Computation and Simulation 17, 315–323.
See Also
starship
starship.adaptivegrid
,
Examples
data <- rgl(100,0,1,.2,.2)
starship.obj(c(0,1,.2,.2),data,"fmkl")
Trimmed L-moments
Description
Calculates sample trimmed L-moments with trimming parameter 1.
Usage
t1lmoments(data,rmax=4)
Arguments
data |
matrix or data frame. |
rmax |
maximum order of trimmed L-moments. |
Value
array of trimmed L-moments (trimming parameter = 1) up to order 4 containing a row for each variable in data.
Note
Functions link{Lmoments}
and link{Lcoefs}
calculate
trimmed L-moments if you specify trim=c(1,1)
.
Author(s)
Juha Karvanen juha.karvanen@ktl.fi
References
Karvanen, J. and A. Nuutinen (2008). "Characterizing the generalized lambda distribution by L-moments." Computational Statistics and Data Analysis 52(4): 1971-1983.
Asquith, W. (2007). "L-moments and TL-moments of the generalized lambda distribution." Computational Statistics and Data Analysis 51(9): 4484-4496.
Elamir, E. A., Seheult, A. H. 2004. "Exact variance structure of sample L-moments" Journal of Statistical Planning and Inference 124 (2) 337-359.
Hosking, J. 1990. "L-moments: Analysis and estimation distributions using linear combinations of order statistics", Journal of Royal Statistical Society B 52, 105-124.
See Also
Lmoments
for L-moments
Examples
x<-rnorm(500)
t1lmoments(x)
Determine Missing Values
Description
Returns a vector showing the position of missing values in a vector.
Usage
which.na(x)
Arguments
x |
An object which should be of |
Value
This returns the indices of values in x
which are missing or
"Not a Number".
Examples
# A non-zero number divided by zero creates infinity,
# zero over zero creates a NaN
weird.values <- c(1/0, -2.9/0, 0/0, NA)
which.na(weird.values)
# Produces:
# [1] 3 4