Help for package Path.Analysis

Type:

Package

Title:

Path Coefficient Analysis

Version:

0.1

Date:

2024-9-12.

Author:

Ali Arminian

[aut, cre, cph]

Maintainer:

Ali Arminian <abeyran@gmail.com>

Description:

Facilitates the performance of several analyses, including simple and sequential path coefficient analysis, correlation estimate, drawing correlogram, Heatmap, and path diagram. When working with raw data, that includes one or more dependent variables along with one or more independent variables are available, the path coefficient analysis can be conducted. It allows for testing direct effects, which can be a vital indicator in path coefficient analysis. The process of preparing the dataset rule is explained in detail in the vignette file "Path.Analysis_manual.Rmd". You can find this in the folders labelled "data" and "~/inst/extdata". Also see: 1)the 'lavaan', 2)a sample of sequential path analysis in 'metan' suggested by Olivoto and Lúcio (2020) <doi:10.1111/2041-210X.13384>, 3)the simple 'PATHSAS' macro written in 'SAS' by Cramer et al. (1999) <doi:10.1093/jhered/90.1.260>, and 4)the semPlot() function of 'OpenMx' as initial tools for conducting path coefficient analyses and SEM (Structural Equation Modeling). To gain a comprehensive understanding of path coefficient analysis, both in theory and practice, see a 'Minitab' macro developed by Arminian, A. in the paper by Arminian et al. (2008) <doi:10.1080/15427520802043182>.

License:

GPL-3

URL:

https://github.com/abeyran/Path.Analysis

BugReports:

https://github.com/abeyran/Path.Analysis/issues

Depends:

R (≥ 4.1.0)

Imports:

stats, corrr, corrplot, Hmisc, gplots, mathjaxr, pastecs, graphics, grDevices, DiagrammeR, ComplexHeatmap, metan

Suggests:

car, ggplot2, devtools, usethis, testthat, knitr, rmarkdown, roxygen2, spelling

VignetteBuilder:

knitr

RdMacros:

mathjaxr

Encoding:

UTF-8

Ali Arminian

RoxygenNote:

7.3.2

Language:

en-US

LazyData:

true

LazyLoad:

true

NeedsCompilation:

BuildManual:

TRUE

Packaged:

2024-09-23 19:30:28 UTC; Administrator

Repository:

CRAN

Date/Publication:

2024-09-25 08:20:05 UTC

Path Coefficient Analysis

Description

Path.Analysis does descriptive statistics on dataset and importantly graphical representation of data such as drawing heatmaps, correlogram and path diagram.

Author(s)

Ali Arminian abeyran@gmail.com

Drawing the correlogram

Description

corr_plot() draws a correlogram for data

Usage

cor_plot(datap)

Arguments

datap

The data set

Value

Returns an object of class ⁠gg, ggmatrix⁠.

Author(s)

Ali Arminian abeyran@gmail.com

References

Olivoto, T, and A Dal’Col Lúcio. 2020. “Metan: An r Package for Multi‐environment Trial Analysis.” Methods in Ecology and Evolution, 11(6): 783–89. https://doi.org/10.1111/2041-210 X.13384.

Examples


data(dtsimp)
cor_plot(dtsimp)

Correlation Analysis

Description

corr() estimates Pearson correlation coefficients among parametric numerical characteristics as follows:
⁠The Pearson correlation coefficient:⁠ \[ r_{x,y} = \frac{n\sum{xy}-(\sum{x})(\sum{y})} {\sqrt{(n\sum{x^2}-(\sum{x})^2)(n\sum{y^2}-(\sum{y})^2)}}\]

or: \[ r_{x,y} =\frac{\Sigma(x-\bar{x})(y-\bar{y})} {\sqrt{\Sigma{(x-\bar{x})^2\Sigma(y-\bar{y}})^2}} \]

where \(r_{x,y}\) is the ⁠correlation coefficient⁠ between \(x\) and \(y\) variables.

Usage

corr(datap, verbose = FALSE)

Arguments

datap

The data set

verbose

If verbose = TRUE then some results are printed in the console.

Details

The corr() function estimates correlation coefficients and their significance in the form of a table of one or more independent (exogenous) variables on a dependent (endogenous) variable along with testing the significance.

Value

Returns a list of two objects:

Correlations: the data frame of Pearson's correlation coefficients
P_values: the data frame of significance of correlation coefficients (r):

p p-value for testing the r
lowCI lower confidence interval of r
uppCI upper confidence interval of r

Author(s)

Ali Arminian abeyran@gmail.com

Examples


data(dtsimp)
corr(dtsimp, verbose = FALSE)


data(dtraw)
corr(dtraw[, -1], verbose = FALSE)

Data preparation

Description

Prepares data for analyses

Usage

dataprep(datap)

Arguments

datap

dataset

Value

Returns a data frame

Descriptive statistics

Description

desc() estimates the descriptive statistics such as Min(Minimum), ⁠1st Qu.⁠(quartile), Median, Mean (average), ⁠3rd Qu.⁠(3rd quartile), Max(maximum), var (variance), std.dev(standard deviation), coef.var (CV or coefficient of variation) of the data set.

Usage

desc(datap, resp)

Arguments

datap

The data set

resp

an integer value indicating the column in datap that corresponds to the response variable.

Details

The desc() function estimates the descriptive statistics, in tables for one or more independent (exogenous) variables on a dependent(endogenous) variable. It acts only on numerical variables. For example for the variable x:

⁠1st. quartile:⁠ \[Q_1 = (n + 1) x 1/4\]
⁠2nd. quartile or Median:⁠ \[md = (n + 1) x 2/4\]
⁠3rd Qu.:⁠ \[Q_3 = (n + 1) x 3/4\]
⁠Arithmetic mean:⁠ \[\bar{x}=\frac{1}{n} \sum_{i=i}^{n} x_{i}\]
⁠Range:⁠ \[R_x = \max(x) - \min(x)\]
Variance: \[\sigma_{x}^2 = \frac{\sum_{i=1}^n(x_i-\bar{x})^2}{n} \]
⁠Standard deviation:⁠ \[sd_x = \sqrt{\frac{\sum_{i} (x_{i} - \mu)^2}{n}}\]
⁠SEM or SE.mean⁠, the standard error of the mean is calculated simply by taking the standard deviation and dividing it by the square root of the sample size: \[SEM_x = \frac{sd(x)}{\sqrt{n}}\]
⁠coef.var or coefficient of variation:⁠ \[CV = \frac{sd(x)}{\bar{x} }\times 100\]

Value

Returns a list of 3 objects:

desc1: Descriptive statistics1 of input data
desc2: Descriptive statistics2 of input data
corcf: A table of correlation coefficients

Author(s)

Ali Arminian abeyran@gmail.com

References

Bhattacharyya GK and Johnson RA 1997. Statistical Concepts and Methods, John Wiley and Sons, New York.

Draper N and Smith H 1981. Applied Regression Analysis, John Wiley & Sons, New York.

Neter, J, Whitmore, GA, Wasserman, W 1992. Applied Statistics. Allyn & Bacon, Incorporated, ISBN 10: 0205134785 / ISBN 13: 9780205134786.

Snedecor, G.W., Cochran, W.G. 1980. Statistical Methods. Iowa State University Press.

Examples


data(dtsimp)
desc(dtsimp, 1)


data(dtraw)
desc(dtraw[, -1], 1)

data(heart)
desc(heart, 2)

Dataset 2: a number of 9 traits measured on 35 Camelina DH lines.

Description

Dataset 2: a number of 9 traits measured on 35 Camelina DH lines.

Usage

data(dtraw)

Format

A data.frame with 35 observations of 9 variables.

⁠DH lines⁠: a character vector
y: a numeric vector
X1: a numeric vector
X2: a numeric vector
X3: a numeric vector
X4: a numeric vector
X5: a numeric vector
X6: a numeric vector
X7: a numeric vector
X8: a numeric vector

Examples


library(Path.Analysis)
data(dtraw)

Dataset 3: a number of 9 traits measured on 35 Camelina DH lines.

Description

Dataset 3: a number of 9 traits measured on 35 Camelina DH lines.

Usage

data(dtraw2)

Format

A data.frame with 35 observations of 9 variables.

⁠DH lines⁠: a character vector considered as rownames
y: a numeric vector
X1: a numeric vector
X2: a numeric vector
X3: a numeric vector
X4: a numeric vector
X5: a numeric vector
X6: a numeric vector
X7: a numeric vector
X8: a numeric vector

Examples


library(Path.Analysis)
data(dtraw2)

Dataset 4: a dataframe consisting of 7 variables measured on 8 observations.

Description

Dataset 4: a dataframe consisting of 7 variables measured on 8 observations.

Usage

data(dtseq)

Format

A data.frame with 8 observations of 7 variables.

Genotypes: a character vector
YLD: a numeric vector
DFT: a numeric vector
FS: a numeric vector
FV: a numeric vector
FW: a numeric vector
DFL: a numeric vector
FLP: a numeric vector

Examples


library(Path.Analysis)
data(dtseq)

Dataset5

Description

Dataset5

Usage

data(dtseqr)

Format

A data.frame with 24 observations of 7 variables.

Genotypes: a character vector
Rep: a numeric vector
YLD: a numeric vector
DFT: a numeric vector
FS: a numeric vector
FV: a numeric vector
FW: a numeric vector
DFL: a numeric vector
FLP: a numeric vector

Examples


library(Path.Analysis)
data(dtseqr)

Dataset 1: a dependent (y) and 3 independent(x1 to x3) variables.

Description

Dataset 1: a dependent (y) and 3 independent(x1 to x3) variables.

Usage

data(dtsimp)

Format

A data.frame with 105 observations of 4 variables.

y: a numeric vector
x1: a numeric vector
x2: a numeric vector
x3: a numeric vector

Examples


library(Path.Analysis)
data(dtsimp)

Dataset 6: Heart Disease data set

Description

A mixed variable dataset containing 14 variables of 297 patients for their heart disease diagnosis.

Usage

data(heart)

Format

A data.frame including 297 rows and 14 variables:

age: Age in years (numerical).
sex: Sex: 1 = male, 0 = female (logical).
heart.disease: a numeric vector as dependent.
biking: a numeric vector as the first independent.
smoking: a numeric vector as the 2nd independent.

Source

The data set is belong to machine learning repository of UCI. The original data set includes 303 patients with 6 NA's. After removing missing values, it reduced into 297 patients.

https://archive.ics.uci.edu/ml/datasets/Heart+Disease

References

Lichman, M. (2013). UCI machine learning repository.

Examples


library(Path.Analysis)
data(heart)

Creating the `Heatmap` chart

Description

heat_map() draws a double-clustered heatmap for path coefficients analysis. Please be cautious that this function acts only on numeric variables/columns (see example on dtraw2 data set). Users for drawing other types of heatmaps may use heatmap.3, ComplexHeatmap and pheatmap R packages. Where an example is given in the vignette manual of this package (Path.Analysis_manual.Rmd)

Usage

heat_map(datap)

Arguments

datap

The data set

Value

Returns an object of class heatmap.2.

Author(s)

Ali Arminian abeyran@gmail.com

Examples


data(dtraw2)
dtraw2 <- scale(as.data.frame(dtraw2))
heat_map(dtraw2)

Direct and Indirect Effects Matrices and Diagram

Description

matdiag() extracts the direct effect and indirect effects matrices of data in path analysis along with the significance of direct effects where direct effects are shown as a vector (columnar matrix of 1*n dimensions and indirect effects are off-diagonal effects. Later, draws a diagram for path coefficient analysis based on the DiagrammeR package.

Usage

matdiag(datap, resp, verbose = FALSE)

Arguments

datap

The data set

resp

The response variable

verbose

If verbose = TRUE then some results are printed

Details

The matdiag function estimates the direct and indirect effects in path coefficient analysis as tables along with drawing the diagram of path analysis. This is apparently the only program testing the significance of direct effects in a path analysis. Note: all variables must be numeric for matrix calculations and the next plotting.

In a path model, path coefficients or direct effects (Pi's) indicate the direct effects of a variable on another, and are standardized partial regression coefficients (in Wright's terminology) due they are estimated from correlations or from the transformed (standardized) data as:

\[P_i = \beta_i\frac{\sigma_{X_i}}{\sigma_Y} \]

The path equations are as follows:
One dependent variable: \[P_1 + P_2r_{12} + P_3r_{13} + ... + P_nr_{1n} = rY_1\] \[P_1r_{21} + P_2 + P_3r_{23} + ... + P_nr_{2n} = rY_2\] \[...\] \[P_1rn_1 + P_2r_{n2} + P_3r_{n3} + ... + P_n = rY_n\]
Extension to more dependent variables: Path.Analysis is capable of performing this straightforward function through detailed explanations. The linear regression model with a single response in its form is as follows (Johnson and Wichern (2007): \(Y = \beta_0 + \beta_1Z_1 + ... + \beta_rZ_r + \epsilon\)

where the multivariate multiple linear regression model is as follows: \[Y_1 = \beta_0 + \beta_1Z_{11} + \beta_2Z{12} + ... + \beta_rZ_{1r} + \epsilon_1\] \[Y_2 = \beta_0 + \beta_1Z_{21} + \beta_2Z{22} + ... + \beta_rZ_{2r} + \epsilon_2\] \[...\] \[Y_n = \beta_0 + \beta_1Z_{n1} + \beta_2Z{n2} + ... + \beta_rZ_{nr} + \epsilon_n\]

As stated by Bondari (1990), for two dependent variables \(Y_1\) and \(Y_2\): \[ Y_1 = p_1X_1 + p_2X_2 + p_3X_3 + ... + p_nX_n \] \[ Y_2 = p'_1X_1 + p'_2X_2 + p'_3X_3 + ... + p'_nX_n \] \[ ... \]

where: \[ r_{Y_1Y_2} = p_1p'_1 + p_2p'_2 + p_3p'_3 + ... + p_np'_n + \sigma_{i=j}p_ip'_1r_{ij} = \sigma_{i,j}p_ip'_ir_{ij} \]

Value

Returns a list with three objects

direff: a data frame of direct effects
matall: a matrix of direct and indirect effects
Residual: a constant of residuals

Author(s)

Ali Arminian abeyran@gmail.com

References

Arminian, A, MS Kang, M Kozak, S Houshmand, and P Mathews. 2008. “MULTPATH: A Comprehensive Minitab Program for Computing Path Coefficients and Multiple Regression for Multivariate Analyses.” Journal of Crop Improvement, 22(1): 82–120.

Bondari, K. 1990. "PATH ANALYSIS IN AGRICULTURAL RESEARCH," Conference on Applied Statistics in Agriculture. https://do i.org/10.4148/2475-7772.1439

Cramer, C.S, TC Wehner, and SB Donaghy. 1999. “PATHSAS: A SAS Computer Program for Path Coefficient Analysis of Quantitative Data.” Journal of Heredity, 90(1): 260–62. https://doi.org/10 .1093/jhered/90.1.260.

Johnson, R.A., Wichern, D.W. 2007. Applied Multivariate Statistical Analysis. Prentice Hall, USA.

Li, C.C. 1975. Path Analysis: A Primer. Boxwood Pr. 346 p.

Wolfle, LM. 2003. “The Introduction of Path Analysis to the Social Sciences, and Some Emergent Themes: An Annotated Bibliography.” Structural Equation Modeling, 10(1): 1–34.

Wright, S. 1923. “The Theory of Path Coefficients a Reply to Niles’s Criticism.” Genetics, 8(3): 239.

———. 1934. “The Method of Path Coefficients.” The Annals of Mathematical Statistics, 5(3): 161–215.

———. 1960. “Path Coefficients and Path Regressions: Alternative or Complementary Concepts?” Biometrics, 16(2): 189–202.

Examples


data(dtsimp)
matdiag(dtsimp, 1, verbose = FALSE)

data(dtraw)
matdiag(dtraw[, -1], 1, verbose = FALSE)

data(heart)
matdiag(heart, 2, verbose = FALSE)

Network plot

Description

network.plot() draws the network plot of path coefficients analysis

Usage

network.plot(datap)

Arguments

datap

The data set

Details

The network.plot() draws a correlogram and a heatmap for data, if requested by user

Value

Returns an object of class network_plot.

Author(s)

Ali Arminian abeyran@gmail.com

References

Kuhn et al. 2022. corrr package. doi: <10.32614/CRAN.package.corrr> https://github.com/tidymodels/corrr

Examples


data(dtraw2)
network.plot(dtraw2)

Multiple Linear Regression

Description

reg() performs a multiple linear regression analysis with extracting the attributed parameters

Usage

reg(datap, resp, verbose = FALSE)

Arguments

datap

The data set

resp

an integer value indicating the column in datap that

verbose

If verbose = TRUE then some results are printed in the console. corresponds to the response variable.

Details

The reg function fits a multiple linear regression analysis of one or more independent (exogenous) variables on a dependent(endogenous) variable in a linear pattern along with testing the significance of parameters. It is important that according to the type of data may produce some warning errors e.g., for dtsimp as: Warning message: In summary.lm(mlreg): essentially perfect fit: summary may be unreliable. This case is due to the intrinsic characteristics of data

Value

An object of class list

Author(s)

Ali Arminian abeyran@gmail.com

Examples


data(dtsimp)
reg(dtsimp, 1, verbose = FALSE)


data(heart)
reg(heart, 1, verbose = FALSE)

Path Coefficient Analysis

Description

Author(s)

See Also

Drawing the correlogram

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Correlation Analysis

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Data preparation

Description

Usage

Arguments

Value

Descriptive statistics

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Dataset 2: a number of 9 traits measured on 35 Camelina DH lines.

Description

Usage

Format

Examples

Dataset 3: a number of 9 traits measured on 35 Camelina DH lines.

Description

Usage

Format

Examples

Dataset 4: a dataframe consisting of 7 variables measured on 8 observations.

Description

Usage

Format

Examples

Dataset5

Description

Usage

Format

Examples

Dataset 1: a dependent (y) and 3 independent(x1 to x3) variables.

Description

Usage

Format

Examples

Dataset 6: Heart Disease data set

Description

Usage

Format

Source

References

Examples

Creating the Heatmap chart

Description

Usage

Arguments

Value

Author(s)

See Also

Examples

Direct and Indirect Effects Matrices and Diagram

Description

Usage

Creating the `Heatmap` chart