Version: | 1.63-1 |
Date: | 2016-03-10 |
Title: | Methods for Detection of Clusters in Hierarchical Clustering Dendrograms |
Author: | Peter Langfelder <Peter.Langfelder@gmail.com> and Bin Zhang <binzhang.ucla@gmail.com>, with contributions from Steve Horvath <SHorvath@mednet.ucla.edu> |
Maintainer: | Peter Langfelder <Peter.Langfelder@gmail.com> |
Depends: | R (≥ 2.3.0), stats |
ZipData: | no |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
Description: | Contains methods for detection of clusters in hierarchical clustering dendrograms. |
URL: | http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/BranchCutting/ |
NeedsCompilation: | no |
Packaged: | 2016-03-10 18:25:27 UTC; plangfelder |
Repository: | CRAN |
Date/Publication: | 2016-03-11 00:39:02 |
Methods for Detection of Clusters in Hierarchical Clustering Dendrograms
Description
Contains methods for detection of clusters in hierarchical clustering dendrograms.
Details
Package: | dynamicTreeCut |
Version: | 1.63-1 |
Date: | 2016-03-10 |
Depends: | R, stats |
ZipData: | no |
License: | GPL version 2 or newer |
URL: | http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/BranchCutting/ |
Index:
cutreeDynamic Adaptive branch pruning of hierarchical clustering dendrograms. cutreeDynamicTree Dynamic dendrogram pruning based on dendrogram only cutreeHybrid Hybrid adaptive tree cut for hierarchical clustering dendrograms. indentSpaces Spaces for indented output. merge2Clusters Merge two clusters printFlush Print arguments and flush the console. treecut-package Methods for detection of clusters in hierarchical clustering dendrograms.
Author(s)
Peter Langfelder <Peter.Langfelder@gmail.com> and Bin Zhang <binzhang.ucla@gmail.com>, with contributions from Steve Horvath <SHorvath@mednet.ucla.edu>
Maintainer: Peter Langfelder <Peter.Langfelder@gmail.com>
Adaptive Branch Pruning of Hierarchical Clustering Dendrograms
Description
This wrapper provides a common access point for two methods of adaptive branch pruning of hierarchical clustering dendrograms.
Usage
cutreeDynamic(
dendro, cutHeight = NULL, minClusterSize = 20,
# Basic tree cut options
method = "hybrid", distM = NULL,
deepSplit = (ifelse(method=="hybrid", 1, FALSE)),
# Advanced options
maxCoreScatter = NULL, minGap = NULL,
maxAbsCoreScatter = NULL, minAbsGap = NULL,
minSplitHeight = NULL, minAbsSplitHeight = NULL,
# External (user-supplied) measure of branch split
externalBranchSplitFnc = NULL, minExternalSplit = NULL,
externalSplitOptions = list(),
externalSplitFncNeedsDistance = NULL,
assumeSimpleExternalSpecification = TRUE,
# PAM stage options
pamStage = TRUE, pamRespectsDendro = TRUE,
useMedoids = FALSE, maxDistToLabel = NULL,
maxPamDist = cutHeight,
respectSmallClusters = TRUE,
# Various options
verbose = 2, indent = 0)
Arguments
dendro |
A hierarchical clustering dendorgram such as one returned by |
cutHeight |
Maximum joining heights that will be considered. For |
minClusterSize |
Minimum cluster size. |
method |
Chooses the method to use. Recognized values are "hybrid" and "tree". |
distM |
Only used for method "hybrid". The distance matrix used as input to |
deepSplit |
For method "hybrid", can be either logical or integer in the range 0 to 4. For method
"tree", must be logical. In both cases, provides a rough control over sensitivity to cluster splitting.
The higher the value (or if |
maxCoreScatter |
Only used for method "hybrid".
Maximum scatter of the core for a branch to be a cluster, given as the fraction of |
minGap |
Only used for method "hybrid".
Minimum cluster gap given as the fraction of the difference between |
maxAbsCoreScatter |
Only used for method "hybrid".
Maximum scatter of the core for a branch to be a cluster given as absolute heights. If given, overrides
|
minAbsGap |
Only used for method "hybrid".
Minimum cluster gap given as absolute height difference. If given, overrides |
minSplitHeight |
Minimum split height given as the fraction of the difference between
|
minAbsSplitHeight |
Minimum split height given as an absolute height.
Branches merging below this height will automatically be merged. If not given (default), will be determined
from |
externalBranchSplitFnc |
Optional function to evaluate split (dissimilarity) between two branches.
Either a single function or a list in which each component is a function (see
|
minExternalSplit |
Thresholds to decide whether two branches should be merged.
It should be a numeric vector of the same length as the number of functions in
|
externalSplitOptions |
Further arguments to function |
externalSplitFncNeedsDistance |
Optional specification of whether the external branch split
functions need the distance matrix as one of their arguments. Either |
assumeSimpleExternalSpecification |
Logical: when |
pamStage |
Only used for method "hybrid". If TRUE, the second (PAM-like) stage will be performed. |
pamRespectsDendro |
Logical, only used for method "hybrid". If |
useMedoids |
Only used for method "hybrid" and only if |
maxDistToLabel |
Deprecated, use |
maxPamDist |
Only used for method "hybrid" and only if |
respectSmallClusters |
Only used for method "hybrid" and only if |
verbose |
Controls the verbosity of the output. 0 will make the function completely quiet, values up to 4 gradually increase verbosity. |
indent |
Controls indentation of printed messages (see |
Details
This is a wrapper for two related but different methods for cluster detection in hierarchical clustering dendrograms.
In order to make the shape parameters maxCoreScatter
and minGap
more universal, their
values are interpreted relative to cutHeight
and the 5th percetile of the merging heights (we
arbitrarily chose the 5th percetile rather than the minimum for reasons of stability). Thus, the absolute
maximum allowable core scatter is calculated as maxCoreScatter * (cutHeight - refHeight) +
refHeight
and the absolute minimum allowable gap as minGap * (cutHeight - refHeight)
, where
refHeight
is the 5th percentile of the merging heights.
Value
A vector of numerical labels giving assignment of objects to modules. Unassigned objects are labeled 0, the largest module has label 1, next largest 2 etc.
Author(s)
Peter Langfelder, Peter.Langfelder@gmail.com
References
Langfelder P, Zhang B, Horvath S, 2007. http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/BranchCutting
See Also
hclust
, cutreeHybrid
, cutreeDynamicTree
.
Dynamic Dendrogram Pruning Based on Dendrogram Only
Description
Detect clusters in a hierarchical dendrogram using a variable cut height approach. Uses only the information in the dendrogram itself is used (which may give incorrect assignment for outlying objects).
Usage
cutreeDynamicTree(dendro, maxTreeHeight = 1, deepSplit = TRUE, minModuleSize = 50)
Arguments
dendro |
Hierarchical clustering dendrogram such produced by |
maxTreeHeight |
Maximum joining height of objects to be considered part of clusters. |
deepSplit |
If |
minModuleSize |
Minimum module size. Branches containing fewer than |
Details
A variable height branch pruning technique for dendrograms produced by hierarchical clustering.
Initially, branches are cut off at the height maxTreeHeight
; the resulting clusters are then
examined for substructure and if subclusters are detected, they are assigned separate labels. Subclusters
are detected by structure and are required to have a minimum of minModuleSize
objects on them to
be assigned a separate label. A rough degree of control over what it means to be a subcluster is
implemented by the parameter deepSplit
.
Value
A vector of numerical labels giving assignment of objects to modules. Unassigned objects are labeled 0, the largest module has label 1, next largest 2 etc.
Author(s)
Bin Zhang, binzhang.ucla@gmail.com, with contributions by Peter Langfelder, Peter.Langfelder@gmail.com.
References
http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/BranchCutting
See Also
Hybrid Adaptive Tree Cut for Hierarchical Clustering Dendrograms
Description
Detect clusters in a dendorgram produced by the function hclust
.
Usage
cutreeHybrid(
# Input data: basic tree cutiing
dendro, distM,
# Branch cut criteria and options
cutHeight = NULL, minClusterSize = 20, deepSplit = 1,
# Advanced options
maxCoreScatter = NULL, minGap = NULL,
maxAbsCoreScatter = NULL, minAbsGap = NULL,
minSplitHeight = NULL, minAbsSplitHeight = NULL,
# External (user-supplied) measure of branch split
externalBranchSplitFnc = NULL, minExternalSplit = NULL,
externalSplitOptions = list(),
externalSplitFncNeedsDistance = NULL,
assumeSimpleExternalSpecification = TRUE,
# PAM stage options
pamStage = TRUE, pamRespectsDendro = TRUE,
useMedoids = FALSE,
maxPamDist = cutHeight,
respectSmallClusters = TRUE,
# Various options
verbose = 2, indent = 0)
Arguments
dendro |
a hierarchical clustering dendorgram such as one returned by |
distM |
Distance matrix that was used as input to |
cutHeight |
Maximum joining heights that will be considered. It defaults to 99 of the range between the 5th percentile and the maximum of the joining heights on the dendrogram. |
minClusterSize |
Minimum cluster size. |
deepSplit |
Either logical or integer in the range 0 to 4. Provides a rough control over
sensitivity to cluster splitting. The higher the value, the more and smaller clusters will be produced.
A finer control can be achieved via |
maxCoreScatter |
Maximum scatter of the core for a branch to be a cluster, given as the fraction
of |
minGap |
Minimum cluster gap given as the fraction of the difference between |
maxAbsCoreScatter |
Maximum scatter of the core for a branch to be a cluster given as absolute
heights. If given, overrides |
minAbsGap |
Minimum cluster gap given as absolute height difference. If given, overrides
|
minSplitHeight |
Minimum split height given as the fraction of the difference between
|
minAbsSplitHeight |
Minimum split height given as an absolute height.
Branches merging below this height will automatically be merged. If not given (default), will be determined
from |
externalBranchSplitFnc |
Optional function to evaluate split (dissimilarity) between two branches.
Either a single function or a list in which each component is a function (see
|
minExternalSplit |
Thresholds to decide whether two branches should be merged.
It should be a numeric vector of the same length as the number of functions in
|
externalSplitOptions |
Further arguments to function |
externalSplitFncNeedsDistance |
Optional specification of whether the external branch split
functions need the distance matrix as one of their arguments. Either |
assumeSimpleExternalSpecification |
Logical: when |
pamStage |
Logical, only used for method "hybrid". If |
pamRespectsDendro |
Logical, only used for method "hybrid".
If |
useMedoids |
if TRUE, the second stage will be use object to medoid distance; if FALSE, it will use average object to cluster distance. The default (FALSE) is recommended. |
maxPamDist |
Maximum object distance to closest cluster that will result in the object
assigned to that cluster. Defaults to |
respectSmallClusters |
If TRUE, branches that failed to be clusters in stage 1 only because of insufficient size will be assigned together in stage 2. If FALSE, all objects will be assigned individually. |
verbose |
Controls the verbosity of the output. 0 will make the function completely quiet, values up to 4 gradually increase verbosity. |
indent |
Controls indentation of printed messages (see |
Details
The function detects clusters in a hierarchical dendrogram based on the shape of branches on the dendrogram. For details on the method, see http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/BranchCutting.
In order to make the shape parameters maxCoreScatter
and minGap
more universal, their
values are interpreted relative to cutHeight
and the 5th percetile of the merging heights (we
arbitrarily chose the 5th percetile rather than the minimum for reasons of stability). Thus, the absolute
maximum allowable core scatter is calculated as maxCoreScatter * (cutHeight - refHeight) +
refHeight
and the absolute minimum allowable gap as minGap * (cutHeight - refHeight)
, where
refHeight
is the 5th percentile of the merging heights.
Value
A list containg the following elements:
labels |
Numerical labels of clusters, with 0 meaning unassigned, label 1 labeling the largest cluster etc. |
cores |
Numerical labels indicating cores of found clusters. |
smallLabels |
Numerical labels for branches that failed to be recognized clusters only because of insufficient number of objects. |
mergeDiagnostics |
A data.frame with one row per merge in the input dendrogram. The columns give the values of the various merging criteria used by the algorithm. Missing data indicate that at least one of the "branches" merged was actually a singleton (single node) and hence the branch merging was automatic. |
mergeCriteria |
Values of the merging thresholds. Either a copy of the corresponding input thresholds
or values determined by |
branches |
A list detailing the deteced branch structure. |
Author(s)
Peter Langfelder, Peter.Langfelder@gmail.com
References
Langfelder P, Zhang B, Horvath S, 2007. http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/BranchCutting
See Also
Spaces for Indented Output
Description
Returns a character string containing two times indent
spaces.
Usage
indentSpaces(indent = 0)
Arguments
indent |
Desired level of indentation. The number of returned spaces will be twice this argument. |
Value
A character string containing spaces, of length twice indent
.
Author(s)
Peter Langfelder, Peter.Langfelder@gmail.com
Examples
spaces = indentSpaces(0);
print(paste(spaces, "This output is not indented..."));
spaces = indentSpaces(1);
print(paste(spaces, "...while this one is."))
Merge Two Clusters
Description
Merge 2 clusters into 1.
Usage
merge2Clusters(labels, mainClusterLabel, minorClusterLabel)
Arguments
labels |
a vector or factor giving the cluster labels |
mainClusterLabel |
label of the first merged cluster. The merged cluster will have this label. |
minorClusterLabel |
label of the second merged cluster. |
Value
A vector or factor of the merged labels.
Author(s)
Bin Zhang and Peter Langfelder
Examples
options(stringsAsFactors = FALSE);
# Works with character labels:
labels = c(rep("grey", 5), rep("blue", 2), rep("red", 3))
merge2Clusters(labels, "blue", "red")
# Works with factor labels:
labelsF = factor(labels)
merge2Clusters(labelsF, "blue", "red")
# Works also with numeric labels:
labelsN = as.numeric(factor(labels))
labelsN
merge2Clusters(labelsF, 1, 3)
Print Arguments and Flush the Console
Description
Passes all its arguments unchaged to the standard print
function; after the
execution of print it flushes the console, if possible.
Usage
printFlush(...)
Arguments
... |
Arguments to be passed to the standard |
Details
Passes all its arguments unchaged to the standard print
function; after the
execution of print it flushes the console, if possible.
Value
Returns the value of the print
function.
Author(s)
Peter Langfelder, Peter.Langfelder@gmail.com