%% $Id: panMatrix.Rd 185 2014-09-03 08:09:01Z larssn $

\name{panMatrix}
\alias{panMatrix}
\title{
  Computing the pan-matrix for a set of gene clusters
}
\description{
  A pan-matrix has one row for each genome and one column for each gene cluster, and cell \samp{[i,j]} indicates how many members genome \samp{i} has in gene family \samp{j}.
}
\usage{
panMatrix(clustering)
}
\arguments{
  \item{clustering}{A vector of integers indicating the gene cluster for every sequence. Sequences with the same number belong to the same cluster. The name of each element is the tag identifying the sequence.}
}
\details{
  The pan-matrix is a central data structure for pan-genomic analysis. It is a matrix with one row for each genome in the study, and one column for each gene cluster. Cell \samp{[i,j]} contains an integer indicating how many members genome \samp{i} has in cluster \samp{j}.
  
  The input \code{clustering} must be an integer vector with one element for each sequence in the study, typically produced by either \code{\link{bClust}} or \code{\link{dClust}}. The name of each element is a text identifying every sequence. The value of each element indicates the cluster, i.e. those sequences with identical values are in the same cluster. IMPORTANT: The name of each sequence must contain the GID-tag for each genome, i.e. they must of the form \samp{GID111_seq1}, \samp{GID111_seq2},... where the \samp{GIDxxx} part indicates which genome the sequence belongs to. See \code{\link{panPrep}} for details.
  
  The rows of the pan-matrix is named by the GID-tag for every genome. The columns are just named \samp{Cluster_x} where \samp{x} is an integer copied from \samp{clustering}.
}
\value{
  The returned object belongs to the class \code{Panmat}, which is a small (S3) extension to a matrix. It can be treated as a matrix, but the generic functions \code{\link{plot.Panmat}}, \code{\link{summary.Panmat}} and \code{\link{str.Panmat}} are defined for a \code{Panmat} object. The input vector \samp{clustering} is attached as the attribute \samp{clustering} to the \code{Panmat} object.
}
\author{
  Lars Snipen and Kristian Hovde Liland.
}

\seealso{
  \code{\link{bClust}}, \code{\link{dClust}}, \code{\link{distManhattan}}, \code{\link{distJaccard}}, \code{\link{fluidity}}, \code{\link{chao}}, \code{\link{binomixEstimate}}, \code{\link{heaps}}, \code{\link{rarefaction}}.
}
\examples{
# Loading clustering data in the micropan package
data(list=c("Mpneumoniae.blast.clustering","Mpneumoniae.domain.clustering"),package="micropan")

# Pan-matrix based on BLAST clustering
panmat.blast <- panMatrix(Mpneumoniae.blast.clustering)

# Pan-matrix based on domain sequence clustering
panmat.domains <- panMatrix(Mpneumoniae.domain.clustering)

# Plotting the first pan-matrix, and then printing its summary
plot(panmat.blast)
summary(panmat.blast)
str(panmat.blast)
}
