NMF Clustering

protocols

Non-negative matrix factorization (NMF) finds a small number of metagenes, each defined as a positive linear combination of the genes in the expression data. It then groups samples into clusters based on the gene expression pattern of these metagenes.

Before you begin

Gene expression data must be in a GCT or RES file.
Example file: all_aml_test.gct.

The gene expression data must contain only positive values. If your data contains negative values, see the NMFConsensus documentation for instructions.

learn more:
file formats

Step 1: PreprocessDataset

Preprocess gene expression data to remove platform noise and genes that have little variation. Although researchers generally preprocess data before clustering if doing so removes relevant biological information, skip this step.

Considerations
learn more:
PreprocessDataset

Step 2: NMFConsensus

NMFConsensus uses the basic principle of dimensionality reduction via non-negative matrix factorization (NMF) to find a small number of metagenes, each defined as a positive linear combination of the genes in the expression data. It then groups samples into clusters based on the gene expression pattern of the samples as positive linear combinations of these metagenes. NMFConsensus repeatedly runs the clustering algorithm against perturbations of the gene expression data and creates a consensus matrix to assesses the stability of the resulting clusters.

3-4 hours: Running this example on the GenePattern public server takes several hours. The results are provided here for your convenience: NMFConsensus_Results.zip.

Considerations
learn more:
NMFConsensus

Step 3: View results

Plots of the results are written to .pdf files. Cluster membership results are written to GCT files. View the result files by clicking on them.

Considerations
learn more:
NMFConsensus

References

Brunet, J-P., Tamayo, P., Golub, T.R., and Mesirov, J.P. 2004. Metagenes and molecular pattern discovery using matrix factorization. Proc. Natl. Acad. Sci. USA 101(12):4164�4169.