GenePattern defines two file formats for gene expression data: GCT and RES. Both are tab-delimited text files that contain a column for each sample, a row for each gene, and an expression value for each gene in each sample. The RES file format also contains the absent (A) versus present (P) calls as generated for each gene by Affymetrix GeneChip software. See File Formats for complete descriptions of all GenePattern file formats.
The GCT file format supports missing expression values; simply leave the cell blank if the expression data is missing. The RES file format, which is specific to Affymetrix chips, does not allow missing expression values. A few modules (such as CART, GSEA and HierarchicalClustering) can be run against an expression dataset that is missing values. Most modules do not allow missing expression values.
Gene expression data from any source can be converted into a GCT file. Gene expression data from Affymetrix can be converted into either a GCT or RES file. Most GenePattern modules that require gene expression data accept either GCT or RES file formats. A small number of GenePattern modules use the A/P calls included in the RES file.
To convert this data | Use this method |
Affymetrix data | Use the ExpressionFileCreator module to convert a set of Affymetrix CEL files to a GCT or RES file. |
cDNA data | Convert the data to a tab-delimited text file that contains a column for each sample, a row for each gene, and expression data for each gene in each sample. Expression data can be intensity values or ratios. Open the file in Excel or another text editor and modify the format to match the GCT or RES file format. |
PCL or CDT files from Stanford Microarray Database (SMD) | Open the file in Excel or another text editor and modify the format to match the GCT or RES file format. |
Gene Expression Omnibus (GEO) data | Use the GEOImporter module to create a GCT file based on expression data extracted from GEO. |
caArray expression data | Use the caArrayImportViewer module to create a GCT file based on expression data extracted from the caArray microarray expression data repository. |
MAGE-ML data | Use the MAGEImportViewer module to convert MAGE-ML format data to a GCT file. The MAGE-ML format, which is popular for both Affymetrix and cDNA microarray data, is the standard format for storing data at the ArrayExpress repository. |