CART Class Prediction: One Data Set

protocols

Use the CARTXValidation module to build and test classifiers using the Classification And Regression Trees (CART) class prediction method and one gene expression data set.

Before you begin

A gene expression data set consists of two files:

learn more:
file formats

Step 1: PreprocessDataset

Preprocess gene expression data to remove platform noise and genes that have little variation. Note: If preprocessing the data removes relevant biological information, skip this step.

Considerations
learn more:
PreprocessDataset

Step 2: CARTXValidation

CARTXValidation runs the CART class prediction method iteratively against the known data set. For each iteration, it leaves one sample out, builds the classifier using the remaining samples, and then tests the classifier on the sample left out. It creates a prediction results file that assesses the accuracy of the classifiers.

learn more:
CARTXValidation

Step 3: PredictionResultsViewer

Display the prediction results file (*.pred.odf). The viewer lists each sample, its actual class, its predicted class, and prediction error rates. Prediction error rates are averaged across all iterations.

Considerations
learn more:
PredictionResultsViewer

Building a classifier (model) file

CARTXValidation does not save the classifiers that it generates. Typically, an analyst builds a classifier using one data set and tests the classifier using a second data set. It is rare to build a classifier (model) file using one data set without having a second data set available for testing; however, it is possible. To build a classifier (model) file using one data set, run the CART module: specify the one data set as the training data set.

learn more:
CART