KNN Class Prediction: One Data Set

protocols

Use the KNNXValidation module to build and test classifiers using the k-nearest-neighbors (KNN) class prediction method and one gene expression data set.

Before you begin

A gene expression data set consists of two files:

learn more:
file formats

Step 1: PreprocessDataset

Preprocess gene expression data to remove platform noise and genes that have little variation. Note: If preprocessing the data removes relevant biological information, skip this step.

Considerations
learn more:
PreprocessDataset

Step 2: KNNXValidation

KNNXValidation runs the KNN class prediction method iteratively against the known data set. For each iteration, it leaves one sample out, builds the classifier using the remaining samples, and then tests the classifier on the sample left out. It creates two files:

learn more:
KNNXValidation

Step 3: View results

To view the prediction results file (*.pred.odf), use the PredictionResultsViewer module. The viewer lists each sample with its actual and predicted class. Error rates for class predictions are averaged across all iterations.

To view the features results file (*.feat.odf), use the FeatureSummaryViewer module. The viewer ists each gene used in a class predictor and the number of times it was used in a predictor.

Considerations
learn more:
PredictionResultsViewer
FeatureSummaryViewer

Building a classifier (model) file

KNNXValidation does not save the classifiers that it generates. Typically, an analyst builds a classifier using one data set and tests the classifier using a second data set. It is rare to build a classifier (model) file using one data set without having a second data set available for testing; however, it is possible. To build a classifier (model) file using one data set, run the KNN module: specify the one data set as the training data set.

learn more:
KNN