Table 1.

Statistics of the selected molecular signatures for differentiating microarray data set of colon cancer patients from that of normal people by 10 different studies that used the same data set

Study (reference)No. selected genes in signatureClass differentiation methodSignature selection methodValidation methodPrediction accuracyNo. genes selected by other n studies
9876543210
Zhou and Mao, 2005 ( 29)15LS-SVMA hybrid of filter and wrapper methods (LS bound measure)Bootstrap<85%0001011111
Ding and Peng, 2005 ( 30)60NB, SVM, LDA, LRFilter method (MRMR)LOOCV93.55%00032114445
Guyon et al., 2002 ( 24)7SVM (linear kernel)Wrapper method (RFE)LOOCV98%0000011203
Inza et al., 2004 ( 28)5Decision tree 1Wrapper methodLOOCV87.1%0003200000
Inza et al., 2004 ( 28)4Decision tree 2Wrapper methodLOOCV88.81%0000011200
Bo and Jonassen, 2002 ( 31)50Linear discriminantGene pair rankingLOOCV87.8%00032127827
L-31-OCV85.9%
Huang and Kecman, 2005 ( 32)10SVM1Wrapper method (RFE)LOOCVNot indicated0003200500
Huang and Kecman, 2005 ( 32)10SVM2Wrapper method (RFE)LOOCV88.84%0003000223
Huang and Kecman, 2005 ( 32)10SVM3Wrapper method (RFE)LOOCV88.1%0003200401
Liu et al., 2005 ( 33)6Clustering methodFilter method (mutual information)LOOCV91.9%0002200020
Total no. uniquely selected genes, 107No. unique genes selected by only one study, 83
  • NOTE: The data set is from ref. 16. The authors of the second study tried four methods (naïve Bayes classifier, SVM, linear discriminant analysis, and logistic regression) and choose one that showed the highest accuracy.

    Abbreviations: LS-SVM, least square SVM; MRMR, minimum redundancy-maximum relevance feature selection framework; NB, naïve Bayes classifier; LDA, linear discriminant analysis; LR, logistic regression; LOOCV, leave-one-out cross-validation; SVM1, SVM2, and SVM3, SVM classifiers with different variables; decision tree 1, decision tree 2, decision tree classifiers with different variables.