A Hierarchical Feature and Sample Selection Framework and Its Application for Alzheimer's Disease Diagnosis

Le An, Ehsan Adeli, Mingxia Liu, Jun Zhang, Seong Whan Lee, Dinggang Shen

Research output: Research - peer-reviewArticle

Abstract

Classification is one of the most important tasks in machine learning. Due to feature redundancy or outliers in samples, using all available data for training a classifier may be suboptimal. For example, the Alzheimer's disease (AD) is correlated with certain brain regions or single nucleotide polymorphisms (SNPs), and identification of relevant features is critical for computer-aided diagnosis. Many existing methods first select features from structural magnetic resonance imaging (MRI) or SNPs and then use those features to build the classifier. However, with the presence of many redundant features, the most discriminative features are difficult to be identified in a single step. Thus, we formulate a hierarchical feature and sample selection framework to gradually select informative features and discard ambiguous samples in multiple steps for improved classifier learning. To positively guide the data manifold preservation process, we utilize both labeled and unlabeled data during training, making our method semi-supervised. For validation, we conduct experiments on AD diagnosis by selecting mutually informative features from both MRI and SNP, and using the most discriminative samples for training. The superior classification results demonstrate the effectiveness of our approach, as compared with the rivals.

LanguageEnglish (US)
Article number45269
JournalScientific Reports
Volume7
DOIs
StatePublished - Mar 30 2017

Fingerprint

Nucleotides
Polymorphism
Classifiers
Magnetic resonance
Imaging techniques
Computer aided diagnosis
Redundancy
Learning systems
Brain
Experiments

ASJC Scopus subject areas

  • General

Cite this

A Hierarchical Feature and Sample Selection Framework and Its Application for Alzheimer's Disease Diagnosis. / An, Le; Adeli, Ehsan; Liu, Mingxia; Zhang, Jun; Lee, Seong Whan; Shen, Dinggang.

In: Scientific Reports, Vol. 7, 45269, 30.03.2017.

Research output: Research - peer-reviewArticle

@article{9290d71df52e456cbcfc437498fd9193,
title = "A Hierarchical Feature and Sample Selection Framework and Its Application for Alzheimer's Disease Diagnosis",
abstract = "Classification is one of the most important tasks in machine learning. Due to feature redundancy or outliers in samples, using all available data for training a classifier may be suboptimal. For example, the Alzheimer's disease (AD) is correlated with certain brain regions or single nucleotide polymorphisms (SNPs), and identification of relevant features is critical for computer-aided diagnosis. Many existing methods first select features from structural magnetic resonance imaging (MRI) or SNPs and then use those features to build the classifier. However, with the presence of many redundant features, the most discriminative features are difficult to be identified in a single step. Thus, we formulate a hierarchical feature and sample selection framework to gradually select informative features and discard ambiguous samples in multiple steps for improved classifier learning. To positively guide the data manifold preservation process, we utilize both labeled and unlabeled data during training, making our method semi-supervised. For validation, we conduct experiments on AD diagnosis by selecting mutually informative features from both MRI and SNP, and using the most discriminative samples for training. The superior classification results demonstrate the effectiveness of our approach, as compared with the rivals.",
author = "Le An and Ehsan Adeli and Mingxia Liu and Jun Zhang and Lee, {Seong Whan} and Dinggang Shen",
year = "2017",
month = "3",
doi = "10.1038/srep45269",
volume = "7",
journal = "Scientific Reports",
issn = "2045-2322",
publisher = "Nature Publishing Group",

}

TY - JOUR

T1 - A Hierarchical Feature and Sample Selection Framework and Its Application for Alzheimer's Disease Diagnosis

AU - An,Le

AU - Adeli,Ehsan

AU - Liu,Mingxia

AU - Zhang,Jun

AU - Lee,Seong Whan

AU - Shen,Dinggang

PY - 2017/3/30

Y1 - 2017/3/30

N2 - Classification is one of the most important tasks in machine learning. Due to feature redundancy or outliers in samples, using all available data for training a classifier may be suboptimal. For example, the Alzheimer's disease (AD) is correlated with certain brain regions or single nucleotide polymorphisms (SNPs), and identification of relevant features is critical for computer-aided diagnosis. Many existing methods first select features from structural magnetic resonance imaging (MRI) or SNPs and then use those features to build the classifier. However, with the presence of many redundant features, the most discriminative features are difficult to be identified in a single step. Thus, we formulate a hierarchical feature and sample selection framework to gradually select informative features and discard ambiguous samples in multiple steps for improved classifier learning. To positively guide the data manifold preservation process, we utilize both labeled and unlabeled data during training, making our method semi-supervised. For validation, we conduct experiments on AD diagnosis by selecting mutually informative features from both MRI and SNP, and using the most discriminative samples for training. The superior classification results demonstrate the effectiveness of our approach, as compared with the rivals.

AB - Classification is one of the most important tasks in machine learning. Due to feature redundancy or outliers in samples, using all available data for training a classifier may be suboptimal. For example, the Alzheimer's disease (AD) is correlated with certain brain regions or single nucleotide polymorphisms (SNPs), and identification of relevant features is critical for computer-aided diagnosis. Many existing methods first select features from structural magnetic resonance imaging (MRI) or SNPs and then use those features to build the classifier. However, with the presence of many redundant features, the most discriminative features are difficult to be identified in a single step. Thus, we formulate a hierarchical feature and sample selection framework to gradually select informative features and discard ambiguous samples in multiple steps for improved classifier learning. To positively guide the data manifold preservation process, we utilize both labeled and unlabeled data during training, making our method semi-supervised. For validation, we conduct experiments on AD diagnosis by selecting mutually informative features from both MRI and SNP, and using the most discriminative samples for training. The superior classification results demonstrate the effectiveness of our approach, as compared with the rivals.

UR - http://www.scopus.com/inward/record.url?scp=85016748973&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85016748973&partnerID=8YFLogxK

U2 - 10.1038/srep45269

DO - 10.1038/srep45269

M3 - Article

VL - 7

JO - Scientific Reports

T2 - Scientific Reports

JF - Scientific Reports

SN - 2045-2322

M1 - 45269

ER -