Knowledge Management System of Northwest Institute of Plateau Biology, CAS
Prediction of Enzyme Subfamily Class via Pseudo Amino Acid Composition by Incorporating the Conjoint Triad Feature | |
Wang, Yong-Cui1,2; Wang, Xiao-Bo1; Yang, Zhi-Xia1,3; Deng, Nai-Yang1 | |
2010-11-01 | |
发表期刊 | PROTEIN AND PEPTIDE LETTERS |
ISSN | 0929-8665 |
卷号 | 17期号:11页码:1441-1449 |
文章类型 | Article |
摘要 | Predicting enzyme subfamily class is an imbalance multi-class classification problem due to the fact that the number of proteins in each subfamily makes a great difference. In this paper, we focus on developing the computational methods specially designed for the imbalance multi-class classification problem to predict enzyme subfamily class. We compare two support vector machine (SVM)-based methods for the imbalance problem, AdaBoost algorithm with RBFSVM (SVM with RBF kernel) and SVM with arithmetic mean (AM) offset (AM-SVM) in enzyme subfamily classification. As input features for our predictive model, we use the conjoint triad feature (CTF). We validate two methods on an enzyme benchmark dataset, which contains six enzyme main families with a total of thirty-four subfamily classes, and those proteins have less than 40% sequence identity to any other in a same functional class. In predicting oxidoreductases subfamilies, AM-SVM obtains the over 0.92 Matthew's correlation coefficient (MCC) and over 93% accuracy, and in predicting lyases, isomerases and ligases subfamilies, it obtains over 0.73 MCC and over 82% accuracy. The improvement in the predictive performance suggests the AM-SVM might play a complementary role to the existing function annotation methods.; Predicting enzyme subfamily class is an imbalance multi-class classification problem due to the fact that the number of proteins in each subfamily makes a great difference. In this paper, we focus on developing the computational methods specially designed for the imbalance multi-class classification problem to predict enzyme subfamily class. We compare two support vector machine (SVM)-based methods for the imbalance problem, AdaBoost algorithm with RBFSVM (SVM with RBF kernel) and SVM with arithmetic mean (AM) offset (AM-SVM) in enzyme subfamily classification. As input features for our predictive model, we use the conjoint triad feature (CTF). We validate two methods on an enzyme benchmark dataset, which contains six enzyme main families with a total of thirty-four subfamily classes, and those proteins have less than 40% sequence identity to any other in a same functional class. In predicting oxidoreductases subfamilies, AM-SVM obtains the over 0.92 Matthew's correlation coefficient (MCC) and over 93% accuracy, and in predicting lyases, isomerases and ligases subfamilies, it obtains over 0.73 MCC and over 82% accuracy. The improvement in the predictive performance suggests the AM-SVM might play a complementary role to the existing function annotation methods. |
关键词 | Enzyme Subfamily Class Prediction Conjoint Triad Feature Imbalance Problem Support Vector Machine |
WOS标题词 | Science & Technology ; Life Sciences & Biomedicine |
关键词[WOS] | SUPPORT VECTOR MACHINES ; PROTEIN STRUCTURAL CLASSES ; SUBCELLULAR LOCATION PREDICTION ; FUNCTIONAL DOMAIN COMPOSITION ; COMPLEXITY MEASURE FACTOR ; APOPTOSIS PROTEINS ; CLEAVAGE SITES ; GRAPHIC RULES ; TURN TYPES ; KINETICS |
收录类别 | SCI |
语种 | 英语 |
WOS研究方向 | Biochemistry & Molecular Biology |
WOS类目 | Biochemistry & Molecular Biology |
WOS记录号 | WOS:000284651900017 |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://210.75.249.4/handle/363003/1640 |
专题 | 中国科学院西北高原生物研究所 |
作者单位 | 1.China Agr Univ, Coll Sci, Beijing 100083, Peoples R China 2.Chinese Acad Sci, NW Inst Plateau Biol, Key Lab Adaptat & Evolut Plateau Biota, Xining 810001, Peoples R China 3.Xinjiang Univ, Coll Math & Syst Sci, Urumuchi 830046, Peoples R China |
推荐引用方式 GB/T 7714 | Wang, Yong-Cui,Wang, Xiao-Bo,Yang, Zhi-Xia,et al. Prediction of Enzyme Subfamily Class via Pseudo Amino Acid Composition by Incorporating the Conjoint Triad Feature[J]. PROTEIN AND PEPTIDE LETTERS,2010,17(11):1441-1449. |
APA | Wang, Yong-Cui,Wang, Xiao-Bo,Yang, Zhi-Xia,&Deng, Nai-Yang.(2010).Prediction of Enzyme Subfamily Class via Pseudo Amino Acid Composition by Incorporating the Conjoint Triad Feature.PROTEIN AND PEPTIDE LETTERS,17(11),1441-1449. |
MLA | Wang, Yong-Cui,et al."Prediction of Enzyme Subfamily Class via Pseudo Amino Acid Composition by Incorporating the Conjoint Triad Feature".PROTEIN AND PEPTIDE LETTERS 17.11(2010):1441-1449. |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论