Machine Learning Sequence Classification Techniques: Application To Cysteine Protease Cleavage Prediction

ISSN: 2212-392X (Online)
ISSN: 1574-8936 (Print)

Volume 11, 5 Issues, 2016

Download PDF Flyer

Current Bioinformatics

Aims & ScopeAbstracted/Indexed in

Submit Abstracts Online Submit Manuscripts Online

Yi-Ping Phoebe Chen
Department of Computer Science and Information Technology
La Trobe University

View Full Editorial Board

Subscribe Purchase Articles Order Reprints

Current: 0.921
5 - Year: 1.045

Machine Learning Sequence Classification Techniques: Application To Cysteine Protease Cleavage Prediction

Current Bioinformatics, Volume 8 (E-pub ahead of print)

Author(s): David A. duVerle and Hiroshi Mamitsuka.


Sequence classification is one of the most fundamental machine learning task in computational biology nowadays. With the wide availability of large corpora of annotated sequences, the use of supervised learning techniques can greatly speed up the process of identifying new sequences sharing certain function or properties. Many methods have been proposed over the years and we hope to provide an introduction to some of the more prominent ones by focussing on protease cleavage prediction: a typical representative of this class of problem. The variety of proteolytic action modes between cysteine-proteases covers a broad range of complexity level and feature specificity, illustrating the strengths and limitations of the different machine learning techniques used on them.

This review briefly introduces the particulars of predicting cleavage by calpains and caspases. We then offer some general practical considerations on treating sequences for use with machine learning algorithms, before covering specific methods. The methods presented range from basic position-based statistical models to more technically advanced methods such as Markov models or kernel-based algorithms, as well as methods with more restricted goals such as decision trees. With each family of algorithms, examples of implementations are introduced and their performances compared, along with particular strengths and weaknesses.

With this review, we aim to provide useful elements of decision toward choosing an existing method or developing a new one, based on the complexity and specific needs of a given sequence classification problem.

Purchase Online Order Reprints Order Eprints Rights and Permissions

Article Details

Volume: 8
First Page:
Page Count:
DOI: 10.2174/15748936113089990010
Global Biotechnology Congress 2016Drug Discovery and Therapy World Congress 2016

Related Journals

Webmaster Contact: Copyright © 2016 Bentham Science