Machine Learning Sequence Classification Techniques: Application To Cysteine Protease Cleavage Prediction

ISSN: 2212-392X (Online)
ISSN: 1574-8936 (Print)


Volume 9, 5 Issues, 2014


Download PDF Flyer




Current Bioinformatics

Aims & ScopeAbstracted/Indexed in

Ranking and Category:
  • 20th of 52 in Mathematical & Computational Biology

Submit Abstracts Online Submit Manuscripts Online

Editor-in-Chief:
Alessandro Giuliani
Istituto Superiore di Sanitá (Italian NIH) Environment and Health Dept
Roma
Italy


View Full Editorial Board

Subscribe Purchase Articles Order Reprints

Current: 1.726
5 - Year: 1.577

Machine Learning Sequence Classification Techniques: Application To Cysteine Protease Cleavage Prediction

Author(s): David A. duVerle and Hiroshi Mamitsuka


Abstract

Sequence classification is one of the most fundamental machine learning task in computational biology nowadays. With the wide availability of large corpora of annotated sequences, the use of supervised learning techniques can greatly speed up the process of identifying new sequences sharing certain function or properties. Many methods have been proposed over the years and we hope to provide an introduction to some of the more prominent ones by focussing on protease cleavage prediction: a typical representative of this class of problem. The variety of proteolytic action modes between cysteine-proteases covers a broad range of complexity level and feature specificity, illustrating the strengths and limitations of the different machine learning techniques used on them.

This review briefly introduces the particulars of predicting cleavage by calpains and caspases. We then offer some general practical considerations on treating sequences for use with machine learning algorithms, before covering specific methods. The methods presented range from basic position-based statistical models to more technically advanced methods such as Markov models or kernel-based algorithms, as well as methods with more restricted goals such as decision trees. With each family of algorithms, examples of implementations are introduced and their performances compared, along with particular strengths and weaknesses.

With this review, we aim to provide useful elements of decision toward choosing an existing method or developing a new one, based on the complexity and specific needs of a given sequence classification problem.



Purchase Online Rights and Permissions

  
  



Article Details

Volume: 8
First Page:
Page Count:
DOI: 10.2174/15748936113089990010
Advertisement

Related Journals




Webmaster Contact: urooj@benthamscience.org Copyright © 2014 Bentham Science