World Library  

Add to Book Shelf
Flag as Inappropriate
Email this Book

Plos One : on the Relevance of Sophisticated Structural Annotations for Disulfide Connectivity Pattern Prediction, Volume 7

By Fernandez-fuentes, Narcis

Click here to view

Book Id: WPLBN0003957784
Format Type: PDF eBook :
File Size:
Reproduction Date: 2015

Title: Plos One : on the Relevance of Sophisticated Structural Annotations for Disulfide Connectivity Pattern Prediction, Volume 7  
Author: Fernandez-fuentes, Narcis
Volume: Volume 7
Language: English
Subject: Journals, Science, Medical Science
Collections: Periodicals: Journal and Magazine Collection
Publication Date:
Publisher: Plos


APA MLA Chicago

Fernandez-Fuentes, N. (n.d.). Plos One : on the Relevance of Sophisticated Structural Annotations for Disulfide Connectivity Pattern Prediction, Volume 7. Retrieved from

Description : Disulfide bridges strongly constrain the native structure of many proteins and predicting their formation is therefore a key sub-problem of protein structure and function inference. Most recently proposed approaches for this prediction problem adopt the following pipeline : first they enrich the primary sequence with structural annotations, second they apply a binary classifier to each candidate pair of cysteines to predict disulfide bonding probabilities and finally, they use a maximum weight graph matching algorithm to derive the predicted disulfide connectivity pattern of a protein. In this paper, we adopt this three step pipeline and propose an extensive study of the relevance of various structural annotations and feature encodings. In particular, we consider five kinds of structural annotations, among which three are novel in the context of disulfide bridge prediction. So as to be usable by machine learning algorithms, these annotations must be encoded into features. For this purpose, we propose four different feature encodings based on local windows and on different kinds of histograms. The combination of structural annotations with these possible encodings leads to a large number of possible feature functions. In order to identify a minimal subset of relevant feature functions among those, we propose an efficient and interpretable feature function selection scheme, designed so as to avoid any form of overfitting. We apply this scheme on top of three supervised learning algorithms : k-nearest neighbors, support vector machines and extremely randomized trees. Our results indicate that the use of only the PSSM (position-specific scoring matrix) together with the CSP (cysteine separation profile) are sufficient to construct a high performance disulfide pattern predictor and that extremely randomized trees reach a disulfide pattern prediction accuracy of 58:2% on the benchmark dataset SPXz, which corresponds toz3:2% improvement over the state of the art. A web-application is available at http:/


Click To View

Additional Books

  • Plos One : Trichosanthin Inhibits Breast... (by )
  • Plos One : De Novo Assembly, Gene Annota... (by )
  • Plos One : Agp2, a Member of the Yeast A... (by )
  • Plos One : Laboratory-based Surveillance... (by )
  • Plos One : Social Environment Affects Ac... (by )
  • Plos One : Mageb2 is Activated by Promot... (by )
  • Plos One : Assessing Dna Barcoding as a ... (by )
  • Plos One : Metagenomic Analysis of Viral... (by )
  • Plos One : Potential Risk of Asymptomati... (by )
  • Plos One : Contrasting Fish Behavior in ... (by )
  • Plos One : Mannose-binding Lectin 2 Gene... (by )
  • Plos One : Development of Gmdr-gpu for G... (by )
Scroll Left
Scroll Right


Copyright © World Library Foundation. All rights reserved. eBooks from World Library are sponsored by the World Library Foundation,
a 501c(4) Member's Support Non-Profit Organization, and is NOT affiliated with any governmental agency or department.