Secondary Protein Structure Prediction Combining Protein Structural Class, Relative Surface Accessibility, and Contact Number

Document Type


Publication Date



Computer Sciences | Physical Sciences and Mathematics


Imad Rahal, Computer Science


Using neural networks to predict the structure of proteins from amino acid sequences is a very common technique. Accuracy of these methods varies greatly depending on the network design, methods used for training, and input datasets. Neural networks tend to work for secondary structure prediction due to the pattern recognition nature of the task. Procedural methods tend to fail to give high accuracies due to the complexities of the interactions.

Several show how prediction accuracy can be increased through the addition of information such as contact number[17], relative surface accessibility[1, 18], protein structural class[16], and other data. While these studies have focused on the improvements by adding individual data, none have been completed that show what effect adding more than one together would have.

To see if the combination of additional data has a positive effect on the accuracy of a prediction network, additional data points will be combined together. Contact number, relative surface accessibility, and protein structure can be combined together in seven different ways and have been independently shown to increase accuracy. These different combinations should allow the determination of how much of an effect the data combinations have on prediction accuracy.