Motivation: Insufficient knowledge of general principles for accurate quantitative inference of biological properties from sequences is a major obstacle in the rationale design of proteins with predetermined activities. Due to this deficiency, protein engineering frequently relies on the use of computational approaches focused on the identification of quantitative structure-activity relationship (SAR) for each specific task. In the current article, a computational model was developed to define SAR for a major conformational antigenic epitope of the hepatitis C virus (HCV) non-structural protein 3 (NS3) in order to facilitate a rationale design of HCV antigens with improved diagnostically relevant properties.
Results: We present an artificial neural network (ANN) model that connects changes in the antigenic properties and structure of HCV NS3 recombinant proteins representing all 6 HCV genotypes. The ANN performed quantitative predictions of the enzyme immunoassay (EIA) Signal/Cutoff (S/Co) profiles from sequence information alone with 89.8% accuracy. Amino acid positions and physicochemical factors strongly associated with the HCV NS3 antigenic properties were identified. The positions most significantly contributing to the model were mapped on the NS3 3D structure. The location of these positions validates the major associations found by the ANN model between antigenicity and structure of the HCV NS3 proteins.
Availability: Matlab code is available at the following URL address: http://bio-ai.myeweb.net/box_widget.html