Antibody sequence analysis

12/11/2022

Despite these and other efforts, there remains a critical gap in linking antibody sequences directly to biological consequences such as target inhibition. These tools graft designed CDRs on backbones, and then utilize energy minimization and optional docking procedures to obtain complete antibody sequences. Efforts to partially design antibody sequences that bind to specific targets have been made utilizing several computational tools (e.g. Other tools (e.g., PIGS, FREAD, PyIGClassify) predict the structures of CDR loops. There are also tools (e.g., IgBLAST and SAbDab ) to select templates from databases for the variable domains VH and VL. Several numbering tools (e.g., AbNum, DomainGapAlign, PyIgClassify, ANARCI, and AbYsis ) annotate an antibody sequence to identify the Complementary Determining Regions (CDRs) and the Framework Regions (FRs). Recent computational tools provide first steps towards elucidating structural information that can guide rational antibody design. Data-driven computational approaches may shed light on such fundamental information. In particular, relating the amino acid sequences of these antibodies to their unique abilities in disrupting biological functions remains a challenge. While experimental methods for antibody discovery, including hybridoma technology and phage and yeast display, have allowed for significant advances in discovering specific binding proteins, difficulties remain in establishing general strategies for designing antibodies that disrupt enzymatic activity or other biological functions. Using design recommendation trees, ASAP-SML suggests combinations of features that can be included or excluded to augment the targeting set with additional candidate MMP-targeting antibody sequences.Īntibodies play an important role in treating diseases such as cancer and autoimmunity disorders by blocking specific protein-protein interactions and recruiting the immune system to specific cells and tissues. Further, ASAP-SML identifies several features in the MMP-targeting set that are distinct from the reference sets. As in prior studies, our analysis of these datasets shows that features associated with the antibody heavy chain are more likely to differentiate MMP-targeting antibody sequences from reference antibody sequences. ASAP-SML performs within and across set similarity analysis. We demonstrate the use of ASAP-SML by analyzing sets of antibodies that inhibit matrix metalloproteinases (MMPs) against reference sets. The pipeline is referred to as Antibody Sequence Analysis Pipeline using Statistical testing and Machine Learning (ASAP-SML). We have created a pipeline that uses statistical testing and machine learning techniques to determine features that are overrepresented in a specified set of antibody sequences in comparison to a reference set. The availability of machine learning techniques and the exponential growth of sequencing data presents new opportunities to identify features that endow antibodies with the ability to disrupt the functions of biological targets. ASAP-SML identifies features and combinations of feature values found in the MMP-targeting sets that are distinct from those in the reference sets. To demonstrate how it works, we applied the pipeline on sets of antibody sequences known to bind or inhibit the activities of matrix metalloproteinases (MMPs), a family of zinc-dependent enzymes that promote cancer progression and undesired inflammation under pathological conditions, against reference datasets that do not bind or inhibit MMPs. Machine learning and statistical significance testing techniques are applied to antibody sequences and extracted feature fingerprints to identify distinguishing feature values and combinations thereof. The fingerprints represent germline, CDR canonical structure, isoelectric point and frequent positional motifs. The pipeline extracts feature fingerprints from sequences. We develop a pipeline, Antibody Sequence Analysis Pipeline using Statistical testing and Machine Learning (ASAP-SML), to identify features that distinguish one set of antibody sequences from antibody sequences in a reference set. The key challenge in generating antibody-based inhibitors is the lack of fundamental information relating sequences of antibodies to their unique properties as inhibitors. Antibodies are capable of potently and specifically binding individual antigens and, in some cases, disrupting their functions.

0 Comments

Antibody sequence analysis

Leave a Reply.

Author

Archives

Categories