Established Methods for Library Construction 84
Random Mutagenesis 85
Site-Directed Diversification 88
Scanning Mutagenesis 94
Diversity from Natural Sources 96
Critical Factors in Evaluation of Library-Construction Methods 97
Extent and Nature of Diversification 98
Sampling Problem 99
Library Quality 100
Design versus Randomization 100
Diversity-oriented protein engineering, also known as directed molecular evolution or directed protein evolution, relies on the construction of large libraries of variant genes, followed by high-throughput screening or selection to identify those members of each library that encode proteins with desired properties.
Typically, a protein-engineering library is based on the sequence of one or a small number of starting proteins, which already have properties similar to those required. For example, a diversity-based engineering project may start with a well-characterized antibody that binds to a specific antigen, with the goal of identifying a related antibody that will bind the same antigen with a higher affinity or specificity. Similarly, an enzyme may be re-engineered to increase its activity or thermostabil-ity, or to modify its substrate specificity. More recently, natural proteins with favorable biophysical properties have been used as scaffolds to design families of stable proteins selected for their ability to bind target macromolecules.
In order to modify or optimize an existing protein, a library of variant genes is designed and constructed with two, often conflicting, goals in mind:
First, library members need to be sufficiently similar in sequence to the starting protein to share a similar structure and function. For example, in the case of antibody affinity maturation, as many variants as possible should be similar enough to the starting antibody to also fold into functional antibodies that recognize the same antigen. Similarly, in the case of engineering an enzyme for change in substrate specificity, the variant enzymes should be similar enough in structure to the starting enzyme to catalyze the same chemical reaction.
Second, library members need to be sufficiently different in sequence from the starting protein to be slightly different in structure and thus in the functional property of interest. For example, in the case of antibody affinity maturation, the variant proteins should bind to the antigen of interest with different affinities. Similarly, in the case of engineering an enzyme for change in substrate specificity, the variant enzymes should differ enough in structure to bind a range of substrates other than the natural substrate of the starting enzyme.
In practice, striking the optimal balance between conservation and diversification of the sequence, and thus of properties to be modified, is one of the biggest challenges in library design. Ideally, all available information on the relationship between the sequence, structure, and function of the starting protein and its relatives should be used in library design, as it would have been used in computational design of a small number of improved, site-directed mutants. The amount of design incorporated into library construction is limited by the level of structural and functional understanding of the protein being engineered, as well as by technical constraints of the library-construction method being used.
Other factors that contribute to library design and construction are: (1) the disparity between accessible library size and the theoretical sequence space of interest, (2) the limited number of approaches that can incorporate knowledge and design rules into a library, (3) the use and control of randomization, (4) the natural versus synthetic origin of the diverse population, and (5) library quality.
This chapter opens with an overview of the library-construction methods most commonly used today. It then discusses these methods in light of the considerations listed in the previous paragraph, including the implications for their use in conjunction with different screening and selection methods, and for their application to different types of protein-engineering problems. We conclude with an emphasis on the complementarity and synergy between different library-construction methods, as well as between the use of diverse libraries and computational protein design.
Was this article helpful?