Protein Engineering 101


Chemical engineering in the 21st century is more versatile than ever before. Today’s chemical engineer has the ability to contribute in a wide range of fields of research including micro and nanotechnology, novel polymeric materials, energy, environment and biotechnology to name a few. Broadly speaking, protein engineering is a branch of biotechnology. The purpose of this article is to give a very brief overview of protein engineering and should not be considered complete or extensive.


Every living being owes its structural integrity and functioning to all the biomolecules that make up life. DNA, RNA, proteins and numerous other small molecules are collectively known as biomolecules. Amongst all these molecules, proteins play the most critical role in mediating most biological processes that are essential in maintaining and propagating life. For instance, large protein molecules called antibodies (also known as immunoglobulin or Ig) help us fight infections by binding its cognate antigen. If we zoom further into this binding event, we will soon realize that the binding that takes place between the antibody and its antigen is actually interaction between a protein and its binding partner. The protein is the binding sub-unit of the antibody while the binding partner can be a peptide derived from a pathogen. This non-covalent and specific interaction between a protein and another molecule is referred to as Molecular Recognition. Molecular recognition can be mediated by all possible combinations of van der Waals force, hydrogen bond, hydrophobic and electrostatic interactions. In addition to molecular recognition, properties of proteins like thermal stability, folding and soluble expression can also be critical in determining the type of engineering problem to be addressed. Protein engineering involves engineering the biochemical and/or biophysical properties of a protein such that the mutant protein attains a desired property. The emergence of protein engineering was duly captured by Ulmer some 30 years ago (Ulmer 1983).

Protein Engineering

In its simplest definition, protein engineering refers to the design of de novo proteins with desired function by the substitution, addition and/or deletion of amino acids. The desired function can be better catalytic activity of an enzyme, improved affinity of a receptor for its ligand, better thermal stability and soluble expression to name a few (Gai and Wittrup 2007). Mainly, there are two approaches to protein engineering: one is rational design and the other is directed evolution. In the rational design approach, specific amino acid residue(s) of a protein is replaced by another residue and thus a mutant protein is created. The product of such an approach is a point mutant because only one amino acid is substituted in the sequence of the naturally occurring or wild type protein. Protein function can be tremendously altered by substituting even a single amino acid (Songyang et al. 1995). Also, because there are 20 natural amino acids, substitution of an amino acid in a single position can give rise to 20 different variants. Given sufficient information about the protein to be mutagenized, more than one point mutation is typical (Serrano, Day, and Fersht 1993). However, to be able to decide which amino acid(s) needs to be replaced requires knowledge of the protein’s structure and the mechanism of its function. Thus, rational design is heavily dependent on the availability of a protein’s structure. Protein Data Bank (PDB) is a good source for finding the crystal structure of those proteins for which the structure has been determined. Because of the vast number of proteins that are out there from all different species, it is highly unlikely to obtain crystal structure of all possible proteins. Even though deposits in PDB are increasing exponentially, only some 70,000 structures have been solved so far.

Figure 1: Outline of a directed protein evolution strategy

Contrary to rational design, directed evolution employs random mutagenesis and/or gene recombination (Arnold 2001) to create mutant libraries. Simplistically put, directed protein evolution can be thought of evolution in test tubes (Figure 1). The idea is to isolate the best pool of mutants that have the desired property which the other billions of mutants don’t have. Just the way natural selection favors certain phenotypes and rejects others, directed protein evolution also applies a selection pressure on the library of mutant proteins. If the goal of the engineering is to obtain higher affinity of a protein for a ligand, then the selection pressure can be successive lower concentration of the ligand, slower dissociation rate and/or stringent wash condition of the protein-ligand complex. In random mutagenesis, any amino acid residue can be replaced by 19 other amino acids either at given positions or completely randomly at any position. Gene recombination by DNA shuffling is also an elegant way of creating molecular diversity (Stemmer 1994). A combinatorial library can also be created by mutagenizing only selected residues on a protein template instead of mutagenizing randomly. For instance, if 10 residues on a wild type protein are mutagenized, the theoretical diversity would be 2010 ~ 1×1013. Once the top 5/10 variants of the wild type are identified, then those are further characterized and the best clone is determined based on in vitro or in vivo analysis, whichever applies for a given problem. This final clone is then again mutagenized randomly and goes through the same process of selection and isolation of the very best clone (Figure 1). Three to six rounds of such selection is usually employed. The number of rounds to be carried out is determined based on the round-to-round improvement of the mutant. A few examples of directed protein evolution can be found here: (Holler et al. 2000; May, Nguyen, and Arnold 2000; Varadarajan et al. 2005; Cho and Szostak 2006).

Protein Engineering Platforms

Point mutation is achieved by site-directed mutagenesis and with the current techniques in molecular biology, the protocols for performing site-directed mutagenesis is relatively straightforward. Screening for the best mutant from a combinatorial library, however, is challenging. The challenge is to isolate the mutants with improved properties at each stage of selection. This challenge is overcome by ‘Display Technology’.

Figure 2: Yeast surface display as a platform to link genotype to phenotype. The mutant protein is expressed on yeast cell surface as a fusion to Aga2p subunit of yeast Saccahromyces cerevisiae. Aga2p is in turn linked to the Aga1p subunit of yeast mating protein a-agglutinin. Aga1p anchors the entire assembly on yeast such that the fusion protein can interact with other molecules in solution.(Boder and Wittrup 1997)

Display technology links the genotype to the phenotype (Figure 2), either using a cellular platform or in a completely cell-free environment. The first of its kind was phage display invented in the mid 80’s by George P. Smith (Smith 1985). This method uses bacteriophage to express the foreign gene library as fusion protein on phage surface. Bratkovic discussed the progress in phage display techniques and its applications in his recent review (Bratkovic 2010). Next in line was Yeast Surface Display (YSD), developed by KD Wittrup that came into picture in the late 90’s (Boder and Wittrup 1997). YSD has been particularly useful because it uses eukaryotic protein processing machineries and thus enables expression of mammalian proteins with post-translational modifications better than phage display (Boder and Wittrup 1997). In YSD, the gene library is expressed on the surface of the yeast as a fusion protein to Aga2p mating protein of the yeast strain Saccharomyces cerevisiae (Figure 2). Use of yeast surface display in protein engineering is reviewed by Gai et al (Gai and Wittrup 2007). Review on protein engineering using cell-surface display can be found elsewhere (Wittrup 2001). Another display technique, called messenger RNA display is different from the former two in that mRNA display is completely a cell-free, in vitro platform. Ribosome display is also a cell-free display technique. Lipovsek et al presented a review on the latter two display technologies (Lipovsek and Pluckthun 2004). Alternatively, proteins can be engineered in silico. Computational design of proteins to alter affinity, specificity or folding can aid the experimental procedures to solve complex problems. Lippow and Tidor discussed the recent progress in computational protein design (Lippow and Tidor 2007).


Protein engineering has been employed in a variety of fields spanning from biocatalysis to biomedicine. Enzymes engineered with enhanced catalytic activity have tremendously benefited industries like textiles, leather, detergent, pulp and paper, personal care, food and beverage (Cherry and Fidantsef 2003; Johannes and Zhao 2006). Cellulases have been engineered for use in many of the above mentioned industries (Cherry and Fidantsef 2003). More recently, Frances Arnold and co-workers have created a family of cellulases for conversion of biomass into biofuel (Heinzelman et al. 2009). Recent advances in protein engineering in the field of biofuels have been reviewed by Wen et al (Wen, Nair, and Zhao 2009).

Enzymes and single chain antibody fragments have been engineered for use in biosensors for a wide variety of targets such as glucose (Sode et al. 2000) and virus (Torrance et al. 2006) respectively. These studies have implications in blood-sugar monitoring and pathogen detection. Application of protein engineering in biosensor development has been extensively reviewed by Lambrianou et al (Lambrianou, Demin, and Hall 2008).

Enzyme engineering has also been particularly useful in bioremediation for removing recalcitrant pollutants. For instance, directed evolution of aniline dioxygenase has proven very useful in biodegradation of some aromatic amines (Ang, Obbard, and Zhao 2009, 2007). Ang et al have reviewed the application of protein engineering in bioremediation (Ang, Zhao, and Obbard 2005).

Protein engineering has transformed biomedicine in many different ways. Protein therapeutics has been revolutionized ever since the recombinant human insulin was approved by the United States Food and Drug Administration (USFDA) in 1982 for treating diabetes mellitus (Leader, Baca, and Golan 2008). However, a single amino acid substitution from the wild type insulin became necessary to mimic the recombinant insulin’s ability to retain its biological activity (Brange et al. 1988). A summary of all the protein therapeutics in the market is furnished in a review by Leader et al (Leader, Baca, and Golan 2008). Engineered proteins have also been very useful in the diagnosis of several types of cancers [table 10 by Leader et al (Leader, Baca, and Golan 2008) and references therein]. While many of the proteins listed in the review by Leader et al are wild type proteins, a number of enzymes, hormones and monoclonal antibodies are engineered, de novo proteins.

Monoclonal antibodies (mAb) have been widely engineered as protein therapeutics. A list of current monoclonal antibodies in the market can be found in a review by Li et al (Li and Zhu 2010). Maynard et al have an excellent review on antibody engineering (Maynard and Georgiou 2000). Some of the USFDA approved novel mAbs and antibody-based therapeutics are used for treatments in several types of cancer, autoimmune disease, respiratory disease, kidney transplant rejection, asthma, cardiac disease like ischemia, osteoporosis, etc. (Li and Zhu 2010). Even though antibodies have been extensively engineered to meet therapeutic needs, some of the features of antibodies are not desirable. For instance, antibodies have large multi domain structure secured by disulfide bonds, which makes facile expression of antibodies difficult in common bacterial expression system. In addition, the process of generation and production of antibodies is time consuming and costly. These issues have prompted the use of antibody-alternatives as template-proteins for engineering. For instance, protein therapeutics called AdnectinsTM has been engineered to bind biologically relevant targets (Lipovsek 2010). AdnectinsTM are engineered variants of 10th domain of fibronectin type III. DARPins (Designed Ankyrin Repeat Proteins) (Stumpp, Binz, and Amstutz 2008), Affibody® (Wikman et al. 2004) and Avimer (Silverman et al. 2005) are some of the other engineered proteins that hold promise as new generation proteins for wide applicability in biotechnology and medicine (Skerra 2007). In an effort to engineer highly stable proteins, hyperthermophilic proteins have very recently appeared as a promising alternative to antibodies (Gera et al. 2011).

Concluding remarks

Since the invention of recombinant DNA technology, biotechnology and medicine have taken unprecedented turns (Cohen et al. 1973). Completion of the Human Genome Project and exponential increase of protein structure deposits in PDB have opened up new horizons in the field of biotechnology and medicine. Our understanding of protein structure-function relationship is also allowing us to approach biological problems in the molecular and systems level that was previously not possible. Protein engineering has been in the forefront of addressing biotechnological challenges by designing novel proteins. Fueled by the on-going research and development in protein engineering, this powerful discipline will continue to be a leading field of biology.


Amino acid: Building blocks of proteins. A protein can be composed of as many as thousands of amino acids.
Antigen: Any substance that draws an immune response in the body i.e. triggers the production of antibodies specific to that substance.
Bacteriophage: Commonly abbreviated as phage, bacteriophages are viruses that infect bacteria.
Bioremediation: Use of microorganisms to eliminate pollutants.
Expression: Production/synthesis of proteins in cells.
Gene: DNA sequence that codes for a protein.
Genome: The entire hereditary information of an organism and is coded in the DNA of all living organisms. For certain viruses, the genome is coded in RNA.
Genotype: DNA sequence that codes for a phenotype.
In vitro: Outside the native environment. For instance, something performed in test tubes without the use of any cells.
In vivo: Opposite of in vitro. Something inside the cell/ in the body/ in the natural environment.
In silico: Performed in computer.
Ligand: Any molecule that can trigger many biochemical processes upon binding to another protein such as another receptor.
Mutant: Variant of a protein produced (naturally or artificially) by substitution/addition/deletion of amino acid residue(s).
Mutagenize/Mutagenesis: The process of producing mutant proteins by altering the DNA sequence.
Pathogen: Disease causing agent such as virus/bacteria/fungus/protozoa.
Phenotype: Visible trait/characteristic of an organism. In regards to protein engineering, it refers to the protein in question.
Post-translational modification (PTM): Chemical modification of proteins that take place after the ‘translation’ or the synthesis of proteins inside the cell, such as glycosylation, acetylation etc. PTM’s are essential for regulating numerous cellular processes.
Receptor: Trans-membrane proteins that mediate many cellular processes such as the T-cell receptors present on T-lymphocytes (cells present in the immune system).
Recombinant DNA technology: The process of cloning a foreign gene in a host organism and produce the corresponding protein in bacteria, yeast or a mammalian cell line.

Ang, E. L., J. P. Obbard, and H. Zhao. 2007. Probing the molecular determinants of aniline dioxygenase substrate specificity by saturation mutagenesis. FEBS J 274 (4):928-939.
Ang, E. L., J. P. Obbard, and H. Zhao. 2009. Directed evolution of aniline dioxygenase for enhanced bioremediation of aromatic amines. Appl Microbiol Biotechnol 81 (6):1063-1070.
Ang, E.L., H. Zhao, and J. P. Obbard. 2005. Recent advances in the bioremediation of persistent organic pollutants via biomolecular engineering. Enzyme Microb Technol 37 (5):487-496.
Arnold, F. H. 2001. Combinatorial and computational challenges for biocatalyst design. Nature 409 (6817):253-257.
Boder, E. T., and K. D. Wittrup. 1997. Yeast surface display for screening combinatorial polypeptide libraries. Nat Biotechnol 15 (6):553-557.
Brange, J., U. Ribel, J. F. Hansen, G. Dodson, M. T. Hansen, S. Havelund, S. G. Melberg, F. Norris, K. Norris, L. Snel, and et al. 1988. Monomeric insulins obtained by protein engineering and their medical implications. Nature 333 (6174):679-682.
Bratkovic, T. 2010. Progress in phage display: evolution of the technique and its application. Cell Mol Life Sci 67 (5):749-767.
Cherry, J. R., and A. L. Fidantsef. 2003. Directed evolution of industrial enzymes: an update. Curr Opin Biotechnol 14 (4):438-443.
Cho, G. S., and J. W. Szostak. 2006. Directed evolution of ATP binding proteins from a zinc finger domain by using mRNA display. Chem Biol 13 (2):139-147.
Cohen, S. N., A. C. Chang, H. W. Boyer, and R. B. Helling. 1973. Construction of biologically functional bacterial plasmids in vitro. Proc Natl Acad Sci U S A 70 (11):3240-3244.
Gai, S. A., and K. D. Wittrup. 2007. Yeast surface display for protein engineering and characterization. Curr Opin Struct Biol 17 (4):467-473.
Gera, N., M. Hussain, R. C. Wright, and B. M. Rao. 2011. Highly Stable Binding Proteins Derived from the Hyperthermophilic Sso7d Scaffold. J Mol Biol.
Heinzelman, P., C. D. Snow, I. Wu, C. Nguyen, A. Villalobos, S. Govindarajan, J. Minshull, and F. H. Arnold. 2009. A family of thermostable fungal cellulases created by structure-guided recombination. Proc Natl Acad Sci U S A 106 (14):5610-5615.
Holler, P. D., P. O. Holman, E. V. Shusta, S. O’Herrin, K. D. Wittrup, and D. M. Kranz. 2000. In vitro evolution of a T cell receptor with high affinity for peptide/MHC. Proc Natl Acad Sci U S A 97 (10):5387-5392.
Johannes, T. W., and H. Zhao. 2006. Directed evolution of enzymes and biosynthetic pathways. Curr Opin Microbiol 9 (3):261-267.
Lambrianou, A., S. Demin, and E. A. Hall. 2008. Protein engineering and electrochemical biosensors. Adv Biochem Eng Biotechnol 109:65-96.
Leader, B., Q. J. Baca, and D. E. Golan. 2008. Protein therapeutics: a summary and pharmacological classification. Nat Rev Drug Discov 7 (1):21-39.
Li, J., and Z. Zhu. 2010. Research and development of next generation of antibody-based therapeutics. Acta Pharmacol Sin 31 (9):1198-1207.
Lipovsek, D. 2010. Adnectins: engineered target-binding protein therapeutics. Protein Eng Des Sel 24 (1-2):3-9.
Lipovsek, D., and A. Pluckthun. 2004. In-vitro protein evolution by ribosome display and mRNA display. J Immunol Methods 290 (1-2):51-67.
Lippow, S. M., and B. Tidor. 2007. Progress in computational protein design. Curr Opin Biotechnol 18 (4):305-311.
May, O., P. T. Nguyen, and F. H. Arnold. 2000. Inverting enantioselectivity by directed evolution of hydantoinase for improved production of L-methionine. Nat Biotechnol 18 (3):317-320.
Maynard, J., and G. Georgiou. 2000. Antibody engineering. Annu Rev Biomed Eng 2:339-376.
Serrano, L., A. G. Day, and A. R. Fersht. 1993. Step-wise mutation of barnase to binase. A procedure for engineering increased stability of proteins and an experimental analysis of the evolution of protein stability. J Mol Biol 233 (2):305-312.
Silverman, J., Q. Liu, A. Bakker, W. To, A. Duguay, B. M. Alba, R. Smith, A. Rivas, P. Li, H. Le, E. Whitehorn, K. W. Moore, C. Swimmer, V. Perlroth, M. Vogt, J. Kolkman, and W. P. Stemmer. 2005. Multivalent avimer proteins evolved by exon shuffling of a family of human receptor domains. Nat Biotechnol 23 (12):1556-1561.
Skerra, A. 2007. Alternative non-antibody scaffolds for molecular recognition. Curr Opin Biotechnol 18 (4):295-304.
Smith, G. P. 1985. Filamentous fusion phage: novel expression vectors that display cloned antigens on the virion surface. Science 228 (4705):1315-1317.
Sode, K., T. Ootera, M. Shirahane, A. B. Witarto, S. Igarashi, and H. Yoshida. 2000. Increasing the thermal stability of the water-soluble pyrroloquinoline quinone glucose dehydrogenase by single amino acid replacement. Enzyme Microb Technol 26 (7):491-496.
Songyang, Z., G. Gish, G. Mbamalu, T. Pawson, and L. C. Cantley. 1995. A single point mutation switches the specificity of group III Src homology (SH) 2 domains to that of group I SH2 domains. J Biol Chem 270 (44):26029-26032.
Stemmer, W. P. 1994. DNA shuffling by random fragmentation and reassembly: in vitro recombination for molecular evolution. Proc Natl Acad Sci U S A 91 (22):10747-10751.
Stumpp, M. T., H. K. Binz, and P. Amstutz. 2008. DARPins: a new generation of protein therapeutics. Drug Discov Today 13 (15-16):695-701.
Torrance, L., A. Ziegler, H. Pittman, M. Paterson, R. Toth, and I. Eggleston. 2006. Oriented immobilisation of engineered single-chain antibodies to develop biosensors for virus detection. J Virol Methods 134 (1-2):164-170.
Ulmer, K. M. 1983. Protein engineering. Science 219 (4585):666-671.
Varadarajan, N., J. Gam, M. J. Olsen, G. Georgiou, and B. L. Iverson. 2005. Engineering of protease variants exhibiting high catalytic activity and exquisite substrate selectivity. Proc Natl Acad Sci U S A 102 (19):6855-6860.
Wen, F., N. U. Nair, and H. Zhao. 2009. Protein engineering in designing tailored enzymes and microorganisms for biofuels production. Curr Opin Biotechnol 20 (4):412-419.
Wikman, M., A. C. Steffen, E. Gunneriusson, V. Tolmachev, G. P. Adams, J. Carlsson, and S. Stahl. 2004. Selection and characterization of HER2/neu-binding affibody ligands. Protein Eng Des Sel 17 (5):455-462.
Wittrup, K. D. 2001. Protein engineering by cell-surface display. Curr Opin Biotechnol 12 (4):395-399.



To cite this article, please use following information:

(use the given format or any standard citation format)

Hussain, M., Protein Engineering 101, ChE Thoughts 2 (2), 6-12, 2011.


Leave a Reply

You must be logged in to post a comment.