User‐friendly algorithms for estimating completeness and diversity in randomized protein‐encoding libraries
Authors: Patrick, Wayne M.; Firth, Andrew E.; Blackburn, Jonathan M.
Source: Protein Engineering, Volume 16, Number 6, June 2003 , pp. 451-457(7)
Publisher: Oxford University Press
Abstract:
Directed evolution of proteins depends on the production of molecular diversity by random mutagenesis. While a number of methods have been developed for introducing this diversity, the best ways to sample it are not always clear. Here we used simple statistics to analyse completeness and diversity in randomized libraries generated by oligonucleotide‐directed mutagenesis, error‐prone polymerase chain reaction (epPCR) and in vitro recombination of highly homologous sequences. For oligonucleotide‐directed mutagenesis, we derive equations to estimate how complete a given library is expected to be and also to predict the size of library required to give a fixed probability of being 100% complete. We describe the statistical bases for computer programs which estimate the number of distinct variants represented in epPCR and shuffled libraries, dubbed PEDEL and DRIVeR, respectively. These programs allow the user to calculate (rather than guess) the diversity represented in a given library and also provide empirical guidelines for maximizing this diversity. PEDEL and DRIVeR are available at www.bio.cam.ac.uk/∼blackburn/stats.html.Keywords: completeness/diversity/Poisson statistics/randomiz
Document Type: Research article
DOI: http://dx.doi.org/10.1093/protein/gzg057
Publication date: 2003-06-01
- In this: publication
- By this: publisher
- In this Subject: Chemical Engineering
- By this author: Patrick, Wayne M. ; Firth, Andrew E. ; Blackburn, Jonathan M.

Shopping cart
Receive new issue alert