Scoring functions for computational algorithms applicable to the design of spiked oligonucleotides

Research output: Contribution to journal › Journal article › Research › peer-review

Jensen, Lars Juhl
K V Andersen
A Svendsen
T Kretzschmar

Protein engineering by inserting stretches of random DNA sequences into target genes in combination with adequate screening or selection methods is a versatile technique to elucidate and improve protein functions. Established compounds for generating semi-random DNA sequences are spiked oligonucleotides which are synthesised by interspersing wild type (wt) nucleotides of the target sequence with certain amounts of other nucleotides. Directed spiking strategies reduce the complexity of a library to a manageable format compared with completely random libraries. Computational algorithms render feasible the calculation of appropriate nucleotide mixtures to encode specified amino acid subpopulations. The crucial element in the ranking of spiked codons generated during an iterative algorithm is the scoring function. In this report three scoring functions are analysed: the sum-of-square-differences function s, a modified cubic function c, and a scoring function m derived from maximum likelihood considerations. The impact of these scoring functions on calculated amino acid distributions is demonstrated by an example of mutagenising a domain surrounding the active site serine of subtilisin-like proteases. At default weight settings of one for each amino acid, the new scoring function m is superior to functions s and c in finding matches to a given amino acid population.

Original language	English
Journal	Nucleic Acids Research
Volume	26
Issue number	3
Pages (from-to)	697-702
Number of pages	6
ISSN	0305-1048
Publication status	Published - 1998
Externally published	Yes

ID: 40750041