DNA Microbial and Viral Identification using Ultraspecific Probes ''Blind'' to Host and Background DNA
Catherine Putonti, (University of Houston), putonti@bioinfo.uh.edu,
George E. Fox, (University of Houston), fox@uh.edu,
Richard C. Willson, (University of Houston), willson@uh.edu, and
Yuriy Fofanov, (University of Houston), yfofanov@bioinfo.uh.edu
Abstract
The reliable detection and identification of microbes and viruses in complex samples without separation of DNA of the organism of interest from the sample background is a challenging and important problem. We have developed a set of novel algorithms that make it feasible to analyze the occurrence of all possible short sequences of length 10 to 25 nucleotides in complete genome sequences of any size. As a result, we can identify all signature sequences present in each of a large set of pathogen genomes and absent in (and not within up to three mismatches) the human genome. We found that it is unusual to find a single, unique genomic sequence present simultaneously in all genomes of interest and absent in all other genomes, including the host organism, even for groups of closely related organisms (e.g., the West Nile virus). This result leads us to suggest using a set of probes that are absent in the host genome, likely to be found in the pathogen genome, and expressed in ! a unique pattern for each pathogen for pathogen identification. Herein we use an evolutionary programming approach to design microarrays so as to minimize the number of probes required, to avoid false positives and to achieve maximal sensitivity. Supporting the proposed approach, initial in silico and in vitro microarray experimental results are provided.