## Type of Work

Article (12) Book (0) Theses (0) Multimedia (0)

## Peer Review

Peer-reviewed only (12)

## Supplemental Material

Video (0) Audio (0) Images (0) Zip (0) Other files (0)

## Publication Year

## Campus

UC Berkeley (1) UC Davis (0) UC Irvine (1) UCLA (0) UC Merced (0) UC Riverside (6) UC San Diego (0) UCSF (0) UC Santa Barbara (0) UC Santa Cruz (0) UC Office of the President (1) Lawrence Berkeley National Laboratory (4) UC Agriculture & Natural Resources (0)

## Department

Bourns College of Engineering (3) Research Grants Program Office (RGPO) (1)

## Journal

## Discipline

## Reuse License

BY - Attribution required (1)

## Scholarly Works (12 results)

We introduce a new combinatorial optimization problem in this paper, called the Minimum Common Integer Partition (MCIP) problem, which was inspired by computational biology applications including ortholog assignment and DNA fingerprint assembly. A partition of a positive integer n is a multiset of positive integers that add up to exactly n, and an integer partition of a multiset S of integers is defined as the multiset union of partitions of integers in S. Given a sequence of multisets S_1,(...),S_k of integers, where k >= 2, we say that a multiset is a common integer partition if it is an integer partition of every multiset S_i, 1 = 3.

The assignment of orthologous genes between a pair of genomes is a fundamental and challenging problem in comparative genomics. Existing methods that assign orthologs based on the similarity between DNA or protein sequences may make erroneous assignments when sequence similarity does not clearly delineate the evolutionary relationship among genes of the same families. In this paper, we present a new approach to ortholog assignment that takes into account both sequence similarity and evolutionary events at a genome level, where orthologous genes are assumed to correspond to each other in the most parsimonious evolving scenario under genome rearrangement. First, the problem is formulated as that of computing the signed reversal distance with duplicates between the two genomes of interest. Then, the problem is decomposed into two new optimization problems, called minimum common partition and maximum cycle decomposition, for which efficient heuristic algorithms are given. Following this approach, we have implemented a high-throughput system for assigning orthologs on a genome scale, called SOAR, and tested it on both simulated data and real genome sequence data. Compared to a recent ortholog assignment method based entirely on homology search (called INPARANOID), SOAR shows a marginally better performance in terms of sensitivity on the real data set because it is able to identify several correct orthologous pairs that are missed by INPARANOID. The simulation results demonstrate that SOAR, in general, performs better than the iterated exemplar algorithm in terms of computing the reversal distance and assigning correct orthologs.

Background: Expressed sequence tag (EST) datasets represent perhaps the largest collection of genetic information. ESTs can be exploited in a variety of biological experiments and analysis. Here we are interested in the design of overlapping oligonucleotide (overgo) probes from large unigene (EST-contigs) datasets. Results: OLIGOSPAWN is a suite of software tools that offers two complementary services, namely (1) the selection of "unique" oligos each of which appears in one unigene but does not occur (exactly or approximately) in any other and (2) the selection of "popular" oligos each of which occurs (exactly or approximately) in as many unigenes as possible. In this paper, we describe the functionalities of OLIGOSPAWN and the computational methods it employs, and we report on experimental results for the overgo probes designed with it. Conclusion: The algorithms we designed are highly efficient and capable of processing unigene datasets of sizes on the order of several tens of Mb in a few hours on a regular PC. The software has been used to design overgo probes employed to screen a barley BAC library (Hordeum vulgare). OLIGOSPAWN is freely available at http://oligospawn.ucr.edu/.