Ptitsyn, A. and Hide, W. orcid.org/0000-0002-8621-3271 (2005) CLU: a new algorithm for EST clustering. BMC Bioinformatics, 6 (Suppl 2). S3.
Abstract
BACKGROUND: The continuous flow of EST data remains one of the richest sources for discoveries in modern biology. The first step in EST data mining is usually associated with EST clustering, the process of grouping of original fragments according to their annotation, similarity to known genomic DNA or each other. Clustered EST data, accumulated in databases such as UniGene, STACK and TIGR Gene Indices have proven to be crucial in research areas from gene discovery to regulation of gene expression.
RESULTS: We have developed a new nucleotide sequence matching algorithm and its implementation for clustering EST sequences. The program is based on the original CLU match detection algorithm, which has improved performance over the widely used d2_cluster. The CLU algorithm automatically ignores low-complexity regions like poly-tracts and short tandem repeats.
CONCLUSION: CLU represents a new generation of EST clustering algorithm with improved performance over current approaches. An early implementation can be applied in small and medium-size projects. The CLU program is available on an open source basis free of charge. It can be downloaded from http://compbio.pbrc.edu/pti.
Metadata
Item Type: | Article |
---|---|
Authors/Creators: |
|
Copyright, Publisher and Additional Information: | © Ptitsyn and Hide; licensee BioMed Central Ltd. 2006. This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
Dates: |
|
Institution: | The University of Sheffield |
Academic Units: | The University of Sheffield > Faculty of Medicine, Dentistry and Health (Sheffield) > Department of Neuroscience (Sheffield) |
Depositing User: | Symplectic Sheffield |
Date Deposited: | 12 May 2017 14:48 |
Last Modified: | 12 May 2017 14:48 |
Published Version: | https://doi.org/10.1186/1471-2105-6-S2-S3 |
Status: | Published |
Publisher: | BioMed Central Ltd |
Refereed: | Yes |
Identification Number: | 10.1186/1471-2105-6-S2-S3 |
Related URLs: | |
Open Archives Initiative ID (OAI ID): | oai:eprints.whiterose.ac.uk:115755 |