ALIGNER  Detecting and Aligning Related Protein Sequences



Clustering of Protein Sequences Based on SMS a New Similarity Measure here.




SIPA was developed within the framework of the data mining research tasks of the ProspectUs laboratory.






Last update : 16 February 2010



Motivation: Existing proteins alignment approaches are usually either global or local. Global are the most successful, except for sequences with similar regions in remote positions (such as N/C-Terminal extensions and internal insertions) or multi-modular sequences. However, it remains difficult to decide which approach is the most appropriate without prior knowledge about the structure of the proteins. In addition, these approaches ignore if the sequences share enough of conserved regions to produce biochemically significant alignments. This usually leads biologists to handle manually input protein datasets by discarding divergent sequences.

 Results: We developed ALIGNER, an approach that is able to align effectively protein sequences that need either global or local alignment. ALIGNER aligns the entire length of sequences with a particular attention to local similarities. ALIGNER detects significant patterns underlying functional properties, and discards those patterns occurring by chance. In addition, ALIGNER detects and aligns structurally similar protein sequences, which facilitates discovering conserved regions when divergent proteins are concerned. Experimental assays showed that ALIGNER outperforms almost all alignment approaches either global or local in discovering conserved regions in protein sequences, especially those that cause difficulties for sequence-based approaches.