Tetsuo Shibuya
Human Genome Center, Institute of Medical Science, University of Tokyo
SMAD (Statistical Model-based Algorithm Design) is a strategy that we propose
to use for designing faster algorithms against inputs that follow some
statistical model.
For example, we designed very fast expected linear time searching algorithms
[1][2] for the fundamental problem of searching over protein 3-D structure
databases, considering that the 3-D structures in the databases follow
the famous statistical molecular model called the `freely-jointed chain
model'. The algorithm in [1] runs in O(Nm) time for arbirary inputs in the worst case (where N is the database size and m is the query size), but it runs in O(N) time in average against the databases that statistically follows the
model. (According to the experiments on the PDB database, the algorithm
*actually* runs in linear time and is practically much faster than previous
algorithms.)
Reference
[1] Tetsuo Shibuya, Searching Protein 3-D Structures in Linear Time, Proc.
13th Annual International Conference on Research in Computational Molecular
Biology (RECOMB 2009), 2009 LNBI 5541, pp 1-15. (Best Paper Award) (PDF file)
[2] Tetsuo Shibuya, Jesper Jansson and Kunihiko Sadakane, Linear-Time Protein
3-D Structure Searching with Insertions and Deletions, 9th Workshop on
Algorithms in Bioinformatics (WABI 2009), LNCS, vol. 5724, pp. 310-320,
2009.
[3] Tetsuo Shibuya, Jesper Jansson, and Kunihiko Sadakane, Linear-Time
Protein 3-D Structure Searching with Insertions and Deletions, BMC Algorithms
for Molecular Biology, to appear.
[4] Tetsuo Shibuya, Searching Protein 3-D Structures in Linear Time, Journal
of Computational Biology, to appear.
[5] Tetsuo Shibuya, Searching Protein 3-D Structures in Faster Than Linear
Time, Journal of Computational Biology, to appear.