Rolf Muertter









Needleman-Wunsch Algorithm

Let's say you have two genetic sequences from two different individuals, and you want to know whether they represent the same gene. How would you go about it? You could let your computer do a simple string comparison, but then you would almost never get a match, even if the two nucleotide sequences do come from the same gene. That's because there is so much genetic variation, especially if the individuals come from two different species. What you need is a more sophisticated approach, one that will give you a good match even if the sequences have diverged in the course of evolution through mutations such as insertions, deletions or nucleotide changes.

This is what the Needleman-Wunsch algorithm does. It optimizes the alignment between two similar sequences, and assumes that the sequences are similar throughout. In other words, it is a global alignment algorithm. The Smith-Watermann algorithm is even more useful, because it doesn't assume that the sequences should be similar everywhere, but instead tries to line up sections of the sequences that give a good match. I implemented the Needleman-Wunsch Algorithm in C++ based on its description in Biological Sequence Analysis, published in 1998. The book is an excellent introduction to biological sequence analysis, including other dynamic programming algorithms, such as hidden Markov models.

Run the program:

Enter first sequence: e.g. ACCGTTTACGT
Enter second sequence: e.g. AGCGTTACAT

(Sorry, it's temporarily out of order)

C++ Source Code:

© Rolf Muertter, muertter@gmail.com