Bioinformatics Multiple Choice Questions on “Exhaustive Algorithms”.
1. Related sequences are identified through the database similarity searching and as the process generates multiple matching sequence pairs, it is often necessary to convert the numerous pair wise alignments into a single alignment. Answer: A 2. There is a unique advantage of multiple sequence alignment because it reveals more biological information than many pair wise alignments can. Answer: A 3. Which of the following cannot be related to multiple sequence alignment? Answer: D 4. The scoring function for multiple sequence alignment is based on the concept of sum of pairs (SP). Answer: A 5. Which of the following scores are not considered while calculating the SP scores? Answer: D 6. Given a multiple alignment of three sequences, the sum of scores is calculated as the sum of the dissimilarity scores of every pair of sequences at each position. Answer: B 7. There are two approaches viz. exhaustive and heuristic approaches used in multiple sequence alignment. Answer: A 8. In a multidimensional search matrix, for aligning N sequences, an (N+2)-dimensional matrix is needed to be filled with alignment scores. Answer: B 9. As the amount of computational time and memory space required increases exponentially with the number of sequences, it makes the multidimensional search matrix method computationally prohibitive to use for a large data set. Answer: A 10. Which of the following is untrue about DCA? Answer: D
A. True
B. False
Explanation: A natural extension of pair wise alignment is multiple sequence alignment, which is to align multiple related sequences to achieve optimal matching of the sequences. Related sequences are identified through the database similarity searching. As the process generates multiple matching sequence pairs, it is often necessary to convert the numerous pair wise alignments into a single alignment, which arranges sequences in such a way that evolutionarily equivalent positions across all sequences are matched.
A. True
B. False
Explanation: It is truly an advantage of multiple sequence alignment. For example, it allows the identification of conserved sequence patterns and motifs in the whole sequence family, which are not obvious to detect by comparing only two sequences.
A. Many conserved and functionally critical amino acid residues can be identified in a protein multiple alignment
B. Multiple sequence alignment is also an essential prerequisite to carrying out phylogenetic analysis of sequence families and prediction of protein secondary and tertiary structures
C. Multiple sequence alignment also has applications in designing degenerate polymerase chain reaction (PCR) primers based on multiple related sequences
D. This method does not contribute much to degenerate polymerase chain reaction (PCR) primers creation
Explanation: In practice, heuristic approaches are most often used. Multiple sequence alignment has applications in designing degenerate (PCR) primers based on multiple related sequences.
A. True
B. False
Explanation: Multiple sequence alignment is to arrange sequences in such a way that a maximum number of residues from each sequence are matched up according to a particular scoring function and is based on the concept of sum of pairs (SP). As the name suggests, it is the sum of the scores of all possible pairs of sequences in a multiple alignment based on a particular scoring matrix.
A. All possible pair wise matches
B. All possible mismatches
C. All possible gap costs
D. Number of gap penalties
Explanation: In calculating the SP scores, each column is scored by summing the scores for all possible pair wise matches, mismatches and gap costs. The score of the entire alignment is the sum of all of the column scores. The score of the entire alignment is the sum of all of the column scores. In that case, option d becomes irrelevant choice here.
A. True
B. False
Explanation: Given a multiple alignment of three sequences, the sum of scores is calculated as the sum of the similarity scores of every pair of sequences at each position. The scoring is based on the BLOSUM62 matrix. If the total score for the alignment is 5, which means that the alignment is 25 = 32 times more likely to occur among homologous sequences than by random chance.
A. True
B. False
Explanation: The exhaustive alignment method involves examining all possible aligned positions simultaneously. Similar to dynamic programming in pair wise alignment, which involves the use of a two-dimensional matrix to search for an optimal alignment, to use dynamic programming for multiple sequence alignment, extra dimensions are needed to take all possible ways of sequence matching into consideration.
A. True
B. False
Explanation: In a multidimensional search matrix, for aligning N sequences, an N-dimensional matrix is needed to be filled with alignment scores. For instance, for three sequences, a three-dimensional matrix is required to account for all possible alignment scores. Back-tracking is applied through the three-dimensional matrix to find the highest scored path that represents the optimal alignment.
A. True
B. False
Explanation: This is indeed the drawback of that method. For this reason, full dynamic programming is limited to small datasets of less than ten short sequences. For the same reason, few multiple alignment programs employing this “brute force” approach are publicly available.
A. It stands for Divide-and-Conquer Alignment
B. It works by breaking each of the sequences into two smaller sections
C. The breaking points during the process are determined based on regional similarity of the sequences
D. If the sections are not short enough, further divisions are restricted as well
Explanation: This is a web-based program that is in fact semi exhaustive because certain steps of computation are reduced to heuristics. If the sections are not short enough, further divisions are carried out. When the lengths of the sequences reach a predefined threshold, dynamic programming is applied for aligning each set of subsequences. The resulting short alignments are joined together head to tail to yield a multiple alignment of the entire length of all sequences.