Bioinformatics Multiple Choice Questions on “Performance Evaluation”.
1. Rigorously evaluating the performance of RNA prediction programs has traditionally been hindered by the dearth of three-dimensional structural information for RNA. Answer: A 2. If prediction accuracy can be represented using a ______ the ______ programs score roughly 20% to 60% depending on the length of the sequences. Answer: B 3. For _________ RNA sequences, such as tRNA, some programs may be able to produce _________% accuracy. Answer: A 4. The pre-alignment independent programs fare __________ for predicting long sequences. Answer: D 5. Based on recent benchmark comparisons, the comparative-type algorithms can reach an accuracy range of 20% to 80%. Answer: A 6. In comparative approach to RNA structure prediction, algorithms that do not use pre-alignment, align multiple input sequences and infers a consensus structure. Answer: A 7. In comparative approach to RNA structure prediction, Foldalign is a web-based only program for RNA alignment Answer: B 8. In comparative approach to RNA structure prediction, the Foldalign program doesn’t use the covariation information. Answer: B 9. In comparative approach to RNA structure prediction, Dynalign is a ________ program. Answer: c 10. In comparative approach to RNA structure prediction, in Dynalign program–by comparing _________ from each sequence, a ______ structure common to both sequences is selected that serves as the basis for sequence alignment. Answer: A
A. True
B. False
Explanation: The availability of recently solved crystal structures of the entire ribosome provides a wealth of structural details relating to diverse types of RNA molecules. The high resolution structural information can then be used as a benchmark for evaluating state-of-the-art RNA structure prediction programs in all categories.
A. multiple parameter, ab initio–based
B. single parameter, ab initio–based
C. multiple parameter, comparative–based
D. single parameter, comparative–based
Explanation: As mentioned, the scores depend on the length of the sequences. Generally speaking, the programs perform better for shorter RNA sequences than for longer ones.
A. small, 70
B. small, 40
C. large, 90
D. large, 75
Explanation: The number of the percentage may vary but the qualitative idea is that for small RNA sequences, some programs may produce better accuracy. The major limitation for performance gains of this category appears to be dependence on energy parameters alone, which may not be sufficient to distinguish different structural possibilities of the same molecule.
A. slight better
B. much better
C. a bit worse
D. much worse
Explanation: For small RNA sequences such as tRNA, both subtypes can achieve very high accuracy (up to 100%). This illustrates that the comparative approach is consistently more accurate than the ab initio one.
A. True
B. False
Explanation: The results depend on whether a program is pre-alignment dependent or not. Most of the superior performance comes from pre-alignment-dependent programs such as RNAalifold.
A. True
B. False
Explanation: The alignment is produced using dynamic programming with a scoring scheme that incorporates sequence similarity as well as energy terms. Because the full dynamic programming for multiple alignment is computationally too demanding, currently available programs limit the input to two sequences.
A. True
B. False
Explanation: Foldalign is a web-based program for RNA alignment and structure prediction. The user provides a pair of unaligned sequences.
A. True
B. False
Explanation: The program uses a combination of clustal and dynamic programming with a scoring scheme that includes covariation information to construct the alignment. A commonly conserved structure for both sequences is subsequently derived based on the alignment. To reduce computational complexity, the program ignores multi-branch loops and is only suitable for handling short RNA sequences.
A. Windows based
B. Fedora
C. UNIX
D. iOS based
Explanation: Is a UNIX program with a free source code for downloading. Here, the user again provides two input sequences. The program calculates the possible secondary structures of each using a method similar to Mfold.
A. multiple alternative structures, lowest energy
B. single structure, lowest energy
C. single structure, highest energy
D. multiple alternative structures, highest energy
Explanation: The unique feature of this program is that it does not require sequence similarity and therefore can handle very divergent sequences. However, because of the computation complexity, the program only predicts small RNA sequences such as tRNA with reasonable accuracy.