Data Science Multiple Choice Questions on “Introduction to Reproducible Research”.
1. Which of the following problem is solved by reproducibility?
a) Scalability
b) Data availability
c) Improved data analysis
d) None of the mentioned
Answer: b
Explanation: More transparency is achieved with reproducibility.
2. Point out the correct statement with respect to replication.
a) Focuses on the validity of the data analysis
b) Focuses on the validity of the scientific claim
c) Arguably a minimum standard for any scientific study
d) All of the mentioned
Answer: a
Explanation: Data replication if the same data is stored on multiple storage device.
3. Which of the following is effective way of checking validity of data analysis?
a) Re-run the analysis
b) Review the code
c) Check the sensitivity
d) All of the mentioned
Answer: d
Explanation: Reproducibility addresses the most “downstream” aspect of the research process.
4. Which of the following is similar to a pre-specified clinical trial protocol?
a) Caching-based Data Analysis
b) Evidence-based Data Analysis
c) Markdown-based Data Analysis
d) All of the mentioned
Answer: b
Explanation: Evidence-based Data Analysis a deterministic statistical machine.
5. Point out the wrong statement with respect to reproducibility.
a) Focuses on the validity of the data analysis
b) The ultimate standard for strengthening scientific evidence
c) Important when replication is impossible
d) None of the mentioned
Answer: b
Explanation: Replication is particularly important in studies that can impact broad policy or regulatory decisions.
6. Which of the following can be used for data analysis model?
a) CRAN
b) CPAN
c) CTAN
d) All of the mentioned
Answer: d
Explanation: Different problems require different approaches and expertise.
7. Reproducibility determines correctness of data analysis.
a) True
b) False
Answer: b
Explanation: Reproducibility has nothing to do with validity of data analysis.
8. Which of the following step is not required in data analysis?
a) Synthesize results
b) Create reproducible code
c) Interpret results
d) None of the mentioned
Answer: d
Explanation: The data set may depend on your goal.
9. Which of the following gives reviewers an important tool without dramatically increasing the burden?
a) Quality research
b) Replication research
c) Reproducible research
d) None of the mentioned
Answer: c
Explanation: Reproducible research is important, but does not necessarily solve the critical question of whether a data analysis is trustworthy.
10. Result analysis are relatively easy to replicate or reproduce.
a) True
b) False
Answer: b
Explanation: Complicated analyses should not be trusted.