300+ REAL TIME Data Science Objective Questions & Answers 2023

250+ TOP MCQs on Literate Statistical Programming and Answers

Data Science Multiple Choice Questions on “Literate Statistical Programming”.

1. What is the role of processing code in the research pipeline?
a) Transforms the analytical results into figures and tables
b) Transforms the analytic data into measured data
c) Transforms the measured data into analytic data
d) All of the mentioned

Answer: c
Explanation: Data science workflow is a non-linear, iterative process.

2. Which of the following is a goal of literate statistical programming?
a) Combine explanatory text and data analysis code in a single document
b) Ensure that data analysis documents are always exported in JPEG format
c) Require those data analysis summaries are always written in R
d) None of the mentioned

Answer: a
Explanation: Literate Statistical Practice is a programming methodology.

3. What does it mean to weave a literate statistical program?
a) Convert a program from S to python
b) Convert the program into a human readable document
c) Convert a program to decompress it
d) All of the mentioned

Answer: b
Explanation: Literate Statistical Programming can be done with knitr.

4. Which of the following is required to implement a literate programming system?
a) A programming language like Perl
b) A programming language like Java
c) A programming language like R
d) All of the mentioned

Answer: c
Explanation: R is a language and environment for statistical computing and graphics.

5. What is one way in which the knitr system differs from Sweave?
a) knitr allows for the use of markdown instead of LaTeX
b) knitr is written in python instead of R
c) knitr lacks features like caching of code chunks
d) none of the mentioned

Answer: a
Explanation: knitr is an engine for dynamic report generation with R.

6. Which of the following is useful way to put text, code, data, output all in one document?
a) Literate statistical programming
b) Object oriented programming
c) Descriptive programming
d) All of the mentioned

Answer: a
Explanation: Object-oriented programming is a programming language model organized around objects rather than “actions” and data rather than logic.

7. Some chunks have to be re-computed every time you re-knit the file.
a) True
b) False

Answer: b
Explanation: All chunks have to be re-computed every time you re-knit the file.

8. Which of the following tool can be used for integrating text and code in one document?
a) knitr
b) ggplot2
c) NumPy
d) None of the mentioned

Answer: a
Explanation: knitr is a way to write LaTeX, HTML, and Markdown with R code interlaced.

9. Which of the following should be set on chunk by chunk basis to store results of computation?
a) cache=TRUE
b) cache=FALSE
c) caching=TRUE
d) none of the mentioned

Answer: a
Explanation: After the first run. The results are loaded from cache.

10. Dependencies are checked explicitly in caching caveats.
a) True
b) False

Answer: b
Explanation: Dependencies are not checked explicitly in caching caveats.

250+ TOP MCQs on Model Based Prediction and Answers

Data Science Multiple Choice Questions on “Model Based Prediction”.

1. Which of the following is correct about regularized regression?
a) Can help with bias trade-off
b) Cannot help with model selection
c) Cannot help with variance trade-off
d) All of the mentioned

Answer: a
Explanation: Regularized regression does not perform as well as random forest.

2. Point out the wrong statement.
a) Model based approach may be computationally convenient
b) Model based approach use Bayes theorem
c) Model based approach are reasonably inaccurate on real problems
d) All of the mentioned

Answer: c
Explanation: Model based approach are reasonably accurate on real problems.

3. Which of the following methods are present in caret for regularized regression?
a) ridge
b) lasso
c) relaxo
d) all of the mentioned

Answer: d
Explanation: In caret one can tune over the no of predictors to retain instead of defined values for penalty.

4. Which of the following method can be used to combine different classifiers?
a) Model stacking
b) Model combining
c) Model structuring
d) None of the mentioned

Answer: a
Explanation: Model ensembling is also used for combining different classifiers.

5. Point out the correct statement.
a) Combining classifiers improves interpretability
b) Combining classifiers reduces accuracy
c) Combining classifiers improves accuracy
d) All of the mentioned

Answer: c
Explanation: You can combine classifier by averaging.

6. Which of the following function provides unsupervised prediction?
a) cl_forecast
b) cl_nowcast
c) cl_precast
d) none of the mentioned

Answer: d
Explanation: cl_predict function is clue package provides unsupervised prediction.

7. Model based prediction considers relatively easy version for covariance matrix.
a) True
b) False

Answer: b
Explanation: Model based prediction considers relatively easy version for covariance matrix.

8. Which of the following is used to assist the quantitative trader in the development?
a) quantmod
b) quantile
c) quantity
d) mboost

Answer: a
Explanation: Quandl package is similar to quantmod.

9. Which of the following function can be used for forecasting?
a) predict
b) forecast
c) ets
d) all of the mentioned

Answer: b
Explanation: Forecasting is the process of making predictions of the future based on past and present data and analysis of trends.

10. Predictive analytics is same as forecasting.
a) True
b) False

Answer: b
Explanation: Predictive analytics goes beyond forecasting.

250+ TOP MCQs on Pandas and Answers

Data Science Interview Questions and Answers for freshers focuses on “Pandas”.

1. Quandl API for Python wraps the ________ REST API to return Pandas DataFrames with time series indexes.
a) Quandl
b) PyDatastream
c) PyData
d) None of the mentioned

Answer: a
Explanation: PyDatastream is a Python interface to the Thomson Dataworks Enterprise (DWE/Datastream) SOAP API to return indexed pandas dataFrames or panels with financial data.

2. Point out the correct statement.
a) Statsmodels provides powerful statistics, econometrics, analysis and modeling functionality that is out of panda’s scope
b) Vintage leverages pandas objects as the underlying data container for computation
c) Bokeh is a Python interactive visualization library for small datasets
d) All of the mentioned

Answer: a
Explanation: Bokeh goal is to provide elegant, concise construction of novel graphics in the style of D3.

3. Which of the following library is used to retrieve and acquire statistical data and metadata disseminated in SDMX 2.1?
a) pandaSDMX
b) freedapi
c) geopandas
d) all of the mentioned

Answer: a
Explanation: Geopandas extends pandas data objects to include geographic information which supports geometric operations.

4. Which of the following provides a standard API for doing computations with MongoDB?
a) Blaze
b) Geopandas
c) FRED
d) All of the mentioned

Answer: a
Explanation: If your work entails maps and geographical coordinates, and you love pandas, you should take a close look at Geopandas.

5. Point out the wrong statement.
a) qgrid is an interactive grid for sorting and filtering DataFrames
b) Pandas DataFrames implement _repr_html_ methods which are utilized by IPython Notebook
c) Spyder is a cross-platform Qt-based open-source R IDE
d) None of the mentioned

Answer: c
Explanation: Spyder is a cross-platform Qt-based open-source Python IDE.

6. Which of the following makes use of pandas and returns data in a series or dataFrame?
a) pandaSDMX
b) freedapi
c) OutPy
d) none of the mentioned

Answer: b
Explanation: freedapi module requires a FRED API key that you can obtain for free on the FRED website.

7. Spyder can introspect and display Pandas DataFrames.
a) True
b) False

Answer: b
Explanation: Spyder show both “column wise min/max and global min/max coloring.

8. Which of the following is used for machine learning in python?
a) scikit-learn
b) seaborn-learn
c) stats-learn
d) none of the mentioned

Answer: a
Explanation: scikit-learn is built on NumPy, SciPy, and matplotlib.

9. The ________ project builds on top of pandas and matplotlib to provide easy plotting of data.
a) yhat
b) Seaborn
c) Vincent
d) None of the mentioned

Answer: b
Explanation: Seaborn has great support for pandas data objects.

10. x-ray brings the labeled data power of pandas to the physical sciences.
a) True
b) False

Answer: a
Explanation: It aims to provide a pandas-like and pandas-compatible toolkit for analytics on multi-dimensional arrays.

250+ TOP MCQs on Literate Statistical Programming and Answers

Data Science Quiz focuses on “Literate Statistical Programming”.

1. Original idea comes of Literate Statistical Practice from _______________
a) Don Knuth
b) Don Cutting
c) Douglas Cutting
d) All of the mentioned

Answer: a
Explanation: Literate programs are tangled to produce machine readable documents.

2. Point out the correct statement.
a) An article is stream of code and text
b) Analysis code is divided in to code chunks only
c) Literate programs are tangled to produce human readable documents
d) None of the mentioned

Answer: a
Explanation: Analysis code is divided in to code chunks and text.

3. Which of the following is required for literate programming?
a) documentation language
b) mapper language
c) reducer language
d) all of the mentioned

Answer: a
Explanation: Programming language is also required for literate programming.

Answer: c
Explanation: R is a language and environment for statistical computing and graphics.

5. Which of the following way is required to make work reproducible?
a) keep track of things
b) Save output
c) Save data in proprietary formats
d) None of the mentioned

Answer: a
Explanation: Save data in NON proprietary formats to make work reproducible.

6. Which of the following disadvantage does literate programming have?
a) Slow processing of documents
b) Code is not automatic
c) No logical order
d) All of the mentioned

Answer: a
Explanation: Code and text is in one place.

7. knitr supports only one documentation language.
a) True
b) False

Answer: b
Explanation: knitr supports various documentation languages.

8. Which of the following tool documentation language is supported by knitr?
a) RMarkdown
b) LaTeX
c) HTML
d) None of the mentioned

Answer: a
Explanation: knitr is available on CRAN.

9. Which of the following package by Yihui is built in to RStudio environment?
a) rpy2
b) knitr
c) ggplot2
d) none of the mentioned

Answer: b
Explanation: It can be exported to pdf and html.

10. Literate program code is live-automatic “regression test” when building a document.
a) True
b) False

Answer: a
Explanation: Data and results are automatically updated to reflect external changes.

250+ TOP MCQs on Shiny and Answers

Data Science Questions for entrance exams focuses on “Shiny”.

1. Which of the following project is used for calling R products from web?
a) OpenCPU
b) OpenDisk
c) OpenMem
d) All of the mentioned

Answer: a
Explanation: OpenCPU is complementary to OpenCPU.

2. Point out the wrong statement.
a) Shiny is platform for creating interactive programs embedded in to web page
b) Shiny is invented by R folks
c) Time required to create data products using shiny is more
d) All of the mentioned

Answer: c
Explanation: Time to create data products is less using shiny.

3. Which of the following statement will install shiny?
a) install.packages(“shiny”)
b) install.library(“shiny”)
c) install.lib(“shiny”)
d) all of the mentioned

Answer: a
Explanation: Shiny applications are automatically “live” in the same way that spreadsheets are live.

4. Which of the following can be done by shiny?
a) Tabbed main panels
b) Editable data tables
c) Dynamic UI
d) All of the mentioned

Answer: d
Explanation: shiny allows users to upload files.

5. Point out the correct statement.
a) shiny project is a directory containing at least three parts
b) shiny project is a file containing at least three parts
c) shiny project consist is a directory containing only one part
d) none of the mentioned

Answer: d
Explanation: shiny project consist is a directory containing at least two parts.

6. Which of the following function can interrupt execution and can be called continuously?
a) browser()
b) browse()
c) search()
d) all of the mentioned

Answer: a
Explanation: Debugging shiny apps can be difficult.

7. runApp() will run the shiny and open the browser window.
a) True
b) False

Answer: a
Explanation: The chart is rendered within the browser using Flash.

8. Which of the following function is for single checkbox widget?
a) checkboxInput
b) dateInput
c) singleboxInput
d) all of the mentioned

Answer: a
Explanation: Shiny comes with a family of pre-built widgets, each created with a transparently named R function.

9. How many components are involved in shiny?
a) 3
b) 4
c) 5
d) none of the mentioned

Answer: d
Explanation: Shiny apps have two components:user-interface script and server script.

10. All of the styled elements are handled through server.R.
a) True
b) False

Answer: b
Explanation: All of the styled elements are handled through ui.R.

250+ TOP MCQs on Pandas and Answers

Data Science Questions and Answers for experienced focuses on “Pandas”

1. Which of the following is the base layer for all of the sparse indexed data structures?
a) SArray
b) SparseArray
c) PyArray
d) None of the mentioned

Answer: b
Explanation: SparseArray is a 1-dimensional ndarray-like object storing only values distinct from the fill_value.

2. Point out the correct statement.
a) All of the standard pandas data structures have a to_sparse method
b) Any sparse object can be converted back to the standard dense form by calling to_dense
c) The sparse objects exist for memory efficiency reasons
d) All of the mentioned

Answer: d
Explanation: The to_sparse method takes a kind argument and a fill_value.

3. Which of the following is not an indexed object?
a) SparseSeries
b) SparseDataFrame
c) SparsePanel
d) None of the mentioned

Answer: d
Explanation: SparseArray can be converted back to a regular ndarray by calling to_dense.

4. Which of the following list-like data structure is used for managing a dynamic collection of SparseArrays?
a) SparseList
b) GeoList
c) SparseSeries
d) All of the mentioned

Answer: a
Explanation: To create one, simply call the SparseList constructor with a fill_value.

5. Point out the wrong statement.
a) to_array. append can accept scalar values or any 2-dimensional sequence
b) Two kinds of SparseIndex are implemented
c) The integer format keeps an arrays of all of the locations where the data are not equal to the fill value
d) None of the mentioned

Answer: a
Explanation: to_array. append can accept scalar values or any 1-dimensional sequence.

6. Which of the following method is used for transforming a SparseSeries indexed by a MultiIndex to a scipy.sparse.coo_matrix?
a) SparseSeries.to_coo()
b) Series.to_coo()
c) SparseSeries.to_cooser()
d) None of the mentioned

Answer: a
Explanation: Experimental api to transform between sparse pandas and scipy.sparse structures.

7. The integer format tracks only the locations and sizes of blocks of data.
a) True
b) False

Answer: b
Explanation: The block format tracks only the locations and sizes of blocks of data.

8. Which of the following is used for testing for membership in the list of column names?
a) in
b) out
c) elseif
d) none of the mentioned

Answer: a
Explanation: For DataFrames, likewise, in applies to the column axis.

9. Which of the following indexing capabilities is used as a concise means of selecting data from a pandas object?
a) In
b) ix
c) ipy
d) none of the mentioned

Answer: b
Explanation: ix and reindex are 100% equivalent.

10. Pandas follow the NumPy convention of raising an error when you try to convert something to a bool.
a) True
b) False

Answer: a
Explanation: This happens in an if or when using the boolean operations, and, or, or not.