300+ REAL TIME Data Science Objective Questions & Answers 2023

250+ TOP MCQs on Graphics Devices and Answers

Data Science Test focuses on “Graphics Devices”.

1. Every open graphics device is assigned an integer greater than 2.
a) True
b) False

Answer: b
Explanation: Every open graphics device is assigned an integer greater than equal to 2.

2. Point out the correct statement.
a) Vector formats are good for line drawings and plots with solid colors using a modest number of points
b) Vector formats are good for plots with a large number of points, natural scenes or web based plots
c) The default graphics device is always the screen device
d) All of the mentioned

Answer: a
Explanation: Bitmap formats are good for plots with a large number of points, natural scenes or web based plots.

3. Which of the following will copy the plot from one device to another?
a) dev.copy
b) dev.copypdf
c) dev.device
d) all of the mentioned

Answer: a
Explanation: Copying a plot to another device can be useful because some plots require a lot of code and it can be a pain to type all that in again for a different device.

4. Which of the following is used to change active graphic device?
a) dev.set
b) dev.int
c) dev.win
d) all of the mentioned

Answer: a
Explanation: You can change the active graphics device with dev.set() where is the number associated with the graphics device you want to switch to.

5. Point out the wrong statement.
a) File devices are useful for creating plots that can be included in other documents or sent to other people
b) Plots must be created on a graphics device
c) For file devices, there are vector and bitmap formats
d) None of the mentioned

Answer: d
Explanation: For file devices, there are vector and bitmap formats.

6. Which of the following is the second goal of PCA?
a) data compression
b) statistical analysis
c) data dredging
d) all of the mentioned

Answer: a
Explanation: The principal components are equal to the right singular values if you first scale the variables.

7. dev.copy2pdf specifically copy a plot to a PDF file.
a) True
b) False

Answer: a
Explanation: Copying a plot is not an exact operation, so the result may not be identical to the original.

8. Which of the following is a vector file device?
a) png
b) svg
c) bmp
d) none of the mentioned

Answer: b
Explanation: svg stands for scalable vector graphics.

9. Which of the following is alternative technique toprincipal component analysis?
a) Factor analysis
b) Independent components analysis
c) Latent semantic analysis
d) All of the mentioned

Answer: d
Explanation: PC’s may mix real patterns.

250+ TOP MCQs on caret and Answers

Data Science Multiple Choice Questions on “caret”.

1. Which of the following can be used to generate balanced cross–validation groupings from a set of data?
a) createFolds
b) createSample
c) createResample
d) none of the mentioned

Answer: a
Explanation: createResample can be used to make simple bootstrap samples.

2. Point out the wrong statement.
a) Simple random sampling of time series is probably the best way to resample times series data.
b) Three parameters are used for time series splitting
c) Horizon parameter is the number of consecutive values in test set sample
d) All of the mentioned

Answer: a
Explanation: Simple random sampling of time series is probably not the best way to resample times series data.

3. Which of the following function can be used to maximize the minimum dissimilarities?
a) sumDiss
b) minDiss
c) avgDiss
d) all of the mentioned

Answer: d
Explanation: sumDiss can be used to maximize the total dissimilarities.

4. Which of the following function can create the indices for time series type of splitting?
a) newTimeSlices
b) createTimeSlices
c) binTimeSlices
d) none of the mentioned

Answer: b
Explanation: Rolling forecasting origin techniques are associated with time series type of splitting.

5. Point out the correct statement.
a) Asymptotics are used for inference usually
b) Caret includes several functions to pre-process the predictor data
c) The function dummyVars can be used to generate a complete set of dummy variables from one or more factors
d) All of the mentioned

Answer: d
Explanation: The function dummyVars takes a formula and a data set and outputs an object that can be used to create the dummy variables using the predict method.

6. Which of the following can be used to create sub–samples using a maximum dissimilarity approach?
a) minDissim
b) maxDissim
c) inmaxDissim
d) all of the mentioned

Answer: b
Explanation: Splitting is based on the predictors.

7. caret does not use the proxy package.
a) True
b) False

Answer: b
Explanation: caret uses the proxy package.

8. Which of the following function can be used to create balanced splits of the data?
a) newDataPartition
b) createDataPartition
c) renameDataPartition
d) none of the mentioned

Answer: b
Explanation: If the y argument to this function is a factor, the random sampling occurs within each class and should preserve the overall class distribution of the data.

9. Which of the following package tools are present in caret?
a) pre-processing
b) feature selection
c) model tuning
d) all of the mentioned

Answer: d
Explanation: There are many different modeling functions in R.

10. caret stands for classification and regression training.
a) True
b) False

Answer: a
Explanation: The caret package is a set of functions that attempt to streamline the process for creating predictive models.

250+ TOP MCQs on Analysis and Experimental Design and Answers

Data Science Multiple Choice Questions on “Analysis and Experimental Design”.

1. If X predicts Y, it does mean X causes Y.
a) True
b) False

Answer: b
Explanation: If X predicts Y, it does not mean X causes Y.

2. Point out the correct statement.
a) If equations are known but the parameters are not, they may be inferred with data analysis
b) If equations are not known but the parameters are, they may be inferred with data analysis
c) If equations and parameter are not, they may be inferred with data analysis
d) None of the mentioned

Answer: a
Explanation: Usually the random component of data is measurement error.

3. Which of the following is the top most important thing in data science?
a) answer
b) question
c) data
d) none of the mentioned

Answer: b
Explanation: The second most important is the data.

4. Which of the following approach should be used if you can’t fix the variable?
a) randomize it
b) non stratify it
c) generalize it
d) none of the mentioned

Answer: a
Explanation: If you can’t fix the variable, stratify it.

5. Point out the wrong statement.
a) Randomized studies are not used to identify causation
b) Complication approached exist for inferring causation
c) Causal relationships may not apply to every individual
d) All of the mentioned

Answer: a
Explanation: Randomized studies are usually used to identify causation.

6. Which of the following is a good way of performing experiments in data science?
a) Measure variability
b) Generalize to the problem
c) Have Replication
d) All of the mentioned

Answer: d
Explanation: Experiments on causal relationships investigate the effect of one or more variables on one or more outcome variables.

7. Which of the following is commonly referred to as ‘data fishing’?
a) Data bagging
b) Data booting
c) Data merging
d) None of the mentioned

Answer: d
Explanation: Data dredging is sometimes referred to as “data fishing”.

8. Which of the following data mining technique is used to uncover patterns in data?
a) Data bagging
b) Data booting
c) Data merging
d) Data Dredging

Answer: d
Explanation: Data dredging, also called as data snooping, refers to the practice of misusing data mining techniques to show misleading scientific ‘research’.

250+ TOP MCQs on Plotting Systems and Answers

Data Science Multiple Choice Questions on “Plotting Systems”.

1. How many stages commonly occurs in creation of plot?
a) 2
b) 5
c) 8
d) All of the mentioned

Answer: a
Explanation: The base plotting system is highly flexible.

2. Base graphics are used most commonly for creating 2D graphics.
a) True
b) False

Answer: a
Explanation: Base graphics is a very powerful system for creating 2D graphics.

3. Which of the following annotation function is used to add or modify text?
a) word
b) graph
c) lines
d) all of the mentioned

Answer: d
Explanation: points and axis are other well known annotation function.

4. Which of the following package is implemented by lattice plotting system?
a) grDevices
b) grid
c) graphics
d) all of the mentioned

Answer: b
Explanation: Use grid on to display the major grid lines.

5. Point out the wrong statement.
a) Plot are created with multiple functions only
b) Plots are created with both single and multiple function calls
c) Annotation in plot is not especially intuitive
d) None of the mentioned

Answer: a
Explanation: Plots are created with single function also.

6. Which of the following parameter defines line type such as dashed and dotted?
a) lty
b) pch
c) lwd
d) all of the mentioned

Answer: a
Explanation: lwd is used for line width.

7. The core plotting engine is encapsulated in graphics package.
a) True
b) False

Answer: a
Explanation: graphics package contain plotting functions.

8. Which of the following argument specifies margin size with regards to par function?
a) las
b) bg
c) mar
d) all of the mentioned

Answer: c
Explanation: par function is used to specify global parameters.

250+ TOP MCQs on caret and Answers

Data Science MCQs focuses on “Caret”.

1. Which of the following function is a wrapper for different lattice plots to visualize the data?
a) levelplot
b) featurePlot
c) plotsample
d) none of the mentioned

Answer: b
Explanation: featurePlot is used for data visualization in caret.

2. Point out the wrong statement.
a) In every situation, the data generating mechanism can create predictors that only have a single unique value
b) Predictors might have only a handful of unique values that occur with very low frequencies
c) The function findLinearCombos uses the QR decomposition of a matrix to enumerate sets of linear combinations
d) All of the mentioned

Answer: a
Explanation: In some situations, the data generating mechanism can create predictors that only have a single unique value.

3. Which of the following function can be used to identify near zero-variance variables?
a) zeroVar
b) nearVar
c) nearZeroVar
d) all of the mentioned

Answer: c
Explanation: The saveMetrics argument can be used to show the details and usually defaults to FALSE.

4. Which of the following function can be used to flag predictors for removal?
a) searchCorrelation
b) findCausation
c) findCorrelation
d) none of the mentioned

Answer: c
Explanation: Some models thrive on correlated predictors.

5. Point out the correct statement.
a) findLinearColumns will also return a vector of column positions can be removed to eliminate the linear dependencies
b) findLinearCombos will return a list that enumerates dependencies
c) the function findLinearRows can be used to generate a complete set of row variables from one factor
d) none of the mentioned

Answer: b
Explanation: For each linear combination, it will incrementally remove columns from the matrix and test to see if the dependencies have been resolved.

6. Which of the following can be used to impute data sets based only on information in the training set?
a) postProcess
b) preProcess
c) process
d) all of the mentioned

Answer: b
Explanation: This can be done with K-nearest neighbors.

7. The function preProcess estimates the required parameters for each operation.
a) True
b) False

Answer: a
Explanation: predict.preProcess is used to apply them to specific data sets.

8. Which of the following can also be used to find new variables that are linear combinations of the original set with independent components?
a) ICA
b) SCA
c) PCA
d) None of the mentioned

Answer: a
Explanation: ICA stands for independent component analysis.

9. Which of the following function is used to generate the class distances?
a) preprocess.classDist
b) predict.classDist
c) predict.classDistance
d) all of the mentioned

Answer: b
Explanation: By default, the distances are logged.

10. The preProcess class can be used for many operations on predictors.
a) True
b) False

Answer: a
Explanation: Operations include centering and scaling.

250+ TOP MCQs on Time Deltas and Answers

Data Science Multiple Choice Questions on “Time Deltas”.

1. Which of the following operations are supported on Time Frames?
a) idxmax
b) ixmax
c) ixmin
d) none of the mentioned

Answer: a
Explanation: Operands can also appear in a reversed order.

2. Point out the correct statement.
a) Timedeltas are differences in times, expressed in difference units
b) You can construct a Timedelta scalar through various argument
c) DateOffsets cannot be used in construction
d) All of the mentioned

Answer: a
Explanation: Timedeltas can be both positive and negative.

3. Numeric reduction operation for timedelta64[ns] will return _________ objects.
a) Timeseries
b) Timeplus
c) Timedelta
d) None of the mentioned

Answer: c
Explanation: NaT are skipped during evaluation.

4. Which of the following scalars can be converted to other ‘frequencies’ by as typing to a specific timedelta type?
a) Timedelta Series
b) TimedeltaIndex
c) Timedelta
d) All of the mentioned

Answer: d
Explanation: These operations yield Series and propagate NaT -> nan.

5. Point out the wrong statement.
a) min, max, idxmin, idxmax operations are supported on Series
b) You cannot pass a timedelta to get a particular value
c) Division by the numpy scalar is true division
d) None of the mentioned

Answer: b
Explanation: Dividing or multiplying a timedelta64[ns] Series by an integer or integer Series yields another timedelta64[ns] dtypes Series.

6. Which of the following is used to generate an index with time delta?
a) TimeIndex
b) TimedeltaIndex
c) LeadIndex
d) None of the mentioned

Answer: b
Explanation: Using TimedeltaIndex you can pass string-like, Timedelta, timedelta, or np.timedelta64 objects.

7. Combination of TimedeltaIndex with DatetimeIndex allow certain combination operations that are NaT preserving.
a) True
b) False

Answer: a
Explanation: You can also convert indices to yield another index.

8. Using _________ on categorical data will produce similar output to a Series or DataFrame of type string.
a) .desc()
b) .describe()
c) .rank()
d) none of the mentioned

Answer: b
Explanation: Categorical data has a categories and a ordered property.

9. Which of the following method can be used to rename categorical data?
a) Categorical.rename_categories()
b) Categorical.rename()
c) Categorical.mv_categories()
d) None of the mentioned

Answer: a
Explanation: Renaming categories is done by assigning new values to the Series.cat.categories property.

10. All values of categorical data are either in categories or np.nan.
a) True
b) False

Answer: a
Explanation: Categoricals are pandas data type.