250+ TOP MCQs on Linear Regression and Answers

R Programming Language Multiple Choice Questions on “Linear Regression ”.

1. ________ is an incredibly powerful tool for analyzing data.
a) Linear regression
b) Logistic regression
c) Gradient Descent
d) Greedy algorithms

Answer: a
Clarification: Linear regression is an incredibly powerful tool for analysing data. we’ll focus on finding one of the simplest type of relationship: linear. This process is unsurprisingly called linear regression, and it has many applications.

2. The square of the correlation coefficient r 2 will always be positive and is called the ________
a) Regression
b) Coefficient of determination
c) KNN
d) Algorithm

Answer: b
Clarification: The square of the correlation coefficient r square will always be positive and is called the coefficient of determination. This also is equal to the proportion of the total variability that’s explained by a linear model.

3. Predicting y for a value of x that’s outside the range of values we actually saw for x in the original data is called ___________
a) Regression
b) Extrapolation
c) Intra polation
d) Polation

Answer: b
Clarification: Predicting y for a value of x that is within the interval of points that we saw in the original data is called interpolation. Predicting y for a value of x that’s outside the range of values we actually saw for x in the original data is called extrapolation.

4. What is predicting y for a value of x that is within the interval of points that we saw in the original data called?
a) Regression
b) Extrapolation
c) Intra polation
d) Polation

Answer: c
Clarification: Predicting y for a value of x that is within the interval of points that we saw in the original data is called interpolation. Predicting y for a value of x that’s outside the range of values we actually saw for x in the original data is called extrapolation.

5. Analysis of variance in short form is?
a) ANOV
b) AVA
c) ANOVA
d) ANVA

Answer: c
Clarification: If the ANOVA test determines that the model explains a significant portion of the variability in the data, then we can consider testing each of the hypotheses and correcting for multiple comparisons.

6. ________ is a simple approach to supervised learning. It assumes that the dependence of Y on X1, X2, . . . Xp is linear.
a) Linear regression
b) Logistic regression
c) Gradient Descent
d) Greedy algorithms

Answer: a
Clarification: Linear regression is a simple approach to supervised learning. It assumes that the dependence of Y on X1, X2, . . . Xp is linear. linear regression is an incredibly powerful tool for analysing data.

7. Although it may seem overly simplistic, _______ is extremely useful both conceptually and practically.
a) Linear regression
b) Logistic regression
c) Gradient Descent
d) Greedy algorithms

Answer: a
Clarification: Linear regression is a simple approach to supervised learning. It assumes that the dependence of Y on X1, X2, . . . Xp is linear. linear regression is an incredibly powerful tool for analysing data.

8. When there are more than one independent variables in the model, then the linear model is termed as _______
a) Unimodal
b) Multiple model
c) Multiple Linear model
d) Multiple Logistic model

Answer: c
Clarification: When there are more than one independent variables in the model, then the linear model is termed as the multiple linear regression model.

9. The parameter β0 is termed as intercept term and the parameter β1 is termed as slope parameter. These parameters are usually called as _________
a) Regressionists
b) Coefficients
c) Regressive
d) Regression coefficients

Answer: d
Clarification: The parameter β0 is termed as intercept term and the parameter β1 is termed as slope parameter. These parameters are usually called as regression coefficients.

10. The sum of squares of the difference between the observations and the line in the horizontal direction in the scatter diagram can be minimized to obtain the estimates is generally called?
a) reverse regression method
b) formal regression
c) logistic regression
d) simple regression

Answer: a
Clarification: The sum of squares of the difference between the observations and the line in the horizontal direction in the scatter diagram can be minimized to obtain the estimates of 0 1 β and β. This is generally called a reverse or inverse regression method.

11. ______ regression method is also known as the ordinary least squares estimation.
a) Simple
b) Direct
c) Indirect
d) Mutual

Answer: b
Clarification: Direct regression method also known as the ordinary least squares estimation. Assuming that a set of n paired observations are available which satisfy the linear regression model.

12. __________ refers to a group of techniques for fitting and studying the straight-line relationship between two variables.
a) Linear regression
b) Logistic regression
c) Gradient Descent
d) Greedy algorithms

Answer: a
Clarification: Linear regression is an incredibly powerful tool for analysing data. we’ll focus on finding one of the simplest type of relationship: linear. This process is unsurprisingly called linear regression, and it has many applications.

13. In order to calculate confidence intervals and hypothesis tests, it is assumed that the errors are independent and normally distributed with mean zero and _______
a) Mean
b) Variance
c) SD
d) KNN

Answer: b
Clarification: In order to calculate confidence intervals and hypothesis tests, it is assumed that the errors are independent and normally distributed with mean zero and variance.

14. What do we do the curvilinear relationship in linear regression?
a) consider
b) ignore
c) may be considered
d) sometimes consider

Answer: b
Clarification: Linear regression models the straight-line relationship between Y and X. Any curvilinear relationship is ignored. This assumption is most easily evaluated by using a scatter plot.

15. When hypothesis tests and confidence limits are to be used, the residuals are assumed to follow the __________distribution.
a) Formal
b) Mutual
c) Normal
d) Abnormal

Answer: c
Clarification: When hypothesis tests and confidence limits are to be used, the residuals are assumed to follow the normal distribution.

250+ TOP MCQs on R Programming Basics and Answers

R Programming Language Multiple Choice Questions on “Basics”.

1. Is It possible to inspect the source code of R?
a) Yes
b) No
c) Can’t say
d) Some times

Answer: a
Clarification: Anybody is free to download and install these packages and even inspect the source code. The instructions for obtaining R largely depend on the user’s hardware and operating system.

2. How to install for a package and all of the other packages on which for depends?
a) install.packages (for, depends = TRUE)
b) R.install.packages (“for”, depends = TRUE)
c) install.packages (“for”, depends = TRUE)
d) install (“for”, depends = FALSE)

Answer: c
Clarification: To install a package named for, open up R and type install.packages(“for”). To install foo and additionally install all of the other packages on which for depends, instead type install.packages (“for”, depends = TRUE).

3. __________ function is used to watch for all available packages in library.
a) lib()
b) fun.lib()
c) libr()
d) library()

Answer: d
Clarification: Type library() at the command prompt to see a list of all available packages in the library. For total information about the installation of R and add-on packages, see the R Installation and Administration manual.

4. The longer programs are called ____________
a) Files
b) Structures
c) Scripts
d) Data

Answer: c
Clarification: The longer programs called scripts, there is too much code to write all at once at the command prompt. Furthermore, for longer scripts, it is convenient to be able to only modify a certain piece of the script and run it again in R.

5. Scripts will run on ___________________
a) Script Editors
b) Console
c) Terminal
d) GCC Compiler

Answer: a
Clarification: script editors are designed to aid the communication and code writing process. They have all sorts of features including R syntax highlighting, automatic code completion, delimiter matching, and dynamic help on the R functions.

6. Which of the following is a “Recommended” package in R?
a) Util
b) Lang
c) Stats
d) Spatial

Answer: d
Clarification: “Recommended” packages also include boot, class, cluster, codetools, foreign, KernSmooth, lattice, mgcv, nlme, rpart, survival, MASS, nnet, Matrix. There are about ten thousand packages in R now.

7. Full Form of GUI is ___________________
a) Guided User Interface
b) Graphical User Interface
c) Guided Used Interface
d) Graphical User Interval

Answer: b
Clarification: GUI elements are usually accessed through a device. All programs running a GUI use a consistent set of graphical elements so that once the user learns a particular interface.

8. ____________ provides a point-and-click interface to many basic statistic problems.
a) Commander
b) GUI
c) Console
d) Terminal

Answer: a
Clarification: R Commander provides a point-and-click interface to statistical problems. It is called the “Commander” because every time one makes a selection, the code corresponding to the task is listed in the output window.

9. What will be the output of the following R code?

options(digits = 16)
20/6

a) 3.33
b) 3.333
c) 3.3333333
d) 3.3333333333333333

Answer: d
Clarification: We know that 20/6 is a repeating decimal, We can change the number of digits displayed with options. This will make the number after the decimal point to extend for the required amount.

10. In which IDE we can interact with R?
a) R studio
b) Console
c) GCC
d) Power shell

Answer: a
Clarification: An IDE tailored to the needs of interactive data analysis and statistical programming called R studio. In R studio we can directly interact with R through the inbuilt functions and packages. We can also download new packages.

11. Which programming language is more based on the results?
a) R
b) C
c) C++
d) Java

Answer: a
Clarification: Compared to other programming languages, the R community tends to be more focussed on results instead of processes. Knowledge of software engineering best practice.

12. Why learning R becomes tough?
a) Special files
b) Functions
c) Packages
d) Special Cases

Answer: d
Clarification: You are confronted with over 20 years of evolution every time you use R. Learning R can be hard because there are many special cases in R to remember. R is the best user of memory.

13. R is mostly used in ______________
a) Problem solving
b) Statistics
c) Probability
d) All of the mentioned

Answer: d
Clarification: Statistics for relatively advanced users. R has thousands of packages, designed, maintained, and widely used by statisticians. We can code ourselves if a command is not present.

14. Why is it needed for R studio to update regularly?
a) Bugs
b) More Functions
c) Methods
d) For more packages

Answer: a
Clarification: RStudio is very popular with a nice interface and well thought out, especially for more advanced usage. It can be a bit buggy, so make sure you update it regularly. Available on all platforms.

15. What is the meaning of “<-“?
a) Functions
b) Loops
c) Addition
d) Assignment

Answer: d
Clarification: The expression a <- 16 creates a variable called a and gives it the value 16 called assignment. The variable on the left is assigned to the value on the right. The left side should have only a single one.

16. In the expression x <- 4 in R, what is the class of ‘x’ as determined by the `class()’ function?
a) Character
b) Numeric
c) Integer
d) Word

Answer: c
Clarification: In R, there is an extension of the numeric or character vectors. They are not a separate type of object but simply an atomic vector with dimensions.

250+ TOP MCQs on Subsetting and Answers

R Programming Language Multiple Choice Questions on “Subsetting”.

1. Which of the following extracts first element from the following R vector?

 > x <- c("a", "b", "c", "c", "d", "a")

a) x[10]
b) x[1]
c) x[0]
d) x[2]

Answer: b
Clarification: The element which we want to extract will be in the format of variable[index value of the element] in R script.

2. Point out the correct statement?
a) There are three operators that can be used to extract subsets of R objects
b) The [ operator is used to extract elements of a list or data frame by literal name
c) The [[ operator is used to extract elements of a list or data frame by string name
d) There are five operators that can be used to extract subsets of R objects

Answer: a
Clarification: Three operators are [,[[ and $.

3. Which of the following extracts first four element from the following R vector?

 > x <- c("a", "b", "c", "c", "d", "a")

a) x[0:4]
b) x[1:4]
c) x[0:3]
d) x[4:3]

Answer: b
Clarification: The multiple successive elements which we want to extract will be in the format of variable[index value of the start element:index value of the last element] in R script.

4. What will be the output of the following R code?

> x <- c("a", "b", "c", "c", "d", "a")
> x[c(1, 3, 4)]

a) “a” “b” “c”
b) “a” “c” “c”
c) “a” “c” “b”
d) “b” “c” “b”

Answer: b
Clarification: The sequence does not have to be in order; you can specify any arbitrary integer vector.

5. Point out the wrong statement?
a) $ operator semantics are similar to that of [[
b) The [ operator always returns an object of the same class as the original
c) The $ operator is used to extract elements of a list or a data frame
d) There are three operators that can be used to extract subsets of R objects

Answer: c
Clarification: The [[ operator is used to extract elements of a list or a data frame. It can only be used to extract a single element and the class of the returned object will not necessarily be a list or data frame.

6. What will be the output of the following R code?

> x <- matrix(1:6, 2, 3)
> x[1, 2]

a) 3
b) 2
c) 1
d) 0

Answer: a
Clarification: Matrices can be subsetted in the usual way with (i,j) type indices where i is the row and j is the column numbers.

7. What will be the output of the following R code?

> x <- matrix(1:6, 2, 3)
> x[1, ]

a) 1 3 5
b) 2 3 5
c) 3 3 5
d) file

Answer: a
Clarification: Matrices can be subsetted in the usual way with (i,j) type indices where i is the row and j is the column numbers. If only row or only column number is specified, then the respective full row or column is printed.

8. Which of the following R code extracts the second column for the following matrix?

a) x[2, ]
b) x[1, 2]
c) x[, 2]
d) x[1 1 2]

Answer: c
Clarification: This behavior is used to access entire rows or columns of a matrix.

9. What will be the output of the following R code?

> x <- matrix(1:6, 2, 3)
> x[1, , drop = FALSE]

a)

[,1] [,2] [,3]
[1,] 1 3 5

b)

[,1] [,2] [,3]
[1,] 2 3 5

c)

[,1] [,2] [,3]
[1,] 1 2 5

d) Error

Answer: a
Clarification: By default, when a single element of a matrix is retrieved, it is returned as a vector of length 1 rather than a $1times 1$ matrix.

10. What will be the output of the following R code?

> x <- list(foo = 1:4, bar = 0.6)
> x

a)

$foo
[1] 1 2 3 4
$bar
[1] 0.6

b)

$foo
[1] 0 1 2 3 4
$bar
[1] 0 0.6

c)

$foo
[1] 0 1 2 3 4
$bar
[1] 0.6

d) Error

Answer: a
Clarification: The [[ operator can be used to extract single elements from a list.

250+ TOP MCQs on Functions and Answers

R Programming Language Multiple Choice Questions on “Functions”.

1. ________ function is usually used inside another function and throws a warning whenever a particular package is not found.
a) Dplyr
b) Require
c) Coin
d) Sample

Answer: b
Clarification: Require () function is usually used inside the other function and gives a warning whenever a particular package is not found. Library () function will give an error message if the desired package is not loaded.

2. ___________ function gives an error message if the desired package cannot be loaded.
a) Dplyr
b) Require
c) Library
d) Sample

Answer: c
Clarification: Library () function gives an error message if the desired package cannot be loaded. Require () function is usually used inside a function and throws a warning whenever a particular package is not found.

3. A ________________ in R programming language can also contain numeric and alphabets along with special characters like dot and underline.
a) Variable name
b) Number
c) Integer
d) Character

Answer: a
Clarification: A variable name in R programming language can also contain numeric and alphabets along with special characters like a dot and underline. Variable names in R language can begin with an alphabet and also the dot symbol.

4. The current user defined objects like lists, vectors, etc. is referred to as __________ in the R language.
a) Work names
b) Work space
c) Environment
d) Console

Answer: a
Clarification: The current R working environment of the user which has user defined objects like lists, vectors, etc. is referred to as Workspace in R language. The workspace of R is flexible to all functions of statistics.

5. Which function helps you perform sorting in R language?
a) Order
b) Inorder
c) Simple
d) Library

Answer: a
Clarification: Order returns a permutation which rearranges its first argument into ascending and also descending order. The result of the order command is a vector where each value references to the values of the position of the object in the original data frame.

6. Which function is used to create a histogram for visualisation in R programming language?
a) Library
b) Hist
c) Data
d) Refer

Answer: b
Clarification: Hist function is used to create a histogram for visualisation in R programming language. The generic function hist computes a histogram of the given data values. This function takes a vector as the input and also uses some more parameters to plot histograms.

7. Write the syntax to set the path of the current working directory in R environment?
a) Setwd(“dipath”)
b) Setwd(dir_path)
c) Setwd(“dir_path”)
d) Set(“dir_path”)

Answer: c
Clarification: The current R working environment of a user which has user-defined objects like lists, vectors, etc. is referred to as Workspace in R language. The workspace of R is flexible to all functions of statistics.

8. What will be the output of runif()?
a) Random number
b) Numbers
c) Character
d) Path generation

Answer: a
Clarification: Random numbers from a normal distribution can be generated using runif() function. We can specify the range of the uniform distribution with the help of max and min argument. If not provided, the default range will be between 0 and 1.

9. ________ function generates “n” normal random numbers based on the mean and standard deviation arguments passed to the function.
a) rnorm
b) vnorm
c) knorm
d) lnorm

Answer: a
Clarification: rnorm function generates “n” normal random numbers based on the mean and standard deviation arguments passed to the function. The workspace of R is flexible to all functions of statistics.

10. Write a function to extract the first name in the string “Mrs. Jake Luther”?
a) Substring
b) Substr
c) Substi
d) Return

Answer: b
Clarification: The substr() function gives a part of a string. The substr() method will give parts of a string, beginning at the character of the specified position, and returns the specified number of characters.

11. _________ package provides basic functionalities in R environment like arithmetic calculations, input/output.
a) R base
b) R boost
c) R serve
d) R comm

Answer: a
Clarification: R Base package is the package that is loaded by default whenever the R programming environment is loaded .R base package provides basic functionalities in R environment like arithmetic calculations, input/output.

12. Which function basically finds the intersection between two different sets of data?
a) Converge
b) Merge
c) Delegate
d) Swap

Answer: b
Clarification: Merge () function is used to combine two data frames and it identifies common rows or columns between the 2 data frames. Merge () function basically finds the intersection between two different sets of data.

13. Which function calculates the count of each category of a categorical variable?
a) Table
b) Intact
c) Tables
d) Retabs

Answer: a
Clarification: The frequency distribution of a categorical variable can be checked using the table function in the R language. Table () function calculates the count of each category of a categorical variable.

14. The cumulative frequency distribution of a categorical variable can be checked using the ________ function in R language.
a) Sum
b) Cumsum
c) Lumpsum
d) Resum

Answer: b
Clarification: The cumulative frequency distribution of a categorical variable can be checked using the cumsum () function in the R language. The frequency distribution of a categorical variable can be checked using the table function in the R language.

15. A programmer builds a _________ to avoid repeating the same task or reduce complexity.
a) Function
b) Package
c) Code
d) Console

Answer: a
Clarification: A function, in a programming environment, is a set of instructions. A programmer builds a function to avoid repeating the same task or reduce complexity. A function should be written to carry out specified tasks and may or may not include arguments.

250+ TOP MCQs on Data Wrangling and Answers

Basic R Programming Interview questions and answers focuses on “Data Wrangling”.

1. __________ is used when you have variables that form rows instead of columns.
a) tidy()
b) spread()
c) separate()
d) gather()

Answer: b
Clarification: You need spread() less frequently than gather() or separate().

2. Point out the correct statement?
a) tidyr and dplyr packages do not make use of the pipe operator
b) tidyr does less than reshape2
c) tidyr provides ability to string multiple functions together by incorporating %
d) tidyr does greater than reshape2

Answer: b
Clarification: Just as reshape2 did less than reshape, tidyr does less than reshape2.

3. Which of the following merges two variables into one?
a) spread()
b) gather()
c) separate()
d) unite()

Answer: b
Clarification: The unite() function is a convenience function to paste together multiple variable values into one.

4. How many functions exist for wrangling the data with dplyr package?
a) one
b) seven
c) three
d) five

Answer: b
Clarification: dplyr provides seven main functions for tidying your messy data.

5. Point out the correct statement?
a) gather() makes “long” data wider
b) tidyr is a reframing of reshape designed to accompany the tidy data framework
c) there are two fundamental verbs of data tidying
d) tidyr and dplyr packages do not make use of the pipe operator

Answer: c
Clarification: In particular, built-in methods only work for data frames, and tidyr provides no margins or aggregation.

6. ________ add new variables/columns or transform existing variables.
a) mutate
b) add
c) apped
d) arrange

Answer: a
Clarification: arrange is used to reorder rows of a data frame.

7. _________ extract a subset of rows from a data frame based on logical conditions.
a) rename
b) filter
c) set
d) subset

Answer: a
Clarification: rename is used to rename variables in a data frame.

8. Spread function is known as ___________ in spreadsheets.
a) pivot
b) unpivot
c) cast
d) order

Answer: b
Clarification: Spread is known by other names in other places: it’s cast in reshape2, unpivot in spreadsheets and unfold in databases.

250+ TOP MCQs on Predictive Analytics and Answers

R Programming Language Multiple Choice Questions on “Predictive Analytics”.

1. The IBM _________ analytics appliances combine high-capacity storage for Big Data with a massively-parallel processing platform for high-performance computing.
a) Watson
b) Netezza
c) InfoSight
d) LityxEQ

Answer: a
Clarification: IBM Watson is a system based on cognitive computing. With the addition of Revolution R Enterprise for IBM Netezza, you can use the power of the R language to build predictive models on Big Data.

2. ______ is an integrated hosted analytics platform for marketing insights, predictive models, and marketing optimization”
a) LityxEQ
b) WatSon
c) LityxIQ
d) InfoSight

Answer: c
Clarification: LityxIQ allows marketers to automate the loading and managing multiple data sources, automatically build and manage predictive models, and optimize marketing budget and media decisions.

3. ________ is rapidly being adopted for computing descriptive and query types of analytics on Big data.
a) EDR
b) Hadoop
c) Azure
d) InfoSight

Answer: b
Clarification: However, it has a reputation for not being a suitable environment for high performance complex iterative algorithms such as logistic regression, generalized linear models, and decision trees.

4. _________ involves predicting a response with meaningful magnitude, such as quantity sold, stock price, or return on investment.
a) Regression
b) Summarization
c) Clustering
d) Classification

Answer: a
Clarification: Regression and classification are two common types predictive models.

5. Which of the following involves predicting a categorical response?
a) Regression
b) Summarization
c) Clustering
d) Classification

Answer: d
Clarification: Classification techniques are widely used in data mining to classify data.

6. Which of the following contains pre-built predictive tools?
a) alteryx
b) fossilx
c) paleoTS
d) ssas

Answer: a
Clarification: Alteryx Analytics, with deep integration of the R statistics and predictive language, offers a way to bridge these two worlds of ease of use and sophisticated predictive analytics.

7. __________ is proprietary tool for predictive analytics.
a) R
b) SAS
c) SSAS
d) EDR

Answer: b
Clarification: SAS (Statistical Analysis System) is a software suite developed by SAS Institute for advanced analytics.

8. Which of the following is preferred for text analytics?
a) R
b) Python
c) S
d) EDR

Answer: b
Clarification: pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming.

9. ______ is simplest class of analytics.
a) Descriptive
b) Predictive
c) Prescriptive
d) Summarization

Answer: a
Clarification: Descriptive is the simplest class of analytics. Predictive analytics can only forecast what might happen in the future because all predictive analytics are probabilistic in nature.

10._________ is a JavaScript charting library and feature-rich API set that lets you build interactive Flash or HTML5 charts.
a) InstantAtlas
b) Alterian
c) ZingChart
d) paleoTS

Answer: c
Clarification: ZingChart lets you create HTML5 Canvas charts and more.