250+ TOP MCQs on Reading Datasets and Answers

R Programming Inteview Questions and Answers for freshers focuses on “Reading Datasets”

1. What will be the Correct R code for the following output?

foo bar
1 1 TRUE
2 2 TRUE
3 3 FALSE
4 4 FALSE

a)

> x <- data.frame(foo = 1:4, bar = c(F, T, F, F))
> x

b)

> x <- data.frame(foo = 1:6, bar = c(F, T, F, F))
> x

c)

> x <- data.frame(foo = 1:4, bar = c(T, T, F, F))
> x

d)

> x <- data.frame(foo = 14:1, bar = c(F, T, F, F))
> x

View Answer

Answer: c
Clarification: Data frames are used to store tabular data in R.

 

2. Point out the wrong statement?
a) is.nan() is used to test objects if they are NA
b) is.nan() is used to test for NaN
c) NA values have a class
d) NA values have a class, so there are integer NA, character NA, etc

Answer: a
Clarification: A NaN value is also NA but the converse is not true.

3. Data frames can be converted to a matrix by calling data _______
a) as.matr()
b) as.mat()
c) as.matrix()
d) as.max()

Answer: c
Clarification: as.matrix() function should be used to coerce a data frame to a matrix.

4. What will be the output of the following R code?

> x <- data.frame(foo = 1:4, bar = c(T, T, F, F))
> ncol(x)

a) 2
b) 4
c) 7
d) 9

Answer: a
Clarification: Data frames are represented as a special type of list where every element of the list has to have the same length.

5. Point out the correct statement?
a) Using factors with labels is better than using integers because factors are self-describing
b) Factors are used to represent categorical data and can be unordered or ordered
c) Factors are important in statistical modeling and are treated specially by modelling functions like lm() and glm()
d) All of the mentioned

Answer: d
Clarification: Having a variable that has values “Male” and “Female” is better than a variable that has values 1 and 2.

6. Which of the following is invalid assignment?
a)

 > x <- list("Los Angeles" = 1, Boston = 2, London = 3)

b)

 > names(x) <- c("New York", "Seattle", "Los Angeles")

c)

 > name(x) <- c("New York", "Seattle", "Los Angeles")

d)

 > names(x) <- c("New York", "Los Angeles", "Los Angeles")

View Answer

Answer: c
Clarification: Lists can also have names, which is often very useful.

 

7. What will be the output of the following R code?

a) NULL
b) 1
c) 2
d) 4.5

Answer: a
Clarification: R objects can have names, which is very useful for writing readable code and self-describing objects.

8. Which of the following statement changes column name to h and f?
a) colnames(m) <- c(“h”, “f”)
b) columnnames(m) <- c(“h”, “f”)
c) rownames(m) <- c(“h”, “f”)
d) rownames(m) <- c(“f”, “f”)

Answer: a
Clarification: Column names and row names can be set separately using the colnames() and rownames() functions.

9. Which of the following is used for reading tabular data?
a) read.csv
b) dget
c) readLines
d) writeline

Answer: a
Clarification: read.table can also be used for reading dataset in structured form.

250+ TOP MCQs on Functions and Answers

R Programming Language Multiple Choice Questions on “Functions”.

1. Which package can be integrated with dplyr for large fast tables?
a) Table
b) Data, dplyr
c) Data.table
d) Dplyr.table

Answer: c
Clarification: Data.table package can be integrated with dplyr for large fast tables. dplyr package is used to speed up data frame management code. dplyr is a package for data manipulation, written and maintained regularly by Hadley Wickham.

2. In the base graphics system, which function is used to add elements to a plot?
a) Boxplot()
b) Text()
c) Boxplot() or Text()
d) Treat()

Answer: c
Clarification: In the base graphics system, boxplot or text function is used to add elements to a plot.

3. What are the different types of sorting algorithms available in R language?
a) Bubble
b) Selection
c) Merge
d) All sorts

Answer: d
Clarification: Bucket Sort, Selection Sort, Quick Sort, Bubble Sort, Merge Sort is the different sorts available in R language. Each and Every sorting algorithm is available in R.

4. What is the command used to store R objects in a file?
a) save (x, file=”x.Rdata”)
b) save (x, file=x.Rdata)
c) save (x, file=”x.Rdata”);
d) save (x, file=”x.data”)

Answer: a
Clarification: The function save() can be used to save one or more R objects to a specified file (in .RData or .rda type file formats). The function can be read back from the file using the function load(). Note that if you save your data with save(), it cannot be restored under a different name.

5. ___________ can be used for storing the data for long-term.
a) HDLS
b) HDFS
c) HDLSV
d) HSSLV

Answer: b
Clarification: The Hadoop Distributed File System (HDFS) is the primary data storage system used in Hadoop applications. The Hadoop Distributed File System ( HDFS ) is a distributed file system designed for a run on commodity hardware.

6. MapReduce jobs submitted from either Oozie, Pig or Hive can be used to encode, improve and sample the data sets from _________ into R.
a) HDLS
b) HDFS
c) HDLSV
d) HSSLV

Answer: b
Clarification: MapReduce jobs submitted from either Oozie, Pig or Hive can be used for encode, improve and sample the data sets from HDFS into R. This helps to leverage complex analysis tasks with the subset of data prepared in R.

7. What will be the output of log (-5.8) when executed on R console?
a) NAN
b) NA
c) Error
d) 0.213

Answer: a
Clarification: Executing the above on R console or terminal will display a warning sign that NaN (Not a Number) will be produced in R console because it is not possible to take a log of a negative number(-).

8. How is a Data object represented internally in R language?
a) unclass (as.time (“2018-12-28″))
b) unclass (as.dat (“2018-12-28″))
c) unclass (as.D (2018-12-28))
d) unclass (as.Date (“2018-12-28″))

Answer: d
Clarification: unclass returns (a copy of) its argument with its class information removed. Unclass with an object is to remove from a class or category.

9. Which package in R supports the exploratory analysis of genomic data?
a) Adegenat
b) Adegenet
c) Adegnet
d) Adezenet

Answer: b
Clarification: Adegenet package in R supports the exploratory analysis of genomic data. R has a large number of in-built functions and the user can create their own functions. In R, a function is an object so the R interpreter is able to pass control to the function.

10. __________ can contain heterogeneous inputs.
a) Matrix
b) Data Frames
c) Matrix and Data Frames
d) Does not exists

Answer: b
Clarification: Data frame can contain heterogeneous inputs while a matrix cannot. In the matrix, only similar data types can be stored whereas in a data frame there can be different data types like characters, integers or other data frames.

250+ TOP MCQs on ggplot2 and Answers

R Programming Language Multiple Choice Questions on “ggplot2 ”.

1. _______ grammar makes a clear distinction between your data and what gets displayed on the screen or page.
a) ggplot1
b) ggplot2
c) d3.js
d) ggplot3

Answer: b
Clarification: The emphasis in ggplot2 is reducing the amount of thinking time by making it easier to go from the plot in your brain to the plot on the page.

2. Point out the wrong statement?
a) mean_se is used to calculate mean and standard errors on either side
b) hmisc wraps up a selection of summary functions from Hmisc to make it easy to use
c) plot is used to create a scatterplot matrix (experimental)
d) translate_qplot_base is used for translating between qplot and base graphics

Answer: c
Clarification: plotmatrix is used to create a scatterplot matrix (experimental).

3. Which of the following cuts numeric vector into intervals of equal length?
a) cut_interval
b) cut_time
c) cut_number
d) cut_date

Answer: a
Clarification: cut_number cuts numeric vector into intervals containing equal number of points.

4. Which of the following is a plot to investigate the order in which observations were recorded?
a) ggplot
b) ggsave
c) ggpcp
d) ggorder

Answer: d
Clarification: ggsave save a ggplot with sensible defaults.

5. Point out the wrong statement?
a) theme_minimal is minimalistic theme with no background annotations
b) theme_color is classic-looking theme, with x and y axis lines and no gridlines
c) theme_classic is a classic-looking theme
d) translate_qplot_base is used for translating between qplot and base graphics

Answer: b
Clarification: theme_classic is a classic-looking theme, with x and y axis lines and no gridlines.

6. ________ is used for translating between qplot and base graphics.
a) translate_qplot_base
b) translate_qplot_gpl
c) translate_qplot_lattice
d) translate_qplot_ggplot

Answer: a
Clarification: translate_qplot_gpl is used for translating between qplot and Graphics Production Library (GPL).

7. __________ modifies geom/stat aesthetic defaults for future plots.
a) translate_qplot_base
b) translate_qplot_gpl
c) translate_qplot_defaults
d) translate_qplot_ggplot

Answer: c
Clarification: translate_qplot_gpl is used for translating between qplot and Graphics Production Library (GPL).

8. Which of the following is discrete state calculator?
a) discrete_scale
b) ggpcp
c) ggfluctuation
d) ggmissing

Answer: c
Clarification: ggpcp is used to create a parallel coordinate plot.

9. Which of the following creates fluctuation plot?
a) ggmissplot
b) ggmissing
c) ggfluctuation
d) ggpcp

Answer: b
Clarification: Fluctuations are used to detect outliers.

10. ________ is used to create a plot to illustrate patterns of missing values.
a) ggmissplot
b) ggmissing
c) ggfluctuation
d) ggpcp

Answer: b
Clarification: The missing values plot is a useful tool to get a rapid overview of the number and pattern of missing values in a dataset.

250+ TOP MCQs on Visualizing Data and Answers

R Programming Language Multiple Choice Questions on “Visualizing Data ”.

1. Which of the following adds marginal sums to an existing table?
a) par()
b) prop.table()
c) addmargins()
d) quantile()

Answer: b
Clarification: prop.table() computes proportions from a contingency table.

2. Which of the following lists names of variables in a data.frame?
a) quantile()
b) names()
c) barchart()
d) par()

Answer: a
Clarification: names function is used to associate name with the value in the vector.

3. Which of the following is tool for chi-square distributions?
a) pchisq()
b) chisq()
c) pnorm
d) barchart()

Answer: c
Clarification: pnorm() is tool for normal distributions.

4. Which of the following groups values of a variable into larger bins?
a) cut
b) col.max(x)
c) stem
d) which.max(x)

Answer: a
Clarification: stem() is used to make a stemplot.

5. Which of the following determine the least-squares regression line?
a) histo()
b) lm
c) barlm()
d) col.max(x)

Answer: b
Clarification: lm calls the lower level functions lm.fit.

6. Which of the following is tool for checking normality?
a) qqline()
b) qline()
c) anova()
d) lm()

Answer: a
Clarification: qqnorm is another tool for checking normality.

7. Which of the following is lattice command for producing boxplots?
a) plot()
b) bwplot()
c) xyplot()
d) barlm()

Answer: b
Clarification: The function bwplot() makes box-and-whisker plots for numerical variables.

8. Which of the following compute analysis of variance table for fitted model?
a) ecdf()
b) cum()
c) anova()
d) bwplot()

Answer: c
Clarification: ecdf() builds empirical cumulative distribution function.

9. Which of the following is used to find variance of all values?
a) var()
b) sd()
c) mean()
d) anova()

Answer: a
Clarification: sd() is used to calculate standard deviation.

10.The purpose of fisher.test() is _______ test for contingency table.
a) Chisq
b) Fisher
c) Prop
d) Stem

Answer: b
Clarification: prop.test() is used to inference for 1 proportion using normal approx.

250+ TOP MCQs on R Programming Basics and Answers

R Programming Language Multiple Choice Questions on “Basics”.

1. The most convenient way to use R is at a graphics workstation running a ________ system.
a) windowing
b) running
c) interfacing
d) matrix

Answer: a
Clarification: Most classical statistics and much of the latest methodology is available for use with R.

2. Point out the wrong statement?
a) Setting up a workstation to take full advantage of the customizable features of R is a straightforward thing
b) q() is used to quit the R program
c) R has an inbuilt help facility similar to the man facility of UNIX
d) Windows versions of R have other optional help systems also

Answer: b
Clarification: help command is used for knowing details of particular command in R.

3. Which of the following is default prompt for UNIX environment?
a) >
b) >>
c) <
d) <<

Answer: a
Clarification: When you use the R program it issues a prompt when it expects input commands.

4. Which of the following will start the R program?
a) $ R
b) > R
c) * R
d) @ R

Answer: a
Clarification: At this point R commands may be issued.

5. Point out the wrong statement?
a) Windows versions of R have other optional help system also
b) The help.search command (alternatively ??) allows searching for help in various ways
c) R is case insensitive as are most UNIX based packages, so A and a are different symbols and would refer to different variables
d) $ R is used to start the R program

Answer: c
Clarification: R is an expression language with a very simple syntax.

6. Which of the following statement is alternative to _________

 ?solve

a) help(solve)
b) print(solve)
c) bind(solve)
d) matrix(solve)

Answer: a
Clarification: help is used to get more information on any specific named function.

7. Elementary commands in R consist of either _______ or assignments.
a) utilstats
b) language
c) expressions
d) packages

Answer: c
Clarification: If an expression is given as a command, it is evaluated, printed (unless specifically made invisible), and the value is lost.

8. If a command is not complete at the end of a line, R will give a different prompt, by default it is ____________
a) *
b) –
c) +
d) /

Answer: c
Clarification: Comments can be put almost anywhere, starting with a hashmark (‘#’), everything to the end of the line is a comment.

9. Command lines entered at the console are limited to about ________ bytes.
a) 3000
b) 4095
c) 5000
d) 6000

Answer: b
Clarification: Elementary commands can be grouped together into one compound expression by braces (‘{’ and ‘}’).

10._____ text editor provides more general support mechanisms via ESS for working interactively with R.
a) EAC
b) Emacs
c) Shell
d) ECAP

Answer: b
Clarification: The recall and editing capabilities under UNIX are highly customizable.

250+ TOP MCQs on Textual Data Formats and Answers

R Programming Language Multiple Choice Questions on “Textual Data Formats”.

1. Which of the following is used for reading in saved workspaces?
a) unserialize
b) load
c) get
d) set

Answer: b
Clarification: unserialize is used for reading single R objects in binary form. Load is used for reading in saved workspaces. Search by name for an object (get) or zero or more objects (mget).

2. Point out the wrong statement?
a) write.table is used for for writing tabular data to text files (i.e. CSV) or connections
b) writeLines is used for for writing character data line-by-line to a file or connection
c) dump is used for for dumping a textual representation of multiple R objects
d) all of the mentioned

Answer: d
Clarification: There are analogous functions for writing data to files.

3. ________ is used for outputting a textual representation of an R object.
a) dput
b) dump
c) dget
d) dset

Answer: a
Clarification: dump is used for dumping a textual representation of multiple R objects.

4. Which of the following argument denotes if the file has a header line?
a) header
b) sep
c) file
d) footer

Answer: a
Clarification: sep is a string indicating how the columns are separated.

5. Point out the correct statement?
a) unserialize is used for converting an R object into a binary format for outputting to a connection
b) save is used for saving an arbitrary number of R objects in binary format to a file
c) The read.data() function is one of the most commonly used functions for reading data
d) save is not used for saving an arbitrary

Answer: b
Clarification: read.table reads a file in table format and creates a data frame from it.

6. Which of the following statement would read file “foo.txt”?
a) data <- read.table(“foo.txt”)
b) read.data <- read.table(“foo.txt”)
c) data <- read.data(“foo.txt”)
d) data <- data(“foo.txt”)

Answer: a
Clarification: R will automatically skip lines that begin with a #.

7. Which of the following function is identical to read .table?
a) read.csv
b) read.data
c) read.tab
d) read.del

Answer: a
Clarification: The read.csv() function is identical to read.table except that some of the defaults are set differently (like the sep argument).

8. Which of the following code would read 100 rows?
a) initial <- read.table(“datatable.txt”, nrows = 100)
b) tabAll <- read.table(“datatable.txt”, colClasses = classes)
c) initial <- read.table(“datatable.txt”, nrows = 99)
d) initial <- read.table(“datatable.txt”, nrows = 101)

Answer: a
Clarification: You can use the Unix tool wc to calculate the number of lines in a file.

9. What will be the output of the following R code?

> y <- data.frame(a = 1, b = "a")
> dput(y)

a)

structure(list(a = 1, b = list(1L, .Label = "a", class = "factor")), .Names
= c("a",
"b"), row.names = c(NA, -1L), class = "data.frame")

b)

list(list(a = 1, b = list(1L, .Label = "a", class = "factor")), .Names
= c("a",
"b"), row.names = c(NA, -1L), class = "data.frame")

c)

structure(list(a = 1, b = structure(1L, .Label = "a", class = "factor")), .Names
= c("a",
"b"), row.names = c(NA, -1L), class = "data.frame")

d) Error

Answer: c
Clarification: dput() output is in the form of R code and that it preserves metadata like the class of the object, the row names, and the column names.

10. Which of the following is used for reading tabular data?

> y <- data.frame(a = 1, b = "a")
> dput(y, file = "y.R")
> new.y <- dget("y.R")
> new.y

a)

b)

c)

d)

View Answer

Answer: a
Clarification: Multiple objects can be deparsed at once using the dump function and read back in using source.