730+ Machine Learning (ML) Solved MCQs

Machine learning is a subset of artificial intelligence that involves the use of algorithms and statistical models to enable a system to improve its performance on a specific task over time. In other words, machine learning algorithms are designed to allow a computer to learn from data, without being explicitly programmed.

These multiple-choice questions (MCQs) are designed to enhance your knowledge and understanding in the following areas: Computer Science Engineering (CSE) .

Take a Test

601.	Simple regression assumes a __________ relationship between the input attribute and output attribute.
A.	linear
B.	quadratic
C.	reciprocal
D.	inverse
Answer» A. linear

602.	Regression trees are often used to model _______ data.
A.	linear
B.	nonlinear
C.	categorical
D.	symmetrical
Answer» B. nonlinear

603.	The leaf nodes of a model tree are
A.	averages of numeric output attribute values.
B.	nonlinear regression equations.
C.	linear regression equations.
D.	sums of numeric output attribute values.
Answer» C. linear regression equations.

604.	Logistic regression is a ________ regression technique that is used to model data having a _____outcome.
A.	linear, numeric
B.	linear, binary
C.	nonlinear, numeric
D.	nonlinear, binary
Answer» D. nonlinear, binary

605.	This technique associates a conditional probability value with each data instance.
A.	linear regression
B.	logistic regression
C.	simple regression
D.	multiple linear regression
Answer» B. logistic regression

606.	This supervised learning technique can process both numeric and categorical input attributes.
A.	linear regression
B.	bayes classifier
C.	logistic regression
D.	backpropagation learning
Answer» A. linear regression

607.	With Bayes classifier, missing data items are
A.	treated as equal compares.
B.	treated as unequal compares.
C.	replaced with a default value.
D.	ignored.
Answer» B. treated as unequal compares.

608.	This clustering algorithm merges and splits nodes to help modify nonoptimal partitions.
A.	agglomerative clustering
B.	expectation maximization
C.	conceptual clustering
D.	k-means clustering
Answer» D. k-means clustering

609.	This clustering algorithm initially assumes that each data instance represents a single cluster.
A.	agglomerative clustering
B.	conceptual clustering
C.	k-means clustering
D.	expectation maximization
Answer» C. k-means clustering

610.	This unsupervised clustering algorithm terminates when mean values computed for the current iteration of the algorithm are identical to the computed mean values for the previous iteration.
A.	agglomerative clustering
B.	conceptual clustering
C.	k-means clustering
D.	expectation maximization
Answer» C. k-means clustering

611.	Machine learning techniques differ from statistical techniques in that machine learning methods
A.	typically assume an underlying distribution for the data.
B.	are better able to deal with missing and noisy data.
C.	are not able to explain their behavior.
D.	have trouble with large-sized datasets.
Answer» B. are better able to deal with missing and noisy data.

612.	In reinforcement learning if feedback is negative one it is defined as____.
A.	Penalty
B.	Overlearning
C.	Reward
D.	None of above
Answer» A. Penalty

613.	According to____ , it’s a key success factor for the survival and evolution of all species.
A.	Claude Shannon\s theory
B.	Gini Index
C.	Darwin’s theory
D.	None of above
Answer» C. Darwin’s theory

614.	What is ‘Training set’?
A.	Training set is used to test the accuracy of the hypotheses generated by the learner.
B.	A set of data is used to discover the potentially predictive relationship.
C.	Both A & B
D.	None of above
Answer» B. A set of data is used to discover the potentially predictive relationship.

615.	Common deep learning applications include____
A.	Image classification, Real-time visual tracking
B.	Autonomous car driving, Logistic optimization
C.	Bioinformatics, Speech recognition
D.	All above
Answer» D. All above

616.	Reinforcement learning is particularly efficient when______________.
A.	the environment is not completely deterministic
B.	it\s often very dynamic
C.	it\s impossible to have a precise error measure
D.	All above
Answer» D. All above

617.	if there is only a discrete number of possible outcomes (called categories), the process becomes a______.
A.	Regression
B.	Classification.
C.	Modelfree
D.	Categories
Answer» B. Classification.

618.	Which of the following are supervised learning applications
A.	Spam detection, Pattern detection, Natural Language Processing
B.	Image classification, Real-time visual tracking
C.	Autonomous car driving, Logistic optimization
D.	Bioinformatics, Speech recognition
Answer» A. Spam detection, Pattern detection, Natural Language Processing

619.	During the last few years, many ______ algorithms have been applied to deep neural networks to learn the best policy for playing Atari video games and to teach an agent how to associate the right action with an input representing the state.
A.	Logical
B.	Classical
C.	Classification
D.	None of above
Answer» D. None of above

620.	What is ‘Overfitting’ in Machine learning?
A.	when a statistical model describes random error or noise instead of underlying relationship ‘overfitting’ occurs.
B.	Robots are programed so that they can perform the task based on data they gather from sensors.
C.	While involving the process of learning ‘overfitting’ occurs.
D.	a set of data is used to discover the potentially predictive relationship
Answer» A. when a statistical model describes random error or noise instead of underlying relationship ‘overfitting’ occurs.

621.	What is ‘Test set’?
A.	Test set is used to test the accuracy of the hypotheses generated by the learner.
B.	It is a set of data is used to discover the potentially predictive relationship.
C.	Both A & B
D.	None of above
Answer» A. Test set is used to test the accuracy of the hypotheses generated by the learner.

622.	________is much more difficult because it's necessary to determine a supervised strategy to train a model for each feature and, finally, to predict their value
A.	Removing the whole line
B.	Creating sub-model to predict those features
C.	Using an automatic strategy to input them according to the other known values
D.	All above
Answer» B. Creating sub-model to predict those features

623.	How it's possible to use a different placeholder through the parameter_______.
A.	regression
B.	classification
C.	random_state
D.	missing_values
Answer» D. missing_values

624.	If you need a more powerful scaling feature, with a superior control on outliers and the possibility to select a quantile range, there's also the class________.
A.	RobustScaler
B.	DictVectorizer
C.	LabelBinarizer
D.	FeatureHasher
Answer» A. RobustScaler

625.	scikit-learn also provides a class for per-sample normalization, Normalizer. It can apply________to each element of a dataset
A.	max, l0 and l1 norms
B.	max, l1 and l2 norms
C.	max, l2 and l3 norms
D.	max, l3 and l4 norms
Answer» B. max, l1 and l2 norms

730+ Machine Learning (ML) Solved MCQs

Simple regression assumes a __________ relationship between the input attribute and output attribute.

Regression trees are often used to model _______ data.

The leaf nodes of a model tree are

Logistic regression is a ________ regression technique that is used to model data having a _____outcome.

This technique associates a conditional probability value with each data instance.

This supervised learning technique can process both numeric and categorical input attributes.

With Bayes classifier, missing data items are

This clustering algorithm merges and splits nodes to help modify nonoptimal partitions.

This clustering algorithm initially assumes that each data instance represents a single cluster.

This unsupervised clustering algorithm terminates when mean values computed for the current iteration of the algorithm are identical to the computed mean values for the previous iteration.

Machine learning techniques differ from statistical techniques in that machine learning methods

In reinforcement learning if feedback is negative one it is defined as____.

According to____ , it’s a key success factor for the survival and evolution of all species.

What is ‘Training set’?

Common deep learning applications include____

Reinforcement learning is particularly efficient when______________.

if there is only a discrete number of possible outcomes (called categories), the process becomes a______.

Which of the following are supervised learning applications

During the last few years, many ______ algorithms have been applied to deep neural networks to learn the best policy for playing Atari video games and to teach an agent how to associate the right action with an input representing the state.

What is ‘Overfitting’ in Machine learning?

What is ‘Test set’?

________is much more difficult because it's necessary to determine a supervised strategy to train a model for each feature and, finally, to predict their value

How it's possible to use a different placeholder through the parameter_______.

If you need a more powerful scaling feature, with a superior control on outliers and the possibility to select a quantile range, there's also the class________.

scikit-learn also provides a class for per-sample normalization, Normalizer. It can apply________to each element of a dataset

There are also many univariate methods that can be used in order to select the best features according to specific criteria based on________.

________performs a PCA with non-linearly separable data sets.

A feature F1 can take certain value: A, B, C, D, E, & F and represents grade of students from a college. Which of the following statement is true in following case?

The parameter______ allows specifying the percentage of elements to put into the test/training set

In many classification problems, the target ______ is made up of categorical labels which cannot immediately be processed by any algorithm.

_______adopts a dictionary-oriented approach, associating to each category label a progressive integer number.

Function used for linear regression in R is __________

In syntax of linear model lm(formula,data,..), data refers to ______

Which of the following methods do we use to find the best fit line for data in Linear Regression?

Which of the following evaluation metrics can be used to evaluate a model while modeling a continuous output variable?

Which of the following is true about Residuals ?

Naive Bayes classifiers are a collection ------------------of algorithms

Naive Bayes classifiers is _______________ Learning

Features being classified is independent of each other in Naïve Bayes Classifier

Features being classified is __________ of each other in Naïve Bayes Classifier

Conditional probability is a measure of the probability of an event given that another event has already occurred.

Bayes’ theorem describes the probability of an event, based on prior knowledge of conditions that might be related to the event.

Bernoulli Naïve Bayes Classifier is ___________distribution

Multinomial Naïve Bayes Classifier is ___________distribution

Gaussian Naïve Bayes Classifier is ___________distribution

Binarize parameter in BernoulliNB scikit sets threshold for binarizing of sample features.

Gaussian distribution when plotted, gives a bell shaped curve which is symmetric about the _______ of the feature values.

SVMs directly give us the posterior probabilities P(y = 1jx) and P(y = 􀀀1jx)

Any linear combination of the components of a multivariate Gaussian is a univariate Gaussian.

Solving a non linear separation problem with a hard margin Kernelized SVM (Gaussian RBF Kernel) might lead to overfitting

Logistic regression is a ____ regression technique that is used to model data having a _outcome.

626.	There are also many univariate methods that can be used in order to select the best features according to specific criteria based on________.
A.	F-tests and p-values
B.	chi-square
C.	ANOVA
D.	All above
Answer» A. F-tests and p-values

627.	________performs a PCA with non-linearly separable data sets.
A.	SparsePCA
B.	KernelPCA
C.	SVD
D.	None of the Mentioned
Answer» B. KernelPCA

628.	A feature F1 can take certain value: A, B, C, D, E, & F and represents grade of students from a college. Which of the following statement is true in following case?
A.	Feature F1 is an example of nominal variable.
B.	Feature F1 is an example of ordinal variable.
C.	It doesn’t belong to any of the above category.
D.	Both of these
Answer» B. Feature F1 is an example of ordinal variable.

629.	The parameter______ allows specifying the percentage of elements to put into the test/training set
A.	test_size
B.	training_size
C.	All above
D.	None of these
Answer» C. All above

630.	In many classification problems, the target ______ is made up of categorical labels which cannot immediately be processed by any algorithm.
A.	random_state
B.	dataset
C.	test_size
D.	All above
Answer» B. dataset

631.	_______adopts a dictionary-oriented approach, associating to each category label a progressive integer number.
A.	LabelEncoder class
B.	LabelBinarizer class
C.	DictVectorizer
D.	FeatureHasher
Answer» A. LabelEncoder class

632.	Function used for linear regression in R is __________
A.	lm(formula, data)
B.	lr(formula, data)
C.	lrm(formula, data)
D.	regression.linear(formula, data)
Answer» A. lm(formula, data)

633.	In syntax of linear model lm(formula,data,..), data refers to ______
A.	Matrix
B.	Vector
C.	Array
D.	List
Answer» B. Vector

634.	Which of the following methods do we use to find the best fit line for data in Linear Regression?
A.	Least Square Error
B.	Maximum Likelihood
C.	Logarithmic Loss
D.	Both A and B
Answer» A. Least Square Error

635.	Which of the following evaluation metrics can be used to evaluate a model while modeling a continuous output variable?
A.	AUC-ROC
B.	Accuracy
C.	Logloss
D.	Mean-Squared-Error
Answer» D. Mean-Squared-Error