730+ Machine Learning (ML) Solved MCQs

Machine learning is a subset of artificial intelligence that involves the use of algorithms and statistical models to enable a system to improve its performance on a specific task over time. In other words, machine learning algorithms are designed to allow a computer to learn from data, without being explicitly programmed.

These multiple-choice questions (MCQs) are designed to enhance your knowledge and understanding in the following areas: Computer Science Engineering (CSE) .

Take a Test

651.	SVM is a ------------------ algorithm
A.	Classification
B.	Clustering
C.	Regression
D.	All
Answer» A. Classification

652.	SVM is a ------------------ learning
A.	Supervised
B.	Unsupervised
C.	Both
D.	None
Answer» A. Supervised

653.	The linear SVM classifier works by drawing a straight line between two classes
A.	True
B.	false
Answer» A. True

654.	What is Model Selection in Machine Learning?
A.	The process of selecting models among different mathematical models, which are used to describe the same data set
B.	when a statistical model describes random error or noise instead of underlying relationship
C.	Find interesting directions in data and find novel observations/ database cleaning
D.	All above
Answer» A. The process of selecting models among different mathematical models, which are used to describe the same data set

655.	Which are two techniques of Machine Learning ?
A.	Genetic Programming and Inductive Learning
B.	Speech recognition and Regression
C.	Both A & B
D.	None of the Mentioned
Answer» A. Genetic Programming and Inductive Learning

656.	Even if there are no actual supervisors ________ learning is also based on feedback provided by the environment
A.	Supervised
B.	Reinforcement
C.	Unsupervised
D.	None of the above
Answer» B. Reinforcement

657.	When it is necessary to allow the model to develop a generalization ability and avoid a common problem called______.
A.	Overfitting
B.	Overlearning
C.	Classification
D.	Regression
Answer» A. Overfitting

658.	Techniques involve the usage of both labeled and unlabeled data is called___.
A.	Supervised
B.	Semi-supervised
C.	Unsupervised
D.	None of the above
Answer» B. Semi-supervised

659.	A supervised scenario is characterized by the concept of a _____.
A.	Programmer
B.	Teacher
C.	Author
D.	Farmer
Answer» B. Teacher

660.	overlearning causes due to an excessive ______.
A.	Capacity
B.	Regression
C.	Reinforcement
D.	Accuracy
Answer» A. Capacity

661.	Which of the following are several models for feature extraction
A.	regression
B.	classification
C.	None of the above
Answer» C. None of the above

662.	_____ provides some built-in datasets that can be used for testing purposes.
A.	scikit-learn
B.	classification
C.	regression
D.	None of the above
Answer» A. scikit-learn

663.	While using _____ all labels are turned into sequential numbers.
A.	LabelEncoder class
B.	LabelBinarizer class
C.	DictVectorizer
D.	FeatureHasher
Answer» A. LabelEncoder class

664.	_______produce sparse matrices of real numbers that can be fed into any machine learning model.
A.	DictVectorizer
B.	FeatureHasher
C.	Both A & B
D.	None of the Mentioned
Answer» C. Both A & B

665.	scikit-learn offers the class______, which is responsible for filling the holes using a strategy based on the mean, median, or frequency
A.	LabelEncoder
B.	LabelBinarizer
C.	DictVectorizer
D.	Imputer
Answer» D. Imputer

666.	Which of the following scale data by removing elements that don't belong to a given range or by considering a maximum absolute value.
A.	MinMaxScaler
B.	MaxAbsScaler
C.	Both A & B
D.	None of the Mentioned
Answer» C. Both A & B

667.	scikit-learn also provides a class for per-sample normalization,_____
A.	Normalizer
B.	Imputer
C.	Classifier
D.	All above
Answer» A. Normalizer

668.	______dataset with many features contains information proportional to the independence of all features and their variance.
A.	normalized
B.	unnormalized
C.	Both A & B
D.	None of the Mentioned
Answer» B. unnormalized

669.	In order to assess how much information is brought by each component, and the correlation among them, a useful tool is the_____.
A.	Concuttent matrix
B.	Convergance matrix
C.	Supportive matrix
D.	Covariance matrix
Answer» D. Covariance matrix

670.	The_____ parameter can assume different values which determine how the data matrix is initially processed.
A.	run
B.	start
C.	init
D.	stop
Answer» C. init

671.	______allows exploiting the natural sparsity of data while extracting principal components.
A.	SparsePCA
B.	KernelPCA
C.	SVD
D.	init parameter
Answer» A. SparsePCA

672.	Which of the following statement is true about outliers in Linear regression?
A.	Linear regression is sensitive to outliers
B.	Linear regression is not sensitive to outliers
C.	Can’t say
D.	None of these
Answer» A. Linear regression is sensitive to outliers

673.	Suppose you plotted a scatter plot between the residuals and predicted values in linear regression and you found that there is a relationship between them. Which of the following conclusion do you make about this situation?
A.	Since the there is a relationship means our model is not good
B.	Since the there is a relationship means our model is good
C.	Can’t say
D.	None of these
Answer» A. Since the there is a relationship means our model is not good

674.	Let’s say, a “Linear regression” model perfectly fits the training data (train error is zero). Now, Which of the following statement is true?
A.	You will always have test error zero
B.	You can not have test error zero
C.	None of the above
Answer» C. None of the above

675.	In a linear regression problem, we are using “R-squared” to measure goodness-of-fit. We add a feature in linear regression model and retrain the same model.Which of the following option is true?
A.	If R Squared increases, this variable is significant.
B.	If R Squared decreases, this variable is not significant.
C.	Individually R squared cannot tell about variable importance. We can’t say anything about it right now.
D.	None of these.
Answer» C. Individually R squared cannot tell about variable importance. We can’t say anything about it right now.

730+ Machine Learning (ML) Solved MCQs

SVM is a ------------------ algorithm

SVM is a ------------------ learning

The linear SVM classifier works by drawing a straight line between two classes

What is Model Selection in Machine Learning?

Which are two techniques of Machine Learning ?

Even if there are no actual supervisors ________ learning is also based on feedback provided by the environment

When it is necessary to allow the model to develop a generalization ability and avoid a common problem called______.

Techniques involve the usage of both labeled and unlabeled data is called___.

A supervised scenario is characterized by the concept of a _____.

overlearning causes due to an excessive ______.

Which of the following are several models for feature extraction

_____ provides some built-in datasets that can be used for testing purposes.

While using _____ all labels are turned into sequential numbers.

_______produce sparse matrices of real numbers that can be fed into any machine learning model.

scikit-learn offers the class______, which is responsible for filling the holes using a strategy based on the mean, median, or frequency

Which of the following scale data by removing elements that don't belong to a given range or by considering a maximum absolute value.

scikit-learn also provides a class for per-sample normalization,_____

______dataset with many features contains information proportional to the independence of all features and their variance.

In order to assess how much information is brought by each component, and the correlation among them, a useful tool is the_____.

The_____ parameter can assume different values which determine how the data matrix is initially processed.

______allows exploiting the natural sparsity of data while extracting principal components.

Which of the following statement is true about outliers in Linear regression?

Suppose you plotted a scatter plot between the residuals and predicted values in linear regression and you found that there is a relationship between them. Which of the following conclusion do you make about this situation?

Let’s say, a “Linear regression” model perfectly fits the training data (train error is zero). Now, Which of the following statement is true?

In a linear regression problem, we are using “R-squared” to measure goodness-of-fit. We add a feature in linear regression model and retrain the same model.Which of the following option is true?

To test linear relationship of y(dependent) and x(independent) continuous variables, which of the following plot best suited?

which of the following step / assumption in regression modeling impacts the trade-off between under-fitting and over-fitting the most.

Which of the following is true about “Ridge” or “Lasso” regression methods in case of feature selection?

What is/are true about kernel in SVM?1. Kernel function map low dimensional data to high dimensional space2. It’s a similarity function

The cost parameter in the SVM means:

How do you handle missing or corrupted data in a dataset?

Which of the following statements about Naive Bayes is incorrect?

The SVM’s are less effective when:

If there is only a discrete number of possible outcomes called _____.

Some people are using the term ___ instead of prediction only to avoid the weird idea that machine learning is a sort of modern magic.

The term _____ can be freely used, but with the same meaning adopted in physics or system theory.

Common deep learning applications / problems can also be solved using____

what is the function of ‘Unsupervised Learning’?

What are the two methods used for the calibration in Supervised Learning?

Which of the following is true about “Ridge” or “Lasso” regression methods in case of feature selection?

If two variables are correlated, is it necessary that they have a linear relationship?

When the C parameter is set to infinite, which of the following holds true?

SVM can solve linear and non-linear problems

The objective of the support vector machine algorithm is to find a hyperplane in an N-dimensional space(N — the number of features) that distinctly classifies the data points.

676.	To test linear relationship of y(dependent) and x(independent) continuous variables, which of the following plot best suited?
A.	Scatter plot
B.	Barchart
C.	Histograms
D.	None of these
Answer» A. Scatter plot

677.	which of the following step / assumption in regression modeling impacts the trade-off between under-fitting and over-fitting the most.
A.	The polynomial degree
B.	Whether we learn the weights by matrix inversion or gradient descent
C.	The use of a constant-term
Answer» A. The polynomial degree

678.	Which of the following is true about “Ridge” or “Lasso” regression methods in case of feature selection?
A.	Ridge regression uses subset selection of features
B.	Lasso regression uses subset selection of features
C.	Both use subset selection of features
D.	None of above
Answer» B. Lasso regression uses subset selection of features

679.	Which of the following statement(s) can be true post adding a variable in a linear regression model?1. R-Squared and Adjusted R-squared both increase2. R-Squared increases and Adjusted R-squared decreases3. R-Squared decreases and Adjusted R-squared decreases4. R-Squared decreases and Adjusted R-squared increases
A.	1 and 2
B.	1 and 3
C.	2 and 4
D.	None of the above
Answer» A. 1 and 2

680.	What is/are true about kernel in SVM?1. Kernel function map low dimensional data to high dimensional space2. It’s a similarity function
A.	1
B.	2
C.	1 and 2
D.	None of these
Answer» C. 1 and 2

681.	Suppose you are building a SVM model on data X. The data X can be error prone which means that you should not trust any specific data point too much. Now think that you want to build a SVM model which has quadratic kernel function of polynomial degree 2 that uses Slack variable C as one of it’s hyper parameter.What would happen when you use very small C (C~0)?
A.	Misclassification would happen
B.	Data will be correctly classified
C.	Can’t say
D.	None of these
Answer» A. Misclassification would happen

682.	The cost parameter in the SVM means:
A.	The number of cross-validations to be made
B.	The kernel to be used
C.	The tradeoff between misclassification and simplicity of the model
D.	None of the above
Answer» C. The tradeoff between misclassification and simplicity of the model

683.	How do you handle missing or corrupted data in a dataset?
A.	a. Drop missing rows or columns
B.	b. Replace missing values with mean/median/mode
C.	c. Assign a unique category to missing values
D.	d. All of the above
Answer» D. d. All of the above

684.	Which of the following statements about Naive Bayes is incorrect?
A.	Attributes are equally important.
B.	Attributes are statistically dependent of one another given the class value.
C.	Attributes are statistically independent of one another given the class value.
D.	Attributes can be nominal or numeric
Answer» B. Attributes are statistically dependent of one another given the class value.

685.	The SVM’s are less effective when:
A.	The data is linearly separable
B.	The data is clean and ready to use
C.	The data is noisy and contains overlapping points
Answer» C. The data is noisy and contains overlapping points