730+ Machine Learning (ML) Solved MCQs

Machine learning is a subset of artificial intelligence that involves the use of algorithms and statistical models to enable a system to improve its performance on a specific task over time. In other words, machine learning algorithms are designed to allow a computer to learn from data, without being explicitly programmed.

These multiple-choice questions (MCQs) are designed to enhance your knowledge and understanding in the following areas: Computer Science Engineering (CSE) .

Take a Test

251.	True or False: Ensemble learning can only be applied to supervised learning methods.
A.	true
B.	false
Answer» B. false

252.	True or False: Ensembles will yield bad results when there is significant diversity among the models. Note: All individual models have meaningful and good predictions.
A.	true
B.	false
Answer» B. false

253.	Which of the following is / are true about weak learners used in ensemble model? 1. They have low variance and they don’t usually overfit 2. They have high bias, so they can not solve hard learning problems 3. They have high variance and they don’t usually overfit
A.	1 and 2
B.	1 and 3
C.	2 and 3
D.	none of these
Answer» A. 1 and 2

254.	True or False: Ensemble of classifiers may or may not be more accurate than any of its individual model.
A.	true
B.	false
Answer» A. true

255.	If you use an ensemble of different base models, is it necessary to tune the hyper parameters of all base models to improve the ensemble performance?
A.	yes
B.	no
C.	can’t say
Answer» B. no

256.	Generally, an ensemble method works better, if the individual base models have ____________? Note: Suppose each individual base models have accuracy greater than 50%.
A.	less correlation among predictions
B.	high correlation among predictions
C.	correlation does not have any impact on ensemble output
D.	none of the above
Answer» A. less correlation among predictions

257.	In an election, N candidates are competing against each other and people are voting for either of the candidates. Voters don’t communicate with each other while casting their votes. Which of the following ensemble method works similar to above-discussed election procedure? Hint: Persons are like base models of ensemble method.
A.	bagging
B.	boosting
C.	a or b
D.	none of these
Answer» A. bagging

258.	Suppose there are 25 base classifiers. Each classifier has error rates of e = 0.35. Suppose you are using averaging as ensemble technique. What will be the probabilities that ensemble of above 25 classifiers will make a wrong prediction? Note: All classifiers are independent of each other
A.	0.05
B.	0.06
C.	0.07
D.	0.09
Answer» B. 0.06

259.	In machine learning, an algorithm (or learning algorithm) is said to be unstable if a small change in training data cause the large change in the learned classifiers. True or False: Bagging of unstable classifiers is a good idea
A.	true
B.	false
Answer» A. true

260.	Which of the following parameters can be tuned for finding good ensemble model in bagging based algorithms? 1. Max number of samples 2. Max features 3. Bootstrapping of samples 4. Bootstrapping of features
A.	1 and 3
B.	2 and 3
C.	1 and 2
D.	all of above
Answer» D. all of above

261.	How is the model capacity affected with dropout rate (where model capacity means the ability of a neural network to approximate complex functions)?
A.	model capacity increases in increase in dropout rate
B.	model capacity decreases in increase in dropout rate
C.	model capacity is not affected on increase in dropout rate
D.	none of these
Answer» B. model capacity decreases in increase in dropout rate

264.	Suppose, you have 2000 different models with their predictions and want to ensemble predictions of best x models. Now, which of the following can be a possible method to select the best x models for an ensemble?
A.	step wise forward selection
B.	step wise backward elimination
C.	both
D.	none of above
Answer» C. both

265.	Below are the two ensemble models: 1. E1(M1, M2, M3) and 2. E2(M4, M5, M6) Above, Mx is the individual base models. Which of the following are more likely to choose if following conditions for E1 and E2 are given? E1: Individual Models accuracies are high but models are of the same type or in another term less diverse E2: Individual Models accuracies are high but they are of different types in another term high diverse in nature
A.	e1
B.	e2
C.	any of e1 and e2
D.	none of these
Answer» B. e2

269.	Which of the following is the difference between stacking and blending?
A.	stacking has less stable cv compared to blending
B.	in blending, you create out of fold prediction
C.	stacking is simpler than blending
D.	none of these
Answer» D. none of these

273.	Which of the following is true about weighted majority votes? 1. We want to give higher weights to better performing models 2. Inferior models can overrule the best model if collective weighted votes for inferior models is higher than best model 3. Voting is special case of weighted voting
A.	1 and 3
B.	2 and 3
C.	1 and 2
D.	1, 2 and 3
Answer» D. 1, 2 and 3

262.	True or False: Dropout is computationally expensive technique w.r.t. bagging
A.	true
B.	false
Answer» B. false

263.	Suppose, you want to apply a stepwise forward selection method for choosing the best models for an ensemble model. Which of the following is the correct order of the steps? Note: You have more than 1000 models predictions 1. Add the models predictions (or in another term take the average) one by one in the ensemble which improves the metrics in the validation set. 2. Start with empty ensemble 3. Return the ensemble from the nested set of ensembles that has maximum performance on the validation set
A.	1-2-3
B.	1-3-4
C.	2-1-3
D.	none of above
Answer» D. none of above

266.	True or False: In boosting, individual base learners can be parallel.
A.	true
B.	false
Answer» B. false

267.	Which of the following is true about bagging? 1. Bagging can be parallel 2. The aim of bagging is to reduce bias not variance 3. Bagging helps in reducing overfitting
A.	1 and 2
B.	2 and 3
C.	1 and 3
D.	all of these
Answer» C. 1 and 3

268.	Suppose you are using stacking with n different machine learning algorithms with k folds on data. Which of the following is true about one level (m base models + 1 stacker) stacking? Note: Here, we are working on binary classification problem All base models are trained on all features You are using k folds for base models
A.	you will have only k features after the first stage
B.	you will have only m features after the first stage
C.	you will have k+m features after the first stage
D.	you will have k*n features after the first stage
Answer» B. you will have only m features after the first stage

270.	Which of the following can be one of the steps in stacking? 1. Divide the training data into k folds 2. Train k models on each k-1 folds and get the out of fold predictions for remaining one fold 3. Divide the test data set in “k” folds and get individual fold predictions by different algorithms
A.	1 and 2
B.	2 and 3
C.	1 and 3
D.	all of above
Answer» A. 1 and 2

271.	Q25. Which of the following are advantages of stacking? 1) More robust model 2) better prediction 3) Lower time of execution
A.	1 and 2
B.	2 and 3
C.	1 and 3
D.	all of the above
Answer» A. 1 and 2

272.	Which of the following are correct statement(s) about stacking? A machine learning model is trained on predictions of multiple machine learning models A Logistic regression will definitely work better in the second stage as compared to other classification methods First stage models are trained on full / partial feature space of training data
A.	1 and 2
B.	2 and 3
C.	1 and 3
D.	all of above
Answer» C. 1 and 3

274.	Which of the following is true about averaging ensemble?
A.	it can only be used in classification problem
B.	it can only be used in regression problem
C.	it can be used in both classification as well as regression
D.	none of these
Answer» C. it can be used in both classification as well as regression

275.	How can we assign the weights to output of different models in an ensemble? 1. Use an algorithm to return the optimal weights 2. Choose the weights using cross validation 3. Give high weights to more accurate models
A.	1 and 2
B.	1 and 3
C.	2 and 3
D.	all of above
Answer» D. all of above

730+ Machine Learning (ML) Solved MCQs

True or False: Ensemble learning can only be applied to supervised learning methods.

True or False: Ensembles will yield bad results when there is significant diversity among the models. Note: All individual models have meaningful and good predictions.

Which of the following is / are true about weak learners used in ensemble model? 1. They have low variance and they don’t usually overfit 2. They have high bias, so they can not solve hard learning problems 3. They have high variance and they don’t usually overfit

True or False: Ensemble of classifiers may or may not be more accurate than any of its individual model.

If you use an ensemble of different base models, is it necessary to tune the hyper parameters of all base models to improve the ensemble performance?

Generally, an ensemble method works better, if the individual base models have ____________? Note: Suppose each individual base models have accuracy greater than 50%.

Suppose there are 25 base classifiers. Each classifier has error rates of e = 0.35. Suppose you are using averaging as ensemble technique. What will be the probabilities that ensemble of above 25 classifiers will make a wrong prediction? Note: All classifiers are independent of each other

In machine learning, an algorithm (or learning algorithm) is said to be unstable if a small change in training data cause the large change in the learned classifiers. True or False: Bagging of unstable classifiers is a good idea

Which of the following parameters can be tuned for finding good ensemble model in bagging based algorithms? 1. Max number of samples 2. Max features 3. Bootstrapping of samples 4. Bootstrapping of features

How is the model capacity affected with dropout rate (where model capacity means the ability of a neural network to approximate complex functions)?

True or False: Dropout is computationally expensive technique w.r.t. bagging

Suppose, you have 2000 different models with their predictions and want to ensemble predictions of best x models. Now, which of the following can be a possible method to select the best x models for an ensemble?

True or False: In boosting, individual base learners can be parallel.

Which of the following is true about bagging? 1. Bagging can be parallel 2. The aim of bagging is to reduce bias not variance 3. Bagging helps in reducing overfitting

Which of the following is the difference between stacking and blending?

Q25. Which of the following are advantages of stacking? 1) More robust model 2) better prediction 3) Lower time of execution

Which of the following is true about weighted majority votes? 1. We want to give higher weights to better performing models 2. Inferior models can overrule the best model if collective weighted votes for inferior models is higher than best model 3. Voting is special case of weighted voting

Which of the following is true about averaging ensemble?

How can we assign the weights to output of different models in an ensemble? 1. Use an algorithm to return the optimal weights 2. Choose the weights using cross validation 3. Give high weights to more accurate models

If you use an ensemble of different base models, is it necessary to tune the hyper parameters of all base models to improve the ensemble performance?

Which of the following is NOT supervised learning?

According to , it's a key success factor for the survival and evolution of all species.

How can you avoid overfitting ?

What are the popular algorithms of Machine Learning?

What is Training set?

Common deep learning applications include

what is the function of Supervised Learning?

Commons unsupervised applications include

Reinforcement learning is particularly efficient when .

if there is only a discrete number of possible outcomes (called categories), the process becomes a .

Which of the following are supervised learning applications

During the last few years, many algorithms have been applied to deep neural networks to learn the best policy for playing Atari video games and to teach an agent how to associate the right action with an input representing the state.

Which of the following sentence is correct?

What is Overfitting in Machine learning?

What is Test set?

is much more difficult because it's necessary to determine a supervised strategy to train a model for each feature and, finally, to predict their value

How it's possible to use a different placeholder through the parameter .

If you need a more powerful scaling feature, with a superior control on outliers and the possibility to select a quantile range, there's also the class .

scikit-learn also provides a class for per- sample normalization, Normalizer. It can apply to each element of a dataset

There are also many univariate methods that can be used in order to select the best features according to specific criteria based on .

Which of the following selects only a subset of features belonging to a certain percentile

performs a PCA with non-linearly separable data sets.

Which of the following is / are true about weak learners used in ensemble model?
1. They have low variance and they don’t usually overfit
2. They have high bias, so they can not solve hard learning problems
3. They have high variance and they don’t usually overfit

Suppose there are 25 base classifiers. Each classifier has error rates of e = 0.35.
Suppose you are using averaging as ensemble technique. What will be the probabilities that ensemble of above 25 classifiers will make a wrong prediction?
Note: All classifiers are independent of each other

Which of the following parameters can be tuned for finding good ensemble model in bagging based algorithms?
1. Max number of samples
2. Max features
3. Bootstrapping of samples
4. Bootstrapping of features

Which of the following is true about bagging?
1. Bagging can be parallel
2. The aim of bagging is to reduce bias not variance
3. Bagging helps in reducing overfitting

Q25. Which of the following are advantages of stacking?
1) More robust model
2) better prediction
3) Lower time of execution

Which of the following is true about weighted majority votes?
1. We want to give higher weights to better performing models
2. Inferior models can overrule the best model if collective weighted votes for inferior models is higher than best model
3. Voting is special case of weighted voting

How can we assign the weights to output of different models in an ensemble?
1. Use an algorithm to return the optimal weights
2. Choose the weights using cross validation
3. Give high weights to more accurate models

276.	Suppose you are given ‘n’ predictions on test data by ‘n’ different models (M1, M2, …. Mn) respectively. Which of the following method(s) can be used to combine the predictions of these models? Note: We are working on a regression problem 1. Median 2. Product 3. Average 4. Weighted sum 5. Minimum and Maximum 6. Generalized mean rule
A.	1, 3 and 4
B.	1,3 and 6
C.	1,3, 4 and 6
D.	all of above
Answer» D. all of above

277.	In an election, N candidates are competing against each other and people are voting for either of the candidates. Voters don’t communicate with each other while casting their votes. Which of the following ensemble method works similar to above-discussed election procedure? Hint: Persons are like base models of ensemble method.
A.	bagging
B.	1,3 and 6
C.	a or b
D.	none of these
Answer» A. bagging

278.	If you use an ensemble of different base models, is it necessary to tune the hyper parameters of all base models to improve the ensemble performance?
A.	yes
B.	no
C.	can’t say
Answer» B. no

279.	Which of the following is NOT supervised learning?
A.	pca
B.	decision tree
C.	linear regression
D.	naive bayesian
Answer» A. pca

280.	According to , it's a key success factor for the survival and evolution of all species.
A.	claude shannon\s theory
B.	gini index
C.	darwin's theory
D.	none of above
Answer» C. darwin's theory

281.	How can you avoid overfitting ?
A.	by using a lot of data
B.	by using inductive machine learning
C.	by using validation only
D.	none of above
Answer» A. by using a lot of data

282.	What are the popular algorithms of Machine Learning?
A.	decision trees and neural networks (back propagation)
B.	probabilistic networks and nearest neighbor
C.	support vector machines
D.	all
Answer» D. all

283.	What is Training set?
A.	training set is used to test the accuracy of the hypotheses generated by the learner.
B.	a set of data is used to discover the potentially predictive relationship.
C.	both a & b
D.	none of above
Answer» B. a set of data is used to discover the potentially predictive relationship.

284.	Common deep learning applications include
A.	image classification, real-time visual tracking
B.	autonomous car driving, logistic optimization
C.	bioinformatics, speech recognition
D.	all above
Answer» D. all above

285.	what is the function of Supervised Learning?
A.	classifications, predict time series, annotate strings
B.	speech recognition, regression
C.	both a & b
D.	none of above
Answer» C. both a & b