

McqMate
These multiple-choice questions (MCQs) are designed to enhance your knowledge and understanding in the following areas: Computer Science Engineering (CSE) .
251. |
True or False: Ensemble learning can only be applied to supervised learning methods. |
A. | true |
B. | false |
Answer» B. false |
252. |
True or False: Ensembles will yield bad results when there is significant diversity among the models. Note: All individual models have meaningful and good predictions. |
A. | true |
B. | false |
Answer» B. false |
253. |
Which of the following is / are true about weak learners used in ensemble model?
|
A. | 1 and 2 |
B. | 1 and 3 |
C. | 2 and 3 |
D. | none of these |
Answer» A. 1 and 2 |
254. |
True or False: Ensemble of classifiers may or may not be more accurate than any of its individual model. |
A. | true |
B. | false |
Answer» A. true |
255. |
If you use an ensemble of different base models, is it necessary to tune the hyper parameters of all base models to improve the ensemble performance? |
A. | yes |
B. | no |
C. | can’t say |
Answer» B. no |
256. |
Generally, an ensemble method works better, if the individual base models have ____________? Note: Suppose each individual base models have accuracy greater than 50%. |
A. | less correlation among predictions |
B. | high correlation among predictions |
C. | correlation does not have any impact on ensemble output |
D. | none of the above |
Answer» A. less correlation among predictions |
257. |
In an election, N candidates are competing against each other and people are voting for either of the candidates. Voters don’t communicate with each other while casting their votes. Which of the following ensemble method works similar to above-discussed election procedure?
|
A. | bagging |
B. | boosting |
C. | a or b |
D. | none of these |
Answer» A. bagging |
258. |
Suppose there are 25 base classifiers. Each classifier has error rates of e = 0.35.
|
A. | 0.05 |
B. | 0.06 |
C. | 0.07 |
D. | 0.09 |
Answer» B. 0.06 |
259. |
In machine learning, an algorithm (or learning algorithm) is said to be unstable if a small change in training data cause the large change in the learned classifiers. True or False: Bagging of unstable classifiers is a good idea |
A. | true |
B. | false |
Answer» A. true |
260. |
Which of the following parameters can be tuned for finding good ensemble model in bagging based algorithms?
|
A. | 1 and 3 |
B. | 2 and 3 |
C. | 1 and 2 |
D. | all of above |
Answer» D. all of above |
261. |
How is the model capacity affected with dropout rate (where model capacity means the ability of a neural network to approximate complex functions)? |
A. | model capacity increases in increase in dropout rate |
B. | model capacity decreases in increase in dropout rate |
C. | model capacity is not affected on increase in dropout rate |
D. | none of these |
Answer» B. model capacity decreases in increase in dropout rate |
262. |
True or False: Dropout is computationally expensive technique w.r.t. bagging |
A. | true |
B. | false |
Answer» B. false |
263. |
Suppose, you want to apply a stepwise forward selection method for choosing the best models for an ensemble model. Which of the following is the correct order of the steps?
|
A. | 1-2-3 |
B. | 1-3-4 |
C. | 2-1-3 |
D. | none of above |
Answer» D. none of above |
264. |
Suppose, you have 2000 different models with their predictions and want to ensemble predictions of best x models. Now, which of the following can be a possible method to select the best x models for an ensemble? |
A. | step wise forward selection |
B. | step wise backward elimination |
C. | both |
D. | none of above |
Answer» C. both |
265. |
Below are the two ensemble models:
|
A. | e1 |
B. | e2 |
C. | any of e1 and e2 |
D. | none of these |
Answer» B. e2 |
266. |
True or False: In boosting, individual base learners can be parallel. |
A. | true |
B. | false |
Answer» B. false |
267. |
Which of the following is true about bagging?
|
A. | 1 and 2 |
B. | 2 and 3 |
C. | 1 and 3 |
D. | all of these |
Answer» C. 1 and 3 |
268. |
Suppose you are using stacking with n different machine learning algorithms with k folds on data.
|
A. | you will have only k features after the first stage |
B. | you will have only m features after the first stage |
C. | you will have k+m features after the first stage |
D. | you will have k*n features after the first stage |
Answer» B. you will have only m features after the first stage |
269. |
Which of the following is the difference between stacking and blending? |
A. | stacking has less stable cv compared to blending |
B. | in blending, you create out of fold prediction |
C. | stacking is simpler than blending |
D. | none of these |
Answer» D. none of these |
270. |
Which of the following can be one of the steps in stacking?
|
A. | 1 and 2 |
B. | 2 and 3 |
C. | 1 and 3 |
D. | all of above |
Answer» A. 1 and 2 |
271. |
Q25. Which of the following are advantages of stacking?
|
A. | 1 and 2 |
B. | 2 and 3 |
C. | 1 and 3 |
D. | all of the above |
Answer» A. 1 and 2 |
272. |
Which of the following are correct statement(s) about stacking?
|
A. | 1 and 2 |
B. | 2 and 3 |
C. | 1 and 3 |
D. | all of above |
Answer» C. 1 and 3 |
273. |
Which of the following is true about weighted majority votes?
|
A. | 1 and 3 |
B. | 2 and 3 |
C. | 1 and 2 |
D. | 1, 2 and 3 |
Answer» D. 1, 2 and 3 |
274. |
Which of the following is true about averaging ensemble? |
A. | it can only be used in classification problem |
B. | it can only be used in regression problem |
C. | it can be used in both classification as well as regression |
D. | none of these |
Answer» C. it can be used in both classification as well as regression |
275. |
How can we assign the weights to output of different models in an ensemble?
|
A. | 1 and 2 |
B. | 1 and 3 |
C. | 2 and 3 |
D. | all of above |
Answer» D. all of above |
276. |
Suppose you are given ‘n’ predictions on test data by ‘n’ different models (M1, M2, …. Mn) respectively. Which of the following method(s) can be used to combine the predictions of these models?
|
A. | 1, 3 and 4 |
B. | 1,3 and 6 |
C. | 1,3, 4 and 6 |
D. | all of above |
Answer» D. all of above |
277. |
In an election, N candidates are competing against each other and people are voting for either of the candidates. Voters don’t communicate with each other while casting their votes. Which of the following ensemble method works similar to above-discussed election procedure? Hint: Persons are like base models of ensemble method. |
A. | bagging |
B. | 1,3 and 6 |
C. | a or b |
D. | none of these |
Answer» A. bagging |
278. |
If you use an ensemble of different base models, is it necessary to tune the hyper parameters of all base models to improve the ensemble performance? |
A. | yes |
B. | no |
C. | can’t say |
Answer» B. no |
279. |
Which of the following is NOT supervised learning? |
A. | pca |
B. | decision tree |
C. | linear regression |
D. | naive bayesian |
Answer» A. pca |
280. |
According to , it's a key success factor for the survival and evolution of all species. |
A. | claude shannon\s theory |
B. | gini index |
C. | darwin's theory |
D. | none of above |
Answer» C. darwin's theory |
281. |
How can you avoid overfitting ? |
A. | by using a lot of data |
B. | by using inductive machine learning |
C. | by using validation only |
D. | none of above |
Answer» A. by using a lot of data |
282. |
What are the popular algorithms of Machine Learning? |
A. | decision trees and neural networks (back propagation) |
B. | probabilistic networks and nearest neighbor |
C. | support vector machines |
D. | all |
Answer» D. all |
283. |
What is Training set? |
A. | training set is used to test the accuracy of the hypotheses generated by the learner. |
B. | a set of data is used to discover the potentially predictive relationship. |
C. | both a & b |
D. | none of above |
Answer» B. a set of data is used to discover the potentially predictive relationship. |
284. |
Common deep learning applications include |
A. | image classification, real-time visual tracking |
B. | autonomous car driving, logistic optimization |
C. | bioinformatics, speech recognition |
D. | all above |
Answer» D. all above |
285. |
what is the function of Supervised Learning? |
A. | classifications, predict time series, annotate strings |
B. | speech recognition, regression |
C. | both a & b |
D. | none of above |
Answer» C. both a & b |
286. |
Commons unsupervised applications include |
A. | object segmentation |
B. | similarity detection |
C. | automatic labeling |
D. | all above |
Answer» D. all above |
287. |
Reinforcement learning is particularly efficient when . |
A. | the environment is not completely deterministic |
B. | it\s often very dynamic |
C. | it\s impossible to have a precise error measure |
D. | all above |
Answer» D. all above |
288. |
if there is only a discrete number of possible outcomes (called categories), the process becomes a . |
A. | regression |
B. | classification. |
C. | modelfree |
D. | categories |
Answer» B. classification. |
289. |
Which of the following are supervised learning applications |
A. | spam detection, pattern detection, natural language processing |
B. | image classification, real-time visual tracking |
C. | autonomous car driving, logistic optimization |
D. | bioinformatics, speech recognition |
Answer» A. spam detection, pattern detection, natural language processing |
290. |
During the last few years, many algorithms have been applied to deep neural networks to learn the best policy for playing Atari video games and to teach an agent how to associate the right action with an input representing the state. |
A. | logical |
B. | classical |
C. | classification |
D. | none of above |
Answer» D. none of above |
291. |
Which of the following sentence is correct? |
A. | machine learning relates with the study, design and |
B. | data mining can be defined as the process in which the |
C. | both a & b |
D. | none of the above |
Answer» C. both a & b |
292. |
What is Overfitting in Machine learning? |
A. | when a statistical model describes random error or noise instead of underlying relationship overfitting occurs. |
B. | robots are programed so that they can perform the task based on data they gather from sensors. |
C. | while involving the process of learning overfitting occurs. |
D. | a set of data is used to discover the potentially predictive relationship |
Answer» A. when a statistical model describes random error or noise instead of underlying relationship overfitting occurs. |
293. |
What is Test set? |
A. | test set is used to test the accuracy of the hypotheses generated by the learner. |
B. | it is a set of data is used to discover the potentially predictive relationship. |
C. | both a & b |
D. | none of above |
Answer» A. test set is used to test the accuracy of the hypotheses generated by the learner. |
294. |
is much more difficult because it's necessary to determine a supervised strategy to train a model for each feature and, finally, to predict their value |
A. | removing the whole line |
B. | creating sub-model to predict those features |
C. | using an automatic strategy to input them according to the other known values |
D. | all above |
Answer» B. creating sub-model to predict those features |
295. |
How it's possible to use a different placeholder through the parameter . |
A. | regression |
B. | classification |
C. | random_state |
D. | missing_values |
Answer» D. missing_values |
296. |
If you need a more powerful scaling feature, with a superior control on outliers and the possibility to select a quantile range, there's also the class . |
A. | robustscaler |
B. | dictvectorizer |
C. | labelbinarizer |
D. | featurehasher |
Answer» A. robustscaler |
297. |
scikit-learn also provides a class for per- sample normalization, Normalizer. It can apply to each element of a dataset |
A. | max, l0 and l1 norms |
B. | max, l1 and l2 norms |
C. | max, l2 and l3 norms |
D. | max, l3 and l4 norms |
Answer» B. max, l1 and l2 norms |
298. |
There are also many univariate methods that can be used in order to select the best features according to specific criteria based on . |
A. | f-tests and p-values |
B. | chi-square |
C. | anova |
D. | all above |
Answer» A. f-tests and p-values |
299. |
Which of the following selects only a subset of features belonging to a certain percentile |
A. | selectpercentile |
B. | featurehasher |
C. | selectkbest |
D. | all above |
Answer» A. selectpercentile |
300. |
performs a PCA with non-linearly separable data sets. |
A. | sparsepca |
B. | kernelpca |
C. | svd |
D. | none of the mentioned |
Answer» B. kernelpca |
Done Studing? Take A Test.
Great job completing your study session! Now it's time to put your knowledge to the test. Challenge yourself, see how much you've learned, and identify areas for improvement. Don’t worry, this is all part of the journey to mastery. Ready for the next step? Take a quiz to solidify what you've just studied.