400+ Data Mining and Data Warehouse Solved MCQs

Take a Test Download as PDF

201.	The ___________is a long, single fibre that originates from the cell body.
A.	axon.
B.	neuron.
C.	dendrites.
D.	strands.
Answer» A. axon.

discuss

202.	A single axon makes ___________ of synapses with other neurons.
A.	ones.
B.	hundreds.
C.	thousands.
D.	millions.
Answer» C. thousands.

discuss

203.	_____________ is a complex chemical process in neural networks.
A.	receiving process.
B.	sending process.
C.	transmission process.
D.	switching process.
Answer» C. transmission process.

discuss

204.	_________ is the connectivity of the neuron that give simple devices their real power. a. b. c. d.
A.	water.
B.	air.
C.	power.
D.	fire.
Answer» D. fire.

discuss

205.	__________ are highly simplified models of biological neurons.
A.	artificial neurons.
B.	computational neurons.
C.	biological neurons.
D.	technological neurons.
Answer» A. artificial neurons.

discuss

206.	The biological neuron's _________ is a continuous function rather than a step function.
A.	read.
B.	write.
C.	output.
D.	input.
Answer» C. output.

discuss

207.	The threshold function is replaced by continuous functions called ________ functions.
A.	activation.
B.	deactivation.
C.	dynamic.
D.	standard.
Answer» A. activation.

discuss

208.	The sigmoid function also knows as __________functions.
A.	regression.
B.	logistic.
C.	probability.
D.	neural.
Answer» B. logistic.

discuss

209.	MLP stands for ______________________.
A.	mono layer perception.
B.	many layer perception.
C.	more layer perception.
D.	multi layer perception.
Answer» D. multi layer perception.

discuss

210.	In a feed- forward networks, the conncetions between layers are ___________ from input to output.
A.	bidirectional.
B.	unidirectional.
C.	multidirectional.
D.	directional.
Answer» B. unidirectional.

discuss

211.	The network topology is constrained to be __________________.
A.	feedforward.
B.	feedbackward.
C.	feed free.
D.	feed busy.
Answer» A. feedforward.

discuss

212.	RBF stands for _____________.
A.	radial basis function.
B.	radial bio function.
C.	radial big function.
D.	radial bi function.
Answer» A. radial basis function.

discuss

213.	RBF have only _______________ hidden layer.
A.	four.
B.	three.
C.	two.
D.	one.
Answer» D. one.

discuss

214.	RBF hidden layer units have a receptive field which has a ____________; that is, a particular input value at which they have a maximal output.
A.	top.
B.	bottom.
C.	centre.
D.	border.
Answer» C. centre.

discuss

215.	___________ training may be used when a clear link between input data sets and target output values does not exist.
A.	competitive.
B.	perception.
C.	supervised.
D.	unsupervised.
Answer» D. unsupervised.

discuss

216.	___________ employs the supervised mode of learning.
A.	rbf.
B.	mlp.
C.	mlp & rbf.
D.	ann.
Answer» C. mlp & rbf.

discuss

217.	________________ design involves deciding on their centres and the sharpness of their Gaussians.
A.	dr.
B.	and.
C.	xor.
D.	rbf.
Answer» D. rbf.

discuss

218.	___________ is the most widely applied neural network technique.
A.	abc.
B.	plm.
C.	lmp.
D.	mlp.
Answer» D. mlp.

discuss

219.	SOM is an acronym of _______________.
A.	self-organizing map.
B.	self origin map.
C.	single organizing map.
D.	simple origin map.
Answer» A. self-organizing map.

discuss

220.	____________ is one of the most popular models in the unsupervised framework.
A.	som.
B.	sam.
C.	osm.
D.	mso.
Answer» A. som.

discuss

221.	The actual amount of reduction at each learning step may be guided by _________.
A.	learning cost.
B.	learning level.
C.	learning rate.
D.	learning time.
Answer» C. learning rate.

discuss

222.	The SOM was a neural network model developed by ________.
A.	simon king.
B.	teuvokohonen.
C.	tomoki toda.
D.	julia.
Answer» B. teuvokohonen.

discuss

223.	SOM was developed during ____________.
A.	1970-80.
B.	1980-90.
C.	1990 -60.
D.	1979 -82.
Answer» D. 1979 -82.

discuss

224.	Investment analysis used in neural networks is to predict the movement of _________ from previous data.
A.	engines.
B.	stock.
C.	patterns.
D.	models.
Answer» B. stock.

discuss

225.	SOMs are used to cluster a specific _____________ dataset containing information about the patient's drugs etc.
A.	physical.
B.	logical.
C.	medical.
D.	technical.
Answer» C. medical.

discuss

226.	GA stands for _______________.
A.	genetic algorithm
B.	gene algorithm.
C.	general algorithm.
D.	geo algorithm.
Answer» A. genetic algorithm

discuss

227.	GA was introduced in the year __________.
A.	1955.
B.	1965.
C.	1975.
D.	1985.
Answer» C. 1975.

discuss

228.	Genetic algorithms are search algorithms based on the mechanics of natural_______.
A.	systems.
B.	genetics.
C.	logistics.
D.	statistics.
Answer» B. genetics.

discuss

229.	GAs were developed in the early _____________.
A.	1970.
B.	1960.
C.	1950.
D.	1940.
Answer» A. 1970.

discuss

230.	The RSES system was developed in ___________.
A.	poland.
B.	italy.
C.	england.
D.	america.
Answer» A. poland.

discuss

231.	Crossover is used to _______.
A.	recombine the population\s genetic material.
B.	introduce new genetic structures in the population.
C.	to modify the population\s genetic material.
D.	all of the above.
Answer» A. recombine the population\s genetic material.

discuss

232.	The mutation operator ______.
A.	recombine the population\s genetic material.
B.	introduce new genetic structures in the population.
C.	to modify the population\s genetic material.
D.	all of the above.
Answer» B. introduce new genetic structures in the population.

discuss

233.	Which of the following is an operation in genetic algorithm?
A.	inversion.
B.	dominance.
C.	genetic edge recombination.
D.	all of the above.
Answer» D. all of the above.

discuss

234.	. ___________ is a system created for rule induction.
A.	rbs.
B.	cbs.
C.	dbs.
D.	lers.
Answer» D. lers.

discuss

235.	NLP stands for _________.
A.	non language process.
B.	nature level program.
C.	natural language page.
D.	natural language processing.
Answer» D. natural language processing.

discuss

236.	Web content mining describes the discovery of useful information from the _______contents.
A.	text.
B.	web.
C.	page.
D.	level.
Answer» B. web.

discuss

237.	Research on mining multi-types of data is termed as _______ data.
A.	graphics.
B.	multimedia.
C.	meta.
D.	digital.
Answer» B. multimedia.

discuss

238.	_______ mining is concerned with discovering the model underlying the link structures of the web.
A.	data structure.
B.	web structure.
C.	text structure.
D.	image structure.
Answer» B. web structure.

discuss

239.	_________ is the way of studying the web link structure.
A.	computer network.
B.	physical network.
C.	social network.
D.	logical network.
Answer» C. social network.

discuss

240.	The ________ propose a measure of standing a node based on path counting.
A.	open web.
B.	close web.
C.	link web.
D.	hidden web.
Answer» B. close web.

discuss

241.	In web mining, _______ is used to find natural groupings of users, pages, etc.
A.	clustering.
B.	associations.
C.	sequential analysis.
D.	classification.
Answer» A. clustering.

discuss

242.	In web mining, _________ is used to know the order in which URLs tend to be accessed.
A.	clustering.
B.	associations.
C.	sequential analysis.
D.	classification.
Answer» C. sequential analysis.

discuss

243.	In web mining, _________ is used to know which URLs tend to be requested together.
A.	clustering.
B.	associations.
C.	sequential analysis.
D.	classification.
Answer» B. associations.

discuss

244.	__________ describes the discovery of useful information from the web contents.
A.	web content mining.
B.	web structure mining.
C.	web usage mining.
D.	all of the above.
Answer» A. web content mining.

discuss

245.	_______ is concerned with discovering the model underlying the link structures of the web.
A.	web content mining.
B.	web structure mining.
C.	web usage mining.
D.	all of the above.
Answer» B. web structure mining.

discuss

246.	The ___________ engine for a data warehouse supports query-triggered usage of data
A.	nntp
B.	smtp
C.	olap
D.	pop
Answer» C. olap

discuss

247.	________ displays of data such as maps, charts and other graphical representation allow data to be presented compactly to the users.
A.	hidden
B.	visual
C.	obscured
D.	concealed
Answer» B. visual

discuss

248.	__________ is a subject-oriented, integrated, time-variant, nonvolatile collection of data in support of management decisions.
A.	data mining.
B.	data warehousing.
C.	web mining.
D.	text mining.
Answer» B. data warehousing.

discuss

249.	The important aspect of the data warehouse environment is that data found within the data warehouse is___________.
A.	subject-oriented.
B.	time-variant.
C.	integrated.
D.	all of the above.
Answer» D. all of the above.

discuss

250.	_________maps the core warehouse metadata to business concepts, familiar and useful to end users.
A.	application level metadata.
B.	user level metadata.
C.	enduser level metadata.
D.	core level metadata.
Answer» A. application level metadata.

discuss

251.	Data redundancy between the environments results in less than ____________percent.
A.	one.
B.	two.
C.	three.
D.	four.
Answer» A. one.

discuss

252.	Bill Inmon has estimated___________of the time required to build a data warehouse, is consumed in the conversion process.
A.	10 percent.
B.	20 percent.
C.	40 percent
D.	80 percent.
Answer» D. 80 percent.

discuss

253.	The biggest drawback of the level indicator in the classic star-schema is that it limits_________
A.	quantify.
B.	qualify.
C.	flexibility.
D.	ability.
Answer» C. flexibility.

discuss

254.	Maintenance of cache consistency is the limitation of __________________.
A.	numa.
B.	unam.
C.	mpp.
D.	pmp.
Answer» C. mpp.

discuss

255.	___________ of data means that the attributes within a given entity are fully dependent on the entire primary key of the entity.
A.	additivity.
B.	granularity.
C.	functional dependency.
D.	dimensionality.
Answer» C. functional dependency.

discuss

256.	Non-additive measures can often combined with additive measures to create new _________..
A.	additive measures.
B.	non-additive measures.
C.	partially additive.
D.	all of the above.
Answer» A. additive measures.

discuss

257.	____________ of data means that the attributes within a given entity are fully dependent on the entire primary key of the entity.
A.	additivity.
B.	granularity.
C.	functional dependency.
D.	dependency.
Answer» C. functional dependency.

discuss

258.	_____________ helps to uncover hidden information about the data..
A.	induction.
B.	compression.
C.	approximation.
D.	summarization.
Answer» C. approximation.

discuss

259.	If T consist of 500000 transactions, 20000 transaction contain bread, 30000 transaction contain jam, 10000 transaction contain both bread and jam. Then the support of bread and jam is _______.
A.	2%
B.	20%
C.	3%
D.	30%
Answer» A. 2%

discuss

260.	7 If T consist of 500000 transactions, 20000 transaction contain bread, 30000 transaction contain jam, 10000 transaction contain both bread and jam. Then the confidence of buying bread with jam is _______.
A.	33.33%
B.	66.66%
C.	45%
D.	50%
Answer» D. 50%

discuss

261.	The _______ step eliminates the extensions of (k-1)-itemsets which are not found to be frequent, from being considered for counting support.
A.	candidate generation.
B.	pruning.
C.	partitioning.
D.	itemset eliminations.
Answer» B. pruning.

discuss

262.	The transformed prefix paths of a node 'a' form a truncated database of pattern which co-occur with a is called _______.
A.	suffix path.
B.	fp-tree.
C.	conditional pattern base.
D.	prefix path.
Answer» C. conditional pattern base.

discuss

263.	__________ clustering techniques starts with all records in one cluster and then try to split that cluster into small pieces.
A.	agglomerative.
B.	divisive.
C.	partition.
D.	numeric.
Answer» B. divisive.

discuss

264.	BIRCH is a ________..
A.	agglomerative clustering algorithm.
B.	hierarchical algorithm.
C.	hierarchical-agglomerative algorithm.
D.	divisive.
Answer» C. hierarchical-agglomerative algorithm.

discuss

265.	The ________ algorithm is based on the observation that the frequent sets are normally very few in number compared to the set of all itemsets.
A.	a priori.
B.	clustering.
C.	association rule.
D.	partition.
Answer» D. partition.

discuss

266.	The basic idea of the apriori algorithm is to generate________ item sets of a particular size & scans the database.
A.	candidate.
B.	primary.
C.	secondary.
D.	superkey.
Answer» A. candidate.

discuss

267.	________is the most well known association rule algorithm and is used in most commercial products.
A.	apriori algorithm.
B.	partition algorithm.
C.	distributed algorithm.
D.	pincer-search algorithm.
Answer» A. apriori algorithm.

discuss

268.	An algorithm called________is used to generate the candidate item sets for each pass after the first.
A.	apriori.
B.	apriori-gen.
C.	sampling.
D.	partition.
Answer» B. apriori-gen.

discuss

269.	___________can be thought of as classifying an attribute value into one of a set of possible classes.
A.	estimation.
B.	prediction.
C.	identification.
D.	clarification.
Answer» B. prediction.

discuss

270.	____________ are a different paradigm for computing which draws its inspiration from neuroscience.
A.	computer networks.
B.	neural networks.
C.	mobile networks.
D.	artificial networks.
Answer» B. neural networks.

discuss

271.	In a feed- forward networks, the conncetions between layers are ___________ from input to output.
A.	bidirectional.
B.	unidirectional.
C.	multidirectional.
D.	directional.
Answer» B. unidirectional.

discuss

272.	___________ training may be used when a clear link between input data sets and target output values does not exist.
A.	competitive.
B.	perception.
C.	supervised.
D.	unsupervised.
Answer» D. unsupervised.

discuss

273.	Investment analysis used in neural networks is to predict the movement of _________ from previous data.
A.	engines.
B.	stock.
C.	patterns.
D.	models.
Answer» B. stock.

discuss

274.	SOMs are used to cluster a specific _____________ dataset containing information about the patient's drugs etc.
A.	physical.
B.	logical.
C.	medical.
D.	technical.
Answer» C. medical.

discuss

275.	_______ is concerned with discovering the model underlying the link structures of the web..
A.	web content mining.
B.	web structure mining.
C.	web usage mining.
D.	all of the above.
Answer» B. web structure mining.

discuss

276.	A link is said to be _________ link if it is between pages with different domain names.
A.	intrinsic.
B.	transverse.
C.	direct.
D.	contrast.
Answer» B. transverse.

discuss

277.	A link is said to be _______ link if it is between pages with the same domain name.
A.	intrinsic.
B.	transverse.
C.	direct.
D.	contrast.
Answer» A. intrinsic.

discuss

278.	Patterns that can be discovered from a given database are which type
A.	more than one type
B.	multiple types always
C.	one type only
D.	no specific type
Answer» A. more than one type

discuss

279.	A snowflake schema is which of the following types of tables?
A.	fact
B.	dimension
C.	helper
D.	all of the above
Answer» D. all of the above

discuss

280.	Which one manages both current and historic transactions?
A.	oltp
B.	olap
C.	spread sheet
D.	xml
Answer» B. olap

discuss

281.	Expansion for DSS in DW is__________.
A.	decision support system
B.	decision single system
C.	data storable system
D.	data support system
Answer» A. decision support system

discuss

282.	__________describes the data contained in the data warehouse
A.	relational data
B.	operational data
C.	meta data
D.	informational data
Answer» C. meta data

discuss

283.	Converting data from different sources into a common format for processing is called as________.
A.	selection.
B.	preprocessing
C.	transformation
D.	interpretation
Answer» C. transformation

discuss

284.	Data warehousing is used in_______________
A.	transaction system
B.	database management system
C.	decision support system
D.	expert system
Answer» C. decision support system

discuss

285.	Data warehouse is based on_____________
A.	two dimensional model
B.	three dimensional model
C.	multi dimensional model
D.	unidimensional model
Answer» C. multi dimensional model

discuss

286.	Multidimensional model of data warehouse called as_____
A.	data structure
B.	table
C.	tree
D.	data cube
Answer» D. data cube

discuss

287.	In data warehousing what is time-variant data?
A.	data in the warehouse is only accurate and valid at some point in time or over time interval
B.	data in the warehouse is always accurate and valid
C.	data in the warehouse is not accurate
D.	data in the warehouse is only accurate sometimes
Answer» A. data in the warehouse is only accurate and valid at some point in time or over time interval

discuss

288.	What is a Star Schema?
A.	a star schema consists of a fact table with a single table for each dimension
B.	a star schema is a type of database system
C.	a star schema is used when exporting data from the database
D.	none of these
Answer» A. a star schema consists of a fact table with a single table for each dimension

discuss

289.	What does the acronym ETL stands for?
A.	explain,transfer and lead
B.	extract,transform and load
C.	extract,transfer and load
D.	effect,transfer and load
Answer» B. extract,transform and load

discuss

290.	Which small logical units do data warehouses hold large amounts of information?
A.	data storage
B.	data marts
C.	access layers
D.	data miners
Answer» B. data marts

discuss

291.	Which one is correct for data warehousing?
A.	it can be updated by end users
B.	it can solve all business questions
C.	it is designed for focus subject areas
D.	it contains only current data
Answer» C. it is designed for focus subject areas

discuss

292.	A fact table is related to dimensional table as a ___ relationship
A.	1:m
B.	m:n
C.	m:1
D.	1:1
Answer» C. m:1

discuss

293.	Minkowski distance is a function used to find the distance between two
A.	binary vectors
B.	boolean-valued vectors
C.	real-valued vectors
D.	categorical vectors
Answer» C. real-valued vectors

discuss

294.	Data set of designation {Professor, Assistant Professor, Associate Professor} is example of__________attribute.
A.	continuous
B.	ordinal
C.	numeric
D.	nominal
Answer» D. nominal

discuss

295.	Identify the correct example of Nominal Attributes.
A.	weight of person in kg
B.	income categories - high, medium, low
C.	mobile number
D.	all above
Answer» B. income categories - high, medium, low

discuss

296.	When objects are represented using single attribute, the proximity value 1 indicates :
A.	objects are similar
B.	objects are dissimilar
C.	not equal
D.	reflexive
Answer» A. objects are similar

discuss

297.	Identity correct equation of Jacard Coefficient:
A.	j= f11/f01+f10+f11
B.	j= f11+f00/f01+f10+f11
C.	j= f11+f00/f01+f10
D.	none of these
Answer» A. j= f11/f01+f10+f11

discuss

298.	What equation we get when r parameter =2 in Minskowski Distance formula?
A.	manhattan distance
B.	euclidean distance
C.	lmaximum distance
D.	all
Answer» B. euclidean distance

discuss

299.	________is a generalization of Manhattan, Euclidean and Max Distance
A.	euclidean distance
B.	minkowski distance
C.	manhattan distance
D.	jaccard distance
Answer» B. minkowski distance

discuss ⁽¹⁾

300.	_________ distance is based on L1 norm.
A.	euclidean distance
B.	minkowski distance
C.	manhattan distance
D.	jaccard distance
Answer» C. manhattan distance

discuss

400+ Data Mining and Data Warehouse Solved MCQs

The ___________is a long, single fibre that originates from the cell body.

A single axon makes ___________ of synapses with other neurons.

_____________ is a complex chemical process in neural networks.

_________ is the connectivity of the neuron that give simple devices their real power. a. b. c. d.

__________ are highly simplified models of biological neurons.

The biological neuron's _________ is a continuous function rather than a step function.

The threshold function is replaced by continuous functions called ________ functions.

The sigmoid function also knows as __________functions.

MLP stands for ______________________.

In a feed- forward networks, the conncetions between layers are ___________ from input to output.

The network topology is constrained to be __________________.

RBF stands for _____________.

RBF have only _______________ hidden layer.

RBF hidden layer units have a receptive field which has a ____________; that is, a particular input value at which they have a maximal output.

___________ training may be used when a clear link between input data sets and target output values does not exist.

___________ employs the supervised mode of learning.

________________ design involves deciding on their centres and the sharpness of their Gaussians.

___________ is the most widely applied neural network technique.

SOM is an acronym of _______________.

____________ is one of the most popular models in the unsupervised framework.

The actual amount of reduction at each learning step may be guided by _________.

The SOM was a neural network model developed by ________.

SOM was developed during ____________.

Investment analysis used in neural networks is to predict the movement of _________ from previous data.

SOMs are used to cluster a specific _____________ dataset containing information about the patient's drugs etc.

GA stands for _______________.

GA was introduced in the year __________.

Genetic algorithms are search algorithms based on the mechanics of natural_______.

GAs were developed in the early _____________.

The RSES system was developed in ___________.

Crossover is used to _______.

The mutation operator ______.

Which of the following is an operation in genetic algorithm?

. ___________ is a system created for rule induction.

NLP stands for _________.

Web content mining describes the discovery of useful information from the _______contents.

Research on mining multi-types of data is termed as _______ data.

_______ mining is concerned with discovering the model underlying the link structures of the web.

_________ is the way of studying the web link structure.

The ________ propose a measure of standing a node based on path counting.

In web mining, _______ is used to find natural groupings of users, pages, etc.

In web mining, _________ is used to know the order in which URLs tend to be accessed.

In web mining, _________ is used to know which URLs tend to be requested together.

__________ describes the discovery of useful information from the web contents.

_______ is concerned with discovering the model underlying the link structures of the web.

The ___________ engine for a data warehouse supports query-triggered usage of data

________ displays of data such as maps, charts and other graphical representation allow data to be presented compactly to the users.

__________ is a subject-oriented, integrated, time-variant, nonvolatile collection of data in support of management decisions.

The important aspect of the data warehouse environment is that data found within the data warehouse is___________.

_________maps the core warehouse metadata to business concepts, familiar and useful to end users.

Data redundancy between the environments results in less than ____________percent.

Bill Inmon has estimated___________of the time required to build a data warehouse, is consumed in the conversion process.

The biggest drawback of the level indicator in the classic star-schema is that it limits_________

Maintenance of cache consistency is the limitation of __________________.

___________ of data means that the attributes within a given entity are fully dependent on the entire primary key of the entity.

Non-additive measures can often combined with additive measures to create new _________..

____________ of data means that the attributes within a given entity are fully dependent on the entire primary key of the entity.

_____________ helps to uncover hidden information about the data..

If T consist of 500000 transactions, 20000 transaction contain bread, 30000 transaction contain jam, 10000 transaction contain both bread and jam. Then the support of bread and jam is _______.

7 If T consist of 500000 transactions, 20000 transaction contain bread, 30000 transaction contain jam, 10000 transaction contain both bread and jam. Then the confidence of buying bread with jam is _______.

The _______ step eliminates the extensions of (k-1)-itemsets which are not found to be frequent, from being considered for counting support.

The transformed prefix paths of a node 'a' form a truncated database of pattern which co-occur with a is called _______.

__________ clustering techniques starts with all records in one cluster and then try to split that cluster into small pieces.

BIRCH is a ________..

The ________ algorithm is based on the observation that the frequent sets are normally very few in number compared to the set of all itemsets.

The basic idea of the apriori algorithm is to generate________ item sets of a particular size & scans the database.

________is the most well known association rule algorithm and is used in most commercial products.

An algorithm called________is used to generate the candidate item sets for each pass after the first.

___________can be thought of as classifying an attribute value into one of a set of possible classes.

____________ are a different paradigm for computing which draws its inspiration from neuroscience.

In a feed- forward networks, the conncetions between layers are ___________ from input to output.

___________ training may be used when a clear link between input data sets and target output values does not exist.

Investment analysis used in neural networks is to predict the movement of _________ from previous data.

SOMs are used to cluster a specific _____________ dataset containing information about the patient's drugs etc.

_______ is concerned with discovering the model underlying the link structures of the web..

A link is said to be _________ link if it is between pages with different domain names.

A link is said to be _______ link if it is between pages with the same domain name.

Patterns that can be discovered from a given database are which type

A snowflake schema is which of the following types of tables?

7 If T consist of 500000 transactions, 20000 transaction contain bread, 30000 transaction contain
jam, 10000 transaction contain both bread and jam. Then the confidence of buying bread with jam is
_______.