430+ High Performance Computing (HPC) Solved MCQs

These multiple-choice questions (MCQs) are designed to enhance your knowledge and understanding in the following areas: Computer Science Engineering (CSE) .

Take a Test

1.	A CUDA program is comprised of two primary components: a host and a _____.
A.	gpu??kernel
B.	cpu??kernel
C.	os
D.	none of above
Answer» A. gpu??kernel

discuss

2.	The kernel code is dentified by the ________qualifier with void return type
A.	_host_
B.	__global__??
C.	_device_
D.	void
Answer» B. __global__??

discuss

3.	The kernel code is only callable by the host
A.	true
B.	false
Answer» A. true

discuss

4.	The kernel code is executable on the device and host
A.	true
B.	false
Answer» B. false

discuss

5.	Calling a kernel is typically referred to as _________.
A.	kernel thread
B.	kernel initialization
C.	kernel termination
D.	kernel invocation
Answer» D. kernel invocation

discuss

6.	Host codes in a CUDA application can Initialize a device
A.	true
B.	false
Answer» A. true

discuss

7.	Host codes in a CUDA application can Allocate GPU memory
A.	true
B.	false
Answer» A. true

discuss

8.	Host codes in a CUDA application can not Invoke kernels
A.	true
B.	false
Answer» B. false

discuss

9.	CUDA offers the Chevron Syntax to configure and execute a kernel.
A.	true
B.	false
Answer» A. true

discuss

10.	the BlockPerGrid and ThreadPerBlock parameters are related to the ________ model supported by CUDA.
A.	host
B.	kernel
C.	thread??abstraction
D.	none of above
Answer» C. thread??abstraction

discuss

11.	_________ is Callable from the device only
A.	_host_
B.	__global__??
C.	_device_
D.	none of above
Answer» C. _device_

discuss

12.	______ is Callable from the host
A.	_host_
B.	__global__??
C.	_device_
D.	none of above
Answer» B. __global__??

discuss

13.	______ is Callable from the host
A.	_host_
B.	__global__
C.	_device_
D.	none of above
Answer» A. _host_

discuss ⁽¹⁾

14.	CUDA supports ____________ in which code in a single thread is executed by all other threads.
A.	tread division
B.	tread termination
C.	thread abstraction
D.	none of above
Answer» C. thread abstraction

discuss

15.	In CUDA, a single invoked kernel is referred to as a _____.
A.	block
B.	tread
C.	grid
D.	none of above
Answer» C. grid

discuss

16.	A grid is comprised of ________ of threads.
A.	block
B.	bunch
C.	host
D.	none of above
Answer» A. block

discuss

17.	A block is comprised of multiple _______.
A.	treads
B.	bunch
C.	host
D.	none of above
Answer» A. treads

discuss

18.	a solution of the problem in representing the parallelismin algorithm is
A.	cud
B.	pta
C.	cda
D.	cuda
Answer» D. cuda

discuss

19.	Host codes in a CUDA application can not Reset a device
A.	true
B.	false
Answer» B. false

discuss

20.	Host codes in a CUDA application can Transfer data to and from the device
A.	true
B.	false
Answer» A. true

discuss

21.	Host codes in a CUDA application can not Deallocate memory on the GPU
A.	true
B.	false
Answer» B. false

discuss

22.	Any condition that causes a processor to stall is called as _____.
A.	hazard
B.	page fault
C.	system error
D.	none of the above
Answer» A. hazard

discuss

23.	The time lost due to branch instruction is often referred to as _____.
A.	latency
B.	delay
C.	branch penalty
D.	none of the above
Answer» C. branch penalty

discuss

24.	_____ method is used in centralized systems to perform out of order execution.
A.	scorecard
B.	score boarding
C.	optimizing
D.	redundancy
Answer» B. score boarding

discuss

25.	The computer cluster architecture emerged as an alternative for ____.
A.	isa
B.	workstation
C.	super computers
D.	distributed systems
Answer» C. super computers

discuss

26.	NVIDIA CUDA Warp is made up of how many threads?
A.	512
B.	1024
C.	312
D.	32
Answer» D. 32

discuss

27.	Out-of-order instructions is not possible on GPUs.
A.	true
B.	false
C.	--
D.	--
Answer» B. false

discuss

28.	CUDA supports programming in ....
A.	c or c++ only
B.	java, python, and more
C.	c, c++, third party wrappers for java, python, and more
D.	pascal
Answer» C. c, c++, third party wrappers for java, python, and more

discuss

29.	FADD, FMAD, FMIN, FMAX are ----- supported by Scalar Processors of NVIDIA GPU.
A.	32-bit ieee floating point instructions
B.	32-bit integer instructions
C.	both
D.	none of the above
Answer» A. 32-bit ieee floating point instructions

discuss

30.	Each streaming multiprocessor (SM) of CUDA herdware has ------ scalar processors (SP).
A.	1024
B.	128
C.	512
D.	8
Answer» D. 8

discuss

31.	Each NVIDIA GPU has ------ Streaming Multiprocessors
A.	8
B.	1024
C.	512
D.	16
Answer» D. 16

discuss

32.	CUDA provides ------- warp and thread scheduling. Also, the overhead of thread creation is on the order of ----.
A.	“programming-overhead”, 2 clock
B.	“zero-overhead”, 1 clock
C.	64, 2 clock
D.	32, 1 clock
Answer» B. “zero-overhead”, 1 clock

discuss

33.	Each warp of GPU receives a single instruction and “broadcasts” it to all of its threads. It is a ---- operation.
A.	simd (single instruction multiple data)
B.	simt (single instruction multiple thread)
C.	sisd (single instruction single data)
D.	sist (single instruction single thread)
Answer» B. simt (single instruction multiple thread)

discuss

34.	Limitations of CUDA Kernel
A.	recursion, call stack, static variable declaration
B.	no recursion, no call stack, no static variable declarations
C.	recursion, no call stack, static variable declaration
D.	no recursion, call stack, no static variable declarations
Answer» B. no recursion, no call stack, no static variable declarations

discuss

35.	What is Unified Virtual Machine
A.	it is a technique that allow both cpu and gpu to read from single virtual machine, simultaneously.
B.	it is a technique for managing separate host and device memory spaces.
C.	it is a technique for executing device code on host and host code on device.
D.	it is a technique for executing general purpose programs on device instead of host.
Answer» A. it is a technique that allow both cpu and gpu to read from single virtual machine, simultaneously.

discuss

36.	_______ became the first language specifically designed by a GPU Company to facilitate general purpose computing on ____.
A.	python, gpus.
B.	c, cpus.
C.	cuda c, gpus.
D.	java, cpus.
Answer» C. cuda c, gpus.

discuss

37.	The CUDA architecture consists of --------- for parallel computing kernels and functions.
A.	risc instruction set architecture
B.	cisc instruction set architecture
C.	zisc instruction set architecture
D.	ptx instruction set architecture
Answer» D. ptx instruction set architecture

discuss

38.	CUDA stands for --------, designed by NVIDIA.
A.	common union discrete architecture
B.	complex unidentified device architecture
C.	compute unified device architecture
D.	complex unstructured distributed architecture
Answer» C. compute unified device architecture

discuss

39.	The host processor spawns multithread tasks (or kernels as they are known in CUDA) onto the GPU device. State true or false.
A.	true
B.	false
C.	---
D.	---
Answer» A. true

discuss

40.	The NVIDIA G80 is a ---- CUDA core device, the NVIDIA G200 is a ---- CUDA core device, and the NVIDIA Fermi is a ---- CUDA core device.
A.	128, 256, 512
B.	32, 64, 128
C.	64, 128, 256
D.	256, 512, 1024
Answer» A. 128, 256, 512

discuss

41.	NVIDIA 8-series GPUs offer -------- .
A.	50-200 gflops
B.	200-400 gflops
C.	400-800 gflops
D.	800-1000 gflops
Answer» A. 50-200 gflops

discuss

42.	IADD, IMUL24, IMAD24, IMIN, IMAX are ----------- supported by Scalar Processors of NVIDIA GPU.
A.	32-bit ieee floating point instructions
B.	32-bit integer instructions
C.	both
D.	none of the above
Answer» B. 32-bit integer instructions

discuss

43.	CUDA Hardware programming model supports: a) fully generally data-parallel archtecture; b) General thread launch; c) Global load-store; d) Parallel data cache; e) Scalar architecture; f) Integers, bit operation
A.	a,c,d,f
B.	b,c,d,e
C.	a,d,e,f
D.	a,b,c,d,e,f
Answer» D. a,b,c,d,e,f

discuss

44.	In CUDA memory model there are following memory types available: a) Registers; b) Local Memory; c) Shared Memory; d) Global Memory; e) Constant Memory; f) Texture Memory.
A.	a, b, d, f
B.	a, c, d, e, f
C.	a, b, c, d, e, f
D.	b, c, e, f
Answer» C. a, b, c, d, e, f

discuss

45.	What is the equivalent of general C program with CUDA C: int main(void) { printf("Hello, World!\n"); return 0; }
A.	int main ( void ) { kernel <<<1,1>>>(); printf("hello, world!\\n"); return 0; }
B.	__global__ void kernel( void ) { } int main ( void ) { kernel <<<1,1>>>(); printf("hello, world!\\n"); return 0; }
C.	__global__ void kernel( void ) { kernel <<<1,1>>>(); printf("hello, world!\\n"); return 0; }
D.	__global__ int main ( void ) { kernel <<<1,1>>>(); printf("hello, world!\\n"); return 0; }
Answer» B. __global__ void kernel( void ) { } int main ( void ) { kernel <<<1,1>>>(); printf("hello, world!\\n"); return 0; }

discuss

46.	Which function runs on Device (i.e. GPU): a) __global__ void kernel (void ) { } b) int main ( void ) { ... return 0; }
A.	a
B.	b
C.	both a,b
D.	---
Answer» A. a

discuss

47.	A simple kernel for adding two integers: __global__ void add( int a, int b, int c ) { c = a + b; } where __global__ is a CUDA C keyword which indicates that:
A.	add() will execute on device, add() will be called from host
B.	add() will execute on host, add() will be called from device
C.	add() will be called and executed on host
D.	add() will be called and executed on device
Answer» A. add() will execute on device, add() will be called from host

discuss

48.	If variable a is host variable and dev_a is a device (GPU) variable, to allocate memory to dev_a select correct statement:
A.	cudamalloc( &dev_a, sizeof( int ) )
B.	malloc( &dev_a, sizeof( int ) )
C.	cudamalloc( (void**) &dev_a, sizeof( int ) )
D.	malloc( (void**) &dev_a, sizeof( int ) )
Answer» C. cudamalloc( (void**) &dev_a, sizeof( int ) )

discuss

49.	If variable a is host variable and dev_a is a device (GPU) variable, to copy input from variable a to variable dev_a select correct statement:
A.	memcpy( dev_a, &a, size);
B.	cudamemcpy( dev_a, &a, size, cudamemcpyhosttodevice );
C.	memcpy( (void*) dev_a, &a, size);
D.	cudamemcpy( (void*) &dev_a, &a, size, cudamemcpydevicetohost );
Answer» B. cudamemcpy( dev_a, &a, size, cudamemcpyhosttodevice );

discuss

50.	Triple angle brackets mark in a statement inside main function, what does it indicates?
A.	a call from host code to device code
B.	a call from device code to host code
C.	less than comparison
D.	greater than comparison
Answer» A. a call from host code to device code

discuss

430+ High Performance Computing (HPC) Solved MCQs

A CUDA program is comprised of two primary components: a host and a _____.

The kernel code is dentified by the ________qualifier with void return type

The kernel code is only callable by the host

The kernel code is executable on the device and host

Calling a kernel is typically referred to as _________.

Host codes in a CUDA application can Initialize a device

Host codes in a CUDA application can Allocate GPU memory

Host codes in a CUDA application can not Invoke kernels

CUDA offers the Chevron Syntax to configure and execute a kernel.

the BlockPerGrid and ThreadPerBlock parameters are related to the ________ model supported by CUDA.

_________ is Callable from the device only

______ is Callable from the host

______ is Callable from the host

CUDA supports ____________ in which code in a single thread is executed by all other threads.

In CUDA, a single invoked kernel is referred to as a _____.

A grid is comprised of ________ of threads.

A block is comprised of multiple _______.

a solution of the problem in representing the parallelismin algorithm is

Host codes in a CUDA application can not Reset a device

Host codes in a CUDA application can Transfer data to and from the device

Host codes in a CUDA application can not Deallocate memory on the GPU

Any condition that causes a processor to stall is called as _____.

The time lost due to branch instruction is often referred to as _____.

_____ method is used in centralized systems to perform out of order execution.

The computer cluster architecture emerged as an alternative for ____.

NVIDIA CUDA Warp is made up of how many threads?

Out-of-order instructions is not possible on GPUs.

CUDA supports programming in ....

FADD, FMAD, FMIN, FMAX are ----- supported by Scalar Processors of NVIDIA GPU.

Each streaming multiprocessor (SM) of CUDA herdware has ------ scalar processors (SP).

Each NVIDIA GPU has ------ Streaming Multiprocessors

CUDA provides ------- warp and thread scheduling. Also, the overhead of thread creation is on the order of ----.

Each warp of GPU receives a single instruction and “broadcasts” it to all of its threads. It is a ---- operation.

Limitations of CUDA Kernel

What is Unified Virtual Machine

_______ became the first language specifically designed by a GPU Company to facilitate general purpose computing on ____.

The CUDA architecture consists of --------- for parallel computing kernels and functions.

CUDA stands for --------, designed by NVIDIA.

The host processor spawns multithread tasks (or kernels as they are known in CUDA) onto the GPU device. State true or false.

The NVIDIA G80 is a ---- CUDA core device, the NVIDIA G200 is a ---- CUDA core device, and the NVIDIA Fermi is a ---- CUDA core device.

NVIDIA 8-series GPUs offer -------- .

IADD, IMUL24, IMAD24, IMIN, IMAX are ----------- supported by Scalar Processors of NVIDIA GPU.

CUDA Hardware programming model supports: a) fully generally data-parallel archtecture; b) General thread launch; c) Global load-store; d) Parallel data cache; e) Scalar architecture; f) Integers, bit operation

In CUDA memory model there are following memory types available: a) Registers; b) Local Memory; c) Shared Memory; d) Global Memory; e) Constant Memory; f) Texture Memory.

What is the equivalent of general C program with CUDA C: int main(void) { printf("Hello, World!\n"); return 0; }

Which function runs on Device (i.e. GPU): a) __global__ void kernel (void ) { } b) int main ( void ) { ... return 0; }

A simple kernel for adding two integers: __global__ void add( int *a, int *b, int *c ) { *c = *a + *b; } where __global__ is a CUDA C keyword which indicates that:

If variable a is host variable and dev_a is a device (GPU) variable, to allocate memory to dev_a select correct statement:

If variable a is host variable and dev_a is a device (GPU) variable, to copy input from variable a to variable dev_a select correct statement:

Triple angle brackets mark in a statement inside main function, what does it indicates?

___ became the first language specifically designed by a GPU Company to facilitate general purpose computing on .

CUDA Hardware programming model supports:
a) fully generally data-parallel archtecture;
b) General thread launch;
c) Global load-store;
d) Parallel data cache;
e) Scalar architecture;
f) Integers, bit operation

In CUDA memory model there are following memory types available:
a) Registers;
b) Local Memory;
c) Shared Memory;
d) Global Memory;
e) Constant Memory;
f) Texture Memory.

Which function runs on Device (i.e. GPU): a) global void kernel (void ) { } b) int main ( void ) { ... return 0; }

A simple kernel for adding two integers: global void add( int a, int b, int c ) { c = a + b; } where global is a CUDA C keyword which indicates that: