CCA-500 Exam - Free Cloudera Questions and Answers

Question #6 (Topic: )

A slave node in yourcluster has 4 TB hard drives installed (4 x 2TB). The DataNode is
configured to store HDFS blocks on all disks. You set the value of the
dfs.datanode.du.reserved parameter to 100 GB. How does this alter HDFS block storage?

A. 25GB on each hard drive maynot be used to store HDFS blocks B. 100GB on each hard drive may not be used to store HDFS blocks C. All hard drives may be used to store HDFS blocks as long as at least 100 GB in total is available on the node D. A maximum if 100 GB on each hard drive maybe used to store HDFS blocks

Answer: C

Question #7 (Topic: )

You want to understand more about how users browse your public website. For example,
you want to know which pages they visit prior to placing an order. You have a server farm
of 200 web servers hosting your website. Which is the most efficient process to gather
these web server across logs into your Hadoop cluster analysis?

A. Sample the web server logs web servers and copy them into HDFS using curl B. Ingest the server web logs into HDFS using Flume C. Channel these clickstreams into Hadoop using Hadoop Streaming D. Import all user clicks from your OLTP databasesinto Hadoop using Sqoop E. Write a MapReeeduce job with the web servers for mappers and the Hadoop cluster nodes for reducers

Answer: B

Question #8 (Topic: )

Your Hadoop cluster is configuring with HDFS and MapReduce version 2 (MRv2) on
YARN. Can you configure a worker node to run a NodeManager daemon but not a
DataNode daemon and still have a functional cluster?

A. Yes. The daemon will receive data from the NameNode to run Map tasks B. Yes. The daemon will get data from another (non-local) DataNode to run Map tasks C. Yes. The daemon will receive Map tasks only D. Yes. The daemon will receive Reducer tasks only

Answer: B

Question #9 (Topic: )

You have recently converted your Hadoop cluster from a MapReduce 1 (MRv1)
architecture to MapReduce 2 (MRv2) on YARN architecture. Your developers are
accustomed to specifying map and reduce tasks (resource allocation) tasks when they run
jobs: A developer wants to know how specify to reduce tasks when a specific job runs.
Which method should you tell that developers to implement?

A. MapReduce version 2 (MRv2) on YARN abstracts resource allocation away from the idea of tasks into memory and virtual cores, thus eliminating the need for a developer to specify the number of reduce tasks, and indeed preventing the developer from specifying the number of reduce tasks. B. InYARN, resource allocations is a function of megabytes of memory in multiples of 1024mb. Thus, they should specify the amount of memory resource they need by executing D mapreduce-reduces.memory-mb-2048 C. In YARN, the ApplicationMaster is responsible forrequesting the resource required for a specific launch. Thus, executing D yarn.applicationmaster.reduce.tasks=2 will specify that the ApplicationMaster launch two task contains on the worker nodes. D. Developers specify reduce tasks in the exact same wayfor both MapReduce version 1 (MRv1) and MapReduce version 2 (MRv2) on YARN. Thus, executing D mapreduce.job.reduces-2 will specify reduce tasks. E. In YARN, resource allocation is function of virtual cores specified by the ApplicationManager making requests to the NodeManager where a reduce task is handeled by a single container (and thus a single virtual core). Thus, the developer needs to specify the number of virtual cores to the NodeManager by executing p yarn.nodemanager.cpu-vcores=2

Answer: D

Question #10 (Topic: )

You have A 20 node Hadoop cluster, with 18 slave nodes and 2 master nodes running
HDFS High Availability (HA). You want to minimize the chance of data loss in your cluster.
What should you do?

A. Add another master node to increase the number of nodes running the JournalNode which increases the number of machines available to HA to create a quorum B. Set an HDFS replication factor that provides data redundancy, protecting against node failure C. Run a Secondary NameNode on a different master from the NameNode in order to provide automatic recovery from a NameNode failure. D. Run the ResourceManager on a different master from the NameNode in order to load- share HDFS metadata processing E. Configure thecluster’s disk drives with an appropriate fault tolerant RAID level

Answer: D

Cloudera CCA-500 - Cloudera Certified Administrator for Apache Hadoop (CCAH) Exam