Microsoft 70-775 - Perform Data Engineering on Microsoft Azure HDInsight Exam

Question #6 (Topic: )
Note: This question is part of a series of questions that present the same Scenario.
Each question I the series contains a unique solution that might meet the stated
goals. Some question sets might have more than one correct solution while others
might not have correct solution.
will have a custom Apache Ambari
configuration.
The cluster will be joined to a domain and must perform the following:
* Fast data analytics and cluster computing by using in memory processing.
* Interactive queries and micro-batch stream processing
What should you do?
A. Use an Azure PowerShell Script to create and configure a premium HDInsight cluster. Specify Apache Hadoop as the cluster type and use Linux as the operating System. B. Use the Azure portal to create a standard HDInsight cluster. Specify Apache Spark as the cluster type and use Linux as the operating system. C. Use an Azure PowerShell script to create a standard HDInsight cluster. Specify Apache HBase as the cluster type and use Windows as the operating system. D. Use an Azure PowerShell script to create a standard HDInsight cluster. Specify Apache Storm as the cluster type and use Windows as the operating system. E. Use an Azure PowerShell script to create a premium HDInsight cluster. Specify Apache HBase as the cluster type and use Windows as the operating system. F. Use an Azure portal to create a standard HDInsight cluster. Specify Apache Interactive Hive as the cluster type and use Windows as the operating system. G. Use an Azure portal to create a standard HDInsight cluster. Specify Apache HBase as the cluster type and use Windows as the operating system
Answer: D
Question #7 (Topic: )
You are configuring the Hive views on an Azure HDInsight cluster that is configured to use
Kerberos.
You plan to use the YARN loos to troubleshoot a query that runs against Apache Hadoop.
You need to view the method, the service, and the authenticated account used to run the
query. Which method call should you view in the YARN logs?
A. HQL B. WebHDFS C. HDFS C* API D. Ambari REST API
Answer: D
Question #8 (Topic: )
Note: This question is part of a series of questions that present the same Scenario.
Each question I the series contains a unique solution that might meet the stated
goals. Some question sets might have more than one correct solution while others
might not have correct solution.
Start of Repeated Scenario:
You are planning a big data infrastructure by using an Apache Spark Cluster in Azure
HDInsight. The cluster has 24 processor cores and 512 GB of memory.
The Architecture of the infrastructure is shown in the exhibit:
[Microsoft-70-775-7.0/Microsoft-70-775-7_2.png]
The architecture will be used by the following users:
* Support analysts who run applications that will use REST to submit Spark jobs.
* Business analysts who use JDBC and ODBC client applications from a real-time view.
The business analysts run monitoring quires to access aggregate result for 15 minutes.
The result will be referenced by subsequent quires.
* Data analysts who publish notebooks drawn from batch layer, serving layer and speed
layer queries. All of the notebooks must support native interpreters for data sources that
are bath processed. The serving layer queries are written in Apache Hive and must support
multiple sessions. Unique GUIDs are used across the data sources, which allow the data
analysts to use Spark SQL.
The data sources in the batch layer share a common storage container. The Following data
sources are used:
* Hive for sales data
* Apache HBase for operations data
* HBase for logistics data by suing a single region server.
End of Repeated scenario.
You need to ensure that the analysts can query the logistics data by using JDBC APIs and
SQL APIs. Which technology should you implement?
A. Apache Phoenix B. Apache Spark C. Apache Storm D. Apache Hive
Answer: D
Question #9 (Topic: )
Note: This question is part of a series of questions that present the same Scenario.
Each question I the series contains a unique solution that might meet the stated
goals. Some question sets might have more than one correct solution while others
might not have correct solution.
You are implementing a batch processing solution by using Azure HDInsight.
You plan to import 300 TB of data.
You plan to use one job that has many concurrent tasks to import the data in memory.
You need to maximize the amount of concurrent tanks for the job.
What should you do?
A. Use a shuffle join in an Apache Hive query that stores the data in a JSON format. B. Use a broadcast join in an Apache Hive query that stores the data in an ORC format. C. Increase the number of spark.executor.cores in an Apache Spark job that stores the data in a text format. D. Increase the number of spark.executor.instances in an Apache Spark job that stores the data in a text format. E. Decrease the level of parallelism in an Apache Spark job that Mores the data in a text format. F. Use an action in an Apache Oozie workflow that stores the data in a text format. Azure Data Factory linked service that stores the data in Azure Data lake. Azure DocumentDB database.
Answer: A
Question #10 (Topic: )
You have a domain joined Apache Hadoop cluster in Azure HDInsight named hdicluster.
The Linux account for hdicluster is named Inxuser.
kam.com. You need to run Hadoop
SSH session.
Which credentials should you use?
[Microsoft-70-775-7.0/Microsoft-70-775-9_2.png]
Answer: [Microsoft-70-775-7.0/Microsoft-70-775-9_3.png]
Download Exam
Page: 2 / 7
Total 35 questions