Designing an Azure Data Solution v1.0 (DP-201)

Page:    1 / 11   
Total 154 questions

Case study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to complete each case. However, there may be additional case studies and sections on this exam. You must manage your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the case study. Case studies might contain exhibits and other resources that provide more information about the scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to make changes before you move to the next section of the exam. After you begin a new section, you cannot return to this section.

To start the case study -
To display the first question in this case study, click the Next button. Use the buttons in the left pane to explore the content of the case study before you answer the questions. Clicking these buttons displays information such as business requirements, existing environment, and problem statements. If the case study has an All Information tab, note that the information displayed is identical to the information displayed on the subsequent tabs. When you are ready to answer a question, click the Question button to return to the question.

Background -
Trey Research is a technology innovator. The company partners with regional transportation department office to build solutions that improve traffic flow and safety.
The company is developing the following solutions:


Regional transportation departments installed traffic sensor systems on major highways across North America. Sensors record the following information each time a vehicle passes in front of a sensor:
Time
Location in latitude and longitude
Speed in kilometers per second (kmps)
License plate number
Length of vehicle in meters
Sensors provide data by using the following structure:

Traffic sensors will occasionally capture an image of a vehicle for debugging purposes.
You must optimize performance of saving/storing vehicle images.

Traffic sensor data -
Sensors must have permission only to add items to the SensorData collection.
Traffic data insertion rate must be maximized.
Once every three months all traffic sensor data must be analyzed to look for data patterns that indicate sensor malfunctions.
Sensor data must be stored in a Cosmos DB named treydata in a collection named SensorData
The impact of vehicle images on sensor data throughout must be minimized.

Backtrack -
This solution reports on all data related to a specific vehicle license plate. The report must use data from the SensorData collection. Users must be able to filter vehicle data in the following ways:
vehicles on a specific road
vehicles driving above the speed limit

Planning Assistance -
Data used for Planning Assistance must be stored in a sharded Azure SQL Database.
Data from the Sensor Data collection will automatically be loaded into the Planning Assistance database once a week by using Azure Data Factory. You must be able to manually trigger the data load process.

Privacy and security policy -
Azure Active Directory must be used for all services where it is available.
For privacy reasons, license plate number information must not be accessible in Planning Assistance.
Unauthorized usage of the Planning Assistance data must be detected as quickly as possible. Unauthorized usage is determined by looking for an unusual pattern of usage.
Data must only be stored for seven years.

Performance and availability -
The report for Backtrack must execute as quickly as possible.
The SLA for Planning Assistance is 70 percent, and multiday outages are permitted.
All data must be replicated to multiple geographic regions to prevent data loss.
You must maximize the performance of the Real Time Response system.

Financial requirements -
Azure resource costs must be minimized where possible.


HOTSPOT -
You need to ensure that emergency road response vehicles are dispatched automatically.
How should you design the processing system? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:



Answer :

Explanation:

Box1: API App -


1. Events generated from the IoT data sources are sent to the stream ingestion layer through Azure HDInsight Kafka as a stream of messages. HDInsight Kafka stores streams of data in topics for a configurable of time.
2. Kafka consumer, Azure Databricks, picks up the message in real time from the Kafka topic, to process the data based on the business logic and can then send to Serving layer for storage.
3. Downstream storage services, like Azure Cosmos DB, Azure Synapse Analytics, or Azure SQL DB, will then be a data source for presentation and action layer.
4. Business analysts can use Microsoft Power BI to analyze warehoused data. Other applications can be built upon the serving layer as well. For example, we can expose APIs based on the service layer data for third party uses.

Box 2: Cosmos DB Change Feed -
Change feed support in Azure Cosmos DB works by listening to an Azure Cosmos DB container for any changes. It then outputs the sorted list of documents that were changed in the order in which they were modified.
The change feed in Azure Cosmos DB enables you to build efficient and scalable solutions for each of these patterns, as shown in the following image:

Reference:
https://docs.microsoft.com/bs-cyrl-ba/azure/architecture/example-scenario/data/realtime-analytics-vehicle-iot?view=azurermps-4.4.1

Case study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to complete each case. However, there may be additional case studies and sections on this exam. You must manage your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the case study. Case studies might contain exhibits and other resources that provide more information about the scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to make changes before you move to the next section of the exam. After you begin a new section, you cannot return to this section.

To start the case study -
To display the first question in this case study, click the Next button. Use the buttons in the left pane to explore the content of the case study before you answer the questions. Clicking these buttons displays information such as business requirements, existing environment, and problem statements. If the case study has an All Information tab, note that the information displayed is identical to the information displayed on the subsequent tabs. When you are ready to answer a question, click the Question button to return to the question.

Background -
Trey Research is a technology innovator. The company partners with regional transportation department office to build solutions that improve traffic flow and safety.
The company is developing the following solutions:


Regional transportation departments installed traffic sensor systems on major highways across North America. Sensors record the following information each time a vehicle passes in front of a sensor:
Time
Location in latitude and longitude
Speed in kilometers per second (kmps)
License plate number
Length of vehicle in meters
Sensors provide data by using the following structure:

Traffic sensors will occasionally capture an image of a vehicle for debugging purposes.
You must optimize performance of saving/storing vehicle images.

Traffic sensor data -
Sensors must have permission only to add items to the SensorData collection.
Traffic data insertion rate must be maximized.
Once every three months all traffic sensor data must be analyzed to look for data patterns that indicate sensor malfunctions.
Sensor data must be stored in a Cosmos DB named treydata in a collection named SensorData
The impact of vehicle images on sensor data throughout must be minimized.

Backtrack -
This solution reports on all data related to a specific vehicle license plate. The report must use data from the SensorData collection. Users must be able to filter vehicle data in the following ways:
vehicles on a specific road
vehicles driving above the speed limit

Planning Assistance -
Data used for Planning Assistance must be stored in a sharded Azure SQL Database.
Data from the Sensor Data collection will automatically be loaded into the Planning Assistance database once a week by using Azure Data Factory. You must be able to manually trigger the data load process.

Privacy and security policy -
Azure Active Directory must be used for all services where it is available.
For privacy reasons, license plate number information must not be accessible in Planning Assistance.
Unauthorized usage of the Planning Assistance data must be detected as quickly as possible. Unauthorized usage is determined by looking for an unusual pattern of usage.
Data must only be stored for seven years.

Performance and availability -
The report for Backtrack must execute as quickly as possible.
The SLA for Planning Assistance is 70 percent, and multiday outages are permitted.
All data must be replicated to multiple geographic regions to prevent data loss.
You must maximize the performance of the Real Time Response system.

Financial requirements -
Azure resource costs must be minimized where possible.


DRAG DROP -
You need to ensure that performance requirements for Backtrack reports are met.
What should you recommend? To answer, drag the appropriate technologies to the correct locations. Each technology may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.
Select and Place:



Answer :

Explanation:

Box 1: Cosmos DB indexes -
The report for Backtrack must execute as quickly as possible.
You can override the default indexing policy on an Azure Cosmos container, this could be useful if you want to tune the indexing precision to improve the query performance or to reduce the consumed storage.

Box 2: Cosmos DB TTL -
This solution reports on all data related to a specific vehicle license plate. The report must use data from the SensorData collection. Users must be able to filter vehicle data in the following ways:
-> vehicles on a specific road
-> vehicles driving above the speed limit
Note: With Time to Live or TTL, Azure Cosmos DB provides the ability to delete items automatically from a container after a certain time period. By default, you can set time to live at the container level and override the value on a per-item basis. After you set the TTL at a container or at an item level, Azure Cosmos DB will automatically remove these items after the time period, since the time they were last modified.
Incorrect Answers:
Cosmos DB stored procedures: Stored procedures are best suited for operations that are write heavy. When deciding where to use stored procedures, optimize around encapsulating the maximum amount of writes possible. Generally speaking, stored procedures are not the most efficient means for doing large numbers of read operations so using stored procedures to batch large numbers of reads to return to the client will not yield the desired benefit.
Reference:
https://docs.microsoft.com/en-us/azure/cosmos-db/index-policy https://docs.microsoft.com/en-us/azure/cosmos-db/time-to-live
Design data processing solutions

Case study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to complete each case. However, there may be additional case studies and sections on this exam. You must manage your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the case study. Case studies might contain exhibits and other resources that provide more information about the scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to make changes before you move to the next section of the exam. After you begin a new section, you cannot return to this section.

To start the case study -
To display the first question in this case study, click the Next button. Use the buttons in the left pane to explore the content of the case study before you answer the questions. Clicking these buttons displays information such as business requirements, existing environment, and problem statements. If the case study has an All Information tab, note that the information displayed is identical to the information displayed on the subsequent tabs. When you are ready to answer a question, click the Question button to return to the question.

Overview -
You develop data engineering solutions for Graphics Design Institute, a global media company with offices in New York City, Manchester, Singapore, and
Melbourne.
The New York office hosts SQL Server databases that stores massive amounts of customer data. The company also stores millions of images on a physical server located in the New York office. More than 2 TB of image data is added each day. The images are transferred from customer devices to the server in New
York.
Many images have been placed on this server in an unorganized manner, making it difficult for editors to search images. Images should automatically have object and color tags generated. The tags must be stored in a document database, and be queried by SQL.
You are hired to design a solution that can store, transform, and visualize customer data.

Requirements -

Business -
The company identifies the following business requirements:
You must transfer all images and customer data to cloud storage and remove on-premises servers.
You must develop an analytical processing solution for transforming customer data.
You must develop an image object and color tagging solution.
Capital expenditures must be minimized.
Cloud resource costs must be minimized.

Technical -
The solution has the following technical requirements:
Tagging data must be uploaded to the cloud from the New York office location.
Tagging data must be replicated to regions that are geographically close to company office locations.
Image data must be stored in a single data store at minimum cost.
Customer data must be analyzed using managed Spark clusters.
Power BI must be used to visualize transformed customer data.
All data must be backed up in case disaster recovery is required.

Security and optimization -
All cloud data must be encrypted at rest and in transit. The solution must support:
parallel processing of customer data
hyper-scale storage of images
global region data replication of processed image data


DRAG DROP -
You need to design the image processing solution to meet the optimization requirements for image tag data.
What should you configure? To answer, drag the appropriate setting to the correct drop targets.
Each source may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.
Select and Place:




Answer :

Explanation:
Tagging data must be uploaded to the cloud from the New York office location.
Tagging data must be replicated to regions that are geographically close to company office locations.

Case study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to complete each case. However, there may be additional case studies and sections on this exam. You must manage your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the case study. Case studies might contain exhibits and other resources that provide more information about the scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to make changes before you move to the next section of the exam. After you begin a new section, you cannot return to this section.

To start the case study -
To display the first question in this case study, click the Next button. Use the buttons in the left pane to explore the content of the case study before you answer the questions. Clicking these buttons displays information such as business requirements, existing environment, and problem statements. If the case study has an All Information tab, note that the information displayed is identical to the information displayed on the subsequent tabs. When you are ready to answer a question, click the Question button to return to the question.

Overview -
You develop data engineering solutions for Graphics Design Institute, a global media company with offices in New York City, Manchester, Singapore, and
Melbourne.
The New York office hosts SQL Server databases that stores massive amounts of customer data. The company also stores millions of images on a physical server located in the New York office. More than 2 TB of image data is added each day. The images are transferred from customer devices to the server in New
York.
Many images have been placed on this server in an unorganized manner, making it difficult for editors to search images. Images should automatically have object and color tags generated. The tags must be stored in a document database, and be queried by SQL.
You are hired to design a solution that can store, transform, and visualize customer data.

Requirements -

Business -
The company identifies the following business requirements:
You must transfer all images and customer data to cloud storage and remove on-premises servers.
You must develop an analytical processing solution for transforming customer data.
You must develop an image object and color tagging solution.
Capital expenditures must be minimized.
Cloud resource costs must be minimized.

Technical -
The solution has the following technical requirements:
Tagging data must be uploaded to the cloud from the New York office location.
Tagging data must be replicated to regions that are geographically close to company office locations.
Image data must be stored in a single data store at minimum cost.
Customer data must be analyzed using managed Spark clusters.
Power BI must be used to visualize transformed customer data.
All data must be backed up in case disaster recovery is required.

Security and optimization -
All cloud data must be encrypted at rest and in transit. The solution must support:
parallel processing of customer data
hyper-scale storage of images
global region data replication of processed image data


HOTSPOT -
You need to design the image processing and storage solutions.
What should you recommend? To answer, select the appropriate configuration in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:




Answer :

Explanation:
From the scenario:
The company identifies the following business requirements:
-> You must transfer all images and customer data to cloud storage and remove on-premises servers.
-> You must develop an image object and color tagging solution.
The solution has the following technical requirements:
-> Image data must be stored in a single data store at minimum cost.
-> All data must be backed up in case disaster recovery is required.
All cloud data must be encrypted at rest and in transit. The solution must support:
-> hyper-scale storage of images
Reference:
https://docs.microsoft.com/en-us/azure/architecture/data-guide/technology-choices/batch-processing https://docs.microsoft.com/en-us/azure/sql-database/sql-database-service-tier-hyperscale
Design data processing solutions

Case study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to complete each case. However, there may be additional case studies and sections on this exam. You must manage your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the case study. Case studies might contain exhibits and other resources that provide more information about the scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to make changes before you move to the next section of the exam. After you begin a new section, you cannot return to this section.

To start the case study -
To display the first question in this case study, click the Next button. Use the buttons in the left pane to explore the content of the case study before you answer the questions. Clicking these buttons displays information such as business requirements, existing environment, and problem statements. If the case study has an All Information tab, note that the information displayed is identical to the information displayed on the subsequent tabs. When you are ready to answer a question, click the Question button to return to the question.

Background -

Current environment -
The company has the following virtual machines (VMs):



Requirements -

Storage and processing -
You must be able to use a file system view of data stored in a blob.
You must build an architecture that will allow Contoso to use the DB FS filesystem layer over a blob store. The architecture will need to support data files, libraries, and images. Additionally, it must provide a web-based interface to documents that contain runnable command, visualizations, and narrative text such as a notebook.
CONT_SQL3 requires an initial scale of 35000 IOPS.
CONT_SQL1 and CONT_SQL2 must use the vCore model and should include replicas. The solution must support 8000 IOPS.
The storage should be configured to optimized storage for database OLTP workloads.

Migration -
You must be able to independently scale compute and storage resources.
You must migrate all SQL Server workloads to Azure. You must identify related machines in the on-premises environment, get disk size data usage information.
Data from SQL Server must include zone redundant storage.
You need to ensure that app components can reside on-premises while interacting with components that run in the Azure public cloud.
SAP data must remain on-premises.
The Azure Site Recovery (ASR) results should contain per-machine data.

Business requirements -
You must design a regional disaster recovery topology.
The database backups have regulatory purposes and must be retained for seven years.
CONT_SQL1 stores customers sales data that requires ETL operations for data analysis. A solution is required that reads data from SQL, performs ETL, and outputs to Power BI. The solution should use managed clusters to minimize costs. To optimize logistics, Contoso needs to analyze customer sales data to see if certain products are tied to specific times in the year.
The analytics solution for customer sales data must be available during a regional outage.

Security and auditing -
Contoso requires all corporate computers to enable Windows Firewall.
Azure servers should be able to ping other Contoso Azure servers.
Employee PII must be encrypted in memory, in motion, and at rest. Any data encrypted by SQL Server must support equality searches, grouping, indexing, and joining on the encrypted data.
Keys must be secured by using hardware security modules (HSMs).
CONT_SQL3 must not communicate over the default ports

Cost -
All solutions must minimize cost and resources.
The organization does not want any unexpected charges.
The data engineers must set the SQL Data Warehouse compute resources to consume 300 DWUs.
CONT_SQL2 is not fully utilized during non-peak hours. You must minimize resource costs for during non-peak hours.

You need to optimize storage for CONT_SQL3.
What should you recommend?

  • A. AlwaysOn
  • B. Transactional processing
  • C. General
  • D. Data warehousing


Answer : B

Explanation:
CONT_SQL3 with the SQL Server role, 100 GB database size, Hyper-VM to be migrated to Azure VM.
The storage should be configured to optimized storage for database OLTP workloads.
Azure SQL Database provides three basic in-memory based capabilities (built into the underlying database engine) that can contribute in a meaningful way to performance improvements:
In-Memory Online Transactional Processing (OLTP)
Clustered columnstore indexes intended primarily for Online Analytical Processing (OLAP) workloads
Nonclustered columnstore indexes geared towards Hybrid Transactional/Analytical Processing (HTAP) workloads
Reference:
https://www.databasejournal.com/features/mssql/overview-of-in-memory-technologies-of-azure-sql-database.html
Design data processing solutions

Case study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to complete each case. However, there may be additional case studies and sections on this exam. You must manage your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the case study. Case studies might contain exhibits and other resources that provide more information about the scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to make changes before you move to the next section of the exam. After you begin a new section, you cannot return to this section.

To start the case study -
To display the first question in this case study, click the Next button. Use the buttons in the left pane to explore the content of the case study before you answer the questions. Clicking these buttons displays information such as business requirements, existing environment, and problem statements. If the case study has an All Information tab, note that the information displayed is identical to the information displayed on the subsequent tabs. When you are ready to answer a question, click the Question button to return to the question.

Overview -

General Overview -
ADatum Corporation is a medical company that has 5,000 physicians located in more than 300 hospitals across the US. The company has a medical department, a sales department, a marketing department, a medical research department, and a human resources department.
You are redesigning the application environment of ADatum.

Physical Locations -
ADatum has three main offices in New York, Dallas, and Los Angeles. The offices connect to each other by using a WAN link. Each office connects directly to the
Internet. The Los Angeles office also has a datacenter that hosts all the company's applications.

Existing Environment -

Health Review -
ADatum has a critical OLTP web application named Health Review that physicians use to track billing, patient care, and overall physician best practices.

Health Interface -
ADatum has a critical application named Health Interface that receives hospital messages related to patient care and status updates. The messages are sent in batches by each hospital's enterprise relationship management (ERM) system by using a VPN. The data sent from each hospital can have varying columns and formats.
Currently, a custom C# application is used to send the data to Health Interface. The application uses deprecated libraries and a new solution must be designed for this functionality.

Health Insights -
ADatum has a web-based reporting system named Health Insights that shows hospital and patient insights to physicians and business users. The data is created from the data in Health Review and Health Interface, as well as manual entries.

Database Platform -
Currently, the databases for all three applications are hosted on an out-of-date VMware cluster that has a single instance of Microsoft SQL Server 2012.

Problem Statements -
ADatum identifies the following issues in its current environment:
Over time, the data received by Health Interface from the hospitals has slowed, and the number of messages has increased.
When a new hospital joins ADatum, Health Interface requires a schema modification due to the lack of data standardization.
The speed of batch data processing is inconsistent.

Business Requirements -

Business Goals -
ADatum identifies the following business goals:
Migrate the applications to Azure whenever possible.
Minimize the development effort required to perform data movement.
Provide continuous integration and deployment for development, test, and production environments.
Provide faster access to the applications and the data and provide more consistent application performance.
Minimize the number of services required to perform data processing, development, scheduling, monitoring, and the operationalizing of pipelines.

Health Review Requirements -
ADatum identifies the following requirements for the Health Review application:
Ensure that sensitive health data is encrypted at rest and in transit.
Tag all the sensitive health data in Health Review. The data will be used for auditing.

Health Interface Requirements -
ADatum identifies the following requirements for the Health Interface application:
Upgrade to a data storage solution that will provide flexible schemas and increased throughput for writing data. Data must be regionally located close to each hospital, and reads must display be the most recent committed version of an item.
Reduce the amount of time it takes to add data from new hospitals to Health Interface.
Support a more scalable batch processing solution in Azure.
Reduce the amount of development effort to rewrite existing SQL queries.

Health Insights Requirements -
ADatum identifies the following requirements for the Health Insights application:
The analysis of events must be performed over time by using an organizational date dimension table.
The data from Health Interface and Health Review must be available in Health Insights within 15 minutes of being committed.
The new Health Insights application must be built on a massively parallel processing (MPP) architecture that will support the high performance of joins on large fact tables.

What should you recommend as a batch processing solution for Health Interface?

  • A. Azure CycleCloud
  • B. Azure Stream Analytics
  • C. Azure Data Factory
  • D. Azure Databricks


Answer : B

Explanation:
Scenario: ADatum identifies the following requirements for the Health Interface application:
Support a more scalable batch processing solution in Azure.
Reduce the amount of time it takes to add data from new hospitals to Health Interface.
Data Factory integrates with the Azure Cosmos DB bulk executor library to provide the best performance when you write to Azure Cosmos DB.
Reference:
https://docs.microsoft.com/en-us/azure/data-factory/connector-azure-cosmos-db
Design data processing solutions

Case study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to complete each case. However, there may be additional case studies and sections on this exam. You must manage your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the case study. Case studies might contain exhibits and other resources that provide more information about the scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to make changes before you move to the next section of the exam. After you begin a new section, you cannot return to this section.

To start the case study -
To display the first question in this case study, click the Next button. Use the buttons in the left pane to explore the content of the case study before you answer the questions. Clicking these buttons displays information such as business requirements, existing environment, and problem statements. If the case study has an All Information tab, note that the information displayed is identical to the information displayed on the subsequent tabs. When you are ready to answer a question, click the Question button to return to the question.

Overview -
You are a data engineer for Trey Research. The company is close to completing a joint project with the government to build smart highways infrastructure across
North America. This involves the placement of sensors and cameras to measure traffic flow, car speed, and vehicle details.
You have been asked to design a cloud solution that will meet the business and technical requirements of the smart highway.

Solution components -

Telemetry Capture -
The telemetry capture system records each time a vehicle passes in front of a sensor. The sensors run on a custom embedded operating system and record the following telemetry data:
Time
Location in latitude and longitude
Speed in kilometers per hour (kmph)

Length of vehicle in meters -



Visual Monitoring -
The visual monitoring system is a network of approximately 1,000 cameras placed near highways that capture images of vehicle traffic every 2 seconds. The cameras record high resolution images. Each image is approximately 3 MB in size.

Requirements. Business -
The company identifies the following business requirements:
External vendors must be able to perform custom analysis of data using machine learning technologies.
You must display a dashboard on the operations status page that displays the following metrics: telemetry, volume, and processing latency.
Traffic data must be made available to the Government Planning Department for the purpose of modeling changes to the highway system. The traffic data will be used in conjunction with other data such as information about events such as sporting events, weather conditions, and population statistics. External data used during the modeling is stored in on-premises SQL Server 2016 databases and CSV files stored in an Azure Data Lake Storage Gen2 storage account.
Information about vehicles that have been detected as going over the speed limit during the last 30 minutes must be available to law enforcement officers.
Several law enforcement organizations may respond to speeding vehicles.
The solution must allow for searches of vehicle images by license plate to support law enforcement investigations. Searches must be able to be performed using a query language and must support fuzzy searches to compensate for license plate detection errors.

Requirements. Security -
The solution must meet the following security requirements:
External vendors must not have direct access to sensor data or images.
Images produced by the vehicle monitoring solution must be deleted after one month. You must minimize costs associated with deleting images from the data store.
Unauthorized usage of data must be detected in real time. Unauthorized usage is determined by looking for unusual usage patterns.
All changes to Azure resources used by the solution must be recorded and stored. Data must be provided to the security team for incident response purposes.

Requirements. Sensor data -
You must write all telemetry data to the closest Azure region. The sensors used for the telemetry capture system have a small amount of memory available and so must write data as quickly as possible to avoid losing telemetry data.


DRAG DROP -
You need to design the system for notifying law enforcement officers about speeding vehicles.
How should you design the pipeline? To answer, drag the appropriate services to the correct locations. Each service may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.
Select and Place:



Answer :

Explanation:


Scenario:
Information about vehicles that have been detected as going over the speed limit during the last 30 minutes must be available to law enforcement officers. Several law enforcement organizations may respond to speeding vehicles.

Telemetry Capture -
The telemetry capture system records each time a vehicle passes in front of a sensor. The sensors run on a custom embedded operating system and record the following telemetry data:
-> Time
-> Location in latitude and longitude
-> Speed in kilometers per hour (kmph)
-> Length of vehicle in meters
Reference:
https://docs.microsoft.com/en-us/azure/azure-databricks/what-is-azure-databricks
Design data processing solutions

Case study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to complete each case. However, there may be additional case studies and sections on this exam. You must manage your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the case study. Case studies might contain exhibits and other resources that provide more information about the scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to make changes before you move to the next section of the exam. After you begin a new section, you cannot return to this section.

To start the case study -
To display the first question in this case study, click the Next button. Use the buttons in the left pane to explore the content of the case study before you answer the questions. Clicking these buttons displays information such as business requirements, existing environment, and problem statements. If the case study has an All Information tab, note that the information displayed is identical to the information displayed on the subsequent tabs. When you are ready to answer a question, click the Question button to return to the question.

Overview -
Litware, Inc. owns and operates 300 convenience stores across the US. The company sells a variety of packaged foods and drinks, as well as a variety of prepared foods, such as sandwiches and pizzas.
Litware has a loyalty club whereby members can get daily discounts on specific items by providing their membership number at checkout.
Litware employs business analysts who prefer to analyze data by using Microsoft Power BI, and data scientists who prefer analyzing data in Azure Databricks notebooks.

Requirements. Business Goals -
Litware wants to create a new analytics environment in Azure to meet the following requirements:
See inventory levels across the stores. Data must be updated as close to real time as possible.
Execute ad hoc analytical queries on historical data to identify whether the loyalty club discounts increase sales of the discounted products.
Every four hours, notify store employees about how many prepared food items to produce based on historical demand from the sales data.
Requirements. Technical Requirements
Litware identifies the following technical requirements:
Minimize the number of different Azure services needed to achieve the business goals.
Use platform as a service (PaaS) offerings whenever possible and avoid having to provision virtual machines that must be managed by Litware.
Ensure that the analytical data store is accessible only to the company’s on-premises network and Azure services.
Use Azure Active Directory (Azure AD) authentication whenever possible.
Use the principle of least privilege when designing security.
Stage inventory data in Azure Data Lake Storage Gen2 before loading the data into the analytical data store. Litware wants to remove transient data from Data
Lake Storage once the data is no longer in use. Files that have a modified date that is older than 14 days must be removed.
Limit the business analysts’ access to customer contact information, such as phone numbers, because this type of data is not analytically relevant.
Ensure that you can quickly restore a copy of the analytical data store within one hour in the event of corruption or accidental deletion.
Requirements. Planned Environment
Litware plans to implement the following environment:
The application development team will create an Azure event hub to receive real-time sales data, including store number, date, time, product ID, customer loyalty number, price, and discount amount, from the point of sale (POS) system and output the data to data storage in Azure.
Customer data, including name, contact information, and loyalty number, comes from Salesforce and can be imported into Azure once every eight hours. Row modified dates are not trusted in the source table.
Product data, including product ID, name, and category, comes from Salesforce and can be imported into Azure once every eight hours. Row modified dates are not trusted in the source table.
Daily inventory data comes from a Microsoft SQL server located on a private network.
Litware currently has 5 TB of historical sales data and 100 GB of customer data. The company expects approximately 100 GB of new data per month for the next year.
Litware will build a custom application named FoodPrep to provide store employees with the calculation results of how many prepared food items to produce every four hours.
Litware does not plan to implement Azure ExpressRoute or a VPN between the on-premises network and Azure.

Inventory levels must be calculated by subtracting the current day's sales from the previous day's final inventory.
Which two options provide Litware with the ability to quickly calculate the current inventory levels by store and product? Each correct answer presents a complete solution.
NOTE: Each correct selection is worth one point.

  • A. Consume the output of the event hub by using Azure Stream Analytics and aggregate the data by store and product. Output the resulting data directly to Azure Synapse Analytics. Use Transact-SQL to calculate the inventory levels.
  • B. Output Event Hubs Avro files to Azure Blob storage. Use Transact-SQL to calculate the inventory levels by using PolyBase in Azure Synapse Analytics.
  • C. Consume the output of the event hub by using Databricks. Use Databricks to calculate the inventory levels and output the data to Azure Synapse Analytics.
  • D. Consume the output of the event hub by using Azure Stream Analytics and aggregate the data by store and product. Output the resulting data into Databricks. Calculate the inventory levels in Databricks and output the data to Azure Blob storage.
  • E. Output Event Hubs Avro files to Azure Blob storage. Trigger an Azure Data Factory copy activity to run every 10 minutes to load the data into Azure Synapse Analytics. Use Transact-SQL to aggregate the data by store and product.


Answer : AE

Explanation:
A: Azure Stream Analytics is a fully managed service providing low-latency, highly available, scalable complex event processing over streaming data in the cloud.
You can use your Azure Synapse Analytics (SQL Data warehouse) database as an output sink for your Stream Analytics jobs.
E: Event Hubs Capture is the easiest way to get data into Azure. Using Azure Data Lake, Azure Data Factory, and Azure HDInsight, you can perform batch processing and other analytics using familiar tools and platforms of your choosing, at any scale you need.
Note: Event Hubs Capture creates files in Avro format.
Captured data is written in Apache Avro format: a compact, fast, binary format that provides rich data structures with inline schema. This format is widely used in the Hadoop ecosystem, Stream Analytics, and Azure Data Factory.
Scenario: The application development team will create an Azure event hub to receive real-time sales data, including store number, date, time, product ID, customer loyalty number, price, and discount amount, from the point of sale (POS) system and output the data to data storage in Azure.
Reference:
https://docs.microsoft.com/bs-latn-ba/azure/sql-data-warehouse/sql-data-warehouse-integrate-azure-stream-analytics https://docs.microsoft.com/en-us/azure/event-hubs/event-hubs-capture-overview

Case study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to complete each case. However, there may be additional case studies and sections on this exam. You must manage your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the case study. Case studies might contain exhibits and other resources that provide more information about the scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to make changes before you move to the next section of the exam. After you begin a new section, you cannot return to this section.

To start the case study -
To display the first question in this case study, click the Next button. Use the buttons in the left pane to explore the content of the case study before you answer the questions. Clicking these buttons displays information such as business requirements, existing environment, and problem statements. If the case study has an All Information tab, note that the information displayed is identical to the information displayed on the subsequent tabs. When you are ready to answer a question, click the Question button to return to the question.

Overview -
Litware, Inc. owns and operates 300 convenience stores across the US. The company sells a variety of packaged foods and drinks, as well as a variety of prepared foods, such as sandwiches and pizzas.
Litware has a loyalty club whereby members can get daily discounts on specific items by providing their membership number at checkout.
Litware employs business analysts who prefer to analyze data by using Microsoft Power BI, and data scientists who prefer analyzing data in Azure Databricks notebooks.

Requirements. Business Goals -
Litware wants to create a new analytics environment in Azure to meet the following requirements:
See inventory levels across the stores. Data must be updated as close to real time as possible.
Execute ad hoc analytical queries on historical data to identify whether the loyalty club discounts increase sales of the discounted products.
Every four hours, notify store employees about how many prepared food items to produce based on historical demand from the sales data.
Requirements. Technical Requirements
Litware identifies the following technical requirements:
Minimize the number of different Azure services needed to achieve the business goals.
Use platform as a service (PaaS) offerings whenever possible and avoid having to provision virtual machines that must be managed by Litware.
Ensure that the analytical data store is accessible only to the company’s on-premises network and Azure services.
Use Azure Active Directory (Azure AD) authentication whenever possible.
Use the principle of least privilege when designing security.
Stage inventory data in Azure Data Lake Storage Gen2 before loading the data into the analytical data store. Litware wants to remove transient data from Data
Lake Storage once the data is no longer in use. Files that have a modified date that is older than 14 days must be removed.
Limit the business analysts’ access to customer contact information, such as phone numbers, because this type of data is not analytically relevant.
Ensure that you can quickly restore a copy of the analytical data store within one hour in the event of corruption or accidental deletion.
Requirements. Planned Environment
Litware plans to implement the following environment:
The application development team will create an Azure event hub to receive real-time sales data, including store number, date, time, product ID, customer loyalty number, price, and discount amount, from the point of sale (POS) system and output the data to data storage in Azure.
Customer data, including name, contact information, and loyalty number, comes from Salesforce and can be imported into Azure once every eight hours. Row modified dates are not trusted in the source table.
Product data, including product ID, name, and category, comes from Salesforce and can be imported into Azure once every eight hours. Row modified dates are not trusted in the source table.
Daily inventory data comes from a Microsoft SQL server located on a private network.
Litware currently has 5 TB of historical sales data and 100 GB of customer data. The company expects approximately 100 GB of new data per month for the next year.
Litware will build a custom application named FoodPrep to provide store employees with the calculation results of how many prepared food items to produce every four hours.
Litware does not plan to implement Azure ExpressRoute or a VPN between the on-premises network and Azure.


HOTSPOT -
Which Azure Data Factory components should you recommend using together to import the customer data from Salesforce to Data Lake Storage? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:




Answer :

Explanation:
Box 1: Self-hosted integration runtime
A self-hosted IR is capable of nunning copy activity between a cloud data stores and a data store in private network.

Box 2: Schedule trigger -

Schedule every 8 hours -

Box 3: Copy activity -
Scenario:
-> Customer data, including name, contact information, and loyalty number, comes from Salesforce and can be imported into Azure once every eight hours. Row modified dates are not trusted in the source table.
-> Product data, including product ID, name, and category, comes from Salesforce and can be imported into Azure once every eight hours. Row modified dates are not trusted in the source table.

Case study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to complete each case. However, there may be additional case studies and sections on this exam. You must manage your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the case study. Case studies might contain exhibits and other resources that provide more information about the scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to make changes before you move to the next section of the exam. After you begin a new section, you cannot return to this section.

To start the case study -
To display the first question in this case study, click the Next button. Use the buttons in the left pane to explore the content of the case study before you answer the questions. Clicking these buttons displays information such as business requirements, existing environment, and problem statements. If the case study has an All Information tab, note that the information displayed is identical to the information displayed on the subsequent tabs. When you are ready to answer a question, click the Question button to return to the question.

Overview -
Litware, Inc. owns and operates 300 convenience stores across the US. The company sells a variety of packaged foods and drinks, as well as a variety of prepared foods, such as sandwiches and pizzas.
Litware has a loyalty club whereby members can get daily discounts on specific items by providing their membership number at checkout.
Litware employs business analysts who prefer to analyze data by using Microsoft Power BI, and data scientists who prefer analyzing data in Azure Databricks notebooks.

Requirements. Business Goals -
Litware wants to create a new analytics environment in Azure to meet the following requirements:
See inventory levels across the stores. Data must be updated as close to real time as possible.
Execute ad hoc analytical queries on historical data to identify whether the loyalty club discounts increase sales of the discounted products.
Every four hours, notify store employees about how many prepared food items to produce based on historical demand from the sales data.
Requirements. Technical Requirements
Litware identifies the following technical requirements:
Minimize the number of different Azure services needed to achieve the business goals.
Use platform as a service (PaaS) offerings whenever possible and avoid having to provision virtual machines that must be managed by Litware.
Ensure that the analytical data store is accessible only to the company’s on-premises network and Azure services.
Use Azure Active Directory (Azure AD) authentication whenever possible.
Use the principle of least privilege when designing security.
Stage inventory data in Azure Data Lake Storage Gen2 before loading the data into the analytical data store. Litware wants to remove transient data from Data
Lake Storage once the data is no longer in use. Files that have a modified date that is older than 14 days must be removed.
Limit the business analysts’ access to customer contact information, such as phone numbers, because this type of data is not analytically relevant.
Ensure that you can quickly restore a copy of the analytical data store within one hour in the event of corruption or accidental deletion.
Requirements. Planned Environment
Litware plans to implement the following environment:
The application development team will create an Azure event hub to receive real-time sales data, including store number, date, time, product ID, customer loyalty number, price, and discount amount, from the point of sale (POS) system and output the data to data storage in Azure.
Customer data, including name, contact information, and loyalty number, comes from Salesforce and can be imported into Azure once every eight hours. Row modified dates are not trusted in the source table.
Product data, including product ID, name, and category, comes from Salesforce and can be imported into Azure once every eight hours. Row modified dates are not trusted in the source table.
Daily inventory data comes from a Microsoft SQL server located on a private network.
Litware currently has 5 TB of historical sales data and 100 GB of customer data. The company expects approximately 100 GB of new data per month for the next year.
Litware will build a custom application named FoodPrep to provide store employees with the calculation results of how many prepared food items to produce every four hours.
Litware does not plan to implement Azure ExpressRoute or a VPN between the on-premises network and Azure.


HOTSPOT -
Which Azure Data Factory components should you recommend using together to import the daily inventory data from SQL to Azure Data Lake Storage? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:




Answer :

Explanation:
Box 1: Self-hosted integration runtime
A self-hosted IR is capable of nunning copy activity between a cloud data stores and a data store in private network.
Scenario: Daily inventory data comes from a Microsoft SQL server located on a private network.

Box 2: Schedule trigger -

Daily schedule -

Box 3: Copy activity -
Scenario:
Stage inventory data in Azure Data Lake Storage Gen2 before loading the data into the analytical data store. Litware wants to remove transient data from Data
Lake Storage once the data is no longer in use. Files that have a modified date that is older than 14 days must be removed.

Case study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to complete each case. However, there may be additional case studies and sections on this exam. You must manage your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the case study. Case studies might contain exhibits and other resources that provide more information about the scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to make changes before you move to the next section of the exam. After you begin a new section, you cannot return to this section.

To start the case study -
To display the first question in this case study, click the Next button. Use the buttons in the left pane to explore the content of the case study before you answer the questions. Clicking these buttons displays information such as business requirements, existing environment, and problem statements. If the case study has an All Information tab, note that the information displayed is identical to the information displayed on the subsequent tabs. When you are ready to answer a question, click the Question button to return to the question.

Overview -
Litware, Inc. owns and operates 300 convenience stores across the US. The company sells a variety of packaged foods and drinks, as well as a variety of prepared foods, such as sandwiches and pizzas.
Litware has a loyalty club whereby members can get daily discounts on specific items by providing their membership number at checkout.
Litware employs business analysts who prefer to analyze data by using Microsoft Power BI, and data scientists who prefer analyzing data in Azure Databricks notebooks.

Requirements. Business Goals -
Litware wants to create a new analytics environment in Azure to meet the following requirements:
See inventory levels across the stores. Data must be updated as close to real time as possible.
Execute ad hoc analytical queries on historical data to identify whether the loyalty club discounts increase sales of the discounted products.
Every four hours, notify store employees about how many prepared food items to produce based on historical demand from the sales data.
Requirements. Technical Requirements
Litware identifies the following technical requirements:
Minimize the number of different Azure services needed to achieve the business goals.
Use platform as a service (PaaS) offerings whenever possible and avoid having to provision virtual machines that must be managed by Litware.
Ensure that the analytical data store is accessible only to the company’s on-premises network and Azure services.
Use Azure Active Directory (Azure AD) authentication whenever possible.
Use the principle of least privilege when designing security.
Stage inventory data in Azure Data Lake Storage Gen2 before loading the data into the analytical data store. Litware wants to remove transient data from Data
Lake Storage once the data is no longer in use. Files that have a modified date that is older than 14 days must be removed.
Limit the business analysts’ access to customer contact information, such as phone numbers, because this type of data is not analytically relevant.
Ensure that you can quickly restore a copy of the analytical data store within one hour in the event of corruption or accidental deletion.
Requirements. Planned Environment
Litware plans to implement the following environment:
The application development team will create an Azure event hub to receive real-time sales data, including store number, date, time, product ID, customer loyalty number, price, and discount amount, from the point of sale (POS) system and output the data to data storage in Azure.
Customer data, including name, contact information, and loyalty number, comes from Salesforce and can be imported into Azure once every eight hours. Row modified dates are not trusted in the source table.
Product data, including product ID, name, and category, comes from Salesforce and can be imported into Azure once every eight hours. Row modified dates are not trusted in the source table.
Daily inventory data comes from a Microsoft SQL server located on a private network.
Litware currently has 5 TB of historical sales data and 100 GB of customer data. The company expects approximately 100 GB of new data per month for the next year.
Litware will build a custom application named FoodPrep to provide store employees with the calculation results of how many prepared food items to produce every four hours.
Litware does not plan to implement Azure ExpressRoute or a VPN between the on-premises network and Azure.


HOTSPOT -
Which Azure service and feature should you recommend using to manage the transient data for Data Lake Storage? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:




Answer :

Explanation:
Scenario: Stage inventory data in Azure Data Lake Storage Gen2 before loading the data into the analytical data store. Litware wants to remove transient data from Data Lake Storage once the data is no longer in use. Files that have a modified date that is older than 14 days must be removed.

Service: Azure Data Factory -
Clean up files by built-in delete activity in Azure Data Factory (ADF).
ADF built-in delete activity, which can be part of your ETL workflow to deletes undesired files without writing code. You can use ADF to delete folder or files from
Azure Blob Storage, Azure Data Lake Storage Gen1, Azure Data Lake Storage Gen2, File System, FTP Server, sFTP Server, and Amazon S3.
You can delete expired files only rather than deleting all the files in one folder. For example, you may want to only delete the files which were last modified more than 13 days ago.

Feature: Delete Activity -
Reference:
https://azure.microsoft.com/sv-se/blog/clean-up-files-by-built-in-delete-activity-in-azure-data-factory/
Design data processing solutions

A company has an application that uses Azure SQL Database as the data store.
The application experiences a large increase in activity during the last month of each year.
You need to manually scale the Azure SQL Database instance to account for the increase in data write operations.
Which scaling method should you recommend?

  • A. Scale up by using elastic pools to distribute resources.
  • B. Scale out by sharding the data across databases.
  • C. Scale up by increasing the database throughput units.


Answer : C

Explanation:
As of now, the cost of running an Azure SQL database instance is based on the number of Database Throughput Units (DTUs) allocated for the database. When determining the number of units to allocate for the solution, a major contributing factor is to identify what processing power is needed to handle the volume of expected requests.
Running the statement to upgrade/downgrade your database takes a matter of seconds.
Incorrect Answers:
A: Elastic pools is used if there are two or more databases.
Reference:
https://www.skylinetechnologies.com/Blog/Skyline-Blog/August_2017/dynamically-scale-azure-sql-database

You are designing an Azure Data Factory pipeline for processing data. The pipeline will process data that is stored in general-purpose standard Azure storage.
You need to ensure that the compute environment is created on-demand and removed when the process is completed.
Which type of activity should you recommend?

  • A. Databricks Python activity
  • B. Data Lake Analytics U-SQL activity
  • C. HDInsight Pig activity
  • D. Databricks Jar activity


Answer : C

Explanation:
The HDInsight Pig activity in a Data Factory pipeline executes Pig queries on your own or on-demand HDInsight cluster.
Reference:
https://docs.microsoft.com/en-us/azure/data-factory/transform-data-using-hadoop-pig

A company installs IoT devices to monitor its fleet of delivery vehicles. Data from devices is collected from Azure Event Hub.
The data must be transmitted to Power BI for real-time data visualizations.
You need to recommend a solution.
What should you recommend?

  • A. Azure HDInsight with Spark Streaming
  • B. Apache Spark in Azure Databricks
  • C. Azure Stream Analytics
  • D. Azure HDInsight with Storm


Answer : C

Explanation:
Step 1: Get your IoT hub ready for data access by adding a consumer group.
Step 2: Create, configure, and run a Stream Analytics job for data transfer from your IoT hub to your Power BI account.
Step 3: Create and publish a Power BI report to visualize the data.
Reference:
https://docs.microsoft.com/en-us/azure/iot-hub/iot-hub-live-data-visualization-in-power-bi

You have a Windows-based solution that analyzes scientific data. You are designing a cloud-based solution that performs real-time analysis of the data.
You need to design the logical flow for the solution.
Which two actions should you recommend? Each correct answer presents part of the solution.
NOTE: Each correct selection is worth one point.

  • A. Send data from the application to an Azure Stream Analytics job.
  • B. Use an Azure Stream Analytics job on an edge device. Ingress data from an Azure Data Factory instance and build queries that output to Power BI.
  • C. Use an Azure Stream Analytics job in the cloud. Ingress data from the Azure Event Hub instance and build queries that output to Power BI.
  • D. Use an Azure Stream Analytics job in the cloud. Ingress data from an Azure Event Hub instance and build queries that output to Azure Data Lake Storage.
  • E. Send data from the application to Azure Data Lake Storage.
  • F. Send data from the application to an Azure Event Hub instance.


Answer : CF

Explanation:
Stream Analytics has first-class integration with Azure data streams as inputs from three kinds of resources:
-> Azure Event Hubs
-> Azure IoT Hub
-> Azure Blob storage
Reference:
https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-define-inputs

Page:    1 / 11   
Total 154 questions