Microsoft DP-203 Certification Exam

Implement a Partition Strategy – The Storage of DataImplement a Partition Strategy – The Storage of Data

2023-04-16| Tammy Hale| 0 Comment| 7:05 am

Chapter 3 discussed designing partition strategies. Remember the two primary reasons for partitioning data. The first reason is to improve the speed at which a query for data responds with [...]

Implement a Partition Strategy for Streaming Workloads – The Storage of DataImplement a Partition Strategy for Streaming Workloads – The Storage of Data

2023-03-26| Tammy Hale| 0 Comment| 7:07 am

Chapter 7, “Design and Implement a Data Stream Processing Solution,” discusses partitioning data within one partition and across partitions. Exercise 7.5 features the hands‐on implementation of partitioning streaming workloads. Partitioning [...]

Deliver Data in Parquet Files – The Storage of DataDeliver Data in Parquet Files – The Storage of Data

2023-01-05| Tammy Hale| 0 Comment| 7:20 am

In Exercise 4.7 you performed a conversion of brain waves stored in multiple CSV files using the following PySpark code snippet: %%pysparkdf = spark.read.option(“header”,”true”) \ .csv(‘abfss://*@*.dfs.core.windows.net/EMEA/brainjammer/in/2022/04/01/18/*’)display(df.limit(10)) Then you wrote that [...]

Implement Logical Data Structures – The Storage of DataImplement Logical Data Structures – The Storage of Data

2022-12-03| Tammy Hale| 0 Comment| 6:36 am

Unlike physical data storage structures, you cannot actually touch logical storage structures. That is because logical storage structures are arrangements like blocks, extents, segments, and tablespaces that reside on the [...]

Design and Implement the Data Exploration Layer – The Storage of DataDesign and Implement the Data Exploration Layer – The Storage of Data

2022-11-30| Tammy Hale| 0 Comment| 7:08 am

The serving layer is a component of the lambda architecture (refer to Figure 3.13). The serving layer is the place into which both the speed layer, which processes data incrementally, [...]

Flowlets – The Storage of DataFlowlets – The Storage of Data

2022-11-26| Tammy Hale| 0 Comment| 6:32 am

A flowlet is a container that holds reusable activities. Consider the activity you created in Exercise 4.8, which retrieves brain wave data and converts the epoch date into a more [...]

Source – The Storage of DataSource – The Storage of Data

2022-10-30| Tammy Hale| 0 Comment| 6:30 am

When you select the Source item, six tabs are rendered in the configuration panel: Source Settings, Source Options, Projection, Optimize, Inspect, and Data Preview. Figure 4.25 shows the Source Settings [...]

Azure Synapse Analytics Develop Hub Notebook – The Storage of DataAzure Synapse Analytics Develop Hub Notebook – The Storage of Data

2022-09-26| Tammy Hale| 0 Comment| 6:23 am

df = spark.read.option(“header”,”true”) \ .csv(‘abfss://@.dfs.core.windows.net/EMEA/brainjammer/in/2022/04/01/18/*’) display(df.limit(10)) FIGURE 4.21 Azure Synapse Analytics Develop hub load Notebook FIGURE 4.22 Azure Synapse Analytics Develop hub write Notebook Parquet files In this exercise you [...]

Azure Databricks – The Storage of DataAzure Databricks – The Storage of Data

2022-07-23| Tammy Hale| 0 Comment| 6:16 am

Your Azure Databricks workspace includes an associated Azure storage account and blob container. Two lifecycle management policies that delete temporary and log files are created by default. You can view [...]

Azure Synapse Analytics Data Hub SQL Script – The Storage of DataAzure Synapse Analytics Data Hub SQL Script – The Storage of Data

2022-05-25| Tammy Hale| 0 Comment| 6:21 am

FIGURE 4.19 Azure Synapse Analytics Data hub ADLS directory FROM OPENROWSET( ) AS [result] FIGURE 4.20 Azure Synapse Analytics Data hub ADLS directory Notice in Figure 4.20 that the SQL [...]