Design and Implement the Data Exploration Layer – The Storage of Data
The serving layer is a component of the lambda architecture (refer to Figure 3.13). The serving layer is the place into which both the speed layer, which processes data incrementally, and the batch layer, which has more refined data, feed. Data stored in the serving layer is then accessed by other applications, individuals, or Power BI, for example. Therefore, to implement a serving layer, you would need to know the details about how those two preceding dependent layers (i.e., speed and batch) are forwarding the data. In other words, what are the format and schema of the data being ingested and likely transformed onto the serving layer? Remember that the speed layer receives data along the hot path, which typically flows from an IoT device through products like Event Hubs, IoT Hub, Stream Analytics, or Kafka. That streaming data is processed incrementally and can be transformed, but just a little in real time, before it is either sent live to a consumer or placed into a datastore like ADLS, an SQL pool, or an Azure Cosmos DB. The data being ingested to the batch layer is not streaming and flows along a cold path into an ingestion product like Azure Data Factory, Azure Databricks, or Azure Synapse Analytics. Once ingested and stored, that data can then be transformed and moved to the serving layer. Chapter 6 includes in‐depth coverage of batching and batch processing, and Chapter 7 discusses stream processing.
Deliver Data in a Relational Star Schema
When you implement the serving layer portion of a lambda architecture along the cold path, consider storing data in a star schema. (Refer to Figure 3.14 for an illustration of a star schema.) Remember that the benefit expected from data flowing through the batch layer and onto the serving layer along the cold path is that the querying is most efficient. That is, querying data that has flowed through the batch layer to the serving layer will render better performance and be less raw than data flowing through the speed layer onto the serving layer. Complete Exercise 4.13, where you will retrieve data from an Azure SQL database, transform it a bit, and then store it for consumption on a serving layer (i.e., a dedicated SQL pool).