Implement a Dimensional Hierarchy – The Storage of Data

2024-07-12| Tammy Hale| 0 Comment| 7:22 am

Categories :

Log in to the Azure portal at https://portal.azure.com ➢ navigate to the Azure Synapse Analytics workspace you created in Exercise 3.3 ➢ on the Overview blade, click the Open link in the Open Synapse Studio tile ➢ select the Develop hub ➢ click the + to the right Develop ➢ click SQL Script ➢ select your dedicated SQL pool from the Connect To drop‐down list box ➢ and then execute the SQL syntax in file createDimensionalHierarchy.sql, which creates and populates the tables. You can find the syntax in the Chapter04/Ch04Ex13 directory on GitHub.
Execute the SQL queries in file queryDimensionalHierarchy.sql, located in the Chapter04/Ch04Ex13 directory on GitHub—for example:

–MODE
SELECT * FROM [dimensional].[MODE] WHERE [MODE_ID] = 2

Approaching the data discovery from the top down would indicate a desire to understand which mode is connected to which frequencies. Because there is no direct relationship between a mode and frequency, the navigation through the dimensional hierarchy must flow through the ELECTRODE table. Working from the bottom up, you would be able to determine which frequency is linked to which mode, by navigating through the ELECTRODE table. For example, if you wanted to find out the mode, electrode, and frequency of a specific brain wave reading, you could use a query like the following: SELECT M.MODE, E.ELECTRODE, F.FREQUENCY
FROM [dimensional].[MODE] M,
[dimensional].[ELECTRODE] E,
[dimensional].[FREQUENCY] F

This kind of navigation and data discovery is useful when you have some data anomalies and want to find out where that value came from. Perhaps in this example that specific electrode is faulty or the code that is capturing the POW mode readings has a bug.

Create and Execute Queries by Using a Compute Solution That Leverages SQL Serverless and Spark Cluster

When you select Built‐in from the Connect To drop‐down list box, as shown in Figure 4.20, it means that you are targeting a serverless SQL pool. When you are running workloads on a serverless SQL pool, as you did in Exercise 4.6 and Exercise 4.11, the required CPU and memory are allocated as needed. Both serverless SQL pools and Apache Spark pools were introduced in detail in Chapter 3. Table 3.7 describes the differences between dedicated and serverless SQL pools, and Table 3.9 contains some different node sizes available for Apache Spark pools. In Exercise 3.4, you created an Apache Spark cluster and used it in Exercise 4.2, Exercise 4.7, and Exercise 4.12.

Implement a Dimensional Hierarchy – The Storage of Data

Create and Execute Queries by Using a Compute Solution That Leverages SQL Serverless and Spark Cluster

Leave a Reply Cancel reply

Related Post

Additional Data Storage Topics – The Storage of Data

Azure Databricks – The Storage of Data

Design and Implement the Data Exploration Layer – The Storage of Data