Schema Modifier – The Storage of Data
Consider adding a few additional transformations and viewing the configuration details. Add an Aggregate schema modifier or a Window schema modifier and see how they can be used to transform the sample data. An Aggregate schema modifier, as you know, enables you to use SQL aggregations like SUM, AVG, MIN, MAX, LAST, or FIRST. A Window Schema modifier simulates the OVER SQL statement. This is commonly used with Window frames for making a calculation from a subset of data to produce a result relevant to a single row. For a summary of all schema modifier transformations, see Table 4.4.
TABLE 4.4 Data flow schema modifiers
Modifier | Description |
Aggregate | Enables the usage of SUM, MIN, MAX, COUNT, etc. |
Derived Column | Creates new columns using the data flow expression language. |
External Call | Calls external endpoints, one row at a time. |
Pivot | An expansion of data from multiple rows and their single column to multiple columns. Groups the rows by a defined value using aggregation. |
Rank | Orders rows based on a sort condition. |
Select | Streams names and alias columns; reorders or drops columns. |
Surrogate Key | Adds a non‐business incrementing arbitrary key value. |
Unpivot | Transforms a row with multiple columns to multiple rows with a single column. |
Window | Uses window‐based aggregation of columns with a data stream. |
In Exercise 4.8 you used a Derived Column schema identifier, as shown in Figure 4.27.
FIGURE 4.27 Azure Synapse Analytics Develop hub, data flow Derived Column schema modifier
The Incoming Stream is the name of the previous source transformation. Notice the Open Expression Builder link. That link is used for creating, testing, and debugging the code snippet placed into the Expression text box. Click the Open Expression Builder link to view the capabilities on that blade (see Figure 4.28).
The Optimize tab provides the same capabilities as shown previously in Figure 4.26. In this case, however, the kind of data distribution is applied to the result of the additional column constructed by the execution of the code snippet in the Expression text box.