Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
I am migrating an Alteryx workflow with an iterative macro to a Dataflow Gen 2 equivalent, utilizing a delta lake table from the lakehouse as the data source.
This function carries out the required transformations on the table; however, the table's structure - such as column names, the number of columns, the order of columns, and the number of rows - remains unchanged. The transformation only impacts the underlying data within the table. Here is the function:
Suppose the name of this function is fxMyFunc
(tbl as table) =>
let
Source = tbl,
AddedCustom = Table.TransformColumnTypes(
Table.AddColumn(
Source,
"On-Hand_Inventory_Quantity_Inter",
each
if [RecordID] = 1 then
[#"On-Hand_Inventory_Quantity"]
else
[Transaction_Quantity__NOS_] + Source[#"On-Hand_Inventory_Quantity"]{[RecordID] - 2}
),
{{"On-Hand_Inventory_Quantity_Inter", type number}}
),
RemovedCols = Table.RemoveColumns(AddedCustom, {"On-Hand_Inventory_Quantity"}),
AddedCustom1 = Table.TransformColumnTypes(
Table.AddColumn(
RemovedCols,
"On-Hand_Inventory_Quantity",
each
if [RecordID] = 1 then
[#"On-Hand_Inventory_Quantity_Inter"]
else
List.Sum(
List.Range(RemovedCols[#"On-Hand_Inventory_Quantity_Inter"], 1, [RecordID] - 1)
)
),
{{"On-Hand_Inventory_Quantity", type number}}
),
RemovedCols1 = Table.RemoveColumns(
AddedCustom1,
{"On-Hand_Inventory_Quantity_Inter", "Inbound/Outbound_Value_Adjustment"}
),
AddedCustom2 = Table.TransformColumnTypes(
Table.AddColumn(
RemovedCols1,
"Inbound/Outbound_Value_Adjustment",
each
if [Type_of_Entry__RECPT_SUB_TYP_CD_] = "TOUT" then
RemovedCols1[Unit_Cost]{[RecordID] - 2} * [Transaction_Quantity__NOS_]
else
[#"Inbound/Outbound_Value"]
),
{{"Inbound/Outbound_Value_Adjustment", type number}}
),
AddedCustom3 = Table.TransformColumnTypes(
Table.AddColumn(
AddedCustom2,
"On-Hand_Inventory_Value_Inter",
each
if [RecordID] = 1 then
[#"On-Hand_Inventory_Value"]
else
[#"Inbound/Outbound_Value_Adjustment"]
+ AddedCustom2[#"On-Hand_Inventory_Value"]{[RecordID] - 2}
),
{{"On-Hand_Inventory_Value_Inter", type number}}
),
RemovedCols2 = Table.RemoveColumns(AddedCustom3, {"On-Hand_Inventory_Value"}),
#"Added custom 4" = Table.TransformColumnTypes(
Table.AddColumn(
RemovedCols2,
"On-Hand_Inventory_Value",
each
if [RecordID] = 1 then
[#"On-Hand_Inventory_Value_Inter"]
else
List.Sum(List.Range(RemovedCols2[#"On-Hand_Inventory_Value_Inter"], 1, [RecordID] - 1))
),
{{"On-Hand_Inventory_Value", type number}}
),
#"Removed columns 3" = Table.RemoveColumns(
#"Added custom 4",
{"On-Hand_Inventory_Value_Inter", "Unit_Cost"}
),
#"Added custom" = Table.TransformColumnTypes(
Table.AddColumn(
#"Removed columns 3",
"Unit_Cost",
each
if [#"On-Hand_Inventory_Quantity"] = 0 then
0
else
[#"On-Hand_Inventory_Value"] / [#"On-Hand_Inventory_Quantity"]
),
{{"Unit_Cost", type number}}
),
#"Rounded off" = Table.TransformColumns(
#"Added custom",
{{"Unit_Cost", each Number.Round(_, 2), type number}}
),
#"Reordered columns" = Table.ReorderColumns(
#"Rounded off",
{
"RecordID",
"Posted_Date",
"Type_of_Entry__RECPT_SUB_TYP_CD_",
"Transaction_Quantity__NOS_",
"Inbound/Outbound_Value",
"Inbound/Outbound_Value_Adjustment",
"Unit_Cost",
"On-Hand_Inventory_Quantity",
"On-Hand_Inventory_Value"
}
)
in
#"Reordered columns"
The iterative macro in Alteryx can repeat a transformation 57 times, corresponding to the maximum value of the [RecordID] column, and it does this within seconds.
Similarly, I attempted to replicate this process using the List.Generate() function in PowerQuery. I'm performing the transformation 5 times for testing purposes, although I need to execute it 57 times. Below is the M-query for this operation:
Suppose the name of this function is OutputQuery
let
Source = List.Generate(
() => [x=1, y=fxMyFunc(BaseTable)],
each [x] <= 5,
each [x = [x] + 1, y = fxMyFunc([y])],
each [y]
)
in
Source
Accessing Source{0} in the Navigation step takes approximately 17 to 18 seconds. However, accessing Source{1} can take up to 10 minutes, after which Dataflow Gen2 indicates that the evaluation has stopped. It is unclear why there is such a significant time difference just to access the subsequent step, Source{1}.
Here is the query for BaseTable:
let
Source = Lakehouse.Contents(null){[workspaceId = "xxxxxx-xxx-xxxx-xxx"]}[Data]{[lakehouseId = "xxxx-xxxx-xxx-xxx-xx-xx"]}[Data],
Navigation = Source{[Id = "MyTableName", ItemKind = "Table"]}[Data]
in
Navigation
To provide context regarding the dataset, it comprises merely 57 rows and 9 columns. You may refer to the fxMyFunc to ascertain the specific column names and the transformations being applied.
I have attempted recursion in addition to List.Generate(), but it also proved unhelpful. Could you assist me with this? I am uncertain whether I am making a mistake or facing optimization issues. Your guidance would be appreciated.
Thanks,
Udit
Thank you, @lbendlin, for your response to this topic.
I understand the pitfalls of recursion. Initially, I attempted it, but soon realized it was not efficient for even 57 rows of data, which led me to choose List.Generate(). Although I have used List.Buffer() in some instances, I never fully explored it because there was no necessity. However, it appears this is the situation where it might be required.
Here is a sample dataset with 57 rows for your reference (This is the DeltaLake Table, the BaseTable) :
I would greatly value your assistance.
Thanks,
Udit
Table.Buffer seems to work just fine
Recursion carries (literally carries) a penalty of having to lug your results around. Try to avoid if possible.
Read about the buffering functions like Table.Buffer and List.Buffer and evaluate if you can apply them to your scenario.
Can you provide a sample of the Delat lake table?
Check out the September 2024 Power BI update to learn about new features.
Learn from experts, get hands-on experience, and win awesome prizes.
User | Count |
---|---|
187 | |
59 | |
56 | |
28 | |
25 |