The following is a pretty common scenario that I’ve seen with flat files exported from legacy systems, but it also applies to multiple other scenarios. It uses the M language inside of Power BI Desktop / Power Query, but you can follow along without knowing that much about it.
The scenario: a table that contains multiple segments or partitions
Imagine the following table:
It has 3 columns, but within the rows of that table we noticed that there’s a repeating pattern. Every 3 rows there’s a first row that contains what it’s commonly called as a header row and then underneath we have 2 values.
In our specific case, the system will always export the file with a symmetrical number of rows for every “section” or “segment”.
At the end of the day, what we’re trying to reach is the following output:
The solution: Split functions to the rescue
You can download the sample file by clicking on the button below:
Don’t forget that we’ll start with a table that looks like this:
Step 1: Splitting your table into multiple smaller tables of the same size
Since our first step in our query is called “Source”, all wee need to do is apply the formula Table.Split( Source) as a new custom step (by clicking on the fx button in the formula bar) and that will look like this:
This essentially transforms our table into a List of tables where each value in that list is basically a table. The #3 that you see in that formula it’s because our pattern stated that every 3 rows there was a new “record” or “object”.
The reality is that in Power Query it’s always easier to target things at a much granular level rather than trying to apply a function that should take in consideration a large number of unknown variables. By dividing or splitting this into smaller pieces of tables, we’re able to create a much faster approach and way easier to understand.
Step 2: Using a custom Function against the tables in the list
I already have a function inside the file called myFx which will transform every table inside of that list into the table that we’re looking. To make that happen we create a new custom step and enter the formula List.Transform( Custom1, myFx) which will apply our custom function to every element in the list.
The result of that looks like this:
Step 3: Combine all tables
This is the last part and probably the easiest one to understand because it’s something that we’ve seen before in this blog. We have a list of tables and right now what we’re after is for a way to combine all of those tables into one single table.
The easiest way is by using the Table.Combine function as shown in the next image:
And just like that we have the output table that we’re looking for.
About other patterns for this scenario
You may have noticed that this scenario could also be solved with some patterns that Ken Puls and myself have published in our book ‘M is for Data Monkey’ where we use things like an Index Column, Module, Integer Divide and others that are out of the box buttons in the Power Query UI.
Those are still valid and are preferred for new to intermediate users, but if you’re going for the best performance possible, then using the Split functions would be the best way to go by far. Note that I’m saying Split functions because we just saw Table.Split, but there’s also List.Split.
I even jumped on a call with Ken about how to use List.Split and he was pretty excited about this function when I showed him what we could do with those, so expect a few more blog posts from myself and Ken about an update to our book patterns that are blazingly fast.
We will most likely be adding these new patterns to our Power Query Academy as videos and to the Power Query Recipes, so be on the lookout for those.