IMPORTANT: You can read the official documentation that Microsoft has on this topic from the following link (url).
I’ve previously done a series on Merge / JOIN operations (First Part here) and it’s now time to do one on Combine / Append operations.
so…How do you combine / append / stack tables with Power BI / Power Query?
There are multiple ways to accomplish this, but we’re going to start with the basics.
Scenario: Combine Data from 2 CSV files
Imagine that we have 2 CSV files.
- Sales data for the year 2017
- Sales data for the year 2018
In reality, our data sources can be any. It can be 1 CSV and 1 Excel file, maybe 1 table from a database and another one from an API.
To Power BI / Power Query, it’s a bit indifferent as they’ll go through the same process regardless of their data source, so what I’m about to show you applies to every single data source possible within Power BI / Power Query.
We want to append both of these into just 1 table. How do we do that?
Connecting to the Data
This is a pretty straightforward process where we can just select our data source from the “Get Data” window and go through the import process which will display a simple browse window so you can select your file and then give you a preview of your data like this:
To finish things off, you can just click on Edit so we can go straight to the Power Query Editor.
This is pretty simple and in our case our csv files have the correct format thus not requiring any transformation steps to be done to it other than the ones that were automatically done by Power BI / Power Query.
We repeat this process for the 2018 file and then we’ll end up with 2 queries:
Those are the queries or tables that we want to stack on top of each other. How do we do that?
Combining / Stacking / Appending Tables
This is truly the easiest part, now all you need to do is find the button that reads Append Queries and then a new window will appear where you can combine all the queries that you want.
It doesn’t need to be just 2 tables, it can be as many tables as you want, which means that next year when I get the data for 2019, I can just create a new query and add this to this list of tables that need to be combined and I’ll be set and done.
The output of this will be a table with all of the columns from all the tables that we combined and all of the data from both.
Considerations
Data Load
When you first create the queries, they’ll be, by default, set to load themselves to your Data Model or created as new tables inside of Excel if you’re using Power Query for Excel.
This might create some data duplication which is completely unnecessary. To stop this from happening, you can change the behavior of the Sales 2017 and Sales 2018 queries to be Connection Only.
You can simply right click the query and disable the Load by unchecking the Enable Load option.
Data Privacy and Query Folding
When combining your data it follows the sample guidelines for Data Privacy (here) and Query Folding (here) as showcased in the previously written articles. This means that you need to be aware of how these concepts play out in your specific scenario, so I highly recommend that you read the articles in this topic in case you’re hitting any errors or performance issues.
The Table.Combine Function
When you go through the process described in this post, Power Query will end up using a function called the Table.Combine function.
When Power Query uses this function, it resembles the same experience that happens when you do a SQL UNION ALL. This means that Power Query will try to find matching columns from all of the tables and stack data accordingly, but if one table has columns that the other doesn’t, then it’ll fill any empty spaces with null values.
This function is BY FAR the best method when it comes to appending data as it’s a function that was designed to make this happen.
Hello Miguel,
Thanks for this tutorial. I’ll take a chance sharing with you a problem with a later version of Power BI (Version: 2.75.5649.861 64-bit (November 2019)).
I follow you tutorial religiously but I do not get an Append1 query in the query pane. It is simply not there. I’m a bit new to all this and I’m learning for a pastime. I tried with other online Tuts to no avail.
If you’ve heard of this oddity maybe you will want to quickly share with me. I fact I’m a member of Power Query Academy and I love it.
Thanks
Daniel
Hey Daniel,
Would you mind posting your full scenario in the academy forum? We have quite a few videos in there, but I’m not sure what could be happening. Perhaps you’re doing the append within an existing query instead of append as a new query ?
Thanks for your reply. On the Microsoft Power BI Community some friends explained that I should have used “Append Query as New”.
I will use your suggestion since I subscribed to to the Power Query complete course.
Thanks for getting back to me…
Daniel
Thanks for the article Miguel. Very, very, helpful. I do have a question. When I use append with two excel spreadsheets for example, I see that it doubles the data for both spreadsheet. removing duplicates doesn’t work. Any idea how to append two different spreadsheets with the unique values of each spreadsheet?
Thanks!
hey!
I’m not sure I’m following. The Append operation should basically stack the rows from both data sources into just one “main” table.
I’d also recommend looking at the other articles in this series, specially this one:
https://www.thepoweruser.com/2019/04/23/combine-or-append-data-combining-excel-files/
But if you’re trying to implement something quite specific for your case, I’d highly recommend that you publish that dataset and scenario on the official Power Query forum:
https://social.technet.microsoft.com/Forums/en-US/home?forum=powerquery
Best!
I have two data environments that have the same data layouts however, the columns start with either EDI or MI then the rest is the same. Is there a way to do this to allow these columns to match?
You could demote the headers of the tables and then match by position, but then you’ll have to figure out what set of header names to use
Hello Miguel, great tutorial, i just have a question, when i try to append more than 2 tables the option “three or more tables” is not showed, i have excel 2016, do i have to abilitate it or something
Hey! Unfortunately I don’t have an excel 2016 with me so I wouldn’t be able to tell you. Have you tried perhaps updating your office ?
Thank you!Great article. I still have the problem of having `null` values after appending the queries. How can I change that function?
It depends! The function itself is not the one that creates the nulls, so there must be something in your source data or in a query that has those nulls.
Hi Dears,
I want to append 2 or more excel sheets in Power BI that 5 columns of them are similar but the other ones are not the same.
would you please help me how can I do it?
after appending, the result is not proper and correct.
thanks in advanced.
Hey! It’s pretty difficult to give you some guidance without understanding your dataset or what your output query should look like. I highly recommend that you post your full scenario with every single file detail and screenshot on the official power query forum:
https://social.technet.microsoft.com/Forums/en-US/home?forum=powerquery
Hi Miguel,
great tutorial! I have a monthly htm file, which always has at least three tables in it, but the number is different for every month. I always need to append the tables starting from table3 to n, and it works when I do it manually, but is there a way to tell the query “take the tables 3 to n” so that it works automatically each month?
I appreciate your help, thank you!
Hey!
When you connect to an Excel workbook you’ll see a list of all the objects inside that workbook. You can actually apply a filter to the “Name” field inside the workbook to apply that sort of condition. I think the name column is set to text, so you’ll need to change it to Numeric so you can apply a numeric filter and do something like “[Name] >= 3”
Hope this helps!
Hi, i have monthly sales data. Let’s say i start loading Jan data & do some modification to it. Add column, calculated field etc. Then, i want to auto load Feb data using Get Data from Folder. And append it with Jan data. I hit error. Is it not possible to append new data into old data that has been modified in query earlier? So i have to redo all the modification to my new data then only i can append it to my old data? Is this the case? Thanks..
Hey Hani!
I’m not sure I follow. If you use the Append feature then it should create a table with the data from both of your queries (Jan and Feb Data where one came from the Folder connector and ther other from the Excel connector). Again, I’m not sure I follow exactly what you mean by “old” data or what error message you might be getting when running your query.
I’d recommend that you publish your scenario with all details and copy of your queries if possible to the official Power BI Community forum (assuming that you’re using Power BI):
https://community.powerbi.com/t5/Desktop/bd-p/power-bi-designer
hi
how could we append to queries without same columns .
we should create sample columns in each queries to unified them ?
it depends on what your end table should be looking like. If you just want to combine tables based on the ordinal position of the columns and not really on the name of the columns, then demote the column headers from both tables before doing the append operation. Demoting the headers will change your columns to be “Column1”, “Column2”, etc etc which would make it easy to append based on position of the columns and not the names of the columns.
Hope this helps!
To Power BI / Power Query, it’s a bit indifferent as they’ll go through the same process regardless of their data source, so what I’m about to show you applies to every single data source possible within Power BI / Power Query.
I would like to add few things in the post for better aware about power bi like In short, when you have one or more columns that you’d like to add to another query, you merge the queries. When you have additional rows of data that you’d like to add to an existing query, you append the query.