With Power Query inside of Excel or Power BI, you can replace values by using the ‘Replace values’ button, but there are other approaches when you need to replace multiples values at the same time like the following ones:
The reality is that there are multiple ways to do it, but it often requires creating another Translation table in order to perform the needed replacement or translation.
Replacing Values automatically with no Translation Table
This can often be optimized with the functions that Power Query already has, but we need to find out what our text string is or what it looks like. In this blog post I’ll tackle a specific scenario where you get text strings that look like the ones in the following table
Text |
<p>I will display €</p> |
<p>I will display €</p> |
<p>I will display €</p> |
<p>* These are specific codes used in HTML (this is just a sample, but things look great with "Power Query")</p> |
and the output table should look like the following:
fnDecodeHTML |
I will display € |
I will display € |
I will display € |
* These are specific codes used in HTML (this is just a sample, but things look great with “Power Query”) |
What might have noticed is that these are text strings written in HTML code. Instead of creating a custom translation table for all the special characters and symbols of the HTML language/code, we can use a native feature/function that Power Query has to simply translate it automatically. Let’s find out how!
Click in the button below so you can download the samples used in this article:
A native M function available for decoding HTML – Web.Page
Power Query has multiple functions that let you connect to different file structures and decode their contents like XML, JSON, Excel files and even HTML files. For our specific page, we need to use the function that extract the contents and decode that HTML file. The function that we need is called Web.Page:
This function can use a text string of HTML and translate it into a simple text string that. This is exactly what we need!
One thing to take in consideration about this function is that its output is more a hierarchy, so in order to get what we need we’ll have to navigate through that hierarchy.
Creating a Custom Function based on the Native Function Web.Page (saving time!)
To make things simpler, and just to save us some time since we’ll be re-using that function multiple times, I was able to follow the proper hierarchy navigation and create a custom function out of it. The code for that function is the following:
(Text as any) => let Source = Text, Custom1 = Web.Page(Source), Data = Custom1{0}[Data], Children = Data{0}[Children], Children1 = Children{1}[Children], Children2 = Children1{0}[Children], Text1 = Children2{0}[Text] in Text1
which you can copy/paste to a blank query through the Advanced Editor and name that function fnDecodeHTML.
You can later Invoke that function using the Invoke button in the Add Column tab and the result of that operation will give you this:
Conclusion
Power Query can handle A LOT of replace values scenarios, but it is always a better idea to understand first what values you’re trying to find, how they are being created and if there’s any current Data Source or decoding functions available in Power Query that can help us. In this specific case, Web.Page is a function that comes into play and helps us by not having to create a custom translation table which would’ve taken hours or even days to properly research and come up with.
To learn more cool techniques and more advanced scenarios with the M language, go ahead and enroll to the FREE trial of the Power Query Academy where I talk more in detail about these type of techniques.
Getting all the HTML codes into a table in Power Query is not that much of a problem. You can even directly load them from a web page into a table. But: If you want to transform a lot of encoded characters, no just a few small demo fields, then looking up each occurence of an encoded character in a separate table is a real performance issue! So it’s good to have the Web.Page function.
That worked great except anything which had quotes “word” has returned extra characters. Can you tell me how to convert to quotes in your query above?
how it returns: mounts of the “keys†share
should be: mounts of the “keys” share
thank you for your help! Very simple and great solutino
Hey! Unfortunately I don’t think that the code is incorrect. It appears that the text string in question uses something that I’ve just found through google called “Mojibake”.
The function from this article uses the Web.Page function which translates the strings automatically. Since it couldn’t find that one, I’d say that you do a replace values operation as showcased here:
https://docs.microsoft.com/en-us/power-query/replace-values
Thank you, Great idea!