Read our blogs, tips and tutorials
Try our exercises or test your skills
Watch our tutorial videos or shorts
Take a self-paced course
Read our recent newsletters
License our courseware
Book expert consultancy
Buy our publications
Get help in using our site
537 attributed reviews in the last 3 years
Refreshingly small course sizes
Outstandingly good courseware
Whizzy online classrooms
Wise Owl trainers only (no freelancers)
Almost no cancellations
We have genuine integrity
We invoice after training
Review 30+ years of Wise Owl
View our top 100 clients
Search our website
We also send out useful tips in a monthly email newsletter ...
Importing web tables by example in Power BI |
---|
Not every set of data you want to import from a web page is considerate enough to exist as an HTML table. Fortunately, you can give examples of the information you want to import to have Power BI infer a table from the data on the page. |
In this blog
When you import data from a website in Power BI, the wizard automatically detects HTML tables and allows you to pick them from a list. Anyone who's attended a Wise Owl Power BI course will be familiar with the table of families we import to build a data model.
The HTML view of this page shows that the list of families is stored in an HTML table, as indicated by the <table> tag.
When we import data from the page, we can simply pick the table from a list.
It's easy to select the table of families to import.
Unfortunately, not every data set that you want to import from a web page is considerate enough to exist in an HTML table! Fortunately, Power BI lets you suggest your own tables and this blog shows you how.
Although this blog uses Power BI Desktop, you can use the same technique when importing data from a website in Excel.
Let's say we wanted to import a list of topics and subtopics from this page on the Wise Owl website which lists the contents of our Introduction to Power BI training course.
Some of the topics covered on the course.
Although the topics listed on the page are in a nice, structured layout, they aren't in an HTML table. You can see this by inspecting an item on the page to see its underlying HTML code. To do this, right-click on an item on the page and choose to inspect it. Here's what that looks like in Google Chrome:
I've right-clicked on the Basic Reports topic name and selected Inspect from the menu.
This opens a panel showing the HTML code of the page, revealing that the item is not an HTML table.
There's definitely a structure here, but it's not an HTML table.
Although the data is in structured layout, because it isn't in an HTML table, Power BI won't recognise it by default.
To help Power BI recognise the structure as a table, start importing from a web page in the usual way.
Choose to Get data from the Web.
Enter the URL of the page containing the information you want to import.
Enter the URL, click OK then provide any credentials you need to access the page.
You'll then be presented with a list of items you could import. The formal HTML Tables will be listed in their own section. Power BI will also try to detect other structures which might be considered tables and lists these in the Suggested Tables section.
You can click on a suggested table to see what it contains.
In this case, Power BI fails to find the topic list so we'll need to tell it what we want. To do this, click the Add Table Using Examples button at the bottom of the dialog box.
Click here to start suggesting your own table.
This opens a new dialog box with a preview of the web page and an area for you to start writing your suggestions.
You can begin typing your suggestions
Start typing your first suggestion and pick the item which matches it from the list which appears.
I've entered part of the name of the first main topic and can pick the matching item from the list.
If you make a mistake, you can double-click a cell to edit it.
Continue suggesting values in the same way.
I'm entering the name of the second main topic on the next row.
Eventually, Power BI will work out what it is you're trying to return and will infer the remaining values from your suggestion.
After entering the first two values, Power BI infers the rest.
At this point, it's worth renaming the column. You can do this by double-clicking the column header and entering a new name.
Enter the new name then press Enter.
You can add a new column by clicking the + button.
Click here to add a new column.
Now start adding suggestions to the column. In the example below, I've entered the name of the first sub-topic:
Power BI has detected some of the remaining values, but not all of them.
You can edit any missing value by selecting the cell and typing the suggestion.
Enter the value you think should appear in the cell.
Continue doing this until you have a table containing all the data you want. You can then click OK.
Click OK to return to the list of tables.
You can now either load the data into the data model, or transform the data in Power Query if you need to tweak it some more.
The table is ready to be loaded into the data model.
If you need to make changes to your suggestions, you can edit the underlying query in Power Query. To do this,
In the APPLIED STEPS list on the right of the Power Query screen, click the "gear" icon next to the Extracted Table from Html step.
Click this icon to open the dialog box and edit your suggestions.
You can now edit your suggestions as previously. When you've finished, you can apply the changes you've made by selecting Home | Close & Apply from the Power Query ribbon.
If you're interested in seeing how Power BI interprets your suggestions, you can look at the code generated in Power Query.
Select this step on the right of the screen.
Look in the formula bar just below the Power Query ribbon to see the code.
The code Power Query uses to extract data from the web page.
Power Query uses a series of CSS selectors to identify the different elements to extract from the web page.
Some other pages relevant to the above blog include:
Kingsmoor House
Railway Street
GLOSSOP
SK13 2AA
Landmark Offices
99 Bishopsgate
LONDON
EC2M 3XD
Holiday Inn
25 Aytoun Street
MANCHESTER
M1 3AE
© Wise Owl Business Solutions Ltd 2024. All Rights Reserved.