On Thursday 19 February we were lucky enough to have Alex Gimson the Community Evangelist at Import.io host a webinar in the Tableau offices in London. Alex demonstrated each of import.io’s four data extraction tools (for an overview of each tool check out Alex’s guest blog post for us Instantly Turn Web Pages into Data with Import.io). He also answered some of your questions about data scraping with import.io.
Watch the Alex’s webinar in full below. If you want to follow along with all his examples you can find the links to all the websites here.
After Alex’s fantastic demonstration of the power of import.io he had time to answer some questions.
Can you combine multiple APIs into a single dataset?
Yes you can! When you are logged into your account you can click on ‘New’ -> ‘Data Set’ and this gives you a blank dataset. In this new blank dataset you can add in lot and lots of APIs by clicking on the ‘+ Add Data’ button and then select as many of your existing APIs as you want.
If you web page has hyperlinks is there any way to get the hyperlink and the text that is displayed on the webpage in two different columns?
When import.io extracts a hyperlink, it also extracts the text at the same time. This means that you simply create two columns from the same hyperlink when you are training the API, and change the second column to display ‘text’.
How to extract from multiple pages when they are formatted a little differently?
For this you will need to use the XPath feature in the advanced settings. XPaths are what the import.io app uses to pick out data from a webpage (when you highlight a part of the webpage, import.io is actually looking at the XPath underneath).
In the advanced settings, you can set the XPath you need to get exactly the data you want. There is a great webinar all about import.io’s advanced features – and how to use them –here
What are some other use cases that companies use import.io for?
Alex’s personal favourite use case of import.io is ThatGift that uses a mix of connectors that query multiple webpages from a single query box.
Another great use that Alex is a fan of is the mobile phone app NightCapp import.io is used to find the clubs with the latest closing hours. Genius!
Want to give import.io a go to find some #DataInTheWild for your next viz? Visit the site Import.io and get started with a click!