The data on a website might sometimes be presented in an inconvenient way.
You might want to extract the data on a website as an Excel spreadsheet.
You can scrape this list and add it to a CSV file (or Excel spreadsheet) to save for future review and use. This is just one simple example of what you can do with web scraping, but the general concept is to find a site that has the information you need, use C# to scrape the content, and store it for later use. Web Scraping for Free In order to scrape large volumes of data, you will need a web scraper. For this example, we will use ParseHub, a free and powerful web scraper that can scrape any website. Make sure to download and install ParseHub before we get started. For someone who is looking for a quick tool to scrape data off pages to Excel and doesn’t want to set up the VBA code yourself, I strongly recommend automated web scraping tools like Octoparse to scrape data for your Excel Worksheet directly or via API. There is no need to learn to program. WEB SCRAPING GOOGLE Now, that we can read the Excel dataset, let’s find the location of headquarters for each of the company names. To do so in a real-life, you’d probably just visit Google and type in “Apple Headquarters” as a query. It would return something like this.
This way, you can actually use the data and realize its full value.
In any case, web scraping tools can be incredibly helpful at helping you turn a website into a simple spreadsheet. In this article, we’ll guide you on how to set up a free web scraper and how to quickly extract the data you need.
ParseHub: A Powerful and Free Web Scraper
To achieve our goal for this project, we will use ParseHub, a free and powerful web scraper that can turn any website into a handy spreadsheet or API for you to use.
Wondering what a web scraper is and how they work? Read our definite guide on web scraping.
For this guide, we will only focus on the spreadsheet side of things.
So, before we get started, make sure to download and install ParseHub for free.
Example Web Scraping Project to Export Data to Excel
For the sake of example, let’s assume we own an imaginary company that sells napkins, paper plates, plastic utensils, straws, and other consumable restaurant items (All our items will be fully recyclable too since our imaginary company is ahead of the competition).
As a result, having an excel spreadsheet with the contact information of every fast food restaurant in town would be incredibly valuable and a great way to build a leads database.
So, we will extract the data from a Yelp search result page into an excel spreadsheet.
Getting Started
- Make sure you’ve downloaded ParseHuband have it up and running on your computer.
- Find the specific webpage(s) you’d like to scrape. In this example, we will use Yelp’s result page for Fast Food restaurants in Toronto.
Create a Project
- In ParseHub, click on “New Project” and enter the URL to scrape.
- Once submitted, the URL will load inside ParseHub and you will be able to start selecting the information you want to extract.
Identify and Select Data to Scrape
- Let’s start by selecting the business name of the first result on the page. Do this by clicking on it. It will then turn green.
- You will notice that all the business names on the page will turn yellow. Click on the next one to select all of them.
- You will notice that ParseHub is now set to extract the business name for every result on the page plus the URL it is linking to. All business names will now also be green.
- On the left sidebar, click on the selection you’ve just created and rename it to business
- Then click on the PLUS(+) sign on the selection and choose relative select. This will allow us to extract more data, such as the address and phone number of each business.
- Using Relative Select, click on the first business name and then on the phone number next to it. Rename this Relative Select to phone.
- Using Relative Select again, do the same for the business address. Rename this Relative Select to address. We’ll do the same for the business category.
Pagination
Now, you will notice that this method will only capture the first page of search results. We will now tell ParseHub to scrape the next 5 pages of results.
- Click on the PLUS(+) sign next to the “Select Page” item, choose the Select command and select the “Next” link at the bottom of the page you'd want to scrape.
- Rename this selection to Pagination.
- ParseHub will automatically pull the URL for this link into the spreadsheet. In this case, we will remove these URL’s since we do not need them. Click on the icon next to the selection name and delete the 2 extract commands.
- Now, click on the PLUS(+) sign next to your Pagination selection and use the click command.
- A window will pop up asking if this is a Next Page link. Click “Yes” and enter the number of times you’d like this cycle to repeat. For this example, we will do it 5 times. Then, click on Repeat Current Template.
Scrape and Export Data
Now that you are all set up, it’s time to actually scrape the data and extract it.
- Click on the green Get Data button on the left sidebar
- Here you can either test your scrape run, schedule it for the future or run it right away. In this case, we will run it right away although we recommend to always test your scrapes before running them.
- Now ParseHub is off to scrape all the data you’ve selected. You can either wait on this screen or leave ParseHub, you will be notified once your scrape is complete. In this case, our scrape was completed in under 2 minutes!
- Once your data is ready to download, click on the CSV/Excel button. Now you can save and rename your file.
Depending on the website you are scraping data from, your CSV file might not display correctly in Excel. In this case, apostrophes were not formatted correctly in our sheet.
If you run into these issues, you can quickly solve them by using the import feature on Excel.
Turning a Website into an Excel Spreadsheet
And that is all that there is to it.
You can now use the power of web scraping to collect info from any website just like we did in this example.
Will you use it to generate more business leads? Or maybe to scrape competitor pricing info? Or maybe you can use it to power up your next Fantasy Football bracket.
Quickly scrape web data without coding
Turn web pages into structured spreadsheets within clicks
Extract Web Data in 3 Steps
Web Scraping To Excel
Point, click and extract. No coding needed at all!
Enter the website URL you'd like to extract data from
Click on the target data to extract
Run the extraction and get data
- Step 1Step 2Step 3
Extract Web Data in 3 Steps
Point, click and extract. No coding needed at all!
- Step 1
Enter the website URL you'd like to extract data from
Step 3Run the extraction and get data
Advanced Web Scraping Features
Everything you need to automate your web scraping
Easy to Use
Scrape all data with simple point and click.
No coding needed.
Deal With All Websites
Scrape websites with infinite scrolling,
login, drop-down, AJAX...
Download Results
Download scraped data as CSV, Excel, API
or save to databases.
Cloud Services
Scrape and access data on Octoparse Cloud Platform 24/7.
Schedule Scraping
Schedule tasks to scrape at any specific time,
hourly, daily, weekly...
IP Rotation
Automatic IP rotation to prevent IP
from being blocked.
What We Can Do
Web Scraping Into Excel
Easily Build Web Crawlers
Point-and-Click Interface - Anyone who knows how to browse can scrape. No coding needed.
Scrape data from any dynamic website - Infinite scrolling, dropdowns, log-in authentication, AJAX...
Scrape unlimited pages - Crawl and scrape from unlimited webpages for free.
Sign upSign up
Octoparse Cloud Service
Cloud Platform - Execute multiple concurrent extractions 24/7 with faster scraping speed.
Schedule Scraping - Schedule to extract data in the Cloud any time at any frequency.
Automatic IP Rotation - Anonymous scraping minimizes the chances of being traced and blocked.
Buy NowBuy Now
Professional Data Services
We provide professional data scraping services for you. Tell us what you need.Our data team will meet with you to discuss your web crawling and data processing requirements.Save money and time hiring the web scraping experts.Data Scraping ServiceData Scraping Service
Trusted by
- It is very easy to use even though you don't have any experience on website scraping before.
It can do a lot for you. Octoparse has enabled me to ingest a large number of data point and focus my time on statistical analysis versus data extraction. - Octoparse is an extremely powerful data extraction tool that has optimized and pushed our data scraping efforts to the next level.
I would recommend this service to anyone. The price for the value provides a large return on the investment.
For the free version, which works great, you can run at least 10 scraping tasks at a time.
Comments are closed.