Intuitive Powerful Visual Web Scraper – WebHarvy can automatically scrape Text, Images, URLs & Emails from websites, and save the scraped content in various formats.
WebHarvy Web Scraper can be used to scrape data from www.yellowpages.com. Data fields such as name, address, phone number, website URL etc can be selected for extraction by just clicking on them!
- Point and Click Interface
WebHarvy is a visual web scraper. There is absolutely no need to write any scripts or code to scrape data. You will be using WebHarvy’s in-built browser to navigate web pages. You can select the data to be scraped with mouse clicks. It is that easy ! - Scrape Data Patterns Automatic Pattern Detection
WebHarvy automatically identifies patterns of data occurring in web pages. So if you need to scrape a list of items (name, address, email, price etc) from a web page, you need not do any additional configuration. If data repeats, WebHarvy will scrape it automatically. - Export scraped data Save to File or Database
You can save the data extracted from web pages in a variety of formats. The current version of WebHarvy Web Scraper allows you to export the scraped data as an XML, CSV, JSON or TSV file. You can also export the scraped data to an SQL database. - Scrape data from multiple pages Scrape from Multiple Pages
Often web pages display data such as product listings in multiple pages. WebHarvy can automatically crawl and extract data from multiple pages. Just point out the ‘link to the next page’ and WebHarvy Web Scraper will automatically scrape data from all pages. - Keyword based Scraping Keyword based Scraping
Keyword based scraping allows you to capture data from search results pages for a list of input keywords. The configuration which you create will be automatically repeated for all given input keywords while mining data. Any number of input keywords can be specified. - Scrape via proxy server Proxy Servers
To scrape anonymously and to prevent the web scraping software from being blocked by web servers, you have the option to access target websites via proxy servers. Either a single proxy server address or a list of proxy server addresses may be used. - Category Scraping Category Scraping
WebHarvy Web Scraper allows you to scrape data from a list of links which leads to similar pages within a website. This allows you to scrape categories or subsections within websites using a single configuration. - Regular Expressions
WebHarvy allows you to apply Regular Expressions (RegEx) on Text or HTML source of web pages and scrape the matching portion. This powerful technique offers you more flexibility while scraping data. - WebHarvy Support Technical Support
Once you purchase WebHarvy Web Scraper you will receive free updates and free support from us for a period of 1 year from the date of purchase. Bug fixes are free for lifetime.
WebHarvy 7.3 – Keywords via Input-Text, Miner Options saved in Configuration etc.
Support for adding keywords via the ‘Input-Text’ option
There are websites where the search functionality is implemented such that the search keyword which user enters does not appear in the URL or POST data of the search results page. In these cases, if you use the ‘Input Text’ capture window option to input the keyword to the search box and then perform search (during configuration), then keywords can be added later to the configuration using the ‘Add Keywords‘ functionality.
Miner options saved in the configuration file
The following miner options are now saved in the configuration file, so that each configuration can have its own specific values for these settings. Previously, global settings were used for all configurations.
- Advanced Miner Options
- Page Load Timeout
- Script Load Wait Time
Category tagging is now done using the full category path for multi-level category scraping
For multi-level category scraping, the category tagging column is filled with full category path (example: main category, sub category 1, sub category 2, final category). Previously, the category column was filled with the final page URL instead of category names.
Updated Browser
The chromium browser that WebHarvy internally uses has been updated to the latest version. This solves issues with Cloudflare protection reported on some websites.
Sorry, You need to be logged in to access this page.
GET FULL ACCESS all the contents on this site if you are VIP MEMBERS.