What is MetaRover?
MetaRover is a Google Sheets Add-on which can parse URLs from XML sitemaps and scrape html tags and their content in text format from a list of URLs utilizing Google Apps Script's URLFetchApp service.
The Add-on aims to be useful for those who might traditionally use other web scrapers, but would like the convenience of scraping straight into a Google Sheet.
Capabilities:
- done Parse URLs from sitemap(s) and dump results into new sheet(s)
- done Parse html tags from individual URLs and dump results into new columns
- done Parse custom regex from individual URLs and dump results into new columns
- done Check redirect status of URLs and dump results into new columns
Limitations:
- cancel Does not crawl through pages iteratively parsing links
- cancel Does not utilise a headless browser in the background and hence cannot execute JavaScript
-
cancel
Is constrained by Google Apps Script Quotas