Automated Scraping

our scraping tools (Solvent and Crowbar) let you deal with web pages at the level of the DOM (e.g., evaluating XPaths, retrieving HTML attributes) rather than at the level of streaming characters. This higher level of abstraction is easier to operate in. Furthermore, Solvent and Crowbar can wait for all the dynamic Javascript code in web pages to finish running; this means that you can even scrape those new Web 2.0 sites rather than just static web pages.

their DOM scraping has come a long way

Leave a comment