Great news from Google, there is now an official Google Chrome Headless library called Puppeteer. In the first Chrome headless blog post, we used the CDP interface library which is quite a low-level interaction for Chrome. In this post, we go through some of the cons and pros of using Puppeteer.
Puppeteer allows a higher level to control the headless Chrome, it has better and easier to understand API. By installing Puppeteer package you also download separate Chrome instance(~71Mb Mac, ~90Mb Linux, ~110Mb Win.
Headless
- With the recent years, the JavaScript language adoption has skyrocketed and it’s hard to find web pages which are not using a single line of JavaScript. Many sites have converted from the traditional model of server side rendered pages to Single Page Applications (SPA). Which means that traditional web data mining, scrapers tools do not work with the SPA applications or do not give the expected results due to the dynamic nature.