'Browser crashes on auto-scroll puppeteer
I have a scraper that scrapes the video urls of all the posts of a user. I have implemented auto-scroll so that all the videos are loaded for me to scrape. I tested it out on my laptop (MacOS) and a desktop PC (Windows 10) and it was working perfectly. I made a zip file of it and sent it to a friend to check it out. And weirdly enough, it wasn't working for him (Windows 11)! Whenever puppeteer tries to scroll the page, the browser gets a message at the top saying "Something went wrong" and it closes. Here is the code:
const puppeteer = require('puppeteer');
const createCsvWriter = require('csv-writer').createObjectCsvWriter;
const Spinner = require('cli-spinner').Spinner;
const { exit } = require('process');
const csvWriter = createCsvWriter({
path: 'data.csv',
header: [
{ id: 'id', title: '#' },
{ id: 'URL', title: 'URL' },
],
});
(async function main() {
var spinner = new Spinner('Scraping data.... %s');
spinner.setSpinnerString('|Oo-\\');
spinner.start();
let data = [];
let counter = 1;
const browser = await puppeteer.launch({ headless: true });
const page = await browser.newPage();
page.setUserAgent(
'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36'
);
await page.goto(`https://www.tiktok.com/@${process.argv[2]}`, {
waitUntil: 'load',
timeout: 0,
});
await page.waitForSelector('.tiktok-yz6ijl-DivWrapper > a');
await page.waitForSelector('.tiktok-yz6ijl-DivWrapper > a');
let lastHeight = await page.evaluate('document.body.scrollHeight');
while (true) {
await page.evaluate('window.scrollTo(0, document.body.scrollHeight)');
await page.waitForTimeout(2000);
let newHeight = await page.evaluate('document.body.scrollHeight');
if (newHeight === lastHeight) {
break;
}
lastHeight = newHeight;
}
let values = await page.evaluate((sel) => {
let elements = Array.from(document.querySelectorAll(sel));
let responses = elements.map((element) => {
return element.getAttribute('href');
});
return responses;
}, '.tiktok-yz6ijl-DivWrapper > a');
values.map((value) => {
data.push({
id: counter,
URL: value,
});
counter++;
});
csvWriter.writeRecords(data).then(() => {
console.log('...Done');
exit(0);
});
})();
Any help or suggestions would be appreciated.
Solution 1:[1]
I got the same error. As I understand it, Tiktok somehow determined that a bot was coming in, and not a person, and did not load the video further, because when I tried to scroll manually, I received the same error. I was helped by the use of puppeteer stealth:
const puppeteer = require("puppeteer-extra");
const StealthPlugin = require("puppeteer-extra-plugin-stealth");
const createCsvWriter = require("csv-writer").createObjectCsvWriter;
const Spinner = require("cli-spinner").Spinner;
const { exit } = require("process");
puppeteer.use(StealthPlugin());
//below is your code without changes
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Mikhail Zub |