'node js puppeteer How do I scrape values from duplicate selector?
I want to scrape the value, but its selector is duplicate, I don't know how to solve.
It will always scrape the value of the one above.
here my code:
const puppeteer = require('puppeteer')
async function scrape() { const browser = await puppeteer.launch({}) const page = await browser.newPage()
await page.goto('https://pantip.com/topic/34497907') var element
= await page.waitForSelector("#comment-counter") var text = await page.evaluate(element => element.textContent, element) console.log(text) browser.close() } scrape()
This is the part I want to scrape.
This is duplicate and above which I don't need it.
I tried other methods I know already like xpath but it doesn't work because the part I will scrape is written in ajax the only way I know and it works now is to use this if there is another better way and about node js please recommend me :)
Solution 1:[1]
I wrote several solutions to your problem. The first is to get an array containing the contents of all the selectors we are looking for:
const puppeteer = require("puppeteer");
async function scrape() {
const browser = await puppeteer.launch({});
const page = await browser.newPage();
await page.goto("https://pantip.com/topic/34497907");
await page.waitForSelector("#comment-counter");
const text = await page.evaluate(() => {
return Array.from(document.querySelectorAll("#comment-counter")).map((el) => el.textContent.trim());
});
console.log(text);
browser.close();
}
scrape();
Output:
[ '?????????????????', '169 ???????????' ]
And in the second option, you can use a chain of selectors so that the one you are looking for is unique:
const puppeteer = require("puppeteer");
async function scrape() {
const browser = await puppeteer.launch({});
const page = await browser.newPage();
await page.goto("https://pantip.com/topic/34497907");
await page.waitForSelector("#comment-counter");
const text = await page.$eval("#comments-counts #comment-counter", (el) => el.textContent);
console.log(text);
browser.close();
}
scrape();
Output:
169 ???????????
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Mikhail Zub |