'Puppeteer, awaiting a selector, and returning data from within
I am loading a page, intercepting its requests, and when a certain element shows up I stop loading and extract the data I need...
Here is the problem that I am faced with.
When simplified the code looks actually something like this:
async function loadPage()
{
var contentLoaded = false;
var content;
//now i say when element shows up, do something
// it is page.waitForSelector but for simplicity, i use a timeout
// because the problem is the same
//i set it to "show up" in 10 seconds here.
//when it shows up, it sets the content to 100 (extracts the content i want)
//and stores it..
setTimeout(()=>{
content = 100;
contentLoaded = true;
},10000)
//Here i have a function that loads the page
//Intercepts request and handles them
//Until content is loaded
page.on('request', req =>{
if(!contentLoaded)
{
// keep loading page
}
})
// this is the piece of code i would like to not run,
// UNTIL i either get the data, or a timeout error
// from page.waitForSelector...
//but javascript will run it if it's not busy with the
//loading function above...
// In 10 seconds the content shows
// and it's stored in DATA, but this piece of code has
// already finished by the time that is done...
// and it returns false...
if(contentLoaded)
{return content}
else
{return false}
}
var x = loadPage();
x.then(console.log); //should log the data or false if error occured
Thank you all for taking the time to read this and help out, I'm a novice so any feedback or even reading material is welcome if you think there is something I'm not fully understanding
Solution 1:[1]
Solved
Simple explanation:
Here is what I was trying to accomplish:
- Intercept page requests so that I can decide what not to load, and speedup loading
- Once an element shows up on the page, i want to extract some data and return it.
I was trying to return it like this: (note, all the browser and error handling will be left out in these since it would just clutter the explanation)
var data = loadPage(url);
async function loadPage(URL)
{
var data;
page.waitForSelector(
var x = //page.evaluate returns data to x...
data = x;
)
return data;
}
Which doesn't work since return runs immediately but waitForSelector runs later, so we always return undefined...
The correct way of doing it, or rather the way it works for me is to return the whole promise, and then extract the data...
var data = loadPage(url);
data.then(//do what needs to be done with the data);
async function loadPage(URL)
{
var data = page.waitForSelector(
var x = //page.evaluate returns data to x...
data = x;
)
return data; // we return data as a promise
}
I hope it's a solid enough explanation, if someone needs to see the whole deal, I could edit the question and place the whole code there...
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | JustBaneIsFine |