'Returned null response from Unirest block
I am trying to scrape a google search results with Unirest and Cheerio . So here I am trying to get this scraped titles array . But when I console it outside it returns me nothing while it returns me the data when I console it inside Unirest block .
Here is my code :
const unirest = require('unirest')
const cheerio = require('cheerio')
var titles = []
unirest
.get('https://www.google.com/search?q=oxylabs')
.headers({'Accept': 'application/json', 'Content-Type': 'application/json'})
.proxy(proxy)//hided
.then((response) =>
{
const $ = cheerio.load(response.body)
$('.uEierd').each((i,el) =>
{
titles[i] = $(el)
.find('.ZINbbc div.v5yQqb a.cz3goc div.CCgQ5 span')
.text()
})
})
for (let i = 0; i < titles.length; i++)
{
console.log(titles[i]);
}
Solution 1:[1]
Your code doesn't work because unirest
is asynchronous and runs after your for
loop. Code and full example in the online IDE:
const unirest = require("unirest");
const cheerio = require("cheerio");
function getData() {
return new Promise((resolve, reject) => {
const titles = [];
unirest
.get("https://www.google.com/search?q=oxylabs")
.headers({
"Accept": "application/json",
"Content-Type": "application/json",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36"
})
// .proxy(proxy) //hided
.then((response) => {
const $ = cheerio.load(response.body);
$(".uEierd").each((i, el) => {
titles[i] = $(el).find(".v0nnCb span").text();
});
resolve(titles);
})
.catch((error) => {
console.error(error);
reject(error);
});
});
}
function logData() {
getData().then((titles) => {
for (let i = 0; i < titles.length; i++) {
console.log(titles[i]);
}
});
}
logData();
Output:
Oxylabs Premium Proxies - Residential Proxy Network
72M+ Residential IPs Network - 7 Day-Free Trial - Join Now
Alternatively, you can use the Google Ad Results API from SerpApi. An API approach is easier if you don't want to figure out how to solve captchas, rotates proxies, create the parser from scratch and maintain it. Check out the Playground for more.
Usage:
const SerpApi = require("google-search-results-nodejs");
const mySecret = process.env['API_KEY'] //your API key from serpapi.com
const search = new SerpApi.GoogleSearch(mySecret);
const params = {
engine: "google", // search engine
q: "oxylabs", // search query
location: "Austin, Texas, United States", // location parameter
google_domain: "google.com", // google domain of the search
gl: "us", // contry of the search
hl: "en", // language of the search
};
const getAdTitles = function (data) {
const titles = [];
const adResults = data.ads;
adResults?.forEach((result) => {
const { title } = result;
titles.push(title);
});
console.log(titles);
};
search.json(params, getAdTitles);
Output:
[
'Oxylabs Premium Proxies - Residential Proxy Network',
'72M+ Residential IPs Network - 7 Day-Free Trial - Join Now'
]
Disclaimer, I work for SerpApi.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Mikhail Zub |