Category "web-scraping"

AWS Batch Jobs does not scale to maximum Vcpus

I'm trying to run a webscraper with 200 VCPUs on an AWS Queue( queue A), but it's only using 40 even if the maximum and desired number of VCPUs is 200. What sho

When using the instant data scraper to grab the target clearance product list, the same data format is inconsistent

My purpose is to use instant data scraper to get the product name, product link, and price of all clearance products in the link. As shown in the picture below,

Edit 14 day weather forecast Excel VBA to include precipitation

I found the code below which works nicely and I think I can repurpose it for my needs, but does not include the precipitation. I'm relatively new to HTML so hav

Scroll with Keys.PAGE_DOWN in Selenium Python

Hello Every one can any one help me in scrolling https://www.grainger.com/category/black-pipe-fittings/pipe-fittings/pipe-tubing-and-fittings/plumbing/ecatalog/

Multi-Threaded Python scraper does not execute functions

I am writing a multi-threaded python scraper. I am facing an issue where my script quits after running for 0.39 seconds without any error. It seem that the pars

Need help parsing link from iframe using BeautifulSoup and Python3

I have this url here, and I'm trying to get the video's source link, but it's located within an iframe. The video url is https://ndisk.cizgifilmlerizle.com... i

Download bing image search results using python (custom url)

I want to download bing search images using python code. Example URL: https://www.bing.com/images/search?q=sketch%2520using%20iphone%2520students My python co

Unable to get correct CSS tag for webscraping in R using SelectorGadget

I am trying to web scrape data in R from this url but cannot seem to get the correct css tag. For now I just need help retrieving the professor's name. Any help

How to accept facebook cookies using python selenium?

I have a problem clicking the facebook accept cookies button on facebook creator studio website. Cookies are shown only when the page is opened by the program,

Proxy: Selenium + Python + Firefox

I'm trying to set up a proxy for Selenium and Firefox (using Python). I have seen many tutorials and posts how to do it, but most are outdated or don't work for

Scrapy returns ValueError SelectorList is not supported

I think the problem is when I try to enter each url spell with response.follow in the loop, but idk why, it passes the around 500 links perfectly to extract_xpa

Scrapy: scraping large PDF files without keeping response body in memory

Let's say I want to scrape a PDF of 1GB with Scrapy, then using the scraped PDF data in further Requests down the line.. how do I do this without keeping the 1G

How do i save the authentication with puppeteer?

I need to talk to a telegram bot, with my web app. So i decided to do a web scrapping, i do not know if its the best strategy. When i try to access the telegram

How can I handle pagination with Scrapy and Splash, if the href of the button is javascript:void(0)

I am trying to scrape the names and links of universities from this website: https://www.topuniversities.com/university-rankings/world-university-rankings/2021,

How can I handle pagination with Scrapy and Splash, if the href of the button is javascript:void(0)

I am trying to scrape the names and links of universities from this website: https://www.topuniversities.com/university-rankings/world-university-rankings/2021,

Setting a default for nosuchelementexception for multiple variables in python

So I am scrapping multiple rows of a table and many of them are either available or not for different pages. What I want to do is to detect which field is not a

Scrape and change data in date in BeautifulSoup

I am scraping data from different web pages and there are several dates in this data. The code allowing me to have the information that I want looks like this,

Can I change a drop down item from a list in Jsoup and submit it?

I have a site I'm trying to scrape with Jsoup that has monthly and yearly selection boxes where the data changes when a different month or year is selected. Edi

Can't bypass cloudflare with python cloudscraper

I faced with cloudflare issue when I tried to parse the website. I got this code import cloudscraper url = "https://author.today" scraper = cloudscraper.create

Not able to replicate AJAX using Python Requests

I am trying to replicate ajax request from a web page (https://droughtmonitor.unl.edu/Data/DataTables.aspx). AJAX is initiated when we select values from dropdo

Category "web-scraping"

Other Categories