Category "web-scraping"

Why can't I scrape table data in order?

I'm trying to scrape table data off of this website: https://www.nfl.com/standings/league/2019/REG I have working code (below), however, it seems like the table

Python - BeautifulSoup - How to return two different elements or more, with different attributes?

HTML Exemple <html> <div book="blue" return="abc"> <h4 class="link">www.example.com</h4> <p class="author">RODRIGO</p> </

Python get string from an html page

I have to create an array which contains all the element within title="", for example: title="xxxxx", title="xxx2", title='xxx4', etc... I need to get xxxx,

How can I download images on a page using puppeteer?

I'm new to web scraping and want to download all images on a webpage using puppeteer: const puppeteer = require('puppeteer'); let scrape = async () => {

Can't manipulate dataframe in pandas

Don't understand why I can't do even the most simple data manipulation with this data i've scraped. I've tried all sorts of methjods to manipulate the data but

soup.find() function is not working, how do I find the ID value?

If I have the following HTML that was found with BeautifulSoup, can someone explain why print(soup.find(id="style")) or print(soup.find(id="id")) does not work

How to scrape all data from first page to last page using beautifulsoup

I have been trying to scrape all data from the first page to the last page, but it returns only the first page as the output. How can I solve this? Below is my

Web Scraping price AirBnB data with Python

I have been trying to web scrape an air bnb website to obtain the price without much luck. I have successfully been able to bring in the other areas of interest

How to get text from a div span in soup?

Hi I am trying to get the text within a span from beautiful soup however it doesn't return the 631. I want to get the 631 from this html. <div class="jsx-302

scraping yell with python requests gives 403 error

I have this code from requests.sessions import Session url = "https://www.yell.com/s/launderettes-birmingham.html" s = Session() headers = { 'user-agent':"

Find the CSRF token from head tag in htlm using Beautifulsoup

HTML looks like this: <head csrf-token="eCUDIDdtOwAHTgR4WE9ZWydwIAYvKQYIFRtXKWw7Nn4=..."> I was trying to extract this way: token = soup.find('input', {'

Scraping data from oddsportal.com with Python

I have been trying to extract the data for each cell on a number from this ajax website, the details for each cell only pop-up when a mouse point on the cell. I

Webscraping - unable to get the full content of the page with R

I'm trying to webscrape the job ads from this page: https://con.arbeitsagentur.de/prod/jobboerse/jobsuche-ui/?was=Soziologie%20(grundst%C3%A4ndig)%20(weiterf%C3

Scraping First post from phpbb3 forum by Python

I have alink like that http://www.arabcomics.net/phpbb3/viewtopic.php?f=98&t=71718 the link has LINKS in first post in phpbb3 forum How I get LINKS in fir

Is it possible to download just part of a ZIP file using python zipfile library

I was wondering is there any way by which I can download only a part of a .rar or .zip file without downloading the whole file ? There is a zip file containing

How to get a collection of elements with playwright?

How to get all images on the page with playwright? I'm able to get only one (ElementHandle) with following code, but not a collection. const { chromium } = req

Selenium - Get a list of all the elements between two h1 elements

I have a webpage with the following HTML snippet within it <h1> ... </h1> <p> ... </p> <p> ... </p> <h1> ... </h1&

How to fix "mapping values are not allowed in this context " error in yaml file?

I've browsed similar questions and believe i've applied all that i've been able to glean from answers. I have a .yml file where as far as I can tell each eleme

How to get the html of this website using python requests?

I am trying to download html file from the following website: https://www.avto.net/Ads/results.asp?znamka=Audi&model=&modelID=&tip=katerikoli%20tip&