Category "beautifulsoup"

Scraping data with BeautifulSoup and Selenium

I am using BeautifulSoup and Selenium to extract web data (beautifulsoup to parse the HTML page and Selenium to click Next to get to the next list of items on t

Dynamic(with mouseover/coordinates) web scraping python unable to extract information

I'm trying to scrape the data that only appears on mouseover(selenium). It's a concert map and this is my entire code. I keep getting TypeError: 'ActionChains'

The scraped content is different from what I see in browser Inspector - Python scraper with Selenium

I want to scrape this website https://lens.zhihu.com/api/v4/videos/1123764263738900480 to get the play_url using Python. This website has a very quick redirect

Python Login to UPS.com returns 403

I had a script that would login to my UPS.com account to receive all incoming packages. The following code was working for a while but not anymore: import reque

How to get product id and UPC in page source in Target?

I am trying to scrape some product ID and UPC of products in Target using Selenium in Python. I cannot find product id and UPC in product page so i go to the pa

Extracting contents of html into BeautifulSoup if search for script ID is a success

My html is below: <html> <body> <div> ... </div> <script id="1" ...> </script> </body> </ht

How to scrape data from Twitter without its API using BeatifulSoup

I'm currently trying to scrape some data from Twitter, like username, screen name, the content of the tweet etc. But I've run into some problems: I've been tryi

find_all() prints everythigh twice

I just started my first Web scraping project and out of some reason when I try to run this simple code, it prints all of the headlines twice. I have no Idea why

How to scrape related searches on google?

I'm trying to scrape google for related searches when given a list of keywords, and then output these related searches into a csv file. My problem is getting be

BS4 - 'NoneType' object has no attribute 'findAll' when scanning spans on amazon page

I'm following a Udemy course on learning BS4 and it seems to be a bit outdated so I'm having trouble with this part. The objective is to scrape the price of thi

Is there any way instead of status_code to determine the request is true or false?

I'm using Python3 with BeautifulSoup. I want to scrape data for a few employees from a site, depending on their ID number. My code: for UID in range(201810000,2

Can't parse out text that is behind </span>text</a> in Beautifulsoup

I think I have tried it all, read crummy, read documentation on Beautifulsoup4 website. I can't get this thing wrapped around my head. So to the question: &

How to select and scrape specific texts out of a bunch <ul> and <li>?

I need to scrape "2015" and "09/09/2015" from the below link: lacentrale.fr/auto-occasion-annonce-87102353714.html But since there are many li and ul, I cant sc

How to scrape sofifa website positions. Text inside of span beautiful soup

So I am webs scraping the sofifa website into a workable csv. Each player gets a column. My main problem is the position section of the website is only exportin

Problem in fetching long URLs using BeautifulSoup

I am trying to fetch a URL from a webpage, here is how the URL looks in the Inspect section: Here is how the URL looks in my python-code: How can I get the ac

Why is web scraping stock prices through beautiful soup returning a different price than the one on the Yahoo Finance page?

I am trying to write a program that will give me the stock price for a few different stocks, but when I run my program, it returns 116.71, while Yahoo Finance h

Python terminal closes when importing BeautifulSoup

I have a simple python program, that is supposed to scrape some information from the internet and do stuff with it. When I run the code in PyCharm (IDE) it work

Why I'm getting "UnicodeEncodeError: 'charmap' codec can't encode character '\u25b2' in position 84811: character maps to <undefined>" error?

I'm getting UnicodeEncodeError: 'charmap' codec can't encode character '\u200b' in position 756: character maps to error while running this code:: from bs4 imp

How to ignore infobox when scraping title from Wikipedia anchor text?

I am trying to scrape the first 20 links on a Wikipedia page but I want to ignore the infobox on the right side. It has a 'table' tag. Here is what I have so fa

Webscraping Google Search Results Using Google API - Returns same result over and over again

My problem Hi everyone I am attempting to develop my very first web scraper using the Google API and Beautiful Soup in Python. The aim is for the scraper to