'Unable to iterate through list using BeautifulSoup
I am doing some experiments with Python3.6 in Mac and BeautifulSoup. I am trying to build a simple program to scrap song lyrics from a URL and store them as plain text in a single variable but I find myself unable to iterate through the html content.
This is the code that I am running:
import requests
import re
from bs4 import BeautifulSoup
r = requests.get("http://www.metrolyrics.com/juicy-lyrics-notorious-big.html")
c = r.content
all = soup.find_all("p",{"class":"verse"})
all[0:10]
for item in all:
print(item.find_all("p",{"class":"verse"})[0].text)
The last two lines of code return the "List index out of range" Error
Also, if I try to do all = all.text
I get the following error:
AttributeError: ResultSet object has no attribute 'text'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?
I imagine this should be something simple but don't know what to do anymore.
Thanks
Solution 1:[1]
The item
in the loop is a BeautifulSoup tag (check it with: type(all[0])
-->
<class 'bs4.element.Tag'>
).
So you can extract text directly from it:
for item in all:
print(item.text)
And if the variable all
is shorter than 10, it will produce an out-of-range error.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | user2314737 |