'can i extract HTML by date when the page is a continuous scroll and the date is in a separate div?
So im scrapping through different sites with python, and the ones with continuous scrolldown are getting me stuck. for example this online newspaper newspaper has all the news on the same page and the same url, so you cant add the date to the url for extracting date specific news like other newspapers, on top of that, the date separation is inside a <div that opens and closes so the news of the desired date are not on the div of the date. the div has this structure
<div class="first-col"
<p class=date-post
<time datetime="2020-04-20"
</time> </p> </div>
and the news are after it, so how could i exctract the news(their href and title) by date, when they are not grouped together?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|