I have a scrapy project and I want to modified it to scrapy-redis: the main scrapy file was below: class MySpider(RedisSpider): name = 'ScrapyBot' redis
so I want to scrape the data of multiple URLs and retrieve all the information. but I can only scrape from 1 URL if more than 1 URL will be an error (list index
I have this piece of code, where I try to download these papers but the loop prints the first element only. import scrapy from urllib.parse import urljoin class
I'm trying to scrape several links containing information about events. I am rotating my paid proxies and user agents generated by UserAgent library. Imperva, w
Is there any way to name a crawled image with other info(text) that we get with the spider? for example in this case I want images with the article title and ar
I would like help on how to use Scrapy in Python to extract data from the following page https://fincaraiz.com.co/apartamentos/arriendos?ubicacion=cali I need t
I am looking to try and remove duplicate timestamps for when I scrape the following site for data on BTC. I want to remove the duplicates after every time reque
within the TERMINAL window having the prompt PS C:\Rolf\py_scripts I have run pip install scrapy I got the message Successfully installed Automat-20.2.0 PyDisp
I would like to find title bar icon with rel = 'icon' or 'shortcut icon'. So I'm trying to do something like this: response.xpath("head/link[@rel='icon' or 'sho
My middlewares settings: from w3lib.http import basic_auth_header class CustomProxyMiddleware(object): def process_request(self, request, spider):
I am new to scrapy and vscode, and my project was working perfectly fine until I decided to get tidy with the folders before uploading on github . After that, w
I am scraping 6 sites in 6 different spiders. But now, I have to scrape these sites in one single spider. Is there a way of scraping multiple links in the same
I'm scraping https://myanimelist.net/anime.php#/ and you can see there is genres section I want to return as a csv only first 18 pages and stop before explicit
I'm scraping a page, using Scrapy. I want the HTML contents of the TD with "text" class: <tr valign="top"> <td class="text" width="100%"> <
I am trying to deploy my Scrapy spider using Scrapyd and Scrapyd-client. I have managed to successfully create a project containing my spider, but when i try to
I am trying to scrape data such as price and some labels etc. from amazon using scrapy. I try to find elements by xpath or css and it works always fine when I u
I tried some website scrapping. I success scraped datas in my current db table. But I would like to INSERT into "new_table" only if records do not exist in "cur
Using scrapy and heroku for some spiders I have setup. It's worked fine for months, didn't change anything that I know would have any affect, and yet when I try
I need to create records in an airtable base and have the following code in scrapy: url = "https://api.airtable.com/v0/appuhKmlhLIIEszLm/Table%201" payload = j
I'm trying to build a site with Django and React, where you can create watchlists with companies. If you view the watchlist, you'll be able to see the price of