'Deploy Scrapy Project with Streamlit
I have a scrapy spider that scrapes products information from amazon based on the product link.
I want to deploy this project with streamlit and take the product link as web input, and product information as output data on the web.
I don't know alot about deployment, so anyone can help me with that.
Solution 1:[1]
You can create a public repository on GitHub with streamlit and connect your account with 0auth. Then you can deploy it on the streamlit servers after signing in the streamlit website.
Solution 2:[2]
You can run scrapy from a script using scrapy.crawler.CrawlerProcess module
basically, you can run the spider and export the data temporarily and use it in your streamli app -
import scrapy
from scrapy.crawler import CrawlerProcess
class MySpider(scrapy.Spider):
# Your spider definition
...
process = CrawlerProcess(settings={
"FEEDS": {
"items.json": {"format": "json"},
},
})
process.crawl(MySpider)
process.start() # the script will block here until the crawling is finished
Now you can save this script and run using subprocess
which will export the data into items.json
. Use it in your app.
Here is a helpful streamlit cloud scrapy thread with public streamlit-scrapy project github repo
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Peker Celik |
Solution 2 | ahmedshahriar |