'Deploy Scrapy Project with Streamlit

I have a scrapy spider that scrapes products information from amazon based on the product link.

I want to deploy this project with streamlit and take the product link as web input, and product information as output data on the web.

I don't know alot about deployment, so anyone can help me with that.



Solution 1:[1]

You can create a public repository on GitHub with streamlit and connect your account with 0auth. Then you can deploy it on the streamlit servers after signing in the streamlit website.

Solution 2:[2]

You can run scrapy from a script using scrapy.crawler.CrawlerProcess module

basically, you can run the spider and export the data temporarily and use it in your streamli app -

import scrapy
from scrapy.crawler import CrawlerProcess

class MySpider(scrapy.Spider):
    # Your spider definition
    ...

process = CrawlerProcess(settings={
    "FEEDS": {
        "items.json": {"format": "json"},
    },
})

process.crawl(MySpider)
process.start() # the script will block here until the crawling is finished

Now you can save this script and run using subprocess which will export the data into items.json. Use it in your app.

Here is a helpful streamlit cloud scrapy thread with public streamlit-scrapy project github repo

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Peker Celik
Solution 2 ahmedshahriar