'Scraping Cloudflare Sites
so I am trying to read data from a bunch of sites, now most of these sites use cloudflare // cloudflare antibot, what this does is block my scrape attempts...
I don't know much about how the cloudflare anti bot works but I was wondering if you could explain how I could do this without using a external library.
Thanks.
Solution 1:[1]
It works very well and how it works is not fully disclosed since it is more a paid service then free one.
Cloudflare default challenge is 30 minutes and they stopped to serve captchas to identify real users by replacing with a more UX oriented method (browser evaluate an hash some seconds). Solved the 1st captcha Your browser will get a token (cf-something) which will make you able to query the underlying resource for half hour (if the site owner didn't lower it) then your scrape process must be faster than 30m. The real question is: why some APi are not available for the services You want to scrape?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | fab23 |