
- Octoparse not working on infinite scroll manual#
- Octoparse not working on infinite scroll software#
- Octoparse not working on infinite scroll professional#
Some was hidden mist! Most did not work at all.
Octoparse not working on infinite scroll professional#
Octoparse Web ScraperĮxperience: I have been looking professional web scraper for about two months now.
Octoparse not working on infinite scroll software#
It hardly takes 5 minutes to set up and start scraping data from Reddit.Comments: The software is much easier to use, visually appealing, and on going customer support as well as tutorials have been created with the user in mind. But scraping new Reddit is a cakewalk with Octoparse. New has an infinite scroll feature and it is tricky to scrape. If your daily scraping requirements are within a few million posts or rows of data, then using “ click and scrape” tools would be more cost & resource-efficient.You’ll need human resources, computing resources, networking resources on top of web scraping specific resources i.e., proxy services, database, etcetera. Remember, this is a resource-intensive option. Prefer hiring web scraping developers and data testing, cleansing & validation engineers if you have a high budget, and if your daily Reddit scraping requirements are way past a few million posts.For large Reddit scraping requirements, you must leverage automated scraping methodologies like custom scripts, API services, or “click and scrape” tools.
Octoparse not working on infinite scroll manual#
Say, if you only need to scrape three or four Reddit threads on a particular topic, of course, manual scraping should be preferred. If your Reddit scraping requirements are small, go for manual scraping.Which method should you choose for scraping ? Though, it is not required as good “click and scrape” tools have in-built functionalities to extract Xpath or generate RegEx. Any added knowledge of XPath or RegEx is beneficial.

These are scalable and require only basic know-how of using a mouse. But it’s cost-intensive, just like using third-party API services. This is highly customizable and scalable.

Custom scraping scripts again requires a high programming caliber. Third-party API services to scrape Reddit is an effective and scalable approach but it is not cost-efficient. It’s not possible to scrape any post other than the top 1000 using Reddit API. Also, Reddit API limits the number of posts in any Reddit thread to 1000. Scraping using Reddit API provides data easily but to use it you need at least basic coding skills. But manual scraping yields data with high consistency. Manually scraping Reddit is the easiest but least efficient in terms of speed, as well as cost.

