Would Such a Grabber Tool Be Interesting to Anyone Here?
A new web extension uses human input to solve captchas, enabling faster scraping of restricted sites like Gelbooru.
An independent developer, under the username Fdx_dy, has built a novel web browser extension to tackle a common problem in data scraping: AI-targeted captchas. Many popular image boards and repositories, such as Gelbooru and Rule34.us, have implemented sophisticated captcha systems that effectively ban fully automated 'grabber' or scraping tools. The developer's solution ingeniously keeps a human in the process. When the extension encounters a captcha wall during a scraping job, it pauses automation and displays the captcha for the user to solve manually. Once the human completes the verification, the tool seamlessly resumes its automated data collection.
This 'human-in-the-loop' hybrid approach offers a practical workaround. The developer is currently using it in a test regime to build datasets and reports being "pleasantly surprised by the speed gains." By handling all the repetitive navigation, link-following, and downloading automatically, and only interrupting for the one task that still requires human cognition, the tool dramatically accelerates the workflow compared to fully manual scraping. The post on a developer forum sought feedback on whether such a tool had been built before and if there was broader interest, highlighting a niche but persistent need in the data collection and AI training community where access to large, specific image datasets is valuable.
- Developer Fdx_dy built a browser extension that pauses scraping for human captcha solving, then resumes.
- Targets sites like Gelbooru and Rule34.us that ban automated tools with advanced captchas.
- Creator reports notable speed improvements for personal dataset creation using this hybrid method.
Why It Matters
It highlights a pragmatic, hybrid approach to data collection for AI training where full automation is blocked.