Tool for scraping ***booru image gallery sites.
Go to file
Fake-Name e60ecff007 Fixing stuff.
Apparently I somehow completely fucked up the xbooru fetcher. Wat.
2018-04-22 20:49:13 -07:00
alembic Add image metadata to the file table, and fetch more then one db row per query, because the get_job() function query was somehow completely slamming my database. 2017-11-25 00:01:31 -08:00
scraper Fixing stuff. 2018-04-22 20:49:13 -07:00
util Fixing stuff. 2018-04-22 20:49:13 -07:00
.gitignore Move things about, update webrequest lib. 2017-11-20 21:02:57 -08:00
README.md Added moar sites. 2017-04-14 18:38:05 -07:00
alembic.ini I think this should bring everything up to functional. 2017-11-22 23:08:27 -08:00
main.py Fixing stuff. 2018-04-22 20:49:13 -07:00
settings.py Fixing stuff. 2018-04-22 20:49:13 -07:00

README.md

Minimalistic high-volume multi-threaded archive-tool for imagegallery sites.

Currently Supports:

Written because I needed a extremely large image database with tags to use for some experiments with training neural nets.

Requires:

  • Sqlalchemy
  • A Database of some sort (currently only works with postgres)
  • Beautiful Soup 4
  • Python 3

Potential ideas: