Tool for scraping ***booru image gallery sites.
Go to file
Fake-Name fd41dbbd4c Add image metadata to the file table, and fetch more then one db row per query, because the get_job() function query was somehow completely slamming my database. 2017-11-25 00:01:31 -08:00
alembic Add image metadata to the file table, and fetch more then one db row per query, because the get_job() function query was somehow completely slamming my database. 2017-11-25 00:01:31 -08:00
scraper Add image metadata to the file table, and fetch more then one db row per query, because the get_job() function query was somehow completely slamming my database. 2017-11-25 00:01:31 -08:00
util Ok, most of the scrapers are go. 2017-11-23 20:16:10 -08:00
.gitignore Move things about, update webrequest lib. 2017-11-20 21:02:57 -08:00
README.md Added moar sites. 2017-04-14 18:38:05 -07:00
alembic.ini I think this should bring everything up to functional. 2017-11-22 23:08:27 -08:00
main.py Restructure mostly done, it at least starts running. 2017-11-20 22:57:18 -08:00
settings.py Bringing it baaaaack! 2017-02-06 22:01:57 -08:00

README.md

Minimalistic high-volume multi-threaded archive-tool for imagegallery sites.

Currently Supports:

Written because I needed a extremely large image database with tags to use for some experiments with training neural nets.

Requires:

  • Sqlalchemy
  • A Database of some sort (currently only works with postgres)
  • Beautiful Soup 4
  • Python 3

Potential ideas: