attempting to create a more readable evaluation to anime tagger ai systems
.gitignore | ||
config.example.json | ||
LICENSE | ||
main.py | ||
README.md | ||
requirements.txt |
tagger-showdown
attempting to create a more readable evaluation to anime tagger ai systems
idea: take some recent images from danbooru, also include your own
then run x tagger systems against each other
score formula:
(len(tags in ground_truth) - len(tags not in ground_truth)) / len(ground_truth)
then average for all posts
system dependencies:
- python3
- stable-diffusion-webui with the tagger extension
- hydrus-dd
python3 -m venv env
env/bin/pip install -Ur ./requirements.txt
# by default, downloads 30 images at page 150 of the default empty query
env/bin/python3 ./main.py download_images
# gets 40 images at page 150 from tag 'rating:questionable'
# you should add more tags to diversify the dataset before calculating scores
env/bin/python3 ./main.py download_images 'rating:questionable' 40 150
# configure interrogators / tagger models
# set sd_webui_address to your stable diffusion webui' address
# set dd_address to hydrus-dd's address
# and set dd_model_name to be something identifiable about the model
# i set it to the md5sum output of my file, to make sure that if the file
# changes back on koto's end, my numbers may be different
cp config.example.json config.json
# fight mode -- run all interrogators against the dataset you've downloaded
env/bin/python3 ./main.py fight
# score mode -- crank the final numbers, generates graphs under plots/ folder
env/bin/python3 ./main.py scores
# keep in mind that you can download more images, run fight mode, and then
# run score mode! the commands are aware of work that's been already done and
# will only run the tagger models for the new files