attempting to create a more readable evaluation to anime tagger ai systems

Find a file

Luna c213987859 add usage notes to readme		2023-06-10 18:36:18 -03:00
.gitignore	add histogram plotting of scores	2023-06-10 15:26:02 -03:00
config.example.json	add download_images function	2023-06-09 23:19:15 -03:00
LICENSE	Initial commit	2023-06-10 00:41:46 +00:00
main.py	add practical error column	2023-06-10 18:36:12 -03:00
README.md	add usage notes to readme	2023-06-10 18:36:18 -03:00
requirements.txt	add histogram plotting of scores	2023-06-10 15:26:02 -03:00

README.md

tagger-showdown

attempting to create a more readable evaluation to anime tagger ai systems

idea: take some recent images from danbooru, also include your own

then run x tagger systems against each other

score formula:

(len(tags in ground_truth) - len(tags not in ground_truth)) / len(ground_truth)

then average for all posts

system dependencies:

python3
stable-diffusion-webui with the tagger extension
hydrus-dd

python3 -m venv env
env/bin/pip install -Ur ./requirements.txt

# by default, downloads 30 images at page 150 of the default empty query
env/bin/python3 ./main.py download_images

# gets 40 images at page 150 from tag 'rating:questionable'
# you should add more tags to diversify the dataset before calculating scores
env/bin/python3 ./main.py download_images 'rating:questionable' 40 150

# configure interrogators / tagger models
# set sd_webui_address to your stable diffusion webui' address
# set dd_address to hydrus-dd's address
# and set dd_model_name to be something identifiable about the model
# i set it to the md5sum output of my file, to make sure that if the file
# changes back on koto's end, my numbers may be different
cp config.example.json config.json

# fight mode -- run all interrogators against the dataset you've downloaded
env/bin/python3 ./main.py fight

# score mode -- crank the final numbers, generates graphs under plots/ folder
env/bin/python3 ./main.py fight