attempting to create a more readable evaluation to anime tagger ai systems
Find a file
2024-08-04 16:12:34 -03:00
.gitignore add histogram plotting of scores 2023-06-10 15:26:02 -03:00
config.example.json add download_images function 2023-06-09 23:19:15 -03:00
LICENSE Initial commit 2023-06-10 00:41:46 +00:00
main.py fix? 2024-08-04 16:12:34 -03:00
README.md update model ids 2023-08-04 20:19:20 -03:00
requirements.txt update model ids 2023-08-04 20:19:20 -03:00

tagger-showdown

attempting to create a more readable evaluation to anime tagger ai systems

idea: take some recent images from danbooru, also include your own

then run x tagger systems against each other

score formula:

(len(tags in ground_truth) - len(tags not in ground_truth)) / len(ground_truth)

then average for all posts

system dependencies:

python3 -m venv env
env/bin/pip install -Ur ./requirements.txt

# by default, downloads 30 images at page 150 of the default empty query
env/bin/python3 ./main.py download_images

# gets 40 images at page 150 from tag 'rating:questionable'
# you should add more tags to diversify the dataset before calculating scores
env/bin/python3 ./main.py download_images 'rating:questionable' 40 150

# configure interrogators / tagger models
# set sd_webui_address to your stable diffusion webui' address
# set dd_address to hydrus-dd's address
# and set dd_model_name to be something identifiable about the model
# i set it to the md5sum output of my file, to make sure that if the file
# changes back on koto's end, my numbers may be different
cp config.example.json config.json

# fight mode -- run all interrogators against the dataset you've downloaded
env/bin/python3 ./main.py fight

# score mode -- crank the final numbers, generates graphs under plots/ folder
env/bin/python3 ./main.py scores

# keep in mind that you can download more images, run fight mode, and then
# run score mode! the commands are aware of work that's been already done and
# will only run the tagger models for the new files