tagger-showdown/README.md

# tagger-showdown

attempting to create a more readable evaluation to anime tagger ai systems

idea: take some recent images from danbooru, also include your own

then run x tagger systems against each other

score formula:

(len(tags in ground_truth) - len(tags not in ground_truth)) / len(ground_truth)

then average for all posts

system dependencies:
 - python3
 - [stable-diffusion-webui](https://github.com/AUTOMATIC1111/stable-diffusion-webui) with the [tagger extension](https://github.com/toriato/stable-diffusion-webui-wd14-tagger)
 - [hydrus-dd](https://gitgud.io/koto/hydrus-dd)

```sh
python3 -m venv env
env/bin/pip install -Ur ./requirements.txt

# by default, downloads 30 images at page 150 of the default empty query
env/bin/python3 ./main.py download_images

# gets 40 images at page 150 from tag 'rating:questionable'
# you should add more tags to diversify the dataset before calculating scores
env/bin/python3 ./main.py download_images 'rating:questionable' 40 150

# configure interrogators / tagger models
# set sd_webui_address to your stable diffusion webui' address
# set dd_address to hydrus-dd's address
# and set dd_model_name to be something identifiable about the model
# i set it to the md5sum output of my file, to make sure that if the file
# changes back on koto's end, my numbers may be different
cp config.example.json config.json

# fight mode -- run all interrogators against the dataset you've downloaded
env/bin/python3 ./main.py fight

# score mode -- crank the final numbers, generates graphs under plots/ folder
env/bin/python3 ./main.py scores

# keep in mind that you can download more images, run fight mode, and then
# run score mode! the commands are aware of work that's been already done and
# will only run the tagger models for the new files
```