dreambooth-stable-diffusion

Dreambooth with Stable diffusion

🖼 Dreambooth example using my photos based on the dotCSV video tutorial.

📄 Usage

  1. Generate a dataset like the one found in data/input/[token_name] with around 20 images with the following typology (this dataset must contain square images of 512x512 pixels):
     60% of foreground images of you
     30% of half body images of you
     10% of full body images of you
    

    In my scenario, I have used input photos from me for my own token and also from my own dog to test how it works with a no-person.

  2. Access the Google Colab notebook and follow the instructions for training your model (you can find a copy under docs/fast_dreambooth__dotCSV__version.ipynb file).

  3. Once the model is trained, download the generated weights from your Google Drive since you will use them locally.

  4. Using the amazing UI that AbdBarho put together here, you can run the model locally and generate your own images.

    4.1. Clone the repo.

    git clone git@github.com:AbdBarho/stable-diffusion-webui-docker.git
    

    4.2. Download all needed dependencies:

    docker-compose --profile download up --build
    

    4.3. Copy your weights under the data folder along with the rest of models.

    4.4. Modify the docker-compose.yml adding the path to the custom model using the following argument in the environment variable called CLI_ARGS:

    --ckpt /data/[path_to_your_ckpt]`
    

    4.5. Run the UI:

    docker-compose --profile hlky up -d --build
    

    4.6. Will start the app on http://localhost:7860/.

  5. Enjoy! 🎉

🚂 Training details

For the training, I used the default parameters from the dotCSV video tutorial, which are the following:

Training_Subject: Character
With_Prior_Preservation: Yes
Captionned_Instance_Images: False
Subject_Type: person
Instance_Name: suaresito
Number_Of_Subject_Images: 500
Dataset: person_ddim
fp16: True
Training_Steps: 1600
Seed: 75576

The only difference is that I used Subject_Type: dog for the caosdelola use case, which is the token name of my dog.

☑️ Results

🧑🏻‍💻 My Personal Token

Here are some of the results I got from the model with the used prompt and cfg_scale:

Output 1

Highly detailed portrait of suaresito, stephen bliss, unreal engine, fantasy art by greg rutkowski, loish, rhads, ferdinand knab, makoto shinkai and lois van baarle, ilya kuvshinov, rossdraws, tom bagshaw, alphonse mucha, global illumination, radiant light, detailed and intricate environment

CFG scale: 7

Output 2

suaresito god of the forest, 3 0 years old, rugged, male, gorgeous, detailed face, ottoman, amazing, thighs, flowers, muscular, intricate, highly detailed, digital painting, artstation, concept art, sharp focus, illustration, art by greg rutkowski and alphonse mucha

CFG scale: 10

Output 3

Portrait of suaresito, scarred! D&D, muscular, fantasy, intricate, elegant, highly detailed, digital painting, artstation, concept art, smooth, sharp focus, illustration, art by artgerm and greg rutkowski and alphonse mucha

CFG scale: 13

Output 4

amazing lifelike award winning pencil illustration of suaresito 90210 trending on art station artgerm Greg rutkowski alphonse mucha cinematic

CFG scale: 7

Output 5

portrait of suaresito as postman 1 9 9 7. intricate abstract. intricate artwork. by tooth wu, wlop, beeple, dan mumford. octane render, trending on artstation, greg rutkowski very coherent symmetrical artwork. cinematic, hyper realism, high detail, octane render, 8 k, iridescent accents

CFG scale: 9

Output 6

a portrait of suaresito as a barbarian, detailed, centered, digital painting, artstation, concept art, donato giancola, joseph christian leyendecker, wlop, boris vallejo, breathtaking, 8 k resolution, extremely detailed, beautiful, establishing shot, artistic, hyperrealistic, beautiful face, octane render

CFG scale: 7

Output 7

Highly detailed portrait of suaresito, stephen bliss, unreal engine, fantasy art by greg rutkowski, loish, rhads, ferdinand knab, makoto shinkai and lois van baarle, ilya kuvshinov, rossdraws, tom bagshaw, alphonse mucha, global illumination, radiant light, detailed and intricate environment

CFG scale: 7.5

Output 8

Portrait of suaresito wearing futuristic power armor, fantasy, intricate, highly detailed, digital painting, trending on artstation, sharp focus, illustration, style of Stanley Artgerm and Dan Mumford

CFG scale: 7

Output 9

russian poet suaresito, portrait, highly detailed, digital painting, artstation, concept art, smooth, sharp focus, illustration, cinematic lighting, art by artgerm and greg rutkowski and alphonse mucha

CFG scale: 10

Output 10 Output 11 Output 12

Output 13 Output 14 Output 17

Highly detailed portrait of suaresito, stephen bliss, unreal engine, fantasy art by greg rutkowski, loish, rhads, ferdinand knab, makoto shinkai and lois van baarle, ilya kuvshinov, rossdraws, tom bagshaw, alphonse mucha, global illumination, radiant light, detailed and intricate environment

CFG scale: 7.5

By far, my favourite prompt ⬆️

Output 15 Output 16

highly detailed portrait of suaresito, stephen bliss, unreal engine, greg rutkowski, ilya kuvshinov, ross draws, hyung tae and frank frazetta, tom bagshaw, tom whalen, nicoletta ceccoli, mark ryden, earl norem, global illumination, god rays

CFG scale: 7

🧑🏻‍💻 My dog’s token

Here are some of the results I got from the model with the used prompt and cfg_scale, which are much simpler than the previous ones:

Output 1

caosdelola drinking a tequila in mexico

CFG scale: 10

Output 2

caosdelola painting with pencil

CFG scale: 10

Output 3

caosdelola celebrating Christmas

CFG scale: 7

Output 4

caosdelola going to the space

CFG scale: 7

Output 5

caosdelola playing football

CFG scale: 7