Skip to main content

Scenario 5: Avatars

In the Stock Photos chapter, I introduced the img2img method which can solve some stock photo issues and is also useful for other scenes.

This chapter focuses on avatar prompts using this method. We'll also start expanding our prompt structure to make it more complete.

Important Notes​

Before introducing img2img applications, I want to emphasize:

  1. Don't upload your own photos to public Discord Midjourney servers! This exposes them to all server members. I recommend using the Midjourney bot instead.
  2. Also, images generated by non-pro members are viewable by all members. You can delete images after generating your avatar. See the Basics section if you don't know how to use the bot or delete images.
  3. I won't elaborate on the img2img process here. Check the Basics and Stock Photos chapters if you need a refresher.

3D Cartoon Avatars​

First I'll share cartoon avatars. Let me emphasize:

  1. I've looked at almost all Chinese and English avatar generation tutorials, tried them myself, and discussed with the Midjourney community. My understanding is that with V5's current capabilities, no matter how you tweak the prompt, randomly generating an image very similar to the original is mostly luck. Even using the techniques I'll share, you just increase the probability. If you know a reliable method to generate highly similar avatars, please share via the GitHub issues - I'll credit you by name and share it with everyone.
  2. During your learning process, if the generated images don't resemble the original, don't be discouraged. This is normal.
  3. You can use the methods I share to create avatars that capture the spirit of the original, but they definitely won't be highly similar.

In the prompt, include the original image link (ID photos or simple backgrounds work best - higher success rate). Then design a prompt using my framework::

PromptExplanation
What is the type?Portraits / AvatarFor an ID photo you can include "portrait" or "avatar"
What is the subject?smiling cute boy, undercut hairstyleOptional - you can initially leave this blank, then add descriptive words if the output doesn't match, like gender, appearance, hairstyle, accessories (glasses, earrings), expression. Note - focus on distinctive traits, getting these right means higher similarity.
What is the background?white backgroundI kept the white background like an ID photo, but you could add real backgrounds like restaurants.
What is the composition?nullWe provided the image already so no need to specify.
What lens?soft focusSoft focus means using a soft focus lens for a sharp yet soft, dream-like effect. Often used for portraits. I added this to soften the image, but you can omit it.
What is the style?3d render,Pixar styleSince we want a 3D avatar I added "3d render" and my preferred Pixar style.
Parametersβ€”iw 2iw sets image vs text weight. Higher values favor the image. Details in the Advanced Parameters chapter.

Finally, three additional techniques:

  1. If the generated images don't match your provided image, pick the closest one, click V (Variation) to have the model generate more variations, pick the next closest, and repeat until you get a good match.
  2. Strangely, if the above doesn't work and the outputs still don't match, try adding "wear glasses" to the prompt. This matches much better for me. If your original has glasses, try β€”no glasses to get less similarity.
  3. Use multiple parameters together - I'll expand on this in Technique 8.

At the end of this chapter I have an example using my ID photo.

Tip 7: Using Multiple Parameters​

When generating avatars with img2img, I found the issue was "text weight is higher than image weight", so the outputs didn't match the original. With iw, V5 caps image weight at 2. So I tried using the s parameter, and it improved results a lot.

If the image still doesn't match, you can try adding β€”s 200 to β€”iw 2. Note no comma between parameters. I've found adding s makes it much more similar - my guess is using s and iw together reduces the text weight.

Higher s values make the output more stylistic and abstract. So if it still doesn't match, try increasing s - e.g. to 500.

I want to demonstrate that combining parameters can have a synergistic effect, further enhancing the model's capabilities. Consider potential combinations when new parameters are introduced.

Anime Avatars​

The modifications are mainly to the image style:

PromptExplanation
What is the type?Portraits / AvatarSame as 3D Cartoon Avatars
What is the subject?smiling cute boy, undercut hairstyleSame as 3D Cartoon Avatars
What is the background?white backgroundSame as 3D Cartoon Avatars
What is the composition?nullSame as 3D Cartoon Avatars
What lens?nullOmitting soft focus lens since this is anime style
What is the style?anime, Studio GhibliTarget is anime avatar, so added anime and Studio Ghibli style
Parametersβ€”iw 2 β€”s 500Don't comma separate when using multiple parameters

Cyberpunk Avatars​

One of my favorite styles - just modify the style and background:

PromptExplanation
What is the subject?cyberpunk robot face, holographic VR glasses, holographic cyberpunk clothingAdded descriptors for the face, VR glasses, cyberpunk clothes
What is the background?neon-lit cityscape backgroundAdded a neon cityscape for a cyberpunk vibe
What is the style?Cyberpunk, by Josan GonzalezAdded cyberpunk style and one of my favorite cyberpunk artists, Josan Gonzalez

Here are the avatars I generated. To be honest, I don't have many distinctive features or good looks, so the outputs all look vaguely Southeast Asian. Midjourney still doesn't seem great with Asian faces! πŸ˜‚

MJ039.png

Tip 8: Modifying Images Using Seed​

Note: I think this technique has potential but is currently unreliable in Midjourney. The official help docs also mention seeds are very unstable in V5. See my Midjourney FAQ chapter.

You may encounter situations like:

  1. You input a prompt and get 4 images
  2. One image looks decent but the rest unsatisfactory, so you tweak the prompt
  3. But now all the new outputs are unsatisfactory, which is frustrating
  4. You wish you could modify one of the initial outputs

In theory you should be able to provide a seed from initial outputs for the model to further modify.

For example, with the cyberpunk avatar I first generate images using the original prompt. Then in Discord, click the emoji button in the top right of the message (Image 1 below), type "envelope" in the text box (Image 2), and click the envelope emoji (Image 3). This makes the bot send you the seed number.

MJ040.png

Then I modify the prompt to change the background to Chinatown. Note:

  1. The new prompt includes the original, only the background portion is changed.
  2. Add the seed parameter.

Original prompt:

{img url}  avatar, cyberpunk robot face, holographic VR glasses,holographic cyberpunk clothing, neon-lit cityscape background, Cyberpunk, by Josan Gonzalez --s 500 --iw 1

New prompt (replace seed with your actual number):

{img url}  avatar, cyberpunk robot face, holographic VR glasses,holographic cyberpunk clothing, China Town background, Cyberpunk, by Josan Gonzalez --s 500 --iw 1 --seed 758242567

Here are the results - the background did change, but the face also changed a bit! πŸ˜‚

The effect isn't great, but I think it's worth exploring as this could improve iterative refinement:

MJ041.png

Tip 9: The Mysterious Blend Feature​

To be honest, I hesitate to call this a technique - it's a very unstable Midjourney feature. But it's quite important so I'll introduce it.

It's simple to use - in Discord, type /blend then click the blend menu:

MJ057.png

Your input will change to this:

MJ058.png

Click the boxes to select two images from your computer, then hit enter to generate the blend.

MJ059.png

Then Midjourney generates this awesome result with Fusion Iron Man on the left and Fusion Buzz Lightyear on the right:

MJ061.png

I was shocked when I first saw this - it's like fusing monsters together in Yu-Gi-Oh! Unfortunately it's very unstable and only works if multiple unknown conditions are met.

I wanted to use this for avatars but found blending my own photos with other styles rarely works well. It seems best with celebrity photos, likely because Midjourney has been trained extensively on those.

But I think it has great potential for easily creating unique avatars by blending your photo with something else. Unfortunately it's not very usable yet.

It also has many other applications which I'll cover later.