How to prompt GPT Image 1.5

Expert analysis from

Linked Agency
January 6, 2026

OpenAI just shipped GPT Image 1.5, and it’s already topped the LM Arena leaderboard.

How to prompt GPT Image 1.5

OpenAI just shipped GPT Image 1.5, and it’s already topped the LM Arena leaderboard.

That sounds impressive.
More importantly, it’s earned.

This is a real step up from Image 1.0. Not perfect. But finally usable.

You can even generate images from presets now, without knowing how to prompt at all.

Which tells you exactly where the bottleneck has moved.

It’s no longer the model.
It’s your prompting.

Not “make it cinematic.”
Not “in the style of.”

Real prompting.

The kind that locks identity, controls composition, and leaves the model nowhere to improvise.

That’s what this piece is about.

What GPT Image 1.5 actually changes, what it still gets wrong, and how to write prompts that produce images you’ll actually publish.

Because the rule hasn’t changed.

If your prompt leaves room for interpretation, the model will interpret it. Confidently. Incorrectly. With just enough polish to make you doubt yourself.

Let’s fix that.

A quick note before we get into it

First, make sure you’re subscribed to my newsletter.

MarTech AI is reader-supported. Subscriptions are what let me spend time pressure-testing tools properly, instead of rushing out shallow takes the moment a leaderboard updates.

Second, the AI Creators Club.

Doors close at the end of the month. This space moves fast, and the goal is to stay ahead of the curve, not chase updates on X. If you want to learn how to use these tools for real business use cases, not just play with them, that’s where we go deep.

Alright. Let’s get into it.

The anatomy of a GPT Image 1.5 prompt

A strong GPT Image 1.5 prompt is not creative writing.
It’s a stack of constraints, each doing a specific job.

If a line doesn’t enforce behaviour, it gets cut.

What follows is the exact structure behind this image.
It maps directly to the prompt I used.

1) Subject reference

Use the uploaded image of me as the subject reference.

This anchors the entire generation.
You are not asking for a new person.

You are telling the model which pixels matter.

Without this, everything becomes negotiable.
With it, your face is treated as source material.

Not inspiration.

2) Identity lock

Preserve my facial features, proportions, age, skin texture, hairstyle, and expression exactly.

This closes the biggest failure mode.
Unwanted improvement.

No smoothing.
No upgrades.
No glow-up.

The model understands one thing clearly.
Deviation is failure.

3) Style exclusion

Do not stylise the face. Do not cartoonise. Do not anime.

This is defensive prompting.

Models take shortcuts.
Stylisation is the easiest one.

By banning it explicitly, realism stays intact.
Even under motion and dramatic lighting.

Skip this and plastic skin appears immediately.

4) Style directive

Style: Photorealistic, cinematic action photography. Real textures. Natural skin. Real fabric. Real motion. No illustration, no CGI look.

This defines how realism behaves.

Not vibes. Rules.

That is why the blade reflects correctly.
Why the banana tears naturally.
Why the skin holds under contrast.

You did not ask for realism.
You defined it.

5) Camera framing

Camera: Wide, full-body shot. Head to feet fully visible. Slight low angle to emphasise stance and authority. Subject-centred.

Camera logic locks the composition.

The low angle adds weight.
Without tipping into fantasy.

6) Pose definition

Pose: I am mid-swing, slicing cleanly through a banana with a sharp, realistic blade. The sword arc is visible across the frame, creating a strong diagonal line.

This describes a physical state.
Not a mood.

Mid-swing implies full-body engagement.
Force aligned with motion.

That is why the pose reads instantly.

7) Action moment

Action: One banana is split cleanly in half mid-air at torso height. Small, realistic juice droplets and motion fragments explode outward, Fruit Ninja energy without stylisation or exaggeration.

That specificity is why the banana reads as physics.
Not as a prop.

Miss this and action feels flat.

8) Wardrobe choice

Wardrobe: Modern, minimalist outfit suitable for movement. Dark fitted top or jacket, trousers, solid footwear. No robes. No fantasy armour. Practical and clean.

Wardrobe here is functional.

This keeps the image grounded in reality.

The outfit exists for one reason.
To support motion and silhouette.

9) Environment context

Environment: Minimal studio or dark neutral space with a subtle deep blue gradient. No scenery. No clutter. The action is the subject.

This removes noise.

The action becomes the subject.
Nothing competes with it.

10) Lighting control

Lighting: Dramatic but realistic studio lighting. Strong key light from one side. Soft fill. Controlled highlights on the blade. Natural shadows on the body.

Lighting is where realism usually breaks.

You define key direction.
You define fill behaviour.
You define highlight control.

That stops the blade glowing unnaturally.
It stops the face flattening under contrast.

That is why the metal reads as metal.

11) Composition rules

Composition: Vertical 1080×1350. Full body clearly visible. Strong negative space above for title text. Clear silhouette that reads at thumbnail size.

This makes it publishable.

Not just good-looking.

Silhouette clarity matters.
Negative space is intentional.

12) Mood signal

Mood: Precision. Control. No nonsense.

Mood comes last.

It resolves edge cases.
It does not drive the image.

This image is not good because GPT Image 1.5 is magical.

It is good because the prompt locks identity, defines physics, controls composition, and removes ambiguity.

The model did not understand my vision.
It did not need to.

It had no alternative interpretation left.

That is the standard now.

Here’s the full prompt structure for you to copy-paste:

Use the uploaded image of me as the subject reference.

Preserve my facial features, proportions, age, skin texture, hairstyle, and expression exactly.

Do not stylise the face. Do not cartoonise. Do not anime.

Style: Photorealistic, cinematic action photography. Real textures. Natural skin. Real fabric. Real motion. No illustration, no CGI look.

Camera: Wide, full-body shot. Head to feet fully visible. Slight low angle to emphasise stance and authority. Subject-centred.

Pose: I am mid-swing, slicing cleanly through a banana with a sharp, realistic blade. The sword arc is visible across the frame, creating a strong diagonal line.

Action: One banana is split cleanly in half mid-air at torso height. Small, realistic juice droplets and motion fragments explode outward, Fruit Ninja energy without stylisation or exaggeration.

Wardrobe: Modern, minimalist outfit suitable for movement. Dark fitted top or jacket, trousers, solid footwear. No robes. No fantasy armour. Practical and clean.

Environment: Minimal studio or dark neutral space with a subtle deep blue gradient. No scenery. No clutter. The action is the subject.

Lighting: Dramatic but realistic studio lighting. Strong key light from one side. Soft fill. Controlled highlights on the blade. Natural shadows on the body.

Composition: Vertical 1080×1350. Full body clearly visible. Strong negative space above for title text. Clear silhouette that reads at thumbnail size.

Mood: Precision. Control. No nonsense.

How I actually wrote this prompt

I’m not going to pretend I typed that perfectly on the first try.

I didn’t write this prompt by hand. I prompted the prompt.

I started with a simple goal. I needed a striking cover image for a carousel about GPT Image 1.5.

Then I pushed it to think bigger. Treat this like a cover image, not a filler visual.

I asked for variations. Zoomed in. Focus on my face. Bananas everywhere.

It was too busy.

So I simplified.

I told it to forget the lore and focus on one clear action. Me slicing through a banana. Fruit Ninja energy.

That’s when it clicked.

The image exists for one reason. The prompt left the model nowhere to hide.

You don’t need a perfect prompt to start. You need direction you can react to.

The model is better at exploring possibilities than you are. You’re better at judging what survives.

That division of labour is the whole game.

Have a vision. Let the LLM explore. Then lock it down with constraints.

That’s not cheating. That’s using the tool properly.

GPT Image 1.5 vs Nano Banana

Minus the fanboying.

I’ve tested GPT Image 1.5 and Nano Banana side by side. Same prompts. Same references. Same expectations.

GPT Image 1.5 is a massive jump from Image 1.0. It’s the first time OpenAI’s image model feels genuinely usable for real work, especially when you’re iterating inside one chat.

But it still prioritises polish. It wants the image to look good, even if that means smoothing over parts of the brief.

Nano Banana does the opposite. It prioritises accuracy. It follows instructions more literally. Less flair. More discipline.

That’s why leaderboards are misleading.

The leverage isn’t choosing sides.

It’s knowing which tool to reach for, and writing prompts that leave no room for interpretation.

That’s the work.

Stay curious, stay human, and keep creating.

— Charlie

About

Linked Agency

Linked Agency is the LinkedIn growth partner for brands and founders who want more than just likes - they want impact.

Read more

Recommended

Related articles
Logo The AI Report
Join the Newsletter
Inchide fereastra