Need help getting started with image generation

Noktuan · March 5, 2026, 1:13am

Im totally new to all this AI stuff, i’ll get straight to the point.
I want to generate images locally offline on my personal desktop, i’ve got an amd gpu.
So i tried out stablediffusion on a website and was stunned by how good the results where.
I soon realized to generate more detailed images i might want to use a LLM to enhance prompts.
So these are the two things i want to get running locally, a model for image generation so that would be something like stablediffusion correct me if im wrong, and a large language model to enhance image prompts. I have not tried out any of the many AI models so far since i avoided all the hype.
Trying to say, if you can recommend me which models suit best for my purpose that would be helpful.
Also, i preffer running only opensource AI.
I can’t code any programming language, thus a more simple setup or atleast a guide to follow step by step would be very welcome. I tried getting stabblediffusion running but i failed, powershell on windows10 kept throwing errors i tried to solve but couldn’t succeed.

John6666 · March 5, 2026, 5:19am

When using open-source generative AI models, there are still some limitations with AMD GPUs. While things have improved significantly on Linux and Windows 11 + WSL2 environments today, options remain quite limited on Windows 10…

What you’re setting up (two separate local apps)

Image generation: Stable Diffusion 1.5 “weights” + a GUI that runs locally (you open it in your browser at 127.0.0.1).
Prompt enhancement: a small local text model that turns “an idea” into POSITIVE / NEGATIVE / SETTINGS you copy/paste into the image GUI.

Keeping them separate is the simplest “offline + no-coding” workflow.

The most realistic Windows 10 + AMD path (no WSL2)

Best first-success route

SD.Next + ONNX Runtime + DirectML (DmlExecutionProvider)
SD.Next explicitly supports ONNX Runtime and notes you can select DmlExecutionProvider by installing onnxruntime-directml, and that DirectX 12 is required. (GitHub)

Alternatives (only if you want them later)

AUTOMATIC1111 + Microsoft DirectML extension: uses ONNX Runtime + DirectML, but requires models optimized via Olive (more moving parts). (GitHub)
AMD’s own guide for that extension calls it “preview” and (in that guide) states only SD 1.5 is supported. (AMD)
A1111 main repo on Windows+AMD: not officially supported; their wiki points to DirectML-focused forks/approaches instead. (GitHub)
SD.Next + ZLUDA: can be a speed/compatibility upgrade on some AMD cards, but it’s an “after you already work” option. SD.Next documents launching it with --use-zluda and notes HIP SDK version constraints. (GitHub)

Step-by-step: SD 1.5 image generation with SD.Next (Windows 10 + AMD)

0) Put it in an easy folder

Use something like:

C:\AI\sdnext\

Avoid OneDrive/Desktop/Program Files. (This prevents many permissions/path problems.)

1) Install the basics (one-time)

Latest AMD GPU driver + reboot
Git for Windows
Python (many SD Windows setups are happiest on Python 3.10.x)

2) Install + start SD.Next (use cmd.exe, not PowerShell)

Open Command Prompt and run:

cd C:\AI
git clone https://github.com/vladmandic/sdnext.git
cd sdnext
webui.bat --debug

SD.Next documents launching on Windows with webui.bat --debug. (GitHub)

When it finishes starting, it prints a local URL (often http://127.0.0.1:7860). Open that in your browser.

3) Add an SD 1.5 model file (the “weights”)

A common starter SD 1.5 checkpoint is:

v1-5-pruned-emaonly.safetensors (license shown as creativeml-openrail-m) (Hugging Face)

Place the .safetensors file into SD.Next’s model folder (SD.Next “Getting Started” covers the basic “generate with a few clicks” workflow and model handling). (GitHub)

4) Turn on AMD GPU acceleration (ONNX Runtime + DirectML)

In SD.Next, switch to the ONNX Runtime pipeline and choose DmlExecutionProvider (DirectML). SD.Next notes:

DML EP becomes available by installing onnxruntime-directml
DirectX 12 is required (GitHub)

Why this matters: ONNX Runtime’s DirectML EP has specific constraints (for example, it does not support memory-pattern optimizations or parallel execution in ORT sessions). (ONNX Runtime)

5) First “known-stable” test settings (prove it works)

Start conservative:

512×512
Steps: 20
CFG: ~7
Batch size: 1

Test prompts:

Positive: portrait photo, soft studio lighting, sharp focus
Negative: lowres, blurry, watermark, text, bad anatomy, extra fingers

Once you can generate one image reliably, then raise resolution/complexity.

Quick troubleshooting (the fastest fixes)

A) Start in “safe mode” to remove extension problems

webui.bat --debug --safe

--safe disables user extensions and is recommended for troubleshooting. (GitHub)

B) UI acts broken / buttons don’t work

SD.Next recommends deleting ui-config.json if it’s bloated (old settings can override new defaults and break the UI). (GitHub)

C) DirectML crashes / weird ORT errors

DirectML EP requires certain ORT options (mem-pattern + parallel execution) to be disabled; enabling them can cause errors. (ONNX Runtime)
If you see errors like 80070057, they’re commonly associated with those constraints; ONNX Runtime has issue reports in this area. (GitHub)

Prompt enhancement (offline, GUI-first)

Pick one “local chat” app

Option 1: Jan (desktop GUI, open source, offline)

Jan is presented as an open-source ChatGPT-like app for running models locally. (GitHub)

Option 2: KoboldCpp (single EXE + browser UI; good AMD hint)

KoboldCpp releases explicitly recommend the Vulkan option in the nocuda build for AMD. (GitHub)

Option 3: Ollama (simple installer)

Ollama’s Windows docs state it does not require Administrator and installs in your home directory by default. (Ollama Official Document)

Good beginner prompt-enhancer models (small + practical)

Specialized prompt optimizers (often best for SD prompting):

TIPO-200M (prompt optimization for text-to-image workflows). (Hugging Face)
DART v2 (generates Danbooru-style tags; useful if you like tag prompts). (Hugging Face)

General small instruct model (good at structured output):

SmolLM2-1.7B-Instruct (compact “run on-device” class model). (Hugging Face)

Copy/paste template for your prompt enhancer

Use this once as your “system prompt” (or first message).

You write prompts for Stable Diffusion 1.5.

Return exactly these sections:

POSITIVE:
NEGATIVE:
SETTINGS:
VARIATIONS:

Rules:
- POSITIVE: 1–2 lines. Include subject, environment, lighting, camera/framing, style/medium.
- NEGATIVE: comma-separated. Include common artifacts: lowres, blurry, watermark, text, deformed hands, extra fingers.
- SETTINGS: suggest resolution (start 512x512), steps (20–30), CFG (6–8).
- VARIATIONS: 5 short alternate POSITIVE prompts that keep the same idea but change lighting/camera/mood.

User idea: <paste your idea here>

Workflow:

Write your idea → 2) copy POSITIVE/NEGATIVE/SETTINGS → 3) paste into SD.Next → 4) generate.

Noktuan · March 5, 2026, 9:56am

Thankls for the fast reply !
Is there no way to install WSL v2 on windows10?
I do also have cachyos linux if you recommend that over windows, but i am very new into it so without well to be honest, without detailed instructions i probably wont be able to handle things.
Maybe i should have also stated my hardware is relatively capable, 16gb vram, 32gb ram.
I know this isn’t enough for some ai models, but it could be worse.
I would like to use “high-end” ai models rather than some beginner models which produce poor or limited output.

Topic		Replies	Views
LM Studio compatible Text To Image Models, click and go Beginners	4	28337	December 25, 2025
Stable Diffusion Beginners	0	282	September 1, 2023
Creation of Images from Text-Prompt (Customized Training) Beginners	37	904	January 15, 2025
New Stable Diffusion 🔒 Gradio	0	1030	September 2, 2023
Generating an image is really slow Beginners	0	287	August 8, 2024