Im totally new to all this AI stuff, i’ll get straight to the point.
I want to generate images locally offline on my personal desktop, i’ve got an amd gpu.
So i tried out stablediffusion on a website and was stunned by how good the results where.
I soon realized to generate more detailed images i might want to use a LLM to enhance prompts.
So these are the two things i want to get running locally, a model for image generation so that would be something like stablediffusion correct me if im wrong, and a large language model to enhance image prompts. I have not tried out any of the many AI models so far since i avoided all the hype.
Trying to say, if you can recommend me which models suit best for my purpose that would be helpful.
Also, i preffer running only opensource AI.
I can’t code any programming language, thus a more simple setup or atleast a guide to follow step by step would be very welcome. I tried getting stabblediffusion running but i failed, powershell on windows10 kept throwing errors i tried to solve but couldn’t succeed.
When using open-source generative AI models, there are still some limitations with AMD GPUs. While things have improved significantly on Linux and Windows 11 + WSL2 environments today, options remain quite limited on Windows 10…
What you’re setting up (two separate local apps)
- Image generation: Stable Diffusion 1.5 “weights” + a GUI that runs locally (you open it in your browser at
127.0.0.1). - Prompt enhancement: a small local text model that turns “an idea” into POSITIVE / NEGATIVE / SETTINGS you copy/paste into the image GUI.
Keeping them separate is the simplest “offline + no-coding” workflow.
The most realistic Windows 10 + AMD path (no WSL2)
Best first-success route
SD.Next + ONNX Runtime + DirectML (DmlExecutionProvider)
SD.Next explicitly supports ONNX Runtime and notes you can select DmlExecutionProvider by installing onnxruntime-directml, and that DirectX 12 is required. (GitHub)
Alternatives (only if you want them later)
- AUTOMATIC1111 + Microsoft DirectML extension: uses ONNX Runtime + DirectML, but requires models optimized via Olive (more moving parts). (GitHub)
AMD’s own guide for that extension calls it “preview” and (in that guide) states only SD 1.5 is supported. (AMD) - A1111 main repo on Windows+AMD: not officially supported; their wiki points to DirectML-focused forks/approaches instead. (GitHub)
- SD.Next + ZLUDA: can be a speed/compatibility upgrade on some AMD cards, but it’s an “after you already work” option. SD.Next documents launching it with
--use-zludaand notes HIP SDK version constraints. (GitHub)
Step-by-step: SD 1.5 image generation with SD.Next (Windows 10 + AMD)
0) Put it in an easy folder
Use something like:
C:\AI\sdnext\
Avoid OneDrive/Desktop/Program Files. (This prevents many permissions/path problems.)
1) Install the basics (one-time)
- Latest AMD GPU driver + reboot
- Git for Windows
- Python (many SD Windows setups are happiest on Python 3.10.x)
2) Install + start SD.Next (use cmd.exe, not PowerShell)
Open Command Prompt and run:
cd C:\AI
git clone https://github.com/vladmandic/sdnext.git
cd sdnext
webui.bat --debug
SD.Next documents launching on Windows with webui.bat --debug. (GitHub)
When it finishes starting, it prints a local URL (often http://127.0.0.1:7860). Open that in your browser.
3) Add an SD 1.5 model file (the “weights”)
A common starter SD 1.5 checkpoint is:
v1-5-pruned-emaonly.safetensors(license shown as creativeml-openrail-m) (Hugging Face)
Place the .safetensors file into SD.Next’s model folder (SD.Next “Getting Started” covers the basic “generate with a few clicks” workflow and model handling). (GitHub)
4) Turn on AMD GPU acceleration (ONNX Runtime + DirectML)
In SD.Next, switch to the ONNX Runtime pipeline and choose DmlExecutionProvider (DirectML). SD.Next notes:
- DML EP becomes available by installing
onnxruntime-directml - DirectX 12 is required (GitHub)
Why this matters: ONNX Runtime’s DirectML EP has specific constraints (for example, it does not support memory-pattern optimizations or parallel execution in ORT sessions). (ONNX Runtime)
5) First “known-stable” test settings (prove it works)
Start conservative:
- 512×512
- Steps: 20
- CFG: ~7
- Batch size: 1
Test prompts:
- Positive:
portrait photo, soft studio lighting, sharp focus - Negative:
lowres, blurry, watermark, text, bad anatomy, extra fingers
Once you can generate one image reliably, then raise resolution/complexity.
Quick troubleshooting (the fastest fixes)
A) Start in “safe mode” to remove extension problems
webui.bat --debug --safe
--safe disables user extensions and is recommended for troubleshooting. (GitHub)
B) UI acts broken / buttons don’t work
SD.Next recommends deleting ui-config.json if it’s bloated (old settings can override new defaults and break the UI). (GitHub)
C) DirectML crashes / weird ORT errors
DirectML EP requires certain ORT options (mem-pattern + parallel execution) to be disabled; enabling them can cause errors. (ONNX Runtime)
If you see errors like 80070057, they’re commonly associated with those constraints; ONNX Runtime has issue reports in this area. (GitHub)
Prompt enhancement (offline, GUI-first)
Pick one “local chat” app
Option 1: Jan (desktop GUI, open source, offline)
Jan is presented as an open-source ChatGPT-like app for running models locally. (GitHub)
Option 2: KoboldCpp (single EXE + browser UI; good AMD hint)
KoboldCpp releases explicitly recommend the Vulkan option in the nocuda build for AMD. (GitHub)
Option 3: Ollama (simple installer)
Ollama’s Windows docs state it does not require Administrator and installs in your home directory by default. (Ollama Official Document)
Good beginner prompt-enhancer models (small + practical)
Specialized prompt optimizers (often best for SD prompting):
- TIPO-200M (prompt optimization for text-to-image workflows). (Hugging Face)
- DART v2 (generates Danbooru-style tags; useful if you like tag prompts). (Hugging Face)
General small instruct model (good at structured output):
- SmolLM2-1.7B-Instruct (compact “run on-device” class model). (Hugging Face)
Copy/paste template for your prompt enhancer
Use this once as your “system prompt” (or first message).
You write prompts for Stable Diffusion 1.5.
Return exactly these sections:
POSITIVE:
NEGATIVE:
SETTINGS:
VARIATIONS:
Rules:
- POSITIVE: 1–2 lines. Include subject, environment, lighting, camera/framing, style/medium.
- NEGATIVE: comma-separated. Include common artifacts: lowres, blurry, watermark, text, deformed hands, extra fingers.
- SETTINGS: suggest resolution (start 512x512), steps (20–30), CFG (6–8).
- VARIATIONS: 5 short alternate POSITIVE prompts that keep the same idea but change lighting/camera/mood.
User idea: <paste your idea here>
Workflow:
- Write your idea → 2) copy POSITIVE/NEGATIVE/SETTINGS → 3) paste into SD.Next → 4) generate.
Thankls for the fast reply !
Is there no way to install WSL v2 on windows10?
I do also have cachyos linux if you recommend that over windows, but i am very new into it so without well to be honest, without detailed instructions i probably wont be able to handle things.
Maybe i should have also stated my hardware is relatively capable, 16gb vram, 32gb ram.
I know this isn’t enough for some ai models, but it could be worse.
I would like to use “high-end” ai models rather than some beginner models which produce poor or limited output.