This program is a Rust-based image generator that uses a local Stable Diffusion model to create AI-generated images from a text prompt. It loads model weights and tokenizer files from a configurable local directory, processes the prompt using a BPE tokenizer, and generates an image by running the diffusion process on the GPU (if available) or CPU.
Find a file
2026-01-10 10:19:34 +01:00
python Fix prompt error message in Python script and update README to include Python script details 2025-05-07 15:50:50 +02:00
rust Enhance prompt handling by adding support for negative prompts and updating default values for n_steps and guidance_scale 2026-01-10 10:19:34 +01:00
.gitignore Refactor code structure for improved readability and maintainability 2026-01-09 18:17:48 +01:00
da7c942dfac427cac629df2f77fd6dbf.webp Refactor code structure for improved readability and maintainability 2026-01-09 18:17:48 +01:00
fc69bdb496f5f3d2ff543dd3e2e8ff65.webp modified: readme.md 2025-05-07 15:38:16 +02:00
readme.md Enhance prompt handling by adding support for negative prompts and updating default values for n_steps and guidance_scale 2026-01-10 10:19:34 +01:00

AImaging

Overview

AImaging is an open-source project designed to generate images from text prompts using artificial intelligence, entirely on your local machine. Built with Rust and leveraging the Stable Diffusion model, it provides a privacy-focused, flexible, and efficient solution for AI-powered image creation. The project is suitable for both end users and developers who want to integrate AI image generation into their workflows without relying on external services or cloud infrastructure.

Example

Example

Description

This main program is an image generator that uses a local Stable Diffusion model to create AI-generated images from a text prompt. It loads model weights and tokenizer files from a configurable local directory, processes the prompt using a BPE tokenizer, and generates an image by running the diffusion process on the GPU. The resulting image is decoded, post-processed (resized to 1024x768 with Lanczos3 filtering and sharpened), and saved in WebP format with a unique filename based on the prompt and timestamp. The program supports configuration via environment variables and is designed to run fully offline.

AI Stack

Core AI components

  • Model architecture: Stable Diffusion v1.5 pipeline (CLIP text encoder, UNet denoiser, VAE decoder).
  • Tokenizer: BPE tokenizer from tokenizers (Hugging Face tokenizer format).
  • Runtime: candle-core and candle-transformers for tensor ops and model execution.
  • GPU acceleration: CUDA via Candle (enabled in Cargo.toml).
  • Model formats: safetensors files for UNet, VAE, and text encoder weights.

Practical limits and behavior

  • Prompt length: limited by CLIP max position embeddings (typically 77 tokens for SD 1.5).
  • Resolution: generated at the configured height/width, then resized to 1024x768.

Licensing notes

  • Model weights: Not distributed with this repository. You must download them yourself and comply with each model's license.
  • Stable Diffusion 1.5: Typically released under the CreativeML OpenRAIL-M license. Verify the exact license for the model you download.
  • Dreamlike Photoreal 2.0: Uses a Modified CreativeML Open RAIL-M license with additional restrictions (see highlights below).

Dreamlike Photoreal 2.0 license highlights (non-exhaustive)

  • Hosting/inference restriction: You may not host, finetune, or do inference with the model or derivatives on websites/apps without permission from the licensor.
  • Attribution requirements: If you host the model card/files (without inference), you must state the full model name, include the license, and link to the model card.
  • Commercial use of outputs: Allowed for teams of 10 or fewer; additional restrictions apply for breach or prohibited hosting.
  • Use-based restrictions: The license prohibits specific uses (see Attachment A), including generating NFTs and other restricted categories.
  • License changes: The licensor reserves the right to change terms or prohibit use at any time.

Full license text: https://huggingface.co/dreamlike-art/dreamlike-photoreal-2.0

Setup

  1. Download the Stable Diffusion photoreal-2.0 model using the Hugging Face CLI:

    pip install huggingface_hub
    huggingface-cli download dreamlike-art/dreamlike-photoreal-2.0 --local-dir models/dreamlike-photoreal-2.0
    

    Alternatively, you can manually download the model from huggingface.co and place the files in the models/dreamlike-photoreal-2.0 directory at the root of the project.

  2. Run the program:

    cd rust
    export PATH=/usr/local/cuda-12.9/bin:$PATH
    cargo run -- "Your prompt here"
    

Python scripts

The python/ directory contains auxiliary scripts that support the main Rust application. For example, sd.py allows you to generate images from text prompts using the Stable Diffusion model via Python and the Hugging Face diffusers library. This script loads a local model, processes the prompt, and saves the generated image as a PNG file. It is useful for model conversion, testing, or as a reference implementation for image generation workflows.

Before running the python/sd.py script, install the required Python packages:

pip install torch diffusers

To generate an image using the Python script, run:

cd python
python sd.py "Your prompt here"