STATUS: ENGINEERING PROTOTYPEMODE: Internal Module RESTRICTED_ACCESS

Public Demo

Deep integration module for a Sauna Builder platform. Handles Sketch-to-Render pipelines and complex post-processing.

AI RENDER ARENA

Sauna Builder → Render Engine → Multi-Mode Editor

Next.js 16.1.5 (App Router)React 19.2.4Supabase (Auth / Postgres / Storage)Flux / Qwen / Nano-Banana Pro / SeedreamGPT-5 Family (Orchestrator)Three.js (Client Editing)

[SYSTEM_OVERVIEW]

This system acts as the 'Intelligence Layer' for a dedicated Sauna Builder. It bridges the gap between a rigid CAD sketch and a marketing-ready visualization. The core challenge is not just generating an image, but generating the correct image—preserving the dimensions, layout, and material choices defined in the builder.

Beyond one-shot generation, the system features a robust 'Pro Editor Module'—a 7-mode post-processing suite allowing users to continuously refine the image. From replacing textures (e.g., swapping alder for cedar via reference image) to integrating new 3D objects, every action is guided by spatial inputs (arrows/pointers) and semantic intent.

The architecture prioritizes control over creativity. It doesn't just 'dream' a sauna; it operates directly on top of the builder’s structural constraints, respecting specific material choices and spatial constraints.

Core Loops

Ingest Builder State (Sketch + JSON)LLM Architect (Dirty → Clean Prompt)Base Generation (Geometry Lock)Pro Editor Loop (Texture/Object/Style)3D Photobooth BakingFinal Asset Delivery

/// DEPLOYMENT_SCOPE

::Builder Integration API
::Sketch-to-Render Pipeline
::Pro Editor (7 Modes)
::Hybrid 3D/2D Workflow

Engineering Architecture

[SYSTEM_CORE]

LLM as 'Prompt Architect'

Builder inputs are messy ('dirty prompts'). The system uses a GPT-5 family model to analyze the sketch and material list, deciding what to keep, what to discard, and how to structure the technical prompt for the target editor model.

The 7-Mode Pro Editor

A unified state machine (General, Texture Swap, Background Replace, Style, 2D Insert, 3D Insert, Multi-Pointer). Each mode routes to a specialized AI pipeline, treating the generated image as a mutable canvas.

Hybrid 3D Integration (Photobooth)

To insert new 3D objects (lamps, heaters) into a 2D render, the system opens a client-side Three.js 'Photobooth'. It renders the 3D asset with the correct camera perspective to generate a precision mask and composite pass for the AI.

Model-Agnostic Routing

The backend abstracts the model choice, selecting the best engine for the specific type of edit requested. Style transfers, inpainting, and object insertions are routed to different optimized pipelines (Flux, Qwen, etc.) seamlessly.

AI Engine

Dirty Prompt Cleaning & Analysis

The pipeline ingests raw user text + builder JSON and normalizes it into a strict technical prompt, stripping marketing fluff and enforcing material terminology.

Spatial Intent Mapping (Multi-Pointer)

Users place spatial anchors (arrows) on the image. These coordinates are mapped to the model's attention mechanism (or mask generation), allowing for precise 'Change THIS to THAT' instructions.

Edit-Context Isolation

When a user edits a specific area (e.g., a window), the system isolates the context to prevent 'bleed' or unwanted changes to surrounding geometry.

Admin & Security

Admin Asset Library

Centralized management for uploading 2D objects, textures, and 3D models (GLB/GLTF). These assets become immediately available in the Pro Editor for users to drag-and-drop.

Asset Validation Pipeline

Automated checks for texture tiling and 3D model scale to ensure assets integrate correctly into the render pipeline without manual adjustment.

Reliability Principles

Hallucination DriftRisk of AI adding windows where there are none. Mitigated by strict ControlNet/Sketch guidance and prompt engineering.
Cost ExplosionMulti-step editing can be expensive. usage_limits are enforced per session ID in the Supabase database.
Vendor LatencySynchronous dependence on external model providers can cause timeouts. Handled via retry logic and client-side loading states.

Invariants (Strict Rules)

::Geometry Truth: Output respects builder wireframe
::Material Fidelity: Selected woods/tiles are strict constraints
::Edit Isolation: Texture swaps avoid warping geometry
::Stateless API: Integrates via secure server-to-server keys
::Audit Trail: Pixel generation is cost-tracked

Data Entities

external_render_requests (api log)edit_sessions (undo/redo stack)asset_library (heaters/lamps)prompt_strategies (system kernels)

UX Philosophies

• Operational UX: Instrumental over marketing
• Comparison Mode: Canvas with Before/After toggles
• Pointer-Based Interactions: Point & Type inputs
• Hybrid Canvas: 2D Image + 3D Object Overlay

Future Signals

→ Real-time Latent Paint
→ User-Uploaded Texture Learning
→ 3D Scene Reconstruction from Render