SVG-Master

SVG-Master converts natural language descriptions into valid SVG code. Teaching language models to generate syntactically correct vector graphics is challenging: SVG requires simultaneous precision in XML syntax, coordinate mathematics, and visual composition. This model delivers production-ready SVG through three iterative training cycles.

🔍 Problem

SVG generation sits at the intersection of structured code generation and visual reasoning. General-purpose models struggle because:

Visual Creativity — Aesthetic judgment and design principles must be encoded implicitly in training data

Mathematical Precision — Coordinates, paths, and geometry require exact numeric output with no tolerance for approximation

Syntax Strictness — One invalid attribute renders the entire graphic blank or broken with no partial rendering

Contextual Understanding — The same description requires different outputs at different scales and viewBox settings

A model that produces plausible-looking but syntactically invalid SVG is functionally useless. Output must render immediately without post-processing correction.

🛠️ Architecture & Training

Base Model: Llama 3.2 3B Instruct

Framework: Apple MLX on Apple Silicon

Technique: LoRA (Low-Rank Adaptation)

Training Data: Curated SVG-description pairs with three refinement cycles

Output Format: Valid XML with proper SVG namespace declarations, viewBox, and optimized syntax

Llama 3.2 3B was selected for instruction-following capability. The model had no visual pretraining, but three training cycles built SVG competency:

Cycle 1 — Generated syntactically invalid SVG. Path data contained incorrect coordinate formats. Output showed no visual composition awareness.

Cycle 2 — Extensive data cleaning and manual curation. Syntax validity improved substantially. Pattern logic remained inconsistent for multi-element scenes.

Cycle 3 — Manual validation layers added to training pipeline. Edge cases introduced. Complex compositions improved. Output reached stable syntactic validity.

📊 Capabilities

The model generates complete, ready-to-render SVG code with proper structure:

Valid XML with correct SVG namespace declarations

Responsive Design with explicit viewBox and width/height attributes

Gradient Definitions in <defs> blocks where applicable

Multi-element Compositions with layered shapes and proper element ordering

Example input: “A minimalist sunset over a calm ocean with orange and purple gradients”

Example output: Complete SVG with layered rectangles, gradients, and proper viewBox scaling.

🛡️ Limitations

Training data limited to common design patterns — specialized visual styles will underperform
Static SVG only — complex animations require manual adjustment
Text rendering occasionally needs font and positioning corrections
Highly detailed illustrations may require post-processing
Not a replacement for human design work in production contexts

See the official Hugging Face documentation for complete technical details.

🚀 Quick Start

Hugging Face

Access the model and documentation

Ollama

ollama pull fahidnasir/svg-master && ollama run fahidnasir/svg-master "Generate a blue glowing circuit board icon"

Python

from mlx_lm import load, generate
model, tokenizer = load("fahidnasir/SVG-Master")
generate(model, tokenizer, "Generate a blue glowing circuit board icon")

💡 Key Takeaways

Data quality is the primary determinant of output validity — clean training examples matter more than dataset size for structured code generation.
Three training cycles were necessary, not exceptional — systematic output failures at each stage revealed specific gaps requiring targeted data additions.
Base model choice shapes the ceiling — code-unspecialized models require more fine-tuning data to reach comparable syntax reliability.
Visual tasks require both structural and semantic alignment — prompt descriptions must accurately match their paired SVG, or the model learns inconsistent mappings.
Partial success is not success for SVG — 90% valid output is not deployable if the remaining 10% renders as blank or broken graphics.