SVG-Master: Fine-Tuned Model for SVG Code Generation
A 3B-parameter fine-tuned model that generates valid SVG code from natural language descriptions, built using LoRA on Apple MLX framework over three training iterations.
SVG-Master
SVG-Master converts natural language descriptions into valid SVG code. Teaching language models to generate syntactically correct vector graphics is challenging: SVG requires simultaneous precision in XML syntax, coordinate mathematics, and visual composition. This model delivers production-ready SVG through three iterative training cycles.
π Problem
SVG generation sits at the intersection of structured code generation and visual reasoning. General-purpose models struggle because:
Visual Creativity β Aesthetic judgment and design principles must be encoded implicitly in training data
Mathematical Precision β Coordinates, paths, and geometry require exact numeric output with no tolerance for approximation
Syntax Strictness β One invalid attribute renders the entire graphic blank or broken with no partial rendering
Contextual Understanding β The same description requires different outputs at different scales and viewBox settings
A model that produces plausible-looking but syntactically invalid SVG is functionally useless. Output must render immediately without post-processing correction.
π οΈ Architecture & Training
Base Model: Llama 3.2 3B Instruct
Framework: Apple MLX on Apple Silicon
Technique: LoRA (Low-Rank Adaptation)
Training Data: Curated SVG-description pairs with three refinement cycles
Output Format: Valid XML with proper SVG namespace declarations, viewBox, and optimized syntax
Llama 3.2 3B was selected for instruction-following capability. The model had no visual pretraining, but three training cycles built SVG competency:
Cycle 1 β Generated syntactically invalid SVG. Path data contained incorrect coordinate formats. Output showed no visual composition awareness.
Cycle 2 β Extensive data cleaning and manual curation. Syntax validity improved substantially. Pattern logic remained inconsistent for multi-element scenes.
Cycle 3 β Manual validation layers added to training pipeline. Edge cases introduced. Complex compositions improved. Output reached stable syntactic validity.
π Capabilities
The model generates complete, ready-to-render SVG code with proper structure:
Valid XML with correct SVG namespace declarations
Responsive Design with explicit viewBox and width/height attributes
Gradient Definitions in <defs> blocks where applicable
Multi-element Compositions with layered shapes and proper element ordering
Example input: βA minimalist sunset over a calm ocean with orange and purple gradientsβ
Example output: Complete SVG with layered rectangles, gradients, and proper viewBox scaling.
π‘οΈ Limitations
- Training data limited to common design patterns β specialized visual styles will underperform
- Static SVG only β complex animations require manual adjustment
- Text rendering occasionally needs font and positioning corrections
- Highly detailed illustrations may require post-processing
- Not a replacement for human design work in production contexts
See the official Hugging Face documentation for complete technical details.
π Quick Start
Hugging Face
Access the model and documentation
Ollama
ollama pull fahidnasir/svg-master && ollama run fahidnasir/svg-master "Generate a blue glowing circuit board icon"
Python
from mlx_lm import load, generate
model, tokenizer = load("fahidnasir/SVG-Master")
generate(model, tokenizer, "Generate a blue glowing circuit board icon")
π‘ Key Takeaways
- Data quality is the primary determinant of output validity β clean training examples matter more than dataset size for structured code generation.
- Three training cycles were necessary, not exceptional β systematic output failures at each stage revealed specific gaps requiring targeted data additions.
- Base model choice shapes the ceiling β code-unspecialized models require more fine-tuning data to reach comparable syntax reliability.
- Visual tasks require both structural and semantic alignment β prompt descriptions must accurately match their paired SVG, or the model learns inconsistent mappings.
- Partial success is not success for SVG β 90% valid output is not deployable if the remaining 10% renders as blank or broken graphics.