Open Source

[Project] htmLLM-50M base: Can a tiny specialist actually code? + Weights & Code (124M v2 in training!)

A tiny 50M-parameter model trained on a single T4 GPU can generate basic HTML and CSS from instructions.

Deep Dive

Independent AI developer LH-Tech-AI has released htmLLM-v1, an experimental 50-million parameter language model hyper-specialized for generating HTML and CSS code. Built on Andrej Karpathy's nanoGPT architecture with 8 layers and 8 attention heads, the model was trained on approximately 150 million tokens from The Stack-Smol HTML dataset, fine-tuned with Alpaca-cleaned instructions. Remarkably, this entire training process was completed using just a single Kaggle T4 GPU, demonstrating how accessible specialized model creation has become.

The model, dubbed a "Pocket Coder," shows surprising competency for its minuscule size. It can follow basic instructions to generate semantic HTML tags, form structures, and simple styling, though it struggles with complex frameworks like Bootstrap and tends to hallucinate CSS properties. The developer is already pushing forward with htmLLM-v2, a 124-million parameter version featuring 12 layers, 12 heads, and an expanded 1024-token context window, currently in early training stages.

This project represents a fascinating exploration into the limits of model specialization. While htmLLM-50M is clearly not a production-ready coding assistant—its 512-token context and occasional nonsensical outputs prove that—it successfully demonstrates that even tiny models can develop domain-specific understanding when trained on focused datasets. The open-source release of both weights and training code invites the community to experiment with ultra-efficient, single-purpose AI tools.

Key Points
  • 50M-parameter nanoGPT model trained exclusively on HTML/CSS data from The Stack-Smol
  • Entirely trained on a single Kaggle T4 GPU using 150M tokens
  • Can generate basic semantic HTML and styling but hallucinates on complex layouts like Bootstrap

Why It Matters

Shows how extreme specialization enables useful AI capabilities on minimal hardware, opening doors for efficient, single-purpose models.