Handling Tokenization and Structured Inputs in LLMs
Large Language Models don't "see" JSON, CSV, XML, or Markdown the way we do. At the API layer, you send something that looks nicely structured. Inside the model...
Large Language Models don't "see" JSON, CSV, XML, or Markdown the way we do. At the API layer, you send something that looks nicely structured. Inside the model...
Deploying Vesuvian Vesuvian is finally ready to be released out into the wild. It's by far not complete, and if it were up to me, I'd still need to spend anothe...
I've been trying to self-host some AI models on my Hetzner server, and it's been an experience. My first idea was to run big models like gpt-oss-120b or Kimi-De...
Terminal IDEs or coding agents like Claude Code are popping up like mushrooms lately. I don't know if it is because it's easier than to build a full-fledged fat...
I'm building a system to experiment with local LLM models and host my apps. I'm using a dedicated Hetzner server with a Kubernetes cluster on it. I've always wa...
What do you do when you want to be part of the local AI gang but can't afford GPUs? You run CPU inference and hope to hit 10 tokens per second. Before I start w...
What do you do when you keep reading about open source models, people getting excited about what comes out of the Alibaba labs, and z.ai release GLM 4.6, but yo...
There are maybe 2 things you can learn from this post: 1. The dangers of vibe coding and DevOps in a fairly complex environment. I'd say I know what I'm doing m...