Posts

Showing posts from September, 2025

AI Safety - Not an Afterthought

Image
If you’re running a Large Language Model (LLM) in production, you know the truth: AI safety is not an optional feature, a nice tool to have, something you can negotiate. Clearly, it's a key design consideration, an architectural layer we must build in from the beginning. Relying solely on the base model's pre-training is naive at best, a strategy for disaster in instances. We need clear, technical, and layered defenses. This article breaks down the inherent risks we must address when designing LLM-centric applications and systems. We provide an illustrative example on how to concretely detect and mitigate a threat using an open model. While we’ll illustrate this using the Gemma 3 family of models, the principles and tooling apply across the board—remember, other powerful open-source guardrail models like Llama Guard and Qwen3Guard exist. Qwen3Guard is a recently published and highly capable model. 1. Understanding the Fundamental LLM Risk Modes LLMs are inherently p...