Posts

Showing posts from October, 2025

Structured Evaluations – One Building Block of AI Safety

Image
TLDR; Evaluation of AI systems is essential for safety and satisfaction of our users. Evaluation of stochastic AI systems is a lot more complex than your typical deterministic enterprise application. Not only does this require an understanding of user needs, non-functional system requirement, model capabilities but also their intersections to pick the right evaluation approach. A sound approach rests on a clear blue print, the right evaluation metrics, and automatization for transparency. A structured evaluation approach is a must. Industry grade LLM-application development frameworks facilitate implementation. Introduction AI assistants like chatbots excel in use cases where know-how needs to be mediated. Whether this is a cooking recipe, the explanation of a complex physical topic or a piece of interesting trivia. In our current project we apply LLMs to disseminate travel information to people interested in the beautiful city of Cologne. You want to know something about thi...