Skip to content Skip to sidebar Skip to footer

FACTS Benchmark Suite: a new way to systematically evaluate LLMs factuality

Large language models (LLMs) are increasingly becoming a primary source for information delivery across diverse use cases, so it’s important that their responses are factually accurate. In order to continue improving their performance on this industry-wide challenge, we have to better understand the types of use cases where models struggle to provide an accurate response…

Read More

Gemma Scope 2: Helping the AI Safety Community Deepen Understanding of Complex Language Model Behavior

Announcing a new, open suite of tools for language model interpretability Large Language Models (LLMs) are capable of incredible feats of reasoning, yet their internal decision-making processes remain largely opaque. Should a system not behave as expected, a lack of visibility into its internal workings can make it difficult to pinpoint the exact reason for…

Read More

AlphaFold: Five Years of Impact

Increasing speed of discovery Cyril Zipfel, professor of Molecular & Cellular Plant Physiology at the University of Zurich and Sainsbury Lab, saw research timelines shrink drastically. They used AlphaFold alongside comparative genomics to better understand how plants perceive changes in their environment, paving the way for more resilient crops. AlphaFold has been cited in more…

Read More