(Voiceover) Building on evaluation quicksand

Interconnects - A podcast by Nathan Lambert

Categories:

Read the full post here: https://www.interconnects.ai/p/building-on-evaluation-quicksandChapters00:00 Building on evaluation quicksand01:26 The causes of closed evaluation silos06:35 The challenge facing open evaluation tools10:47 Frontiers in evaluation11:32 New types of synthetic data contamination13:57 Building harder evaluationsFiguresFig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/manual/openai-predictions.webp Get full access to Interconnects at www.interconnects.ai/subscribe

Visit the podcast's native language site