|
Description:
|
|
In this episode, Pallavi Koppol, Research Scientist at Databricks, explores the importance of domain-specific intelligence in large language models (LLMs). She discusses how enterprises need models tailored to their unique jargon, data, and tasks rather than relying solely on general benchmarks.
Highlights include: - Why benchmarking LLMs for domain-specific tasks is critical for enterprise AI. - An introduction to the Databricks Intelligence Benchmarking Suite (DIBS). - Evaluating models on real-world applications like RAG, text-to-JSON, and function calling. - The evolving landscape of open-source vs. closed-source LLMs. - How industry and academia can collaborate to improve AI benchmarking. |