|
Episode 123 I spoke with Suhail Doshi about: * Why benchmarks aren’t prepared for tomorrow’s AI models * How he thinks about artists in a world with advanced AI tools * Building a unified computer vision model that can generate, edit, and understand pixels. Suhail is a software engineer and entrepreneur known for founding Mixpanel, Mighty Computing, and Playground AI (they’re hiring!). Reach me at editor@thegradient.pub for feedback, ideas, guest suggestions. Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on Twitter Outline: * (00:00) Intro * (00:54) Ad read — MLOps conference * (01:30) Suhail is *not* in pivot hell but he *is* all-in on 50% AI-generated music * (03:45) AI and music, similarities to Playground * (07:50) Skill vs. creative capacity in art * (12:43) What we look for in music and art * (15:30) Enabling creative expression * (18:22) Building a unified computer vision model, underinvestment in computer vision * (23:14) Enhancing the aesthetic quality of images: color and contrast, benchmarks vs user desires * (29:05) “Benchmarks are not prepared for how powerful these models will become” * (31:56) Personalized models and personalized benchmarks * (36:39) Engaging users and benchmark development * (39:27) What a foundation model for graphics requires * (45:33) Text-to-image is insufficient * (46:38) DALL-E 2 and Imagen comparisons, FID * (49:40) Compositionality * (50:37) Why Playground focuses on images vs. 3d, video, etc. * (54:11) Open source and Playground’s strategy * (57:18) When to stop open-sourcing? * (1:03:38) Suhail’s thoughts on AGI discourse * (1:07:56) Outro Links: * Playground homepage * Suhail on Twitter
Get full access to The Gradient at thegradientpub.substack.com/subscribe |