|
Description:
|
|
In the episode 25 of The Gradient Podcast, we talk to Greg Yang, senior researcher at Microsoft Research. Greg Yang’s Tensor Programs framework recently received attention for its role in the µTransfer paradigm for tuning the hyperparameters of large neural networks.
Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSS Follow The Gradient on Twitter
Sections:
(00:00) Intro (01:50) Start in AI / Research (05:55) Fear of Math in ML (08:00) Presentation of Research (17:35) Path to MSR (21:20) Origin of Tensor Programs (26:05) Refining TP’s Presentation (39:55) The Sea of Garbage (Initializations) and the Oasis (47:44) Scaling Up Further (55:53) On Theory and Practice in Deep Learning (01:05:28) Outro
Episode Links:
Greg’s Homepage
Greg’s Twitter
µP GitHub
Visual Intro to Gaussian Processes (Distill)
Get full access to The Gradient at thegradientpub.substack.com/subscribe |