Watch and track your favorite playlist.
Curated by: Programming Throwdown (374 videos)
Intro topic: Grills News/Links: * You can’t call yourself a senior until you’ve worked on a legacy project * https://www.infobip.com/developers/blog/seniors-working-on-a-legacy-project ( https://www.infobip.com/developers/blog/seniors-working-on-a-legacy-project ) * Recraft might be the most powerful AI image platform I’ve ever used — here’s why * https://www.tomsguide.com/ai/ai-image-video/recraft-might-be-the-most-powerful-ai-image-platform-ive-ever-used-heres-why ( https://www.tomsguide.com/ai/ai-image-video/recraft-might-be-the-most-powerful-ai-image-platform-ive-ever-used-heres-why ) * NASA has a list of 10 rules for software development * https://www.cs.otago.ac.nz/cosc345/resources/nasa-10-rules.htm ( https://www.cs.otago.ac.nz/cosc345/resources/nasa-10-rules.htm ) * AMD Radeon RX 9070 XT performance estimates leaked: 42% to 66% faster than Radeon RX 7900 GRE * https://www.tomshardware.com/tech-industry/amd-estimates-of-radeon-rx-9070-xt-performance-leaked-42-percent-66-percent-faster-than-radeon-rx-7900-gre ( https://www.tomshardware.com/tech-industry/amd-estimates-of-radeon-rx-9070-xt-performance-leaked-42-percent-66-percent-faster-than-radeon-rx-7900-gre ) Book of the Show * Patrick: * The Player of Games (Ian M Banks) * https://a.co/d/1ZpUhGl ( https://a.co/d/1ZpUhGl ) (non-affiliate) * Jason: * Basic Roleplaying Universal Game Engine * https://amzn.to/3ES4p5i ( https://amzn.to/3ES4p5i ) Patreon Plug https://www.patreon.com/programmingthrowdown?ty=h ( https://www.patreon.com/programmingthrowdown?ty=h ) Tool of the Show * Patrick: * Pokemon Sword and Shield * Jason: * Features and Labels ( https://fal.ai ( https://fal.ai ) ) Topic: Reinforcement Learning * Three types of AI * Supervised Learning * Unsupervised Learning * Reinforcement Learning * Online vs Offline RL * Optimization algorithms * Value optimization * SARSA * Q-Learning * Policy optimization * Policy Gradients * Actor-Critic * Proximal Policy Optimization * Value vs Policy Optimization * Value optimization is more intuitive (Value loss) * Policy optimization is less intuitive at first (policy gradients) * Converting values to policies in deep learning is difficult * Imitation Learning * Supervised policy learning * Often used to bootstrap reinforcement learning * Policy Evaluation * Propensity scoring versus model-based * Challenges to training RL model * Two optimization loops * Collecting feedback vs updating the model * Difficult optimization target * Policy evaluation * RLHF & GRPO Programming Throwdown Episode 180 March 17, 2025 ★ Episode details: https://share.transistor.fm/s/3d34f87a ★ Additional episodes: https://www.programmingthrowdown.com/
Automatically track which videos you have watched. Your completion status is updated at a glance, preventing you from re-watching episodes by mistake.
Never lose your spot. Our custom player remembers your exact video and timestamp, allowing you to dive right back in seamlessly.
Sync your playlist states, watched progress, and premium preferences across your desktop, laptop, tablet, and mobile phone automatically.
Simply paste any YouTube playlist URL or channel link in the application search bar to immediately generate a custom, sorted, and progress-tracked workspace. No registration required to start.
Explore Playlist Guides & How-Tos