Why AI Isn't Just More Software: A Guide to ML, MLOps, and Reinforcement Learning

0:000:00

PodcastOctober 28, 2025

Why AI Isn't Just More Software: A Guide to ML, MLOps, and Reinforcement Learning

MACHINE LEARNINGMLOPSREINFORCEMENT LEARNINGAI STRATEGYSOFTWARE ENGINEERING

Why can't you apply Agile sprints to an AI project? This episode dives into why ML development is 'fuzzy' and non-linear, unlike traditional software. We explore the 'nothing, nothing, something' problem that frustrates engineers and managers alike. Discover the real-world challenges of MLOps, from testing non-deterministic models to deployment. The conversation also breaks down Reinforcement Learning (RL), explaining how it learns from exploration, the high-stakes risks, and its role in training LLMs.

Hosted by:

Deejay

Featuring:

Phil Winder, Winder.AI

Episode Transcript

Episode Highlights

•Discusses why AI projects are 'fuzzy' and non-linear, unlike the prescriptive, plannable nature of traditional software engineering.
•Explains that ML models need to be 'massaged and babied' and often show 'nothing, nothing, something' progress, frustrating agile teams.
•Testing AI models is probabilistic, not pass/fail, requiring 'fuzzing' to find edge cases where the model lacks data.
•Reinforcement Learning (RL) is defined by its 'agency to explore' an environment, a key difference from other ML types.
•The biggest challenge in RL is the need for a safe simulation, as live exploration in industrial settings 'could be catastrophic.'
•'Offline Reinforcement Learning' is a powerful alternative that can train effective agents purely from pre-existing logged data.
•Modern LLMs are already trained using RL, which uses human feedback to fine-tune models for better conversational behavior.