
Beyond The Pilot: Enterprise AI in Action
VentureBeat·30 episodes
AI gets real here. On “Beyond the Pilot,” top business execs share what actually happens after the AI proof of concept — from infrastructure and org design to wins, failures, and ROI. Not theory, but deep dives into how they scaled AI that works.
Episodes
Pinterest's open-source AI stack costs 90% less than frontier models — and their custom-trained recommender outperforms off-the-shelf alternatives by 30% in accuracy. Pinterest CTO Matt Madrigal breaks down exactly how they did it, and what enterprise AI teams can actually replicate. Madrigal walks through the full architecture behind Navigator 1, Pinterest's conversational shopping assistant built on Qwen 3 VL — and the specific decision to rip out its native vision encoder and replace it with PinCLIP, Pinterest's proprietary multimodal embedding layer. That swap alone closes a 20x inference latency gap and makes the economics work at 620 million monthly active users. This is the clearest public explanation yet of how a scaled platform operationalizes the "core vs. context" principle for model selection: open-source and custom-built where it touches the user, frontier models where speed-to-prototype matters more than cost. The conversation also covers the Taste Graph — Pinterest's knowledge graph across hundreds of billions of pins and 15 billion boards — and how post-training on that proprietary data lets a smaller, fit-for-purpose model beat a larger frontier model on production metrics. Madrigal details their eval framework: gold set benchmarks, product-level evals tied to engagement and merchant click outcomes, and a structured A/B test pipeline that runs from engineer PRs through to live user signal. On the organizational side: how Pinterest manages a "default yes" multi-IDE policy (Cursor, Windsurf, Claude Code, Codex) without collapsing security posture, how they segment sandbox environments between ML engineers with Taste Graph access and general application developers, and why Madrigal measures AI coding ROI in token usage and experimentation velocity — not lines of code. 🎙️ GUEST: Matt Madrigal | CTO, Pinterest 🎙️ HOSTS: Matt Marshall | VentureBeat, Sam Witteveen | VentureBeat 00:00 Show Intro and Guest 01:17 Open Source Cost Breakdown 02:20 Pinterest Multimodal Roots 02:37 PinClip and Embeddings 05:46 Core vs Context Models 07:43 Navigator 1 Assistant Stack 11:52 Benchmarking and Evals 13:29 Accuracy from Proprietary Data 17:16 Taste Graph Explained 18:29 Taste Graph in Training 22:22 Fighting AI Slop 25:16 Developer Tools and Velocity 27:57 Tool Choice and Governance 28:56 Security Sandboxes and CICD 30:57 Wrap Up Subscribe to VentureBeat: https://www.youtube.com/@VentureBeat Apple Podcasts: https://podcasts.apple.com/us/podcast/venturebeat/id1839285239 Spotify: https://open.spotify.com/show/4Zti73yb4hmiTNa7pEYls4 Website: https://venturebeat.com LinkedIn: https://www.linkedin.com/company/venturebeat Newsletter: https://venturebeat.com/newsletters #EnterpriseAI #OpenSourceAI #AIInfrastructure #LLM #MachineLear
Enterprise GPU hoarding is over. LinkedIn CTO Erran Berger and VentureBeat analyst Rob Strechay break down what comes next — and the infrastructure math most enterprises are only now being forced to confront. VentureBeat's Q1 research shows GPU availability anxiety dropped from 20.8% to 15.4% among enterprise teams, while cost-per-inference and TCO concerns jumped from 34% to 41% — a number that's still climbing. The hoarding phase is giving way to an audit phase, and the companies that didn't build the instrumentation to understand their workloads are now paying for it. Erran Berger explains how LinkedIn runs one of the few remaining at-scale applied ML shops outside the hyperscalers — owning the full stack from bare metal GPU clusters to member-facing products. That means LinkedIn engineers can optimize custom CUDA kernels, compress embeddings, prune models for throughput, and adapt networking and storage per workload — trade-offs that are simply unavailable on public cloud instance menus. The result: a rigorous ROI framework that evaluates not just current traffic costs, but the traffic shape agents will drive in 2–3 years. On the market side, 72% of enterprises admit they lack sufficient control over their AI infrastructure. Open-source inference tools like vLLM and LLMD are seeing rapid adoption, while 17% of organizations have moved to full-stack ownership. Hyperscalers report 60–80% of workloads have already shifted from training to inference — and most enterprise teams are still figuring out how to staff and instrument for that reality. 🎙️ GUEST: Erran Berger | CTO, LinkedIn 🎙️ ANALYST: Rob Strechay | VentureBeat 🎙️ HOST: Matt Marshall | CEO, VentureBeat --- 00:00 Intro: The GPU Hoarding Hangover 00:10 Guest Introductions 02:00 VentureBeat Q1 Data: GPU Panic Fades, TCO Concerns Rise 03:00 LinkedIn's Early Shift to Inference ROI Discipline 04:00 Budget Moving Into Inference Optimization and Control 07:00 LinkedIn's Full-Stack Advantage: Kernels, Pruning, Embedding Compression 08:00 Private AI and Sovereign Stacks: What the Q1 Data Shows 09:00 Open Source Inference Tooling: vLLM, LLMD, RDMA 10:00 Data Sovereignty at LinkedIn Scale: Member Data and Board-Level ROI Framing 12:00 Why Instrumentation Beats GPU Hoarding 13:00 Planning for Ambient Agent Traffic — Not Just Today's Workloads 14:00 Closing Advice for the Enterprise CTO Staring at 5% GPU Utilization --- Subscribe to VentureBeat: https://www.youtube.com/@VentureBeat Apple Podcasts: https://podcasts.apple.com/us/podcast/venturebeat/id1839285239 Spotify: https://open.spotify.com/show/4Zti73yb4hmiTNa7pEYls4 Website: https://venturebeat.com LinkedIn: https://www.linkedin.com/company/venturebeat Newsletter: https://venturebeat.com/newsletters #EnterpriseAI #AIInfrastructure #MLOps #InferenceOptimizati
The CEO who built one of the most-starred RAG frameworks on GitHub (47,000 stars) just publicly declared that frameworks like his are becoming obsolete — and then pivoted his entire company around that conclusion. Jerry Liu, CEO and co-founder of LlamaIndex, joins Matt Marshall and Sam Witteveen to explain exactly what broke in the AI stack, why 95% of his team's code is now AI-generated, and where the real defensibility in enterprise AI infrastructure actually lives in 2026. The conversation covers the specific architectural shift that made RAG orchestration frameworks less central: agent reasoning has improved to the point where dumb tools plus smart agents outperform sophisticated retrieval pipelines, coding agents have collapsed the cost of custom integrations, and model providers like Anthropic are consolidating the harness layer around MCP, sandboxes, and session state. Jerry walks through Anthropic's managed agent diagram as a real architectural reference point and explains why engineering leaders should prioritize modular interfaces over implementation investment — because parts of your current stack will need to be thrown away in months, not years. On SaaS survival, Jerry argues the companies that retain value are those becoming systems of record — and that the real opportunity is building AI agents that automate labor on top of their platforms, not defending UI/UX that agents are now bypassing. On LlamaIndex's own bet: document understanding — parsing PDFs, tables, charts, and forms at higher accuracy and lower cost than frontier models — is the context layer every agent stack needs regardless of which model wins the next benchmark cycle. LlamaParse and the newly released open-source ParseBench (April 13) are the commercial expression of that thesis. If you're evaluating your AI stack architecture, deciding how much to build vs. buy, or trying to understand where horizontal tooling still has a moat, this episode is the conversation. 🎙️ GUEST: Jerry Liu | CEO & Co-Founder, LlamaIndex 🎙️ HOSTS: Matt Marshall | VentureBeat, Sam Witteveen | VentureBeat --- **CHAPTERS** 00:00 Intro — LlamaIndex's origin and RAG framework origins 02:00 How LlamaIndex started: GPT-3, 4K context windows, and GPT Index 04:00 Why AI frameworks are becoming less useful in the agentic era 07:00 What changed in the stack: agent reasoning, coding agents, and RAG's evolution 09:00 How Anthropic's managed agent diagram reframes enterprise architecture 13:00 The lock-in question: managed agents, session state, and stack modularity 16:00 Should you build horizontal tooling? Why Jerry says probably not 18:00 Open vs. closed: the Apple/Android analogy applied to frontier labs 21:00 The abstraction level is rising — English is the new programming language 24:00 SaaS market cap destruction: who survives agents eating software 28:00 The "full
Cisco's OutShift deployed a multi-agent network configuration system that raised error detection from 10–15% to 100% and cut full change validation from 2–3 weeks to 6–7 minutes. The reason it worked — and why most enterprise multi-agent deployments still fail — comes down to a single gap nobody is talking about: agents can connect, but they cannot think together. Vijoy Pandey, SVP and General Manager of OutShift by Cisco, joins Matt and Sam to explain why A2A, MCP, and existing agent protocols solve connectivity but leave out an entire layer: shared cognition. OutShift's research identifies this as a missing "Layer 9" — a semantic and cognitive communication stack above today's syntactic protocols — and they're already building it. The conversation covers the four pillars of enterprise-grade multi-agent infrastructure (discovery, identity/access, communication, observability), why standard IAM models break when agents enter the picture, and how OutShift extended OpenTelemetry with Microsoft to cover multi-agent evaluation. Vijoy introduces three new cognition-state protocols — SSTP (Semantic State Transfer), LSTP (Latent Space Transfer), and CSTP (Compressed State Transfer) — and explains the staged rollout path for each, including a published MIT collaboration called the Ripple Effect Protocol. The healthcare scheduling case study is particularly instructive: three independent third-party agents — insurance, diagnostics, scheduling — each with competing optimization functions and siloed context, and zero shared intent. That's the real multi-vendor, multi-org enterprise problem. Vijoy explains what an orchestrator can't fix, and what a cognitive fabric layer would. 🎙️ GUEST: Vijoy Pandey | SVP & General Manager, OutShift by Cisco 🎙️ HOSTS: Matt Marshall | VentureBeat, Sam Witteveen | VentureBeat --- **CHAPTERS** 00:00 Intro & Cold Open: Agents Connect But Can't Think Together 00:03 Welcome & Guest Introduction: Vijoy Pandey, OutShift by Cisco 00:04 Do Agents Work Outside Coding & Customer Support? Challenging Amjad Masad's Diagnosis 00:05 What's Wrong With A2A and MCP? The Four Pillars of AGNTCY 00:08 Identity & Access Management for Agents: Why IAM Breaks and What TBAC Fixes 00:12 The Network Digital Twin: How OutShift Achieved 100% Error Detection in Production 00:13 From 2–3 Weeks to 6–7 Minutes: Real Results From Deployed Multi-Agent Networking 00:15 Agents Can Connect But Can't Think Together: The Core Thesis 00:20 The Cognitive Revolution Analogy: Shared Intent, Shared Context, Collective Innovation 00:25 The Healthcare Scheduling Case Study: Three Competing Agents, Zero Shared Intent 00:31 Why Orchestrators Fail in Multi-Vendor, Multi-Org Environments 00:36 Introducing Layer 9: SSTP, LSTP, and CSTP — The Cognition-State Protocol Stack 00:41 What OutShift Is Building Now: Protocols, Fabric, and Cogn
A QuickBooks customer discovered significant fraud by asking their AI assistant follow-up questions about transaction amounts that didn't add up. This isn't a demo — it's one of 3 million customers now using Intuit's AI agents in production, with 80.5% returning to use them again. Marianna Tessel, EVP and GM of QuickBooks (formerly CTO of Intuit), walks through the architecture decisions behind one of the first enterprise AI deployments at true scale. Intuit's "done-for-you" agents now automate book closing, reconciliation, transaction categorization, and payroll — but the breakthrough came when they realized chatbots alone weren't enough. Businesses wanted human experts integrated directly into AI workflows, creating what Intuit calls the "AI + HI" model (artificial intelligence + human intelligence). The results: invoices paid 5 days faster, 90% more paid in full, 30% reduction in manual work, and 62% of users reporting bookkeeping is easier. Tessel reveals the technical evolution: moving from monolithic agents to a dynamic orchestration layer that routes queries across multiple LLMs (including Intuit's proprietary FinLM built on open-source), 24,000 bank connections, and 600,000 customer attributes. The system now handles proactive anomaly detection, benchmarking against similar businesses, and even nascent vibe coding — all without requiring users to understand they're essentially programming workflows through natural language. She also addresses the "SaaS apocalypse" narrative head-on, explaining why QuickBooks saw 18% growth last quarter while competitors faced market pressure: durable data advantages and customer trust in financial accuracy matter more than ever when AI enters the mix. For enterprise builders navigating agent architecture, data grounding, and human-in-the-loop design, this is a rare look inside a working system serving millions. 🎙️ GUEST: Marianna Tessel | EVP & GM, QuickBooks (Intuit) 🎙️ HOSTS: Matt Marshall | VentureBeat, Sam Witteveen | VentureBeat 00:00 Intro — Customer discovers fraud using QuickBooks AI 03:26 Intuit Intelligence: Agents, BI, and human expertise integration 05:20 First-time AI users and going beyond chatbots 08:02 How Intuit decides which workflows to automate 10:16 Sponsor: Outshift by Cisco 10:38 Human-in-the-loop: When to insert experts vs. full automation 13:00 The AI + HI model: Why customers want human verification 15:24 Human expertise as confidence layer, not just AI check 16:14 Proprietary data advantage: 24K bank connections, 600K attributes 18:39 Benchmarking: "Businesses like me" — using aggregate data for competitive insights 19:52 First-party vs. third-party data strategy 21:38 Addressing the "SaaS apocalypse" narrative — why Intuit grew 18% last quarter 24:39 Proactive AI: Anomaly detection for marketing expense spikes 25:20 Builder perspective: Leaning
Major SaaS companies including Salesforce, Intuit, and ServiceNow saw stock drops of 45-50% as enterprises shift from bloated software suites to personalized AI agents that users can control directly. Microsoft just capitulated this week, opening Copilot to allow Claude Cowork-style functionality — a clear signal that the "build vs. buy" calculus for enterprise software has fundamentally changed. Matt Marshall and Sam Witteveen break down why personalization is no longer optional for enterprise products. Companies like Zoom now offer personalized workflows that access your conversation history and profile context. Infrastructure decisions are moving fast: token budgets must account for per-user context, identity management has become the biggest technical challenge for agent deployments, and "skills" (not just MCP) are emerging as the key abstraction layer. Zoom's Li Juan explains how their AI Companion moved beyond generic templates to user-controlled personalization: tracking opinion divergence in meetings, generating follow-up emails with specific context controls, and giving users explicit prompt examples instead of "good luck with your prompt." This is the new standard. If your product can't reason over which tools to use, which skills to apply, and which context to pull — all personalized to the individual user — you're competing with something that can be built in 10 days (Cowork's timeline). The agents-are-taking-over reality is here: multi-user agent architectures require thinking about context contamination, security postures for computer-use capabilities, and whether you're building internal agents or buying SaaS that will adapt. Sam's take: "AGI is agentic, and we're well along that continuum now." 🎙️ HOSTS: Matt Marshall | CEO, VentureBeat & Sam Witteveen | VentureBeat 📺 CHAPTERS: 00:00 Intro — The SaaS Apocalypse 00:01:00 The Personalization Imperative 00:02:00 Microsoft Copilot Capitulates to Cowork 00:03:00 From Template Selection to Skill Generation 00:04:00 The Land Grab for User Context 00:05:00 Zoom's Li Juan on Personalized Meeting Intelligence 00:06:00 Why Context = Magic in Enterprise AI 00:07:00 Product-Market Fit in the Agent Era 00:08:00 Metrics That Matter: JP Morgan's 30,000 Agents 00:09:00 Build vs. Buy: The New Calculus 00:10:00 Why Slack Might Win on Agent Identity Management 00:11:00 Zoom's AI Companion: Control Over Randomness 00:13:00 Li Juan on Purposeful Prompts and Reference Control 00:15:00 Multi-Agent vs. Multi-User: The Critical Distinction 00:16:00 LinkedIn's GPU Optimization Strategy 00:17:00 AGI Is Agentic: Where Presented by Outshift by Cisco Outshift is Cisco’s emerging tech incubation engine and driver of Agentic AI, quantum, and next-gen infrastructure. Learn more at outshift.cisco.com<
LangChain told employees they cannot install OpenClaw on company laptops due to "massive security risk" — yet this unhinged approach is exactly what makes it work. Harrison Chase unpacks why OpenClaw succeeds where AutoGPT failed, and why context engineering, not just smarter models, separates demo agents from production-ready systems. The shift is architectural: Modern agent harnesses like Claude Code now dump 40,000-token API responses to file systems instead of cramming them into message history. LangChain's Deep Agents framework emerged from reverse-engineering Claude Code, Codex, and Deep Research — discovering they all use planning via to-do lists, subagents for focused work, file systems for context control, and 2000-line system prompts. Harrison explains why coding agents make surprisingly good general-purpose agents, how prompt caching creates accuracy trade-offs, and why "context engineering" — bringing the right information in the right format to the LLM at the right time — matters more than framework choice. For enterprise teams: Harrison breaks down LangGraph (agent runtime with durable execution), LangChain (unopinionated agent framework), and Deep Agents (batteries-included harness). The conversation covers when to use graphs vs. loops, how skills differ from tools and subagents, and why nine months ago marked the inflection point where models could finally run reliably in autonomous loops. 🎙️ GUEST: Harrison Chase | Co-founder & CEO, LangChain 🎙️ HOSTS: Matt Marshall | CEO, VentureBeat | Sam Witteveen | VentureBeat **CHAPTERS:** 00:00 Intro — OpenClaw security warning 01:00 LangChain's origin story: From open source library to company 03:00 Early LLM patterns: RAG and SQL agents before ChatGPT 05:00 Why OpenClaw works where AutoGPT failed 08:00 Step change in agent capability: The summer 2024 inflection 11:00 Deep Agents unpacked: Planning, subagents, file systems, prompting 14:00 Skills vs tools vs subagents 16:00 LangGraph, LangChain, and Deep Agents architecture 19:00 Context engineering: What the LLM sees vs what developers see 21:00 File systems for context management vs AutoGPT's approach **LINKS:** Subscribe to VentureBeat: https://www.youtube.com/@VentureBeat Apple Podcasts: https://podcasts.apple.com/us/podcast/venturebeat/id1839285239 Spotify: https://open.spotify.com/show/4Zti73yb4hmiTNa7pEYls4 Website: https://venturebeat.com LinkedIn: https://www.linkedin.com/company/venturebeat Newsletter: https://venturebeat.com/newsletters #EnterpriseAI #AIAgents #LangChain #AgenticAI #LLMInfrastructure Learn more about your ad choices. Visit megaphone.fm/adchoices
On February 2nd, a single plugin wiped nearly $800 billion off the enterprise software market. Wall Street is terrified that AI agents are about to eat the legal industry's lunch. But LexisNexis isn't scared—they're building the moat. In this episode of Beyond the Pilot, Min Chen (Chief AI Officer, LexisNexis) reveals the sophisticated architecture they built to counter the "LLM wrapper" revolution. Moving beyond standard RAG, Min breaks down their move to "GraphRAG", their deployment of Agentic workflows (using Planner and Reflection agents), and why they created a proprietary "Usefulness Score" because standard accuracy metrics weren't good enough for lawyers. AI Gets Real Here. No theory, just the execution roadmap for deploying AI in a zero-error environment. In this episode, we cover: The "Dangerous RAG" Problem: Why semantic search fails in professional domains (retrieving "relevant" but overruled cases) and how "Point of Law" knowledge graphs fix it. The "Usefulness" Metric: The 8 sub-metrics LexisNexis uses (including Authority, Comprehensiveness, and Fluency) to grade AI quality. Agentic ROI: How deploying a "Planner Agent" to break down complex questions increased answer usefulness by 20%. The "Reflection Agent": Using a secondary agent to critique and refine drafts in real-time. Hallucination Detection: Why you should never rely on an LLM to judge its own hallucinations (and the deterministic code they use instead). ⏱️ TIMESTAMPS 00:00 - Intro: The $800 Billion AI Threat to Legal Tech 02:18 - Min Chen’s Journey: From Feature Engineering to Chief AI Officer 05:55 - Why Standard RAG Fails in Law (and How GraphRAG Fixes It) 10:40 - "Accuracy" is a Vanity Metric: The 8-Point Usefulness Score 14:20 - The "Auto-Eval" Framework: Human-in-the-Loop at Scale 16:40 - The Secret Sauce: Don't Use LLMs to Detect Hallucinations 21:15 - Agentic AI: How "Planner Agents" Drove a 20% Gain 22:00 - The "Reflection Agent": Self-Critique Loops for Drafting 30:30 - Distillation: Balancing Cost, Speed, and Quality 32:45 - Min’s Advice: Don't Build the Product First (Build the Metrics) Presented by Outshift by Cisco Outshift is Cisco’s emerging tech incubation engine and driver of Agentic AI, quantum, and next-gen infrastructure. Learn more at outshift.cisco.com. About VentureBeat: VentureBeat equips enterprise technology leaders with the clearest, expert guidance on AI – and on the data and security foundations that turn it into worki
While most of the world is still running GenAI pilots, Mastercard is running AI inference on 160 billion transactions a year—with a hard latency limit of 50 milliseconds per score. In this episode of Beyond the Pilot, Johan Gerber (EVP of Security Solutions) and Chris Merz (SVP of Data Science) open the hood on one of the world's largest production AI systems: Decision Intelligence Pro. They reveal how they moved beyond legacy rules engines to build Recurrent Neural Networks (RNNs) that act as "inverse recommenders"—predicting legitimate behavior faster than the blink of an eye. AI Gets Real Here. This isn't just about defense. Johan and Chris detail how they are taking the fight to criminals by leveraging Generative AI to engage scammers with "honeypots," expose mule accounts, and map fraud networks globally. In this episode, we cover: The 50ms Inference Challenge: How Mastercard optimized their RNNs to score transactions at a peak rate of 70,000 per second. "Scamming the Scammers": How GenAI agents are being used to automate honeypot conversations and extract mule account data. The "Inverse Recommender" Architecture: Why Mastercard treats fraud detection as a recommendation problem (predicting the next likely merchant). Org Design for Scale: The "Data Science Engineering Requirements Document" (DSERD) Chris used to align four separate engineering teams. The Hybrid Infrastructure: Why moving to Databricks and the cloud was necessary to cut innovation cycles from months to hours. 🚀 CHAPTERS 00:00 - Intro: 160 Billion Transactions & 50ms Decisions 02:08 - Thinking Like a Criminal: Johan’s Law Enforcement Background 06:22 - Org Design: Why AI is the "Middle Lane" of Engineering 11:00 - The Scale: 70k Transactions Per Second 15:47 - Decision Intelligence Pro: The "Inverse Recommender" RNN 23:00 - The "Lego Block" Strategy: Aligning Data Science & Engineering 33:00 - Infrastructure: Why Cloud/Databricks was Non-Negotiable 37:00 - GenAI Offensive: Threat Hunting & "Scamming the Scammers" 46:40 - "Honeypots" and Detecting Mule Accounts 52:00 - Advice for Technical Leaders: Talent & Prioritization Presented by Outshift by Cisco Outshift is Cisco’s emerging tech incubation engine and driver of Agentic AI, quantum, and next-gen infrastructure. Learn more at outshift.cisco.com. About VentureBeat: VentureBeat equips enterprise technology leaders with the clearest, expert guidance on AI – and on the data and security foundations that turn it into working reality. 🔗 CONNECT WITH US Subscribe to our Newsletters for technical breakdowns: https://venturebeat.com/newsletters Visit Ve
While the rest of the industry chases massive models, LinkedIn quietly achieved a major engineering breakthrough by going small. In this episode of Beyond the Pilot, Erran Berger (VP of Product Engineering, LinkedIn) opens the "cookbook" on how they distilled massive 7B parameter models down to ultra-efficient 600M parameter "student" models—scaling AI to 1.2 billion users without breaking the bank. AI Gets Real Here. This isn't theory. Erran details the exact architecture, the "Multi-Teacher" distillation process, and the organizational shift that forced Product Managers to write evals instead of specs. In this episode, we cover: The Distillation Pipeline: How to train a 7B "Teacher" and distill it to a 1.7B intermediate and 0.6B "Student" for production. Synthetic Data Strategy: Using GPT-4 to generate the "Golden Dataset" for training. Multi-Teacher Architecture: Why they separated "Product Policy" and "Click Prediction" into different teacher models to solve alignment issues. 10x Efficiency Hacks: Specific techniques (Pruning, Quantization, Context Compression) that slashed latency. Org Design: Why the "Eval First" culture is the new requirement for AI engineering teams. 🚀 CHAPTERS 00:00 - Intro: LinkedIn's Massive "Small Model" Feat 04:00 - Why Commercial Models Failed at LinkedIn Scale 08:00 - The "Product Policy" Funnel & Synthetic Data Generation 12:00 - The Pipeline: 7B → 1.7B → 600M Parameters 19:00 - The "Multi-Teacher" Breakthrough (Relevance vs. Clicks) 23:00 - How They Achieved 10x Latency Reduction (Pruning/Compression) 31:00 - Changing the Culture: Why PMs Must Write Evals 35:00 - The "Bright Green Matrix": Measuring Success & Future Roadmap Presented by Outshift by Cisco Outshift is Cisco’s emerging tech incubation engine and driver of Agentic AI, quantum, and next-gen infrastructure. Learn more at outshift.cisco.com. About VentureBeat: VentureBeat equips enterprise technology leaders with the clearest, expert guidance on AI – and on the data and security foundations that turn it into working reality. 🔗 CONNECT WITH US Subscribe to our Newsletters for technical breakdowns: https://venturebeat.com/newsletters Visit VentureBeat: Venturebeat.com . . . Subscribe to VentureBeat: / @VentureBeat . . Subscribe to the full podcast here: Apple: https://podcasts.apple.com/us/podcast/venturebeat/id1839285239 Spotify: https://open.spotify.com/show/4Zti73yb4hmiTNa7pEYls4 YouTube: https
The "TAM" for AI Agents isn't software. And there is a $10 Trillion opportunity. In this episode, Replit CEO Amjad Masad reveals why 99% of today's enterprise AI agents are just "Slop"—unreliable, generic toys that fail in production. We dive deep into the engineering reality of building autonomous agents that actually work, moving beyond simple chatbots to systems that can navigate the messy reality of enterprise infrastructure. Amjad breaks down Replit’s "Computer Use" hack that makes agents 10x cheaper than generic models, explains why "Vibe Coding" is the future of the C-Suite, and issues a warning to technical leaders: If you want to ship fast in the AI era, you need to kill your product roadmap. In this episode, we cover: The "Slop" Problem: Why most LLM outputs are generic and how to inject "taste" back into software. The Computer Use "Hack": How Replit built a programmatic verifier loop that outperforms vision-based models. Vibe Coding: Why non-technical domain experts (HR, Sales, Marketing) will build the next generation of enterprise software. The $10T Market: Why the Junior Developer role is disappearing and being replaced by the "Manager of Agents." 🚀 CHAPTERS 0:00 - Intro: Why most AI Agents are "Toys" 03:02 - The only 2 AI use cases making money right now 06:00 - The "Crappy Product" Strategy (Shipping fast) 10:00 - What is "AI Slop"? (And how to fix it) 14:30 - The "Deleted Database" Incident: Solving Reliability 18:00 - The "Squishy" Divide: Why Marketing Agents fail 21:45 - Vibe Coding in the Enterprise 26:00 - Model Wars: Claude Opus vs. Gemini vs. OpenAI 28:10 - The "Computer Use" Hack (10x Cheaper, 3x Faster) 36:00 - Why Product Roadmaps are Dead 43:00 - Replit is the #1 Software Vendor (Ramp Data) 49:00 - The Unit Economics of Agents (Token Costs vs. Value) 53:00 - Open Source vs. Closed: The "Cathedral of Bazaars" 59:00 - The $10 Trillion Opportunity: Replacing Labor 🔗 CONNECT WITH US Subscribe to our Newsletters for technical breakdowns: https://venturebeat.com/newsletters Visit VentureBeat: Venturebeat.com #AgenticAI #Replit #VibeCoding #EnterpriseAI #LLM #SoftwareEngineering #FutureOfWork #AmjadMasad #ArtificialIntelligence #DevOps . . . Subscribe to VentureBeat: / @VentureBeat . . Subscribe to the full podcast here: Apple: https://podcasts.apple.com/us/podcast/venturebeat/id1839285239 Spotify: https://open.spotify.com/show/4Zti73yb4hmiTNa7pEYls4 YouTube: https://www.youtube.com/VentureBeat Learn more about your ad choic
Inside the 'Agent Economy': How 30,000 AI Assistants Took Over JPMorgan While most enterprises were scrambling after ChatGPT launched, JPMorgan Chase was already two years ahead. 🚀 In this episode of Beyond the Pilot, we sit down with Derek Waldron, Chief Analytics Officer at JPMorgan Chase, to reveal how the world’s largest bank built an internal AI platform that is now used by 1 in 2 employees daily. Derek shares the contrarian insight that drove their strategy: AI models are commodities; the real moat is connectivity. Learn how they scaled from zero to 250,000+ users, why they empowered employees to build 30,000+ of their own "Personal Agents," and how they are solving the data privacy challenge at an enterprise scale. 🔥 IN THIS EPISODE: The "Super Intelligence" Thought Experiment: Why raw intelligence is useless without enterprise connectivity. The Agent Economy: How JPM enabled non-technical staff to build 30,000 custom AI assistants. The Adoption Playbook: How to break through the "30% wall" and get the majority of your workforce using AI. Build vs. Buy: Why JPM built their own "LLM Suite" instead of waiting for vendors. ⏳ CHAPTERS: 00:00 - Introduction: The JPMorgan AI Story 01:45 - The 3 Core Principles Behind JPM’s Strategy 03:25 - The "Super Intelligence" Thought Experiment 05:00 - Data Privacy: Why JPM Doesn't Train Public Models 06:00 - Viral Adoption: From 0 to 250k Users 09:20 - Evolution of LLM Suite: From RAG to Ecosystem 14:00 - The "Moat" is Connectivity, Not the Model 23:00 - The Agent Economy: 30,000 Employee-Built Assistants 31:00 - Governance & Guardrails for AI Agents 33:00 - Crossing the Chasm: Getting to 60% Adoption 40:00 - The "Product" Mindset: Solving Business Problems First 42:30 - The Future: End-to-End Process Transformation 46:25 - The "Unsolved" Problem Derek Wants to Fix 🙏 SPECIAL THANKS TO OUR SPONSOR: This episode is presented by Outshift by Cisco. Learn more about their work on the Internet of Agents and the open-source Linux Foundation project: 🔗 https://www.agentcy.org 🎙️ GUEST: Derek Waldron | Chief Analytics Officer, JPMorgan Chase HOSTS: Matt Marshall | VentureBeat Sam Witteveen | VentureBeat #EnterpriseAI #JPMorgan #GenerativeAI #AgenticAI #FinTech #ArtificialIntelligence #Innovation #BeyondThePilot . . . Subscribe to VentureBeat: / @VentureBeat . . Subscribe to the full podcast here: Apple: https://podcasts.apple.com/us/podcast/venturebeat/id1839285239</
We built AI agents by accident... and it worked. 🤯 In this episode of VentureBeat’s Beyond the Pilot, we go inside the engineering brain of Booking.com with Pranav Pathak (Director of Product Machine Learning). Pranav reveals how they "stumbled" into agentic architectures before the term even existed, how a simple text box revealed a massive missed revenue opportunity (the "Hot Tub" story), and exactly how they stack LLMs, RAG, and Orchestrators to handle millions of travelers without breaking the bank. If you are building Enterprise AI, this is the blueprint for moving from "cool demo" to production scale. 🚀 In this episode, we cover: The "Hot Tub" Revelation: How free-text AI search exposed features customers desperately wanted but couldn't find. Real ROI Metrics: How LLMs drove a 2x increase in topic detection accuracy and freed up 1.5x of agent bandwidth. The Booking.com AI Stack: A full breakdown of their Orchestrator → Moderation → Agent → RAG architecture. Latency vs. Intelligence: Why they don't use GPT-5 for everything and how they decide between small models and big brains. The Memory Problem: How to build AI that remembers user preferences without being "creepy”. 00:00 Introduction to Agentic Architectures 00:30 Meet Pranav Pathak from Booking.com 01:24 Evolution of Travel Recommendations 03:41 Impact of Gen AI on Customer Service 07:29 Building an Effective AI Stack 10:32 Agentic Systems and Best Practices 13:45 Choosing Between Building and Buying AI Solutions 18:51 Evaluating AI Models for Business Use 24:10 Challenges in Human Evaluation 25:06 Recommendation System and Data Utilization 27:26 Innovations in Travel Search 29:04 Journey and Challenges in ML Integration 32:08 Managing Memory and User Data 38:07 Future of Travel Assistance 41:33 Advice for New AI Integrations 43:57 Final Thoughts and Farewell 🔗 LINKS & RESOURCES: OutShift by Cisco (Sponsor): outshift.cisco.com VentureBeat: www.venturebeat.com #ArtificialIntelligence #GenAI #Bookingcom #MachineLearning #AgenticAI #LLM #TechPodcast #EnterpriseAI . . . Subscribe to VentureBeat: / @VentureBeat . . Subscribe to the full podcast here: Apple: https://podcasts.apple.com/us/podcast/venturebeat/id1839285239 Spotify: https://open.spotify.com/show/4Zti73yb4hmiTNa7pEYls4 YouTube: https://www.youtube.com/VentureBeat Learn more about your ad choices. Visit megaphone.fm/adchoi
In our inaugural episode, we sit down with Ryan Nystrom, leader of the AI team at Notion, to pull back the curtain on Notion 3.0. Ryan reveals the journey of integrating powerful AI agents into the productivity platform and draws fascinating parallels between the current AI era and the mobile revolution he witnessed at Instagram. He shares exclusive insights into the development challenges, the critical role of tools, context, and curation, and how custom agents are poised to reshape work. Plus, Ryan offers essential advice for any company diving into the AI space. Learn more about your ad choices. Visit megaphone.fm/adchoices
Lift-and-shift isn’t enough. MongoDB’s Vinod Bagal breaks down how to modernize your data for AI — and why waiting could cost you your competitive edge. Host: Sean Michael Kerner Guest: Vinod Bagal For more stories visit venturebeat.com Learn more about your ad choices. Visit megaphone.fm/adchoices
AI is accelerating the cyber arms race — and former FBI agents Paul Bingham and Mike Morris say most enterprises aren’t ready. In this VB in Conversation, they break down the real-world threats targeting critical infrastructure, how AI is changing the attack surface, and why smart, layered defense starts with training the next-gen cyber workforce. Learn more about your ad choices. Visit megaphone.fm/adchoices
Key Insights and Takeaways from VentureBeat Transform Event: AI and Enterprise Innovation In this episode, Matt and Sam recount their experiences and insights from the recent VentureBeat Transform event, an annual gathering focused on enterprise AI. They discuss the significant takeaways, including the increasing adoption of AI agents in production, the lack of dominance by any single hyperscaler in the AI model space, the focus on practical AI solutions over super-intelligence hype, and the evolving structure of teams in the AI-driven workplace. Highlights include insights from speakers at major companies like American Express, Google, IBM, and Zoom, as well as discussions on AI safety and the changing management dynamics with AI agents. Tune in to get a comprehensive overview of the current state and future of AI in enterprise settings. 00:00 Introduction and Event Overview 00:51 Key Takeaways from VentureBeat Transform 02:46 AI Deployment in Enterprises 04:37 Insights from Industry Leaders 08:48 Hyperscalers and Model Dominance 12:12 Real-World AI Applications 14:04 Focus on Practical AI Solutions 24:08 The Future of AI Teams 30:37 Conclusion and Final Thoughts Learn more about your ad choices. Visit megaphone.fm/adchoices
Visa’s SVP of Data & AI, Sam Hamilton, joins VentureBeat to break down the hidden costs, trade-offs, and infrastructure realities behind running over 400 AI solutions incorporating 300 AI models at global scale. Learn more about your ad choices. Visit megaphone.fm/adchoices
Google Cloud Next: In-Depth Discussion with Chief Evangelist Richard Seroter Join us for an exclusive interview with Richard Seroter, Chief Evangelist of Google Cloud, as he discusses the latest developments and insights from Google Cloud Next. Dive into conversations about AI advancements, the new agent development kit, and the multi-agent protocol, and how they are reshaping the future of cloud services and enterprise solutions. Learn about the balance between pre-built and custom agents, and Google's commitment to open-source and multi-cloud flexibility. Don't miss out on this insider look at the cutting edge of AI and cloud technology. 00:00 Introduction and Recap of Google Cloud Next 02:17 Interview with Richard Seroter Begins 02:51 Discussing the Developer Keynote and Agent Technology 03:31 The Evolution and Readiness of AI Agents 05:19 Google's Approach to AI and Agent Development 07:20 Comparing Google with Competitors in AI 09:02 Agent Development Kit and Industry Adoption 10:51 The Future of Multi-Agent Systems 16:04 Google's Open Source Strategy and Cloud Integration 21:11 Exploring Google's Interest in Agent Technology 21:45 The Future of Agent Marketplaces 22:27 Google's Role in Payment Processing for Agents 23:17 Community Adoption of Agent Protocols 26:20 Enterprise Applications of Agents 28:43 The Evolution of Agent Space 33:48 The Rise of Personal Agents 36:04 Balancing Innovation Across Google Cloud and Labs 38:00 The Impact of Pre-Built Agents 40:18 Conclusion and Final Thoughts Learn more about your ad choices. Visit megaphone.fm/adchoices
Cisco’s Anurag Dhingra joins VentureBeat’s Matt Marshall to unpack what it means to build a truly AI-ready network. From smart switching to agentic operations, Dhingra explains how enterprises must stay ahead of surging network demands — and why network intelligence is now foundational to scaling AI. Learn more about your ad choices. Visit megaphone.fm/adchoices
Exploring Napkin.ai: Revolutionizing Graphic Design with AI Agents Join Matt Marshall, founder and CEO of VentureBeat, and Sam Witteveen as they interview Pramod Sharma, CEO of Napkin.ai, and co-founder Jerome Scholler, about their innovative AI-powered graphic design tool. Discover how Napkin.ai has rapidly grown to 2 million beta users with its unique ability to transform text into compelling graphics effortlessly. Learn about the sophisticated backend structure involving multiple specialized AI agents and the recently introduced custom styles feature that allows users and companies to define and perfect their graphic outputs. Perfect for anyone interested in the future of AI in graphic design. 00:00 Introduction to Napkin AI 00:24 Interview with Pramod Sharma 00:44 How Napkin AI Works 00:59 The AI Agency Model 01:56 Deep Dive into Napkin AI's Features 04:34 User Experience and Customization 08:00 Future Plans and Innovations 15:11 Technical Insights and Agent Structure 28:09 Conclusion and Final Thoughts Learn more about your ad choices. Visit megaphone.fm/adchoices
Part 1: Inside the Cybersecurity-First AI Model LLMs are inherently non-deterministic — but more data isn’t always better. As Cisco’s Jeetu Patel explains, Cisco Foundation AI distilled 900 billion tokens down to the most relevant 5 billion to create the industry’s first AI model purpose-built for security. Part 2: AI Security: Built-in, Always-On Cisco’s Tom Gillis and Splunk’s Mike Horn reveal how distributed enforcement, self-upgrading firewalls, and AI-powered infrastructure are redefining security architecture and transforming what a SOC can do in the AI era. Learn more about your ad choices. Visit megaphone.fm/adchoices
Originally published in September 2024 As orgs race to progress from proof-of concept to full, scalable production in AI, there are many lessons and resets. Key to these are choices made around infrastructure. Intuit’s Credit Karma, with over 100 million users, has been one of AI’s success stories. Vishnu Ram, Credit Karma’s VP of Engineering, spoke with VB about those lessons – and how the learning will continue. Learn more about your ad choices. Visit megaphone.fm/adchoices
NFL CISO Tomás Maldonado speaks with VentureBeat about defending Super Bowl LIX from adversarial attacks that potentially include weaponized AI, endpoint attacks, deepfakes, and finely tuned social engineering – and require collaboration with the FBI and Secret Service. Learn more about your ad choices. Visit megaphone.fm/adchoices
Originally published in October 2023 VentureBeat Editor-in-Chief Matt Marshall talks with Mazen Rawashdeh, SVP & CTO of eBay, about how one of the world’s largest online marketplaces is using innovative technologies to power its data centers and reduce its environmental impact while meeting the growing demand for data processing in the age of generative AI. Rawashdeh explores the importance of fungibility between CPUs and GPUs, hybrid cloud, open source, running their data centers on 85% utilization and more. Learn more about your ad choices. Visit megaphone.fm/adchoices
Originally published in October 2023 VentureBeat Editor-in-Chief Matt Marshall talks with Archana Deskus, EVP & CIO of Paypal, on how the payment giant, with more than 400 million users, is meeting the need for more computation, more power and more storage, particularly during peak periods — all while committing to sustainability and greater efficiency. Deskus dives into bursting capacity, data center consolidation, asset utilization, efficient code and more. Learn more about your ad choices. Visit megaphone.fm/adchoices
Originally Released in February 2024 In his role as Global Leader for Tech and Digital Advantage at the Boston Consulting Group, Vlad Lukic has overseen hundreds of AI deployments. Now working with scores of enterprise companies racing to embrace gen AI, he finds that critical considerations are often overlooked, such as readiness of data and pivotal financial factors. Missteps such as deploying AI where it’s not really needed and neglecting the essential task of managing organizational change for seamless AI adoption can also tank the best intentions to capitalize on gen AI. He also has lots of advice on how to do it right. Learn more about your ad choices. Visit megaphone.fm/adchoices
Originally published in July 2023 Kamran Ziaee, SVP, Technology Strategy & Global Infrastructure at Verizon sits down with VB Editor-in-Chief Matt Marshall to discuss how data centers now must handle greater and greater demands. While high-performance computing used to be reserved for exceptionally data-intensive applications, like gene sequencing or self-driving cars, every industry is seeing real use cases for adopting HPC. Verizon is an example of a company that says HPC is now table stakes for many applications it runs in its data centers. While Verizon has consolidated down to three data centers from nine, it has also invested in multi-access edge computing (MEC) to ensure rapid response time for customers. We explore his unique approach to an optimized hybrid cloud, where many critical applications stay on-prem and highly elastic applications migrate to the public cloud to enjoy performance gains there. We touch on the Gen AI craze and the proof of concepts the company is running, which also point to a hybrid future. Learn more about your ad choices. Visit megaphone.fm/adchoices
Originally Published in July 2023 FedEx handles 100 billion daily transactions, from tracking packages to routing flights. To keep up with this massive data demand, the company has invested in data center performance, especially in high-performance computing. Ken Spangler, EVP of IT and CIO of Global Operations Technology, FedEx, talks with VB Editor-in-Chief Matt Marshall about how the company has adopted the co-location model for the data center, which allows it to leverage the benefits of cloud computing while maintaining control of the technology stack. He also reveals how FedEx is partnering with edge computing providers to deploy “mini data centers” that can reduce latency and enhance security for its customers – and of course AI and automation. Learn more about your ad choices. Visit megaphone.fm/adchoices
Originally Released in October 2023 How do companies identify the best uses cases for gen AI — and then navigate the complexity of AI-first products and tools to drive that innovation at scale? VB’s editorial director, Michael Nuñez, speaks with Databricks co-founder Reynold Xin and Vijoy Pandey, SVP at Outshift by Cisco, about the many factors that will determine how organizations succeed, or don’t, during a time of, as Xin says, bottom-up innovation. Learn more about your ad choices. Visit megaphone.fm/adchoices
Reviews
No reviews yet.
If you like this...
Discussion (0)
No comments yet. Be the first to start the discussion!




