333: The Cloud Pod Goes Nano Banana

Welcome to episode 333 of The Cloud Pod, where the forecast is always cloudy! Justin, Ryan, and Matt are taking a quick break from re:Invent festivities. They bring you the latest and greatest in Cloud and AI news. This week, we discuss Norad and Anthropic teaming up to bring you Christmas cheer. Wait, is that right? Huh. We also have undersea cables, some Turkish region delight, and a LOT of Opus 4.5 news. Let’s get into it! Titles we almost went with this week Boring Error Pages Not Found Claude Goes Native in Snowflake: Finally, AI That Stays Where Your Data Lives Cross-Cloud Romance: AWS and Google Make It Official with Interconnect Google Gemini Puts OpenAI in Code Red: The Tables Have Turned Azure NAT Gateway V2: Now With More Zones Than a Parking Lot From ChatGPT to Chat-Uh-Oh: OpenAI Sounds the Alarm as Gemini Steals 200 Million Users **Anthropic Scheduled Actions: Because Your VMs Need a Work-Life Balance Too Finally, Your 500 Errors Can Look as Good as Your Homepage Foundry Model Router: Because Choosing Between 47 AI Models is Nobody’s Idea of Fun Google Takes the Scenic Route: New Cable Avoids the Sunda Strait Traffic Jam Azure Application Gateway Gets Its TCP/IP Diploma Google Cloud Gets Its Türkiye Dinner: 2 Billion Dollar Cloud Feast Coming Soon Microsoft Foundry: Turning AI Chaos into Compliance Gold AI Is Going Great, or How ML Makes Money 02:59 Nano Banana Pro available for enterprise Google launches Nano Banana Pro (Gemini 3 Pro Image) in general availability on Vertex AI and Google Workspace , with Gemini Enterprise support coming soon. The model supports up to 14 reference images for style consistency and generates 4K resolution outputs with multilingual text rendering capabilities. The model includes Google Search grounding for factual accuracy in generated infographics and diagrams, plus built-in SynthID watermarking for transparency. Copyright indemnification will be available at general availability under Google’s shared responsibility framework. Enterprise integrations are live with Adobe Firefly , Photoshop , Canva , and Figma , enabling production-grade creative workflows. Major retailers, including Klarna, Shopify, and Wayfair, report using the model for product visualization and marketing asset generation at scale. Developers can access Nano Banana Pro through Vertex AI with Provisioned Throughput and Pay As You Go pricing options, plus advanced safety filters. Business users get access through Google Workspace apps, including Slides, Vids, and NotebookLM , starting today. The model handles complex editing tasks like translating text within images while preserving visual elements, and maintains character and brand consistency across multiple generated assets. This addresses a key enterprise challenge of maintaining creative control when using AI for production assets. 03:59 Justin – “The thing that’s the most important about this is when Nano Banana messes up the text (which it doesn’t do as often), you can now edit it without generating a whole completely different image.” 05:58 Introducing Claude Opus 4.5 Claude Opus 4.5 is now generally available across Anthropic’s API , apps, and all three major cloud platforms at $5 per million input tokens and $25 per million output tokens. This represents a substantial price reduction that makes Opus-level capabilities more accessible. Developers can access it via the claude-opus-4-5-20251101 model identifier. The model achieves state-of-the-art performance on software engineering benchmarks, scoring higher than any human candidate on Anthropic’s internal performance engineering exam within a 2-hour time limit on SWE-bench Verified. It matches Sonnet 4.5 ‘s best score while using 76% fewer output tokens at medium effort, and exceeds it by 4.3 percentage points at highest effort while still using 48% fewer tokens. Anthropic introduces a new effort parameter in the API that lets developers control the tradeoff between speed and capability, allowing optimization for either minimal time and cost or maximum performance depending on the task requirements. This combines with new context management and memory capabilities to boost performance on agentic tasks by nearly 15 percentage points in testing. Claude Code gains Plan Mode that builds a user-editable plan.md files before execution, and is now available in the desktop app for running multiple parallel sessions. The consumer apps remove message limits for Opus 4.5 through automatic context summarization, and Claude for Chrome and Claude for Excel expand to all Max, Team, and Enterprise users. The model demonstrates improved robustness against prompt injection attacks compared to other frontier models and is described as the most robustly aligned model Anthropic has released. It shows better performance across vision, reasoning, and mathematics tasks while using dramatically fewer tokens than predecessors, reaching similar or better outcomes. 08:01 Justin – “The most important part of the whole announcement is the cheaper context input and output tokens.” 09:58 Announcing Claude Opus 4.5 on Snowflake Cortex AI Snowflake Cortex AI now offers Claude Opus 4.5 and Claude Sonnet 4.5 in general availability, bringing Anthropic’s latest models directly into Snowflake’s data platform. Users can access these models through SQL, Python, or REST APIs without moving data outside their Snowflake environment. Claude Opus 4.5 delivers improved performance on complex reasoning tasks, coding, and multilingual capabilities compared to previous versions, while Claude Sonnet 4.5 provides a balanced option for speed and intelligence. Both models support 200K token context windows and can process text and images natively within Snowflake queries. The integration enables enterprises to build AI applications using their Snowflake data with built-in governance and security controls, eliminating the need to export sensitive data to external AI services. Pricing follows Snowflake’s credit-based model, with costs varying by model and token usage. Developers can combine Claude models with other Cortex AI features like vector search, document understanding, and fine-tuning capabilities to create end-to-end AI workflows. This allows for use cases ranging from customer service automation to financial analysis and code generation, all within the Snowflake ecosystem. 11:03 OpenAI CEO declares “code red” as Gemini gains 200 million users in 3 months Oh, how the turn tables have turned… OpenAI CEO Sam Altman issued an internal code red memo to refocus the company on improving ChatGPT after Google’s Gemini 3 model topped the LMArena leaderboard and gained 200 million users in three months. The directive delays planned features, including advertising integration, AI agents for health and shopping, and the Pulse personal assistant feature. Google’s Gemini 3 model, released in mid-November, has outperformed ChatGPT on industry benchmark tests and attracted high-profile users like Salesforce CEO Marc Benioff, who publicly announced switching from ChatGPT after three years. The model’s performance represents a significant shift in the competitive landscape since OpenAI’s initial ChatGPT launch in December 2022. The situation mirrors December 2022, when Google declared its own code red after ChatGPT’s rapid adoption, with CEO Sundar Pichai reassigning teams to develop competing AI products. This role reversal demonstrates how quickly competitive positions can shift in the AI model space, particularly around user experience and benchmark performance. OpenAI is implementing daily calls for teams responsible for ChatGPT improvements and encouraging temporary team transfers to address the competitive pressure. The company’s response indicates that maintaining market leadership in conversational AI requires continuous iteration even for established products with large user bases. 13:11 Ryan – “I started on ChatGPT and tried to use it after adopting Claude, and I try to go back every once in a while – especially when they would announce a new model, but I always end up going back to one of the Anthropic models.” GCP 15:19 New Google Cloud region coming to Türkiye Google Cloud is launching a new region in Türkiye as part of a 2 billion dollar investment over 10 years, partnering with local telecom provider Turkcell, which will invest an additional 1 billion dollars in data centers and cloud infrastructure. This brings Google Cloud’s global footprint to 43 regions and 127 zones, with Türkiye serving as a strategic hub for EMEA customers. The region targets three key verticals already committed as customers: financial services with Garanti BBVA and Yapi Kredi Bank modernizing core banking systems, airlines with Turkish Airlines improving flight operations and passenger systems, and government entities focused on digital sovereignty. The local presence addresses data residency requirements and provides low-latency access for organizations that need to keep data within national borders. Technical capabilities include standard Google Cloud services for data analytics, AI, and cybersecurity with data encryption at rest and in transit, granular access controls, and threat detection systems meeting international security standards. The region will serve both Türkiye and neighboring countries with reduced latency compared to existing European regions. The announcement emphasizes digital sovereignty as a primary driver, with government officials highlighting the importance of local infrastructure for maintaining control over national data while accessing hyperscale cloud capabilities. This follows a pattern of Google Cloud expanding into regions where data localization requirements create demand for in-country infrastructure. No specific pricing details were provided for the Türkiye region, though standard Google Cloud pricing models based on compute, storage, and network usage will apply once the region launches. The timeline for when the region will be operational was not disclosed in the announcement. Show note editor Heather note: If you enjoy history, you need to travel to Türkiye immediately! 17:03 Introducing BigQuery Agent Analytics Google launches BigQuery Agent Analytics , a new plugin for their Agent Development Kit that streams AI agent interaction data directly to BigQuery with a single line of code. The plugin captures metrics like latency, token consumption, tool usage, and user interactions in real-time using the BigQuery Storage Write API , enabling developers to analyze agent performance and optimize costs without complex instrumentation. The integration allows developers to leverage BigQuery’s advanced capabilities, including generative AI functions, vector search, and embedding generation to perform sophisticated analysis on agent conversations. Teams can cluster similar interactions, identify failure patterns, and join agent data with business metrics like CSAT scores to measure real-world impact, going beyond basic operational metrics to quality analysis. The plugin includes three core components: an ADK plugin that requires minimal code changes, a predefined optimized BigQuery schema for storing interaction data, and low-cost streaming via the BigQuery Storage Write API. Developers maintain full control over what data gets streamed and can customize pre-processing, such as redacting sensitive information before logging. Currently available in preview for ADK users, with support for other agent frameworks like LangGraph coming soon. The feature addresses a critical gap in agentic AI development where understanding user interaction patterns and agent performance is essential for refinement, particularly as organizations move from building agents to optimizing them at scale. Pricing follows standard BigQuery costs for storage and queries, with the Storage Write API offering cost-effective real-time streaming compared to traditional batch loading methods. Documentation and a hands-on codelab are available at google.github.io/adk-docs for developers ready to implement agent analytics. 18:16 Ryan – “This is an interesting model; providing both the schema and the already instrumented integration. I feel like a lot of times with other types of development, you’re left to your own devices, and so this is a neat thing. As you’re developing an agent, everyone is instrumenting these things in odd ways, and it’s very difficult to compile the data in a way where you get usable queries out of it. So it’s kind of an interesting concept.” 19:35 TalayLink subsea cable to connect Australia and Thailand You know how much we love a good undersea cable… Google announces TalayLink, a new subsea cable connecting Australia and Thailand via the Indian Ocean, taking a western route around the Sunda Strait to avoid congestion from existing cable paths. This cable extends the Interlink system from the Australia Connect initiative and will directly connect to Google’s planned Thailand cloud region and data centers. The project includes two new connectivity hubs in Mandurah, Western Australia, and South Thailand, providing diverse landing points away from existing cable concentrations in Perth and enabling cable switching, content caching, and colocation capabilities. Google is partnering with AIS for the South Thailand hub to leverage existing infrastructure. TalayLink forms part of a broader Indian Ocean connectivity strategy, linking with previously announced hubs in the Maldives and Christmas Island to create redundant paths connecting Australia, Southeast Asia, Africa, and the Middle East. This routing diversity aims to improve network resilience across multiple regions. The infrastructure supports Thailand’s digital economy transformation goals and Western Australia’s digital future roadmap, with the Thailand Board of Investment actively backing the project. No pricing or specific completion timeline was disclosed in the announcement. The Cloud Pod is excited to cover the latest innovations and trends. We aim to keep you informed about the evolving landscape of cloud technology and artificial intelligence. 20:34 Matt – “It’s amazing…subsea cable congestion. How many cables can be there that there’s congestion?” 23:16 Claude Opus 4.5 on Vertex AI Claude Opus 4.5 is now generally available on Vertex AI , delivering Anthropic’s most advanced model at one-third the cost of its predecessor Opus 4.1. The model excels in coding tasks that can compress multi-day development projects into hours, agentic workflows with dynamic tool discovery from hundreds of tools without context window bloat, and office productivity tasks with improved memory for maintaining consistency across documents. Google is positioning Vertex AI as a unified platform for deploying Claude with enterprise features, including global endpoints for reduced latency, provisioned throughput for dedicated capacity at fixed costs, and prompt caching with flexible Time To Live up to one hour. The platform integrates with Google’s Agent Builder stack , including the open Agent Development Kit, Agent2Agent protocol, and fully managed Agent Engine for moving multi-step workflows from prototype to production. Security and governance capabilities include Google Cloud’s foundational security controls, data residency options, and Model Armor protection against AI-specific threats like prompt injection and tool poisoning through Security Command Center . Customers like Palo Alto Networks report 20-30 percent increases in code development velocity when using Claude on Vertex AI. The model supports a 1 million token context window, batch predictions for cost efficiency, and web search capabilities in preview. Regional availability and specific pricing details are available in the Vertex AI documentation, with the model accessible through both the Model Garden and Google Cloud Marketplace . 23:58 Registration is live for Google Cloud Next 2026 in Las Vegas Google Cloud Next 2026 takes place April 22-24 in Las Vegas, with registration now open at an early bird price of $999 for a limited time. This represents the standard pricing structure for Google’s flagship annual conference following their record-breaking attendance in 2025. The conference focuses heavily on AI agent development and implementation, featuring interactive demos, hackathons, and workshops designed to help attendees build intelligent agents. Organizations can learn from real-world case studies of companies deploying AI solutions at scale. Next 2026 offers hands-on technical training through deep-dive sessions, keynotes, and practical labs aimed at developers and technical practitioners. The format emphasizes actionable learning with direct access to Google engineers and product experts. The event serves as a networking hub for cloud practitioners to connect with peers facing similar technical challenges and to provide feedback that influences Google Cloud’s product roadmap. This direct line to product teams can be valuable for organizations planning their cloud strategy. Ready to register? You can do that here . 27:19 VPC Flow Logs for Cross-Cloud Network VPC Flow Logs now support Cloud VPN tunnels and VLAN attachments for Cloud Interconnect and Cross-Cloud Interconnect, extending visibility beyond traditional VPC subnet traffic to hybrid and multi-cloud connections. This addresses a critical gap for organizations running Cross-Cloud Network architectures who previously lacked detailed telemetry on traffic flowing between Google Cloud, on-premises infrastructure, and other cloud providers. The feature provides 5-tuple granularity logging (source/destination IP, port, and protocol) with new gateway annotations that identify traffic direction and context through reporter and gateway object fields. Flow Analyzer integration eliminates the need for complex SQL queries, offering built-in analysis capabilities including Gemini-powered natural language queries and in-context Connectivity Tests to correlate flow data with firewall policies and network configurations. Primary use cases include identifying elephant flows that congest specific tunnels or attachments, auditing Shared VPC bandwidth consumption by service projects, and troubleshooting connectivity issues by verifying whether traffic reaches Google Cloud gateways. Organizations can also validate DSCP markings for application-aware Cloud Interconnect policy configurations, which is particularly valuable for enterprises with quality-of-service requirements. The feature is available now for both new and existing deployments through Console, CLI, API, and Terraform, with Flow Analyzer providing no-cost analysis of logs stored in Cloud Logging. This capability is particularly relevant for financial services, healthcare, and enterprises with strict compliance requirements that need comprehensive audit trails of cross-cloud and hybrid network traffic. 28:37 Ryan – “The controls say that you have to have logging, not what the logging is – and so very frequently it is sort of ‘turn it on and sort of forget it’. I do think this is great, but it is sort of, they say the five-tuple granularity will help you measure congestion, but I don’t see them actually producing any sort of bandwidth or request size metrics. So it is sort of an interesting thing, but it’s at least better than the nothing that we had before. So I’ll take it.” 30:35 AWS and Google Cloud collaborate on multicloud networking AWS and Google Cloud jointly engineered a multicloud networking solution that eliminates the need for manual physical infrastructure setup between their platforms. Customers can now provision dedicated bandwidth and establish connectivity in minutes instead of weeks through either cloud console or API. The solution uses AWS Interconnect multicloud and Google Cloud Cross-Cloud Interconnect with quad-redundancy across physically separate facilities and MACsec encryption between edge routers. Both providers published open API specifications on GitHub for other cloud providers to adopt the same standard. Previously, connecting AWS and Google Cloud required customers to manually coordinate physical connections, equipment, and multiple teams over weeks or months. This new managed service abstracts away physical connectivity, network addressing, and routing policy complexity into a cloud-native experience. Salesforce is using this capability to connect its Data 360 platform across clouds using pre-built capacity pools and familiar AWS tooling. The integration allows them to ground AI and analytics in trusted data regardless of which cloud it resides in. The collaboration represents a shift toward cloud provider interoperability through open standards rather than proprietary solutions. The published specifications enable any cloud provider or partner to implement compatible multicloud connectivity using the same framework. 31:38 Justin – “I do want you guys to check the weather. Do you see pigs flying or anything crazy?” Azure 33:17 Generally Available: TLS and TCP termination on Azure Application Gateway Azure Application Gateway now supports TLS and TCP protocol termination at general availability, expanding beyond its traditional HTTP/HTTPS load balancing capabilities. This allows customers to use Application Gateway for non-web workloads like database connections, message queuing systems, and other TCP-based applications that previously required separate load balancing solutions. The feature consolidates infrastructure by letting organizations use a single gateway service for both web and non-web traffic, reducing the need to deploy and manage multiple load balancers. This is particularly useful for enterprises running mixed workloads that include legacy applications, databases like SQL Server or PostgreSQL , and custom TCP services alongside modern web applications. Application Gateway’s existing features, like Web Application Firewall , autoscaling, and zone redundancy, now extend to TCP and TLS traffic, providing consistent security and availability across all application types. The pricing model follows Application Gateway’s standard consumption-based structure with charges for gateway hours and data processing, though specific costs for TCP/TLS termination were not detailed in the announcement. Common use cases include load balancing for database clusters, securing MQTT or AMQP message broker connections, and providing SSL offloading for legacy applications that don’t natively support modern TLS versions. This positions Application Gateway as a more versatile Layer 4-7 load balancing solution competing with dedicated TCP load balancers and third-party appliances. 33:38 Justin – “Thank you for developing network load balancers.” 34:48 Generally Available: Azure Application Gateway mTLS passthrough support Want to make your life even more complicated? Well, it’s GOOD NEWS! Azure Application Gateway now supports mutual TLS passthrough in general availability, allowing backend applications to validate client certificates and authorization headers directly while still benefiting from Web Application Firewall inspection. This addresses a specific compliance requirement where organizations need end-to-end certificate validation but cannot terminate TLS at the gateway layer. The feature enables scenarios where backend services must verify client identity through certificates for regulatory compliance or zero-trust architectures, particularly relevant for financial services, healthcare, and government workloads. Previously, customers had to choose between WAF protection or backend certificate validation, creating security or compliance gaps. Application Gateway continues to inspect traffic through WAF rules even as the mTLS connection passes through to the backend, maintaining protection against common web exploits and OWASP vulnerabilities. This dual-layer approach means organizations can enforce both perimeter security policies and application-level authentication without architectural compromises. The capability is available across all Azure regions where Application Gateway v2 SKU operates, with standard Application Gateway pricing applying based on capacity units consumed. No additional charges exist specifically for the mTLS passthrough feature itself, though backend certificate validation may increase processing overhead slightly. 36:30 Matt – “I did S tunnel and MongoDB because it didn’t support encryption for the longest time…that was a fun one.” 36:50 Public Preview: Azure API Management adds support for A2A Agent APIs Azure API Management now supports Agent-to-Agent (A2A) APIs in public preview, allowing organizations to manage AI agent APIs alongside traditional REST APIs, AI model APIs, and Model Context Protocol tools within a single governance framework. This addresses the growing need to standardize how autonomous agents communicate and interact across enterprise systems. The feature enables centralized management of agent interactions, which is particularly relevant as organizations deploy multiple AI agents that need to coordinate tasks and share information. API Management can now apply consistent security policies, rate limiting, and monitoring across all agent communications, reducing the operational complexity of multi-agent architectures. This capability positions Azure API Management as a unified control plane for the full spectrum of API types emerging in AI-driven applications. Organizations already using API Management for traditional APIs can extend their existing governance practices to cover agent-based workflows without deploying separate infrastructure. The preview is available in Azure regions where API Management is currently supported, though specific pricing for A2A API features has not been disclosed separately from standard API Management tiers. Organizations should evaluate this against their existing API Management costs, which start at approximately $50 per month for the Developer tier. 38:13 Introducing Claude Opus 4.5 in Microsoft Foundry Claude Opus 4.5 is now available in public preview on Microsoft Foundry , GitHub Copilot paid plans, and Microsoft Copilot Studio , expanding Azure’s frontier model portfolio following the Microsoft-Anthropic partnership announced at Ignite. The model achieves 80.9% on SWE-bench software engineering benchmarks and is priced at one-third the cost of previous Opus-class models, making advanced AI capabilities more accessible for enterprise workloads. The model introduces three key developer features on Foundry: an Effort Parameter in beta that lets teams control computational allocation across thinking and tool calls, Compaction Control for managing context in long-running agentic tasks, and enhanced programmatic tool calling with dynamic tool discovery that doesn’t consume context window space. These capabilities enable sophisticated multi-tool workflows across cybersecurity, financial modeling, and full-stack development. Opus 4.5 serves as Anthropic’s strongest vision model and delivers improved computer use performance for automating desktop tasks, particularly for creating spreadsheets, presentations, and documents with professional polish. The model maintains context across complex projects using memory features, making it suitable for precision-critical verticals like finance and legal, where consistency matters. Microsoft Foundry’s rapid integration strategy gives Azure customers immediate access to the latest frontier models while maintaining centralized governance, security, and observability at scale. This positions Azure as offering the widest selection of advanced AI models among cloud providers, with Opus 4.5 available now through the Foundry portal and coming soon to Visual Studio Code via the Foundry extension . 38:37 Justin – “Cool, it’s in Foundry – hooray!” 40:21 Generally Available: DNS security policy Threat Intelligence feed Azure DNS security policy now includes a managed Threat Intelligence feed that blocks queries to known malicious domains. This feature addresses the common attack vector where nearly all cyber attacks begin with a DNS query, providing an additional layer of protection at the DNS resolution level. The service integrates with Azure’s existing DNS infrastructure and uses Microsoft’s threat intelligence data to automatically update the list of malicious domains. Organizations can enable this protection without managing their own threat feeds or maintaining blocklists, reducing operational overhead for security teams. This capability is particularly relevant for enterprises looking to implement defense-in-depth strategies, as it stops threats before they can establish connections to command and control servers or phishing sites. The feature works alongside existing Azure Firewall and network security tools to provide comprehensive protection. The general availability means the service is now production-ready with full SLA support across Azure regions. Pricing details were not specified in the announcement, so customers should check Azure pricing documentation for DNS security policy costs. 41:28 Ryan – “It is something, being able to automatically take the results of a feed, I will do any day just because these things are updated by many more parties and faster than I can ever react to, and you know, our own threat intelligence. So that’s pretty great. I like it.” 42:46 Public Preview: Standard V2 NAT Gateway and StandardV2 Public IPs Azure introduces StandardV2 NAT Gateway in public preview, adding zone-redundancy for high availability in regions with availability zones. This upgrade addresses a key limitation of the original NAT Gateway by ensuring outbound connectivity survives zone failures, which matters for enterprises running mission-critical workloads that require consistent internet egress. The StandardV2 SKU includes matching StandardV2 Public IPs that work together with the new NAT Gateway tier. Organizations using the original Standard SKU will need to evaluate migration paths since zone-redundancy represents a fundamental architectural change requiring new resource types rather than an in-place upgrade. This release targets customers who previously had to architect complex workarounds for zone-resilient outbound connectivity, particularly those running multi-zone deployments of containerized applications or database clusters. The preview allows testing of failover scenarios before production deployment. The announcement lacks specific pricing details for the StandardV2 tier, though NAT Gateway typically charges based on hourly resource fees plus data processing costs. Customers should monitor Azure pricing pages as the preview progresses toward general availability for cost comparisons against the Standard SKU. 43:48 Justin – “The fact that this is not an upgrade that I can just check, and I have to redeploy a whole new thing, annoys the crap out of me.” 46:51 Generally Available: Custom error pages on Azure App Service Custom error pages on Azure App Service have moved to general availability, allowing developers to replace default HTTP error pages with branded or customized alternatives. This addresses a common requirement for production applications where maintaining a consistent user experience during errors is important for brand identity and user trust. The feature integrates directly into App Service configuration without requiring additional Azure services or third-party tools. Developers can specify custom HTML pages for different HTTP error codes like 404 or 500, which App Service will serve automatically when those errors occur. This capability is particularly relevant for customer-facing web applications, e-commerce sites, and SaaS platforms where error handling needs to align with corporate branding guidelines. The feature works across all App Service tiers that support custom domains and SSL certificates. No additional cost is associated with custom error pages beyond standard App Service hosting fees, which start at approximately $13 per month for the Basic tier. Implementation requires uploading error page files to the app’s file system and updating configuration settings through Azure Portal or deployment templates. The general availability status means the feature is now production-ready with full support coverage, moving beyond the preview phase where it was available for testing. Documentation is available at the Azure App Service custom error pages guide. 48:17 Matt – “It’s crazy that this wasn’t already there. The workarounds you had to do to make your own error page was messy at best.” 49:01 Generally Available: Streamline IT governance, security, and cost management experiences with Microsoft Foundry Microsoft Foundry reaches general availability as an enterprise AI governance platform that consolidates security, compliance, and cost management controls for IT administrators deploying AI solutions. The platform addresses the growing need for centralized oversight as organizations scale their AI initiatives across Azure infrastructure. The service integrates with existing Azure management tools to provide unified visibility and control over AI workloads, allowing IT teams to enforce policies and monitor resource usage from a single interface. This reduces the operational overhead of managing disparate AI projects while maintaining enterprise security standards. Foundry targets large enterprises and regulated industries that require strict governance frameworks for AI deployment, particularly organizations balancing innovation speed with compliance requirements. The platform helps bridge the gap between data science teams pushing for rapid AI adoption and IT departments responsible for risk management. The general availability announcement indicates Microsoft is positioning Azure as the enterprise-ready AI cloud, competing directly with AWS and Google Cloud for organizations prioritizing governance alongside AI capabilities. Specific pricing details were not disclosed in the announcement, suggesting costs likely vary based on usage and existing Azure commitments. 50:22 Justin – “It’s like a combination of SageMaker and Vertex married Databricks and then had a baby – plus a report interface.” 52:44 Generally Available: Model Router in Microsoft Foundry Microsoft Foundry’s Model Router is now generally available as an AI orchestration layer that automatically selects the optimal language model for each prompt based on factors like complexity, cost, and performance requirements. This eliminates the need for developers to manually choose between different AI models for each use case. The service supports an expanded range of models, including the GPT-4 family , GPT-5 family , GPT-oss , and DeepSeek models, giving organizations flexibility to balance performance needs against cost considerations. The router can dynamically switch between models within a single application based on prompt characteristics. This addresses a practical challenge for enterprises deploying multiple AI models where different tasks require different model capabilities. For example, simple queries could route to smaller, less expensive models while complex reasoning tasks automatically use more capable models. The orchestration layer integrates with Microsoft Foundry’s broader AI infrastructure, allowing customers to manage multiple model deployments through a single interface rather than building custom routing logic. This reduces operational complexity for teams managing diverse AI workloads across their organization. No specific pricing details are provided in the announcement, though costs will likely vary based on the underlying models selected by the router and usage patterns. Organizations should evaluate potential cost savings from routing simpler queries to less expensive models versus always using premium models. 54:50 Generally Available: Scheduled Actions Azure’s Scheduled Actions feature is now generally available, providing automated VM lifecycle management at scale with built-in handling of subscription throttling and transient error retries. This eliminates the need for custom scripting or third-party tools to start, stop, or deallocate VMs on a recurring schedule. The feature addresses common cost optimization scenarios where organizations need to automatically shut down development and test environments during off-hours or scale down non-production workloads on weekends. This can reduce compute costs by 40-70% for environments that don’t require 24/7 availability. Scheduled Actions integrates directly with Azure Resource Manager and works across VM scale sets, making it suitable for both individual VMs and large-scale deployments. The automatic retry logic and throttling management means operations complete reliably even when managing hundreds or thousands of VMs simultaneously. The service is available in all Azure public cloud regions where VMs are supported, with no additional cost beyond standard VM compute charges. Organizations pay only for the time VMs are running, so automated shutdown schedules directly translate to reduced monthly bills. 55:31 Justin – “Thank you for copying every other cloud that’s had this forever…” After Show 51:46 OpenAI and NORAD team up to bring new magic to “NORAD Tracks Santa.” OpenAI partnered with NORAD to add AI-powered holiday tools to the annual Santa tracking tradition, creating three ChatGPT-based features that turn kids’ photos into elf portraits, generate custom toy coloring pages, and build personalized Christmas stories. This represents a consumer-friendly application of generative AI that demonstrates how large language models can be packaged for mainstream family use during the holidays. The collaboration shows OpenAI pursuing brand-building partnerships with trusted institutions like NORAD to normalize AI tools in everyday contexts. By embedding ChatGPT features into a 68-year-old military tradition that reaches millions of families, OpenAI gains exposure to non-technical users who might otherwise be hesitant about AI adoption. From a technical perspective, these tools showcase practical implementations of image generation and text-to-image capabilities that parents can use without understanding the underlying models. The focus on simple, single-purpose GPTs rather than complex interfaces suggests OpenAI is testing how to make their technology more accessible to casual users. The partnership raises interesting questions about AI companies seeking legitimacy through associations with government organizations and cultural traditions. While the tools are harmless holiday fun, they demonstrate how AI providers are moving beyond enterprise sales to embed their technology into cultural moments and family activities. This is essentially a marketing play disguised as holiday cheer, but it does illustrate how cloud-based AI services are becoming infrastructure for consumer experiences rather than just backend business tools. The real story is about distribution strategy and making AI feel safe and familiar to mainstream audiences. The Cloud Pod has one message: keep Skynet out of Christmas! Closing And that is the week in the cloud! Visit our website, the home of the Cloud Pod, where you can join our newsletter, Slack team, send feedback, or ask questions at theCloudPod.net or tweet at us with the hashtag #theCloudPod

About this episode