AWS re:Invent 2025 - Designing local Generative AI inference with AWS IoT Greengrass (DEV316)
While running inference in the cloud is a common and effective approach for Generative AI, some use cases demand local execution. This session explores how to design and operate local inference architectures using AWS IoT Greengrass. Through a live demo using a robotic arm, attendees compare cloud-based and local inference in practice, examining trade-offs in latency, connectivity, and model update frequency.