This is a hands-on, full-day workshop where you'll go from zero to running open-source models directly inside your Go applications — no cloud APIs, no external servers, no data leaving your machine.
You'll start by loading a model and running your first inference with the Kronk SDK. Then you'll learn how to configure models for your hardware — GPU layers, KV cache placement, batch sizes, and context windows — so you get the best performance out of whatever machine you're running on. With the model tuned, you'll take control of its output through sampling parameters: temperature, top-k, top-p, repetition penalties, and grammar constraints that guarantee structured JSON responses.
Next you'll see how Kronk's caching systems — System Prompt Cache (SPC) and Incremental Message Cache (IMC) — eliminate redundant computation and make multi-turn conversations fast. You'll watch a conversation go from full prefill on every request to only processing the newest message.
With the foundation solid, you'll build real applications: a Retrieval-Augmented Generation (RAG) pipeline that grounds model responses in your own documents using embeddings and vector search, and a natural-language-to-SQL system where the model generates database queries from plain English — with grammar constraints ensuring the output is always valid, executable SQL.
Each part builds on the last.
By the end of the day, you won't just understand how private AI works — you'll have built applications that load models, cache intelligently, retrieve context, and generate code, all running locally on your own hardware.
By the end of this workshop, you'll leave with working code, a deep understanding of local model inference in Go, and hands-on experience across the full stack: model configuration, performance tuning, intelligent caching, retrieval-augmented generation, and structured code generation.
Don't worry if you don't have the full hardware required — the instructor will provide everything you need to follow along and run the examples.
William Kennedy is a managing partner at Ardan Labs in Miami, Florida. Ardan is a group of passionate engineers, artists and business professionals focused on building and delivering reliable, secure and scalable solutions. Bill is also the author of Go in Action and the Ultimate Go Notebook, plus the author of over 100 blog posts and articles about Go. Lastly, he is a founding member of GoBridge and the GDN, organisations working to increase Go adoption through diversity.
BUY TICKET
Claws — always-on, autonomous AI agents like OpenClaw and NanoClaw — are changing how people interact with AI. Unlike coding agents that wait for your commands, a claw runs continuously, remembers context across sessions, schedules its own tasks, and acts on your behalf. This hands-on workshop walks you through building your own claw from scratch in Go, covering everything from the core reasoning loop to inter-agent communication.
Starting from an empty main.go, we'll build a fully functional claw step by step: the agent reasoning loop, tool registration and execution, streaming responses, persistent memory, autonomous scheduling, and a web UI to interact with your agent. We'll use coding agents throughout the process — building an AI agent with the help of AI agents — while reviewing and understanding every layer of what gets produced.
Go's concurrency model maps directly to claw architecture. Goroutines handle parallel tool execution and background scheduling naturally. Channels deliver streaming LLM responses. The type system keeps tool schemas honest. And a single binary deployment means your claw runs anywhere — on your laptop, a Raspberry Pi, or a VPS — without runtime dependencies.
The workshop progresses through carefully designed demonstrations, each building on the previous. Along the way you'll learn to manage context windows, apply conversation compaction strategies, and evaluate security boundaries for agent systems. The final demonstration connects all participant-built claws via Google's A2A (Agent-to-Agent) protocol, where your agents discover peers on the network and begin collaborating autonomously — delegating tasks, sharing results, and solving problems together across the room.
While you're building a personal agent, the patterns transfer directly to production use cases: internal knowledge assistants, automated ops tools, LLM-powered APIs, and service-to-service coordination. The agent loop, tool system, and concurrency patterns you implement are the same foundations your team will need for any Go-based LLM integration.
Daniel Mahlow is managing partner and generative AI lead at Contiamo, a Berlin-based consultancy. He has worked on various software and data engineering projects and since 2020 has been driving generative AI projects from prototype to large-scale production. He is a generalist, a builder, and enjoys diving into new technology.
BUY TICKET