This new Docker extension lets you run Docker Model Runner as the inference service behind Open WebUI, powering a richer chat experience. This blog walks you through why Docker Model Runner and Open WebUI make a powerful duo, how to set up your own local AI assistant, and what’s happening under the hood.
Local LLMs are no longer just experimental toys. Thanks to rapid advances in model optimization and increasingly powerful consumer hardware, local large language models have gone from proof-of-concept curiosities to genuinely useful tools.
Even a MacBook with an M-series chip can now run models that deliver fast, meaningful responses offline, without an internet connection or API keys.
Docker Model Runner, accessible via Docker Desktop (and also available as a plugin for Docker CE and of course fully OSS), makes getting started easy: just pick a model in the UI or run a single docker model run
from the CLI.
You’ll have a fully operational model up and running in seconds.
Docker Model Runner is designed as an inference service in Docker Desktop or CLI, allowing developers to run models locally with familiar workflows and commands they trust and know. This means from a design perspective, it only provides the bare minimum: a prompt box and a response field. There’s no memory. No file upload. No chat flow. No interface that feels like a real assistant. Note this is actually by design. There is no intention to replicate an experience within Docker Desktop; that is already well implemented by other offerings within the wider ecosystem.
That’s where Open WebUI comes in: a modern, self-hosted interface designed specifically for working with local LLMs.
It brings chat history, file uploads, prompt editing, and more. All local. All private.
That’s why an extension was created: to combine the two.
This Docker Extension launches Open WebUI and hooks it directly into your running model via Docker Model Runner. No configuration. No setup. Just open and go.
Let’s see how it works.
If you already have Docker Desktop installed, you’re almost there.
Head over to the Models tab and pick any model from the Docker Hub section: GPT-OSS, Gemma, LLaMA 3, Mistral or others.
One click, and Docker Model Runner will pull the container and start serving the model locally.
Prefer the CLI? A single docker model pull
does the same job.
Next, you might want something more capable than a single input box.
Open the Extensions Marketplace inside Docker Desktop and install the Open WebUI extension, a feature-rich interface for local LLMs.
It automatically provisions the container and connects to your local Docker Model Runner.
All models you’ve downloaded will appear in the WebUI, ready to use; no manual config, no environment variables, no port mapping.
Once setup completes, you’ll see a screen confirming the extension is running, along with a button to launch the interface in your browser.
Alternatively, you can open it manually at http://localhost:8090
(default port) or bookmark it for future use.
Note: The first-time startup may take a couple of minutes as Open WebUI installs required components and configures integration.
Subsequent launches are much faster – nearly instant.
Once installed, Open WebUI feels instantly familiar, like using ChatGPT, but running entirely on your own machine.
You get a full chat experience, with persistent conversations, system prompt editing, and the ability to switch between models on the fly.
Drop in PDFs, Markdown files, presentations, spreadsheets, or even images.
Open WebUI extracts the content and makes it queryable through the chat.
Need a summary, quick answer, or content overview? Just ask: all processing happens locally.
With voice input turned on, you can talk to your assistant right from the browser.
This is great for hands-free tasks, quick prompts, or just demoing your local AI setup to a friend.
Requires permission setup for microphone access.
Open WebUI supports full control over system prompts with templates, variables, and chat presets.
Whether you’re drafting code, writing blog posts, or answering emails, you can fine-tune how the model thinks and responds.
Already downloaded multiple models using Docker Model Runner?
Open WebUI detects them automatically. Pick any model from the dropdown and start chatting; no restart required.
Want the model to remember something specific?
You can store facts or reminders manually in the personal memory panel and edit or remove them at any time.
Open WebUI goes beyond chat with advanced tools that power real workflows:
Note: Not all features are universally available. Some depend on the model’s capabilities (e.g., function calling, image understanding), your current Open WebUI settings (e.g., voice input, plugins), or the hardware you’re running on (e.g., GPU acceleration, local RAG performance).
Open WebUI aims to provide a flexible platform, but actual functionality may vary based on your setup.
Under the hood, the extension brings together two key components: integration between Open WebUI and Docker Model Runner, and a dynamic container provisioner built into the Docker extension.
Open WebUI supports Python-based “functions”, lightweight plugins that extend model behavior.
This extension includes a function that connects to Docker Model Runner via its local API, allowing the interface to list and access all downloaded models automatically.
When you install the extension, Docker spins up the Open WebUI container on demand. It’s not a static setup, the container is configured dynamically based on your environment. You can:
The extension handles all of this behind the scenes, but gives you full control when needed.
You’ve just seen how the Docker Open WebUI Extension turns Docker Model Runner from a simple model launcher into a fully-featured local AI assistant with memory, file uploads, multi-model support, and more.
What used to require custom configs, manual ports, or third-party scripts now works out of the box, with just a few clicks.
http://localhost:8090
and start chatting locallyThe future of local AI is modular, private, and easy to use.
This extension brings us one step closer to that vision and it’s just getting started.
Enter the Docker MCP Toolkit.
The Model Context Protocol (MCP) Toolkit acts as a secure bridge, allowing AI models like Codex to safely discover and use any of the 200+ MCP servers from the trusted MCP catalog curated by Docker.
In this post, we’ll walk through an end-to-end demo, just like our Claude Code and Gemini CLI tutorials. But this time, we’re pairing Codex with Neo4j MCP servers.
First, we’ll connect Codex to the Neo4j server using the MCP Toolkit. Then, we’ll show a fun example: building a graph of Pokémon species and their types, and exploring the data visually. While playful, this example highlights how Codex + MCP can be applied to real-world, semi-structured data pipelines.
Read on to see how a generic AI assistant, when supercharged with Docker and MCP, can evolve into a specialized data engineering powerhouse!
While Codex provides powerful AI capabilities and MCP provides the protocol, Docker MCP Toolkit makes automated data modeling and graph engineering practical. Without containerization, building a knowledge graph means managing local Neo4j installations, dealing with database driver versions, writing boilerplate connection and authentication code, and manually scripting the entire data validation and loading pipeline. A setup that should take minutes can easily stretch into hours for each developer.
Docker MCP Toolkit eliminates this friction:
We built Docker MCP Toolkit to meet developers where they are. If you’re using Codex, you should be able to engineer a knowledge graph without wrestling with database infrastructure.
First, we need to give Codex access to the specialized Neo4j tools.
The Neo4j Cypher and Data Modeling servers are available out-of-the-box in the main MCP Toolkit catalog.
With our tools ready, we run a one-time command to make Codex aware of the MCP Toolkit:
docker mcp-client configure codex
We can also do that from the Docker Desktop UI, navigate to the clients tab, and smash that connect button next to Codex and any other assistants you use:
Docker will edit the corresponding configuration files and next time Codex starts, it’ll connect to the MCP toolkit and you’ll have the tools at your disposal!
We still need to configure the Neo4j Cypher MCP server to connect to the Neo4j database, so we’ll set this up now. We’ll use Codex to start our Neo4j database and configure the connection. First, we ask Codex to create the container:
› Spin up a Neo4j container for me in Docker please.
Codex will run the necessary Docker command, and get our Neo4j container running. You can of course do this manually, use a cloud service, or download the Neo4j Desktop application, but since we’re having fun in Codex – then why not make it do these mundane things for us.
With Neo4j available in the container we now need to configure the Neo4j Cypher MCP server to connect to it. The Neo4j Data Modeling MCP server works without any configuration. To simplify you can take a screenshot of the configuration form and ask Codex for the information on how to fill it to connect to the container we just started.
Next, we asked Codex for help, pointing it to a screenshot of the configuration page:
› There is a screenshot in this directory. It has the information we need to configure, MCP server to connect to the Neo4j database. Please look at it and give me the information to configure the MCP server to connect to our Neo4j container.
Codex inspected the container, found the running container, and determined its authentication details (e.g., NEO4J_AUTH=neo4j/password
). It then provided the correct configuration values. We just need to update the Neo4j Cypher MCP server configuration in the Docker Desktop UI with the neo4j-cypher.url
(neo4j://localhost:7687
) and the password.
Fill the form in the Docker MCP toolkit for the Neo4j Cypher MCP server and restart codex now.
Now we’re ready to tackle our original problem of building the graph of Pokémon species. First of all, enter the YOLO mode by typing /approvals
and selecting the mode where it has freedom to do whatever it pleases:
Next, we can throw the full power of Codex models at our task. We give it the full problem statement:
› Here's the page with all pokemon in generation 1: https://bulbapedia.bulbagarden.net/wiki/List_of_Pok%C3%A9mon_by_index_number_in_Generation_I
and here's the types information: https://bulbapedia.bulbagarden.net/wiki/Type
we're going to create a graph with data about these pokemon (ignore MissingNo entries - they are a glitch and ignore other glitched pokemon).
you have a neo4j container running, and the MCP tools for modelling and cypher.
Codex will accept the task, break it down into a plan, and start cranking at it.
There are a few interesting moments in how it approaches it, though it largely depends on the model you configure Codex with. For me it was “model: gpt-5-codex” and it tends to use scripting more than chaining curl commands for example.
Here’s the plan you might see if you’re following along:
• I’m preparing to build a Pokemon graph by fetching data from a website using network access. Since this is a complex task, I’ll create a detailed plan involving data gathering, modeling
entities like Pokémon and Types, defining relationships such as evolutions and type effectiveness, and then using tools like MCP Docker to build the Neo4j graph. The first step is to draft
• Updated Plan
└ □ Inspect requirements and gather Generation I Pokémon data with types and evolution chains
□ Define Neo4j graph data model for Pokémon, Types, and relationships
□ Load data into Neo4j via Cypher ingestion and validate
Then it immediately started cranking python scripts with requests, BeautifulSoup and other web scraping friends to fetch the data from the provided wiki. It’s interesting that the Pokemon data is not actually on a single page, but spread through subpages for every Pokemon species, so perhaps scraping with Python makes a ton of sense over curling the pages and spending tokens on processing the data in the model. And it leaves the scripts behind so you can verify and reproduce the results if needed.
Eventually the model proceeds to analyze the data, and here the MCP tools become really helpful. You can see below that it’s calling the MCP_DOCKER.validate_data_model
from the server we configured earlier.
I'm outlining a comprehensive data model for Pokemon, types, and their relationships with precise property types and units converted for better clarity. I plan to generate ingestion-ready
JSON files via a Python script to integrate cleanly with Neo4j ingestion tools, ensuring the data pipeline is robust and structured.
└ wrote data model to data/gen1_pokemon_model.json
• Called MCP_DOCKER.validate_data_model({"data_model":"data/gen1_pokemon_model.json","return_validated":true})
└ Input validation error: 'data/gen1_pokemon_model.json' is not of type 'object'
....
It took Codex a few attempts to figure out the data model json schema and produce one for the Pokémon that satisfied the Neo4j Data Modelling MCP server.
Then it returned to Python for creating the data ingestion script and loaded the data into the Neo4j instance.
A few MCP tool calls later to query the data with cypher (query language for graph databases) which it can do because it has access to the MCP server for Neo4j Cypher. And with it, Codex and the MCP servers can answer analytical questions about our data.
- Greedy type-coverage search suggests trios such as (Rhydon, Parasect, Dragonite) or (Rhydon, Parasect, Jynx) hit 13 of the 15 defending types super-effectively; no trio can cover Normal/Rock simultaneously because Normal has no offensive 2× matchup.
Now what’s really fun about Neo4j is that it comes with a terrific console where you can explore the data.
While our Neo4j container with the Pokémon data is still running we can go to http://localhost:7474, enter neo4j/password credentials and get to explore the data in a visual way.
Here for example is a subset of the Pokémon and their type relationships.
And if you know Cypher or have an AI assistant that can generate Cypher queries (and verify they work with an MCP tool call), you can generate more complex projections of your data, for example this (generated by Codex) shows all Pokémon, their evolution relationships and primary/secondary types.
MATCH (p:Pokemon)
CALL {
WITH p
OPTIONAL MATCH (p)-[:EVOLVES_TO*1..]->(evo:Pokemon)
WITH collect(DISTINCT evo) AS evos
RETURN [e IN evos WHERE e IS NOT NULL | {node: e, relType: 'EVOLVES_TO'}] AS evolutionConnections
}
CALL {
WITH p
OPTIONAL MATCH (p)-[:HAS_TYPE]->(type:Type)
WITH type
ORDER BY type.name // ensures a stable primary/secondary ordering
RETURN collect(type) AS orderedTypes
}
WITH p, evolutionConnections, orderedTypes,
CASE WHEN size(orderedTypes) >= 1 THEN orderedTypes[0] END AS primaryType,
CASE WHEN size(orderedTypes) >= 2 THEN orderedTypes[1] END AS secondaryType
WITH p,
evolutionConnections +
CASE WHEN primaryType IS NULL THEN [] ELSE [{node: primaryType, relType: 'HAS_PRIMARY_TYPE'}] END +
CASE WHEN secondaryType IS NULL THEN [] ELSE [{node: secondaryType, relType: 'HAS_SECONDARY_TYPE'}] END AS connections
UNWIND connections AS connection
RETURN p AS pokemon,
connection.node AS related,
connection.relType AS relationship
ORDER BY pokemon.name, relationship, related.name;
While this Pokémon demo is a fun example, it’s also a practical blueprint for working with real-world, semi-structured data. Graph databases like Neo4j are especially well-suited for this kind of work. Their relationship-first model makes it easier to represent the complexity of real-world systems.
In this walkthrough, we showed how to connect OpenAI’s Codex to the Neo4j MCP Servers via Docker MCP Toolkit, enabling it to take on multiple specialized roles:
In your own projects, you might ask your AI assistant to “Analyze production logs and identify the cause of performance spikes,” “Migrate the user database schema to a new microservice,” or “Model our product catalog from a set of messy CSVs.”
The Docker MCP Toolkit bridges the gap between powerful AI coding agents and the specialized tools they need to be truly useful. By providing secure, one-click access to a curated catalog of over 200 MCP servers, it enables AI agents to interact with real infrastructure, including databases, APIs, command-line tools, and more. Whether you’re automating data workflows, querying complex systems, or orchestrating services, the MCP Toolkit equips your assistant to work like a real developer. If you’re building with AI coding assistants and want it to go beyond code generation, it’s time to start integrating with the tools your stack already relies on!
jQuery is a household name among web developers who have been around the block. Initially released in 2006, it took the web development world by storm with its easy and intuitive syntax for navigating a document, selecting DOM elements, handling events, and making AJAX requests. At its peak in 2015, jQuery featured on 62.7 percent of the top one million websites and 17 percent of all Internet websites.
A decade later, jQuery is not the shiny new kid on the block anymore. Most of the original pain points jQuery solved, such as DOM manipulation and inconsistent browser behavior, are gone thanks to modern browser APIs.
But jQuery is still widely used. According to SimilarWeb, as of August 11, 2025, nearly 195 million websites use it. That means many developers, like me, still use it every day. And like me, you might prefer it in certain cases.
So, in this article, I’ll share when it still makes sense to use jQuery and when not. Don’t worry: I’m not arguing we should replace React with jQuery. And I’m not here to romanticize 2008. In 2025, I simply still find myself reaching for jQuery because it’s the right tool for the job.
To determine when it makes sense to use jQuery and when not, it helps to know why it was created in the first place and what problems it aimed to solve.
When John Resig launched jQuery at BarCamp NYC in January 2006, the web was a different place. Features we take for granted today were absent from most browsers:
querySelectorAll
: Selecting DOM elements across browsers was messy. In the mid-2000s, none of the available element selectors, like getElementById
or getElementsByClassName
, could select elements using complex CSS queries.addEventListener
wasn’t universal. While browsers like Firefox, Safari, and Chrome supported the W3C event model with addEventListener
, Internet Explorer (before IE9) used Microsoft’s proprietary model with attachEvent
. These two models differed from each other in almost all functional aspects.XMLHttpRequest
. While browsers like Firefox and Safari offered the familiar XMLHttpRequest
, Internet Explorer (before IE7) used ActiveX objects to give JavaScript network capabilities. This meant you had to use a bunch of if-else
blocks to make an AJAX request.jQuery solved all of this with a simple, chainable syntax and consistent cross-browser behavior. It offered a streamlined, chainable API for DOM traversal, event handling, and AJAX—far simpler than cross-browser native JavaScript at the time. These features made jQuery become the go-to JavaScript library in the 2010s, powering everything from personal blogs to Fortune 500 sites. In 2012, a W3Techs survey found that jQuery was running on 50 percent of all websites, and by 62.7 percent of the top 1M websites used it.
Although jQuery’s glory days are clearly behind us, it still works well in some situations. Here are the scenarios where I still choose jQuery:
Even now in 2025, a W3Techs survey shows that jQuery is used in 77.8 percent of the top 10M websites in 2025. This is mostly legacy usage—old apps that use jQuery because switching to a more modern framework is a costly endeavour. This is clear when you look at the version statistics. In a 2023 survey across 500 organizations, only 44 percent use maintained versions (3.6.0 or newer), while 59 percent run older versions (1.x to 3.5.1)
I maintain a few legacy projects like these that were written with jQuery, and I can tell you why they’re still around: they just work. So as the adage goes, “If it ain’t broke, don’t fix it.”
Many large enterprises, government sites, corporate intranets, and many WordPress plugins and themes still rely on jQuery. Rewriting these sites to pure JavaScript or a modern framework is a time-consuming, expensive endeavour that can also introduce new challenges and bugs. Most of the time, all that effort and risk aren’t worth the relatively small benefits in the short term.
The truth is this: the codebase I inherited, built in the jQuery era, works. The business logic is robust, the profit margins are healthy, and—most surprisingly—shipping new features feels like slipping into a worn leather jacket: unfashionable, but comfortable. - Marc Boisvert-Duprs
Yes, most jQuery plugins are no longer actively maintained or have been deprecated, so depending on them is a security risk. Abandoned plugins may become incompatible or insecure as browsers continue to evolve. So, legacy projects that use jQuery and jQuery plugins should eventually migrate away from jQuery.
Developers often need to prototype very simple frontend apps, be it for throwaway demos, internal tools, or proof-of-concept pages. Sometimes the spec may even require a very basic frontend with minimal interactivity (for example, a static page with a simple form and a button).
jQuery is a perfect choice for these situations. Simply drop in a <script>
tag from a CDN and get animations, DOM manipulation, and AJAX in minutes—no need for npm, bundlers, transpilers, or complicated frameworks with hundreds of dependencies. It’s also great for running quick commands from the DevTools console, especially if you want to experiment with an app.
But why not use a more modern but lightweight framework like Alpine.js? Personally, I’m intimately familiar with jQuery: I’ve used it since the beginning of my web development journey. I love its simplicity and ease of use. The minor improvements a new framework can make in this scenario don’t offset the time spent learning a new tool.
Hopefully, you don’t have to support older browsers that lack the standard querySelector
, or browsers like Internet Explorer, notorious for their non-standard behavior. Unfortunately, some of us still need to maintain apps that run on these browsers.
While native JS is perfectly fine for modern browsers, if you’re building something that has to run on older embedded browsers (think: kiosk software, older enterprise or university intranets, or web apps inside legacy desktop apps), jQuery’s normalization saves you from manual polyfilling, and its CSS selector lets you perform complex DOM manipulations easily.
As someone who primarily works with backend apps, I don’t often need to code animations for the frontend. But when I do need to create basic chained animations (fading, sliding, sequencing multiple elements, etc.), jQuery’s .animate()
is simpler (and more lightweight) to write than juggling CSS animations and JS event callbacks.
I was recently tasked to make some upgrades to an ancient app with a PHP backend. Imagine my surprise when I discovered that the server returns HTML fragments, and not JSON APIs. In this case, jQuery’s .load()
and .html()
methods can be simpler and more efficient than writing fetch()
boilerplate with DOM parsing.
For example, I can extract a DOM element from the results of an AJAX request, and load it into an element like so:
// Replace #comments with just the #comments-list from the server response
$('#comments').load('/article/1 #comments-list');
Whereas the same thing in native JS would be:
fetch('/article/1')
.then(res => res.text())
.then(html => {
const doc = new DOMParser().parseFromString(html, 'text/html');
const comments = doc.querySelector('#comments-list');
document.querySelector('#comments').innerHTML = comments.outerHTML;
})
Yes, while the jQuery syntax is more straightforward, both approaches are doing the same thing under the hood, so there’s not a huge performance gain. In the jQuery version, you also have the added overhead of bundling the jQuery library. So, it’s a tradeoff between simplicity and bundle size.
While jQuery still makes sense in some situations, there are some cases where I would never use jQuery.
If I’m building a modern frontend app with lots of reactivity and reusable components, I’d use a modern framework like React or Vue with native features for DOM manipulation.
Frameworks like React, Vue, Svelte, or Angular handle DOM rendering in a virtualised way. Direct DOM manipulation with jQuery conflicts with their data-binding approach, causing state mismatches and bugs.
For example, in React, calling $('#el').html('...')
bypasses React’s virtual DOM and React won’t know about the change. This will inevitably lead to bugs that are difficult to diagnose.
Most of jQuery’s once-killer features, such as selectors, AJAX, events, and animations, are now native in JavaScript:
document.querySelectorAll()
replaces $()
.fetch()
replaces $.ajax()
.element.classList
replaces .addClass()
/ .removeClass()
.element.animate()
handles animations.If I’m just toggling classes or making a fetch call, adding jQuery is wasteful.
jQuery’s major draw between 2008 and 2015 was its cross-browser compatibility, which was necessary due to quirks in IE6–IE9. It simply wasn’t practical to write browser-specific JS for all the different versions of IE. With jQuery, the quirks were abstracted away.
When IE was discontinued, this usefulness is no longer relevant.
So if the app I’m working on needs to support only modern browsers, I don’t need most of jQuery’s compatibility layer.
Mixing jQuery and framework code leads to a “hybrid monster” that’s difficult to maintain.
jQuery can conflict with existing frameworks, which can cause hard-to-fix bugs. If my project is already written in another framework, I avoid including jQuery.
Sometimes, I need to use some features of jQuery, but I can’t justify including it in its entirety. Here are some libraries I use in cases like these.
document.querySelector()
and document.querySelectorAll()
fetch()
APIelement.addEventListener()
.animate()
in jQuery.* Lodash: collection iteration, object/array utilities, throttling, debouncing
* Day.js: date manipulation in a tiny package (instead of jQuery’s date plugins)
If you still like a single API but want it lighter than jQuery:
In 2025, jQuery isn’t the cutting-edge choice for building complex, highly interactive single-page applications that it was in the 2010s, and that’s perfectly fine. While modern frameworks dominate the headlines, jQuery remains a reliable, well-understood tool that solves the problems it was designed for, simply and effectively.
In the end, the “right” tool is the one that meets your project’s needs, and for countless developers and businesses, jQuery continues to be that.
ARG
? Which files were copied into the image?
Recently, we have been making updates to the Docker Build clients (Buildx) and our VS Code extension (Docker DX) to improve the developer experience when using Docker. Today, we are sharing the next stage of that process with the introduction of Build Debugging in VS Code and Docker Build.
With the new debugging feature in Buildx from Docker, you will be able to reduce the time you spend fixing your Docker builds. In this post, you’ll learn how to configure the Buildx debugger in Visual Studio Code, step through a build and inspect variables and the image’s file system, and open a shell inside the image being built. Finally, you will learn a little about the debugger’s implementation and how it can be integrated into other editors.
To start debugging Dockerfiles in Visual Studio Code:
docker buildx version
and verify that your Buildx is at least version 0.29.x.Open up your Dockerfile and open the Run and Debug view in Visual Studio Code. If you do not have any launch configurations, you will see something like the following.
Figure 1: Run and Debug view opened in Visual Studio Code with no launch configurations defined.
Click on the “create a launch.json file” hyperlink. If you have launch configurations, open up your launch.json
file by clicking on the cog icon in the top right hand corner of the Run and Debug view.
In your launch.json
file, create a new launch configuration for debugging your Docker build. You can use the sample below to get started. For a full description of the various attributes in a launch configuration, see here.
{
"name": "Docker: Build",
"type": "dockerfile",
"request": "launch",
"dockerfile": "Dockerfile",
"contextPath": "${workspaceFolder}"
}
Now that you have completed setting up your launch configuration, let’s add a breakpoint to our Dockerfile. Place a breakpoint next to one of your RUN
instructions by clicking in the editor’s left margin or by pressing F9. A circle should appear to indicate that a breakpoint has been added.
We are now ready to start the debugger. Select the launch configuration you created and then hit F5. The build should pause at the RUN
line where you placed the breakpoint.
Figure 2: Docker build suspended by a breakpoint in Visual Studio Code.
We will now walk you through the three different features that the Buildx Debugger provides.
When a build is in a suspended state, you can look at any variables that may have been defined. In this example, by looking at the executed command’s workdir value on the left-hand side, we can now see that the command is not being run in the right folder as we had copied the contents into /app
. We can fix this by adding WORKDIR /app
before the RUN
line. Also note that we can view variables that have been defined by our image and the base image as seen by VAR
and NODE_VERSION
.
Figure 3: Docker build encounters an error and is suspended by the debugger instead of terminating.
In addition to inspecting variables, you can also look at the structure of the file system to see what is already there and what you have copied in. For text files, you can also see its file content as shown in the file’s data
field.
Figure 4: View the file system of the Docker image being built.
Creating the right Dockerfile is often an iterative process. Part of this is usually because the host system you are developing on shares few similarities with the image you are building. Consider the differences between running Ubuntu locally but trying to build an Alpine Linux image. The small differences in package names creates a lot of back and forth between your editor and your browser as you search for the right name. You add a line here and then maybe comment another line somewhere else before running docker build
again to just hope for the best.
This iterative process can now be streamlined with the help of the debugger. When your build is in a suspended state, open the Debug Console view and then place your cursor in the input field at the bottom. Type in exec
and then hit the enter key. The Terminal view should now open with a shell that is attached to the image that is being built.
Figure 5: Use the Debug Console to open a shell into the Docker image being built by running exec
.
Figure 6: The Docker image that is being built can now be accessed and inspected with a terminal.
This feature is a game changer as you can now easily open the image of a Dockerfile at any given step and inspect its content and run commands for testing. Previously, we would have to comment everything after the buggy line, build the Docker image, and then manually run and open a shell into the image. All of that is now condensed into adding a breakpoint in your editor and starting a debug session!
Keep in mind that none of the changes you make in the terminal are persisted so this is purely for experimentation. In the figure below, we can see that a file was created when the debugger was paused at line 3. When the debugger was advanced to line 4, the file disappeared.
Figure 7: Changes to the Docker image inside the exec
terminal will be reset when the debugger steps to another line.
Just like our work with the Docker Language Server that implements the Language Server Protocol, the Buildx debugger is built on open standards as it implements the Debug Adapter Protocol which means that you can debug Dockerfile builds with any editor that supports the protocol. Besides Visual Studio Code, we also provide an official plugin for Neovim. For the JetBrains users out there, we have verified that it integrates well with the LSP4IJ plugin. If your favourite editor supports the Debug Adapter Protocol, there should be a way for the Buildx debugger to integrate with it.
We want to take this opportunity to thank Kohei Tokunaga (ktock) for his ideas and initial work around this feature. The contributions he provided to Buildx gave us a great foundation for us to build out and complete this feature. This release would not have been possible without his help. Thank you, Kohei!
At Docker, we are building our hardened images to exacting standards. That means carefully crafting by hand, because humans are still the best security architects. That said, we understand the value of AI and deploy it as an extra set of eyes at critical junctures in our Docker Hardened Image (DHI) build process. With this dual approach, our users get better coverage and more secure products. Humans aren’t perfect, but AI can help them become better. Here’s a recent example of how the AI tools we use for DHI caught a bug, made a product more secure, and contributed back to the community.
Our upstream release automation opened a routine version bump for nginx-exporter. As part of our release process, the DHI AI guardrail fetched the upstream diffs that the bump would pull in and scanned them with language-aware checks. It spotted a classic logic inversion in the exporter’s new proxy-protocol path and immediately blocked our PR from auto-merging. I reproduced the issue, confirmed the diagnosis, and submitted a small fix upstream. Because the guardrail held the line, customers never saw the bug.
That sequence matters. A normal dependency update would have sailed through and shipped a regression. Instead, Docker’s AI guardrail acted like a release bouncer at the door of a bar. The AI guardrail looked at what was about to enter, recognized a risk pattern, and stopped it from coming in and causing trouble. A human Docker engineer then read the flagged issue, proved the failure and sent the fix. The fix was then accepted by the project, improving their code base. So, this fix not only improved DHI but also improved the project for everyone.
While the fix pull request has been reviewed and approved by the upstream maintainers, our DHI build pipeline applied the patch and shipped the new, patched version to our customers. Making sure that customers never saw the regression in their environments.
Ironically, standard AI coding assistants didn’t spot the flaw. This is why having our own internal AI guardrails is so critical. They provide the extra layer of support and the specialization that is hard to get from general-purpose coding assistants.
This is how we want AI to show up in our pipeline. It is not a replacement for engineering judgment, but a force multiplier for safety. The guardrail focuses on a narrow set of high-leverage issues that can cause big problems such as inverted error checks, ignored failures, and resource mishandling or suspicious contributor activity.
The example also shows the value of layered safeguards. We don’t just accept an upstream version and hope for the best. The AI guardrail scrutinizes what changed. Our policy treats high-confidence findings as a hard stop. Humans then verify, reproduce the behavior, and apply the smallest correct patch. Only then does the release move forward. The best security is proactive, not reactive.
As I alluded to above, there’s an important broader open source benefit to our AI guardrails. DHI depends on hundreds of community projects, some of which are downloaded millions or even billions of times per month. When the DHI AI guardrail surfaces an issue, our default is to fix it upstream rather than carry a private patch. That keeps our images clean, reduces long-term maintenance, and gives every downstream user a better baseline. It also helps the upstream projects, which benefit from our AI anomaly detection, our human judgment, and our subsequent code fixes.
So back to the title. DHI is crafted by engineers, but it is protected by AI. Our guardrail is an active and continuously improving AI that reads upstream diffs with context, recognizes risky patterns, scores confidence, and blocks merges that don’t pass muster. Well-applied AI helps the human work better with faster reviews, tighter patches, and fewer regressions.
This is the partnership we’re optimizing for. Humans set intent, exercise design judgment, and ship features. AI enforces discipline at critical gates. Best of all, every catch feeds the model signals for future scans, so protections improve as our ecosystem evolves. Collectively, this adds to the security of the entire open source ecosystem.
It’s a win for Docker, for our customers, and the community.
]]>Enter the powerhouse duo: Gemini CLI and Docker MCP Toolkit. This isn’t just another tool combination — it’s a paradigm shift that delivers developer AI assistance with zero complexity overhead. A one-time setup of just 5 minutes can save you 20 minutes per test scenario — that’s a 97% time reduction that compounds with every test you run.
Let’s look at a concrete example: browser testing and performance analysis that developers do regularly can be simplified and completely automated. What used to require opening browsers, clicking through flows, analyzing DevTools, taking screenshots, and manually documenting buys can now happen in a single 30-second conversation.
In this guide, you’ll learn how to:
With 220+ pre-built MCP servers, browser automation becomes as simple as having a conversation. No Selenium WebDriver configuration, no CI/CD pipeline complexity, no manual screenshot management — just natural language instructions that execute real browser tests.
Manual browser testing and performance analysis are broken. You open Chrome DevTools, click through pages, check network requests, analyze performance metrics, take screenshots, write bug reports, and then manually create GitHub issues. This context-switching nightmare wastes hours every sprint.
Traditional automation tools don’t solve the real problem. Selenium requires brittle selectors and a complex setup. Playwright needs JavaScript knowledge and test frameworks. Both require maintaining test scripts that break with every UI change. The “solution” often takes more time than manual testing.
While Gemini provides powerful AI capabilities and MCP provides the protocol, Docker MCP Toolkit makes browser automation practical. Without containerization, setting up browser testing means managing Chrome/Firefox installations, dealing with WebDriver versions, configuring Node.js dependencies, handling screenshot directories manually, and different configurations for every developer’s machine. The setup that should take 2 minutes takes 2-6 hours per developer.
Docker MCP Toolkit eliminates this friction:
We built Docker MCP Toolkit to meet developers where they are. If you’re using Gemini CLI, you should be able to automate browser testing without wrestling with infrastructure.
Your tests run securely on your machine. Everything executes in isolated Docker containers on your local system. Test data, screenshots, and application access never leave your computer. No cloud uploads, no third-party services, no compliance concerns. You get enterprise-grade browser automation with complete privacy.
Install via npm:
npm install -g @google/gemini-cli
Once installed, just type gemini in your terminal window.
gemini
Follow the setup wizard:
In case you need higher rate limits or enterprise access, I suggested you use an API key from Google AI Studio. You can easily set it as an environment variable:
export GEMINI_API_KEY="YOUR_API_KEY"
After selecting your sign-in method, a browser window will open. Simply log in with your Google account
Just type “gemini” in your terminal window to start chatting with Gemini and enter your prompt.
Option 1: One-Click Connection (Recommended)
Docker Desktop automatically configures the MCP Gateway connection – the underlying infrastructure that routes requests between Gemini CLI and your MCP servers, handling authentication, containerisation, and secure communication seamlessly.
Option 2: Manual Command Line Setup
If you prefer a command-line setup or need to configure a specific project:
docker mcp client connect gemini --global
You’ll see output like this:
=== System-wide MCP Configurations ===
● gemini: connected
MCP_DOCKER: Docker MCP Catalog (gateway server) (stdio)
● gordon: connected
MCP_DOCKER: Docker MCP Catalog (gateway server) (stdio)
You might have to restart 'gemini'.
The connected
status confirms Gemini CLI is linked to the Docker MCP Gateway.
The Gemini CLI uses the mcpServers configuration in your settings.json file to locate and connect to MCP servers. This configuration supports multiple servers with different transport mechanisms. The mcpServers object is where you define each MCP server you want the CLI to connect to.
Whenever you hit the “Connect” button under the Docker MCP Client for Gemini CLI, it adds the following Docker MCP Gateway configuration to the ~/.gemini/settings.json
file.
{
"theme": "Default",
"selectedAuthType": "oauth-personal",
"mcpServers": {
"MCP_DOCKER": {
"command": "docker",
"args": ["mcp", "gateway", "run"],
"env": {}
}
}
}
Learn more about MCP and Gemini CLI interaction through this link.
# Exit Gemini CLI if running, then restart
gemini
Inside Claude Code, type /mcp
to see available MCP servers.
You should see the Docker MCP Gateway listed, which provides access to all enabled MCP servers. The /MCP_DOCKER
tools indicate a successful connection. As you enable more MCP servers in Docker Desktop, they’ll appear here automatically.
When you start Gemini CLI for the first time after connecting to Docker MCP Toolkit, you’ll see a prompt about the new MCP server:
Choose Option 1 (recommended). This configures your project to automatically use Docker MCP Toolkit and any MCP servers you enable in Docker Desktop. You won’t need to approve MCP servers individually each time.
You’re now ready to use Gemini with MCP servers from Docker Desktop.
Now that you’ve connected Gemini CLI to Docker MCP Toolkit, let’s see it in action with a practical example. We’ll automatically discover real bugs through browser testing and identify performance bottlenecks through detailed analysis — the kind that would take 20 minutes of manual testing, DevTools monitoring, and performance profiling.
This isn’t a trivial “Hello World” demo. We’re performing comprehensive browser testing and performance analysis on a real e-commerce application with the kinds of issues you encounter in production:
That’s a 97% time reduction, but more importantly, it’s consistent, thorough, and documented every time.
The catalog-service-node
application is a realistic e-commerce catalog with intentional issues that mirror common production problems:
Performance Issues:
/api/products
twice unnecessarilyAccessibility Issues:
Browser Issues:
Let’s see if Gemini CLI can discover all of these automatically through intelligent browser testing and performance analysis, then create a comprehensive GitHub issue.
For this demo, we’ll use a real e-commerce catalog application. This gives us realistic performance and accessibility issues to discover.
Clone the repository:
git clone https://github.com/ajeetraina/catalog-service-node
cd catalog-service-node
Start all services:
# Start Docker services (database, S3, Kafka)
docker compose up -d
# Install dependencies
npm install --omit=optional
# Start the application
npm run dev
Verify it’s running:
To make testing realistic, create sample products:
# Create seed script
cat > seed-data.sh << 'EOF'
#!/bin/bash
API_URL="http://localhost:3000/api"
echo "Seeding test products..."
curl -s -X POST "$API_URL/products" \
-H "Content-Type: application/json" \
-d '{"name":"Vintage Camera","description":"Classic 35mm film camera","price":299.99,"upc":"CAM001"}' \
> /dev/null && echo "✅ Vintage Camera"
curl -s -X POST "$API_URL/products" \
-H "Content-Type: application/json" \
-d '{"name":"Rare Vinyl Record - LAST ONE!","description":"Limited edition. Only 1 left!","price":149.99,"upc":"VINYL001"}' \
> /dev/null && echo "✅ Rare Vinyl Record"
curl -s -X POST "$API_URL/products" \
-H "Content-Type: application/json" \
-d '{"name":"Professional DSLR Camera","description":"50MP camera with 8K video","price":2499.99,"upc":"CAMPRO001"}' \
> /dev/null && echo "✅ Professional DSLR"
# Add bulk test products
for i in {4..15}; do
curl -s -X POST "$API_URL/products" \
-H "Content-Type: application/json" \
-d "{\"name\":\"Test Product $i\",\"description\":\"Bulk test product $i\",\"price\":$((50 + RANDOM % 450)).99,\"upc\":\"BULK$(printf '%03d' $i)\"}" \
> /dev/null && echo "✅ Test Product $i"
done
echo ""
TOTAL=$(curl -s "$API_URL/products" | jq '. | length')
echo "Total products: $TOTAL"
echo "Ready! Visit http://localhost:5173"
EOF
chmod +x seed-data.sh
./seed-data.sh
Expected output:
Seeding test products...
✅ Vintage Camera
✅ Rare Vinyl Record
✅ Professional DSLR
✅ Test Product 4
✅ Test Product 5
...
✅ Test Product 15
Total products: 15
Ready! Visit http://localhost:5173
Now you have a realistic environment with 15 products to analyze.
For browser testing and performance analysis automation, you’ll orchestrate three MCP servers:
Let’s configure each one.
The Playwright MCP server gives Gemini the ability to control real browsers, Chrome, Firefox, and WebKit, just like a human would.
In Docker Desktop:
What you get:
browser_navigate
– Navigate to URLsbrowser_snapshot
– Capture page state for analysisbrowser_take_screenshot
– Save visual evidencebrowser_click, browser_type
– Interact with elementsbrowser_console_messages
– Get console errorsbrowser_network_requests
– Analyze HTTP requestsThe Playwright MCP runs in a secure Docker container with browsers pre-installed. No manual ChromeDriver setup, no WebDriver conflicts, no OS-specific browser installations.
The GitHub MCP enables Gemini to create issues, PRs, and manage repositories on your behalf.
Option 1: OAuth Authentication (Recommended – Easiest)
Advantage: No manual token creation. Authorization happens through GitHub’s secure OAuth flow with automatic token refresh.
Option 2: Personal Access Token (For Granular Control)
If you prefer manual control or need specific scopes:
Step 1: Create GitHub Personal Access Token
repo
(Full control of repositories)workflow
(Update GitHub Actions workflows)Step 2: Configure in Docker Desktop
Or via CLI:
docker mcp secret set GITHUB.PERSONAL_ACCESS_TOKEN=github_pat_YOUR_TOKEN_HERE
The Filesystem MCP allows Gemini to save screenshots and test artifacts to your local machine.
In Docker Desktop:
Important Security Note: Only grant access to directories you’re comfortable with Gemini reading and writing to. The Filesystem MCP is scoped to these specific paths for your protection.
Change directory to the root of the catalog-service-node
project repository and open Gemini CLI:
gemini
Paste this prompt:
Navigate to http://host.docker.internal:5173 and perform a detailed
performance and accessibility analysis:
1. Take a screenshot of the full page
2. Count how many products are displayed
3. Open browser DevTools and check:
- Console for any JavaScript errors
- Network tab: how many HTTP requests are made?
- Performance: how long does the page take to load?
4. Identify performance issues:
- Are all products loading at once with no pagination?
- Are images optimized?
- Any unnecessary network requests?
5. Check for accessibility issues:
- Missing alt text on images
- Color contrast problems
- Vague button labels
Create a GitHub issue titled "Product catalog performance and
accessibility issues" with:
- Screenshots attached
- List of specific problems found
- Severity: Medium
- Labels: performance, accessibility, enhancement
Here’s what happens behind the scenes as Gemini orchestrates multiple MCP servers:
Step 1: Browser Navigation
The web server is running. I'll navigate to http://host.docker.internal:5173, take a screenshot, and then proceed with the rest of the performance analysis.
Gemini launches a real Chromium browser in the Docker container and navigates to your application.
Step 2: Console Analysis
✦ I can see 15 products listed on the page: * Vintage Camera * Rare Vinyl Record - LAST ONE! * Professional DSLR Camera * Test Product 4-15
Step 3: Performance Check
✦ Checking browser console and network requests... Opening DevTools to analyze performance metrics...
Step 4: Issue Creation
✦ Creating GitHub issue with findings...
create_issue (MCP_DOCKER MCP Server) {"labels":["performance","accessibility","enhancement"],"body":"### Summary..."}
✦ I have successfully created the issue. You can find it at: https://github.com/ajeetraina/catalog-service-node/issues/1
Gemini automatically created this comprehensive issue with structured information:
Step |
Before MCP (Manual Testing) |
Time |
After MCP (Gemini CLI + MCP) |
Time |
---|---|---|---|---|
1 |
Open browser manually |
1 min |
Paste prompt in Gemini CLI |
5 sec |
2 |
Navigate to application |
30 sec |
Gemini analyzes automatically |
25 sec |
3 |
Click through pages |
3 min |
GitHub issue created |
auto |
4 |
Open DevTools manually |
30 sec |
Screenshots attached |
auto |
5 |
Take screenshots |
2 min |
✓ Complete |
– |
6 |
Check browser console |
1 min |
– |
– |
7 |
Analyze network requests |
2 min |
– |
– |
8 |
Document findings |
3 min |
– |
– |
9 |
Write detailed bug report |
5 min |
– |
– |
10 |
Create GitHub issue |
2 min |
– |
– |
Summary |
Total |
~ 20 minutes per test |
– |
30 sec per test |
Time saved per test: 19.5 minutes (97% faster!)
Impact over time:
You’ve just witnessed how Docker MCP Toolkit transforms Gemini CLI from a chat assistant into a complete browser testing and performance analysis platform. What used to require opening browsers, clicking through flows, analyzing DevTools, documenting bugs, and creating issues manually now happens in one 30-second conversation.
The combination of Gemini CLI and Docker MCP Toolkit represents a paradigm shift in AI-assisted development. By leveraging terminal-native tools and containerized services, you get:
This setup isn’t just about convenience — it’s about building a development environment that adapts to your workflow rather than forcing you to adapt to it. The developer productivity revolution is here. The question isn’t whether you’ll adopt AI-assisted development — it’s whether you’ll lead with the best tools available or play catch-up later.
Ready to try it? Enable Docker MCP Toolkit in Docker Desktop and start building your own Gemini-powered development workflow today.
Models are advancing quickly. GPT-5, Claude Sonnet, Gemini. Each release gives us more capabilities. But most real work isn’t solved by a single model.
Developers are realizing they need a system of agents: different types of agents working together to accomplish more complex tasks. For example, a researcher to find information, a writer to summarize, a planner to coordinate, and a reviewer to check accuracy.
The challenge is that today, building a multi-agent system is harder than it should be. Context doesn’t flow cleanly between agents. Tools require custom integration. Sharing with a teammate means sending instructions and hoping they can re-create your setup.
That’s the problem cagent solves.
In this blog, we’ll walk you through the basics, how to create a multi-agent AI system in minutes, and how cagent makes this possible.
A multi-agent system is a coordinated group of AI agents that collaborate to complete complex tasks. Using cagent, you can build and run these systems declaratively, no complex wiring or reconfiguration needed.
Figure 1: cagent workflow for multi-agent orchestration.
cagent is an open-source tool for building agents and a part of Docker’s growing ecosystem of AI tools.
Instead of writing glue code to wire up models, tools, and workflows, describe an agent (or a team of agents) in a single YAML file:
This turns agents into portable, reproducible artifacts you can run anywhere and share with anyone.
Create, run, and share multi-agent AI systems more easily with cagent.
Here’s what that looks like in practice.
version: "2"
agents:
root:
model: anthropic/claude-sonnet-4-0
instruction: |
Break down a user request.
Ask the researcher to gather facts, then pass them to the writer.
sub_agents: ["researcher", "writer"]
researcher:
model: openai/gpt-5-mini
description: Agent to research and gather information.
instruction: Collect sources and return bullet points with links.
toolsets:
- type: mcp
ref: docker:duckduckgo
writer:
model: dmr/ai/qwen3
description: Agent to summarize notes.
instruction: Write a concise, clear summary from the researcher’s notes.
cagent run team.yaml
The coordinator delegates, the researcher gathers, and the writer drafts. You now have a functioning team of agents.
cagent push ./team.yaml org/research-writer
Now, anyone on your team can run the exact same setup with:
cagent run docker.io/org/research-writer
That’s a full multi-agent workflow, built and shared in under 5 minutes.
These principles keep cagent an easy-to-use and customizable multi-agent runtime to orchestrate AI agents.
Developers are already exploring different multi-agent use cases with cagent. Here are some examples:
1. PR and issue triaging
2. Research summarizing
3. Knowledge routing
Each one starts the same way: a YAML file and an idea. And they can be pushed to a registry and run by anyone.
cagent gives you the fastest path forward to build multi-agent systems. It’s open-source, easy to use, and built for the way developers already work. Define your agents, run them locally, and share them, all in a few lines of YAML.
YAML in, agents out.
Run the following to get started:
brew install cagent
cagent new
cagent run agent.yaml
When we first announced Docker Model Runner, it was in its beta phase, exclusively available on Docker Desktop and limited to Apple and Nvidia hardware. We received a ton of valuable feedback, and we’ve been hard at work making it more accessible and powerful.
Today, we’re proud to say that Docker Model Runner is now Generally Available (GA) and can be used in all versions of Docker! But that’s not all. We’ve added Vulkan support, which means you can now run your models on virtually any GPU. This is a huge leap forward, and it’s all thanks to the incredible potential we see in this project and the community that surrounds it.
We’ve listened to your feedback about the contribution process, and we’ve made some significant changes to make it as smooth as possible.
To start, we’ve consolidated all the repositories into a single, unified home. This makes it much easier to find everything you need in one place.
We have also invested a lot of effort in updating our documentation for contributors. Whether you’re a seasoned open-source veteran or a first-time contributor, you’ll find the information you need to get started.
The success of Docker Model Runner depends on you, our amazing community. We’re calling on you to help us make this project the best it can be. Here’s how you can get involved:
We’re incredibly excited about this new chapter for Docker Model Runner, and we can’t wait to see what we can build together. Let’s get to work!
In this post, we’ll show how DGX Spark and Docker Model Runner work together to make local model development faster and simpler, covering the unboxing experience, how to set up Model Runner, and how to use it in real-world developer workflows.
NVIDIA DGX Spark is the newest member of the DGX family: a compact, workstation-class AI system, powered by the Grace Blackwell GB10 Superchip that delivers incredible performance for local model development. Designed for researchers and developers, it makes prototyping, fine-tuning, and serving large models fast and effortless, all without relying on the cloud.
Here at Docker, we were fortunate to get a preproduction version of DGX Spark. And yes, it’s every bit as impressive in person as it looks in NVIDIA’s launch materials.
Many of us at Docker and across the broader developer community are experimenting with local AI models. Running locally has clear advantages:
But there are also familiar tradeoffs:
This is where DGX Spark and Docker Model Runner (DMR) shine. DMR provides an easy and secure way to run AI models in a sandboxed environment, fully integrated with Docker Desktop or Docker Engine. When combined with the DGX Spark’s NVIDIA AI software stack and large 128GB unified memory, you get the best of both worlds: plug-and-play GPU acceleration and Docker-level simplicity.
The device arrived well-packaged, sleek, and surprisingly small, resembling more a mini-workstation than a server.
Setup was refreshingly straightforward: plug in power, network, and peripherals, then boot into NVIDIA DGX OS, which includes NVIDIA drivers, CUDA, and AI software stack pre-installed.
Once on the network, enabling SSH access makes it easy to integrate the Spark into your existing workflow.
This way, the DGX Spark becomes an AI co-processor for your everyday development environment, augmenting, not replacing, your primary machine.
Installing Docker Model Runner on the DGX Spark is simple and can be done in a matter of minutes.
DGX OS comes with Docker Engine (CE) preinstalled. Confirm you have it:
docker version
If it’s missing or outdated, install following the regular Ubuntu installation instructions.
The Model Runner CLI is distributed as a Debian package via Docker’s apt repository. Once the repository is configured (see linked instructions above) install via the following commands:
sudo apt-get update
sudo apt-get apt-get install docker-model-plugin
Or use Docker’s handy installation script:
curl -fsSL https://get.docker.com | sudo bash
You can confirm it’s installed with:
docker model version
Now that the plugin is installed, let’s pull a model from the Docker Hub AI Catalog. For example, the Qwen 3 Coder model:
docker model pull ai/qwen3-coder
The Model Runner container will automatically expose an OpenAI-compatible endpoint at:
http://localhost:12434/engines/v1
You can verify it’s live with a quick test:
# Test via API
curl http://localhost:12434/engines/v1/chat/completions -H 'Content-Type: application/json' -d
'{"model":"ai/qwen3-coder","messages":[{"role":"user","content":"Hello!"}]}'
# Or via CLI
docker model run ai/qwen3-coder
GPUs are allocated to the Model Runner container via nvidia-container-runtime
and the Model Runner will take advantage of any available GPUs automatically. To see GPU usage:
nvidia-smi
Here’s what’s happening under the hood:
[ DGX Spark Hardware (GPU + Grace CPU) ]
│
(NVIDIA Container Runtime)
│
[ Docker Engine (CE) ]
│
[ Docker Model Runner Container ]
│
OpenAI-compatible API :12434
The NVIDIA Container Runtime bridges the NVIDIA GB10 Grace Blackwell Superchip drivers and Docker Engine, so containers can access CUDA directly. Docker Model Runner then runs inside its own container, managing the model lifecycle and providing the standard OpenAI API endpoint. (For more info on Model Runner architecture, see this blog).
From a developer’s perspective, interact with models similarly to any other Dockerized service — docker model pull
, list
, inspect
, and run
all work out of the box.
If you’re using a laptop or desktop as your primary machine, the DGX Spark can act as your remote model host. With a few SSH tunnels, you can both access the Model Runner API and monitor GPU utilization via the DGX dashboard, all from your local workstation.
To access the DGX Spark via SSH first set up an SSH server:
Using Local Models in Your Daily Workflows
If you’re using a laptop or desktop as your primary machine, the DGX Spark can act as your remote model host. With a few SSH tunnels, you can both access the Model Runner API and monitor GPU utilization via the DGX dashboard, all from your local workstation.
sudo apt install openssh-server
sudo systemctl enable --now ssh
Run the following command to access Model Runner via your local machine. Replace user with the username you configured when you first booted the DGX Spark and replace dgx-spark.local with the IP address of the DGX Spark on your local network or a hostname configured in /etc/hosts.
ssh -N -L localhost:12435:localhost:12434 user@dgx-spark.local
This forwards the Model Runner API from the DGX Spark to your local machine.
Now, in your IDE, CLI tool, or app that expects an OpenAI-compatible API, just point it to:
http://localhost:12435/engines/v1/models
Set the model name (e.g. ai/qwen3-coder
) and you’re ready to use local inference seamlessly.
The DGX Spark exposes a lightweight browser dashboard showing real-time GPU, memory, and thermal stats, usually served locally at:
http://localhost:11000
You can forward it through the same SSH session or a separate one:
ssh -N -L localhost:11000:localhost:11000 user@dgx-spark.local
Then open http://localhost:11000 in your browser on your main workstation to monitor the DGX Spark performance while running your models.
This combination makes the DGX Spark feel like a remote, GPU-powered extension of your development environment. Your IDE or tools still live on your laptop, while model execution and resource-heavy workloads happen securely on the Spark.
Let’s make this concrete.
Suppose you use OpenCode, an open-source, terminal-based AI coding agent.
Once your DGX Spark is running Docker Model Runner with ai/qwen3-coder
pulled and the port is forwarded, you can configure OpenCode by adding the following to ~/.config/opencode/opencode.json
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"dmr": {
"npm": "@ai-sdk/openai-compatible",
"name": "Docker Model Runner",
"options": {
"baseURL": "http://localhost:12435/engines/v1" // DMR’s OpenAI-compatible base
},
"models": {
"ai/qwen3-coder": { "name": "Qwen3 Coder" }
}
}
},
"model": "ai/qwen3-coder"
}
Now run opencode
and select Qwen3 Coder
with the /models
command:
That’s it! Completions and chat requests will be routed through Docker Model Runner on your DGX Spark, meaning Qwen3-Coder now powers your agentic development experience locally.
You can verify that the model is running by opening http://localhost:11000 (the DGX dashboard) to watch GPU utilization in real time while coding.
This setup lets you:
Running Docker Model Runner on the NVIDIA DGX Spark makes it remarkably easy to turn powerful local hardware into a seamless extension of your everyday Docker workflow.
You install one plugin and use familiar Docker commands (docker model pull, docker model run).
You get full GPU acceleration through NVIDIA’s container runtime.
You can forward both the model API and monitoring dashboard to your main workstation for effortless development and visibility.
This setup bridges the gap between developer productivity and AI infrastructure, giving you the speed, privacy, and flexibility of local execution with the reliability and simplicity Docker provides.
As local model workloads continue to grow, the DGX Spark + Docker Model Runner combo represents a practical, developer-friendly way to bring serious AI compute to your desk — no data center or cloud dependency required.
Learn more:
Container security is understandably a hot topic these days, with more and more workloads running atop this mainstay of the cloud native landscape. While I might be biased because I work at Docker, it is safe to say that containers are the dominant form factor for running applications today. Equally important, the next generation of applications focused on AI are already running on containers. Because the world runs on containers, getting container security right is of paramount importance.
I am sad to say that most organizations who claim to be delivering container security are not. Particularly troubling are the growing ranks of hardened image providers who claim to be providing highly secure containers but are missing important components of what makes a container secure. Granted, we have a strong opinion on container security. We run the world’s largest repository and infrastructure for container hosting and management. And to be clear, our company’s future fate depends on the continued perception that containers are secure. So we have real skin in this game.
All of this being said, as the lead security engineer at Docker, and someone with a very long history with containers, I want to lay down our vision for container security. That vision is actually uncomplicated. There are five essential ingredients of maximum container security and hardened images. Those ingredients are:
Minimal Attack Surface: A proper hardened image only includes absolutely necessary software in the container. This means stripping out the majority of libraries, agents, and modules that may deliver useful functionality but are put into software distributions by default and add both complexity and CVE exposure. Our hardening process on average eliminates over 98% of the CVE exposure of a container.
A 100% Complete Software Bills of Materials. This is the baseline and must be 100% complete (as per CISA guidance) with no minimum depth. Provides accurate inventory including direct dependencies, transitive dependencies, and explicit relationships. SBOMs must be fully verifiable back to source, through open standards like SPDX or CycloneDX, standard component identifiers like PURLs, and honest gap disclosure.
Verifiable Build Provenance establishes chain of custody from source code to deployed artifact. SLSA Build Level 3 provenance provides non-falsifiable attestations about what was built, where, and by what process. If you don’t know how or where it was built and who built it, you can’t be sure it’s not tainted.
Standardized Exploitability Assessment clarifies which vulnerabilities affect specific deployment contexts. OpenVEX provides machine-readable statements about vulnerability status, enabling auditors and compliance tools to process assessments independently and properly leverage SBOMs. VEX statement transparency and interoperability make container security viable and allow teams to focus only on real risks.
Cryptographic Verification proves authenticity and integrity. Modern approaches like Sigstore and Cosign enable signing with public verification, allowing anyone to verify signatures without proprietary infrastructure. The signature and provenance chain must be transparent and easy to produce or query.
100% Transparency to Bind These Pillars Together. All of the above five elements must be transparent, not just in what they produce but in how they produce attestations, evidence, and any data or statements. This means using public sources for vulnerability intelligence (National Vulnerability Database or NVD, distribution security feeds, language ecosystem advisories, GitHub Security Advisories) with visible synchronization cadence. When CVEs listed in the KEV (Known Exploited Vulnerabilities) catalog appear, transparency ensures alignment without negotiation. This means making the CVE selection process and criteria public and allowing users to see the process. This means making the SBOM creation process transparent so users can understand how the manifests are built. Ultimately, radical transparency transforms security from a trust exercise into a verification process where you can prove your posture, auditors can validate your evidence, and customers can independently assess your claims.
Of course container security also extends into the container runtimes to execute containers with highest security standards as well as continuous observability and enforcement of organizational policies across the entire container lifecycle. I’ll cover Docker’s activities in this area in a later post.
For enterprises looking to better secure containers, I want to be very, very clear. Any “hardened” container image that cannot meet these requirements is a lie. Unfortunately, a number of hardened image vendors cannot meet these requirements. Here are some of the problems we have seen with competitors’ hardened images that our users and customers have brought us for comparison:
It’s not news to say that supply chain attacks on the open source ecosystem are out of control. The smartest Black Hat minds in the world at the most advanced APTs are laser-focused on compromising supply chains because these are among the best ways to compromise entire ecosystems. Supply chain attacks can expose a huge swath of organizations to critical breaches leading to data exfiltration, ransomware and extortion, and espionage. Because we sit at a central position in the container ecosystem, we are also exposed any time the container supply chain is compromised.
That’s why I’m writing this post. Docker has designed our hardened images explicitly to deliver on all five of the core pillars while also providing 100% transparency into process, inputs and outputs. I want to make it very easy for any platform, team, security team, CISO, or even CEO or business leader to be able to ask the right questions to determine whether their container security posture is valid, and whether the hardened images they are buying are actually delivering on their promise. (As a side note, container security is so important that we also think hardened images should be affordable to all. That’s why we’re now offering them at extremely reasonable prices, making them accessible even to two-person startups.)
Container security is not hard. Container security is not rocket science. Container security is about radical transparency, honesty, and doing right for your users. In a perfect world, everyone would be doing container security the right way, and every organization would have easy access to rock-solid containers that are properly hardened by default and completely transparent.
In this perfect world, Docker as a company is better off, the users are better off, the enterprises are better off, and the world is better off. Frankly, our competitors are also better off and their products are better. That’s a good thing. This is more than a sales pitch or an engineering rant. I guess you can call it a mission. Making the technology world safer is of fundamental importance and that’s the outcome we seek.
]]>