Giving Your AI Agent a Disposable Workspace with Alibaba Cloud AgentRun

This article introduces Alibaba Cloud AgentRun, a serverless platform that provides AI agents with secure, disposable execution workspaces for safe and scalable task automation.

Written by Rizky Andriawan, Solution Architect Alibaba Cloud Indonesia

TL;DR — An AI agent doesn't just answer; it acts. It runs code, drives browsers, touches files. So it needs a workspace. The shift nearly everyone building agent infrastructure landed on over the last two years isn't that "agents are stateless." It's that agents flip the default. Humans get persistence by default; agents get isolation by default, persistence by exception. The agent's execution environment should be disposable, while its memory, identity, and artifacts are deliberately kept somewhere else and durable. This piece is about why that flip is the right call, and how Alibaba Cloud's AgentRun turns it into a workspace you can use without standing up a single server.

01_hero
Disposable where it works, durable where it remembers.

Start with What an Agent Is Actually Doing

A chatbot answers. An agent acts.

Ask a chatbot a question and it writes text back, and nothing changes in the world. Give an agent a task, like "analyze this spreadsheet and chart the outliers," "fix the failing test in this repo," or "open this site and pull the three cheapest listings," and it has to take actions: run code, write files, drive a browser, read the result, decide what to do next. Over and over, in a loop.

That single difference, acting instead of answering, creates a need a chatbot never had: somewhere to do the work. A place to run that code. A scratch disk. A browser. A shell. Call it a workspace.

So the real question is this: what kind of workspace do you give something like this? The intuitive answer, "the same kind we give a developer," turns out to be exactly backwards. And seeing why is the fastest way to understand what a platform like AgentRun is for.

02_chatbot_vs_agent
A chatbot answers in a single step. An agent runs a loop inside a sealed workspace, then acts on the world.

The Default Flip

Think about your own dev environment, your laptop or a VM you SSH into. You set it up once. You install your tools, leave files lying around, and come back tomorrow to find it all still there. That's the human default: persistence by default, isolation by exception. You only reach for a sandbox or a clean VM in the rare case you're handling something you don't trust.

An agent wants the inverse: isolation by default, persistence by exception. Every task runs in its own clean, sealed environment, and anything that needs to survive the task is deliberately pushed outside that environment.

The word doing a lot of quiet work here is "workspace," so let's pin it down. The thing that should be disposable is the agent's execution environment, where it runs code, drives a browser, and writes temp files. The things that should not be disposable, like its memory ("Acme is a stalled deal"), its identity and permissions, the artifacts it produces, and the audit trail of what it did, don't live in the workspace at all. They live outside it, durable, on purpose, so that the workspace can be thrown away without taking them along.

That separation is the whole trick. Hold the comparison at the level of the execution environment, the place the agent actually does things, and the contrast is stark:

	A human's dev environment	An agent's execution workspace
Lifespan	Months. You return to it.	One task, seconds to minutes, then gone.
What persists in it	Everything, by default.	Nothing, by default. Durable state lives outside it.
Identity	The machine is yours.	The workspace is anonymous; identity and permissions are attached per task, from outside.
Provisioning	Set up once, cost amortized.	Materialize on demand, vanish when done.
Concurrency	One human, one machine.	Thousands firing at once, then none.
Trust	You trust yourself.	Untrusted by construction. It runs code it wrote, on inputs (web pages, documents) that can hide instructions.

This is the exact gap AgentRun is built to fill. Instead of you provisioning and maintaining servers for your agents to act in, it hands each agent a fresh, isolated, disposable execution workspace on demand, and bills you only while the agent is actually working. The rest of this piece is really just why that's the right design, one property at a time.

Why Disposable Execution Is Actually Better, and How AgentRun Delivers It

"Disposable" sounds like a compromise. For execution, it's an upgrade, and each reason maps onto something AgentRun does:

A guaranteed clean slate, so behavior is reproducible. Every task starts from the same known-good baseline. No drift, no leftovers from the last run quietly changing the outcome. AgentRun creates each sandbox fresh per task and releases it when it goes idle, so an agent never inherits yesterday's mess. (This is also why "works on Tuesday, breaks on Friday" mostly disappears, because Friday starts identical to Tuesday.)
A bounded blast radius, in space and in time. The agent fills the disk, runs a sketchy command it picked up from a poisoned web page, or deletes its own files? It's sealed off from everything else and getting destroyed shortly regardless. AgentRun runs each sandbox in its own isolated environment with its own filesystem and process space, and caps how long any instance can live (a single sandbox lasts at most a few hours, and idle ones are reclaimed far sooner). Disposability shrinks a potential incident into a non-event.
No contamination between tasks or users. If a workspace persisted and got reused, one task's leftover credentials or data could surface in the next, maybe someone else's. Per-task sandboxes make that whole class of leak structurally impossible, not just unlikely.
Cost follows the work, not the calendar. A human's idle laptop is fine, because there's one human. A million idle agent machines is a serious bill. Agents are spiky and mostly waiting, whether for a model to reply or a page to load. Because AgentRun sits on Alibaba Cloud's serverless platform (Function Compute), there's nothing to pre-provision. Workspaces appear on demand and you pay per use, not for idle capacity.
Elasticity with no fleet to maintain. A thousand agents wake at once, a thousand workspaces spin up, and a minute later they're gone. No capacity planning, no machines to patch and clean up. The serverless substrate scales with the workload.

Notice the threat model and the design line up almost by definition. The danger is untrusted code running on untrusted inputs. The answer, isolated so it can't reach out, disposable so nothing it does sticks, is practically the negation of that danger.

So That's How Agents Work

Here's the payoff for anyone who's found agents a little mysterious: the workspace gives away the secret.

An agent isn't an oracle that knows the answer. It's a loop that tries things in a workspace, reads what happened, and tries again. The only reason it needs a disposable sandbox at all is that it will make a mess, running code that errors, installing the wrong thing, taking a wrong turn, and you want that mess sealed off and discarded. The reason its memory lives outside the workspace is the same reason: so the messy part can be thrown away while the learned part is kept. Trial-and-error happens in a cell you discard; the lessons are filed in a store you keep.

Picture an AI agent less as a genius and more as a very fast, tireless intern you've handed a sealed room, a computer, and exactly one task. You rebuild that room from scratch for the next job, but you keep its notes. That's not just a metaphor. It's roughly the architecture.

03_trace
An agent works by trial and error: plan, run, fail, retry, succeed.

One Concrete Task, Start to Finish

Make it real. A sales-ops agent gets told: "pull last week's pipeline report and flag the stalled deals."

It spins up a fresh workspace. It opens a browser, logs into your CRM, and downloads the report. It runs a few lines of Python to find deals untouched for 14 days. It writes a short summary. Then it exits, and the workspace is destroyed.

Now look at what didn't happen:

No browser left logged in for the next run to inherit.
No half-installed Python package left behind on a long-lived machine.
No CRM credentials sitting inside a VM that's powered on 24/7, waiting to be stolen.
No "it worked last Tuesday" drift, because this Friday's run started from the identical clean slate Tuesday's did.

And the things you do want to keep, like the summary it produced, the audit log of every click, and its memory that "Acme has gone quiet," were written outside the workspace, on purpose, so the workspace could vanish without dragging them down with it.

That's the default flip in a single task: the execution is disposable, the results and the memory are not. Once you see it in one workflow, you see it everywhere.

Meet AgentRun: The Flip, as a Managed Service

This is what AgentRun packages. It's tempting to call it an "all-in-one agent box," but that undersells the design, which keeps distinct things distinct rather than lumping them together:

agentrun_01_agent_creation
Easy to start: a single form to define your agent's model, prompt, and tools — no infrastructure to manage.

The execution sandbox, the disposable workspace itself. AgentRun's AIO Sandbox ("All-In-One") bundles the three capabilities an agent's hands actually need into one isolated environment: a headless browser, a code interpreter, and an interactive terminal plus filesystem. Alibaba aptly calls this the agent's "eyes, brain, and hands."

agentrun_09_sandbox_templates
AgentRun offers five sandbox templates — each purpose-built for a different class of agent work.

A dedicated browser plane, the Browser Sandbox: a cloud headless browser drivable with standard tools (Playwright or Puppeteer over the DevTools protocol), with a built-in live view over VNC so you can literally watch the agent click through a site in real time. When you're debugging "what exactly did my agent just do," that's the difference between a black box and a glass one.

agentrun_10_aio_sandbox_config
Creating an AIO Sandbox: pick your resources, your browser, your runtime — the workspace materializes on demand and vanishes when the task is done.

Model access and governance, which models an agent may call, and under what limits.

agentrun_07_model_management
Model governance: control which models an agent may call, and under what limits.

A memory and state plane, the durable half of the flip, deliberately outside the disposable workspace.

agentrun_03_memory
Memory lives outside the disposable workspace — durable, searchable, and deliberately separate.

An observability and control plane, the trace of what the agent did and why.

agentrun_04_monitoring
The observability layer: every invocation tracked, every resource unit measured.

The disposable workspace is the piece this whole article is about. The other layers exist precisely so that piece can be disposable. AgentRun's job is to wire them together so you don't have to, while keeping the boundaries clean enough that you still know which layer is doing what.

04_boundary
Disposable in the middle, durable on the edges, all on one serverless platform.

agentrun_06_tools_marketplace
Extend agent capabilities with a marketplace of MCP tools and cloud-native skills — from Playwright to RDS Copilot.

agentrun_11_agent_runtime_list
Easy to manage: every agent runtime visible and controllable from one place.

But Is It Safe to Run AI-Written Code in the Cloud?

The obvious worry: if an agent runs untrusted code and thousands share hardware, can't one break out and reach another customer? It's the right question, and the honest answer is neither "it's perfectly sealed" nor "anything goes."

These workspaces don't sit on a bare shared machine. They ride on the same

lightweight-VM-class isolation (secure containers) that already separates millions of multi-tenant serverless workloads. That's a strong, battle-tested boundary, not a fresh and fragile one. But "strong" isn't "perfect." No isolation is unbreakable, researchers do occasionally find cracks, and agents add a genuinely new attack surface that no sandbox closes: prompt injection, where the untrusted input is the agent's instructions, smuggled in through a web page or a document it reads.

So the real posture is defense in depth: strong isolation, plus disposability (nothing persists to be stolen later), plus controlled egress (it can't call out to anywhere), plus least-privilege credentials. Disposability isn't the entire security story, but it's the part that turns a reckless move by the agent from a standing liability into a thirty-second problem.

Why "on Function Compute" Matters

Because it all sits on a serverless platform, there's nothing to reserve, scale, or remember to switch off. Workspaces are created on demand, released automatically when idle, capped at a few hours of life, and billed per use. The spiky, mostly-waiting shape of agent work, which would be ruinously expensive as a fleet of always-on VMs, is exactly what serverless was built to absorb. You bring the agent logic, and AgentRun provides infrastructure that is there only when it's working.

What to Take from This

If you're building with agents, stop thinking about servers and start thinking about two things kept apart on purpose: disposable execution and durable everything-else. Don't run agent-written code next to your app, and don't stand up a VM fleet to avoid it either. With AgentRun the disposable execution workspace is a managed thing you can just call, with the memory, governance, and audit trail wired in around it.

And notice the timing. This shape barely existed as a product category two years ago, and now it's everywhere, arrived at independently by teams that never compared notes. When that happens, it's usually the problem telling you the answer's shape is right, and that the disposable execution cell is becoming the default ground agent workloads run on, the way containers quietly became the default for everything before them. AgentRun is Alibaba Cloud's bet on being that ground.

The agents get the headlines. But the unglamorous box they work inside, built for one task, thrown away after, with the important things kept safely outside it, might be the part that actually made them work.

Cikck here to build a tiny agent on AgentRun!

Community

Giving Your AI Agent a Disposable Workspace with Alibaba Cloud AgentRun

Start with What an Agent Is Actually Doing

The Default Flip

Why Disposable Execution Is Actually Better, and How AgentRun Delivers It

So That's How Agents Work

One Concrete Task, Start to Finish

Meet AgentRun: The Flip, as a Managed Service

But Is It Safe to Run AI-Written Code in the Cloud?

Why "on Function Compute" Matters

What to Take from This

Read previous post:

Alibaba Cloud Indonesia

You may also like

Comments

Alibaba Cloud Indonesia

Related Products

Alibaba Cloud Model Studio

Qwen

Alibaba Cloud for Generative AI

AI Acceleration Solution