Run serverless GPU workloads with fast cold starts on bare-metal servers, anywhere in the world
English | 简体中文 | 繁體中文 | Türkçe | हिंदी | Português (Brasil) | Italiano | Español | 한국어 | 日本語
- Run serverless workloads using a friendly Python interface
- Autoscaling and automatic scale-to-zero
- Read large files at the edge using distributed, cross-region storage
- Connect bare-metal nodes to your cluster with a single cURL command
- Manage your fleet of servers using a Tailscale-powered service mesh
- Securely run workloads with end-to-end encryption through WireGuard
Add an endpoint
decorator to your code, and you'll get a load-balanced HTTP endpoint (with auth!) to invoke your code.
You can also run long-running functions with @function
, deploy task queues using @task_queue
, and schedule jobs with @schedule
:
from beta9 import endpoint
# This will run on a remote A100-40 in your cluster
@endpoint(cpu=1, memory=128, gpu="A100-40")
def square(i: int):
return i**2
Deploy with a single command:
$ beta9 deploy app.py:square --name inference
=> Building image
=> Using cached image
=> Deployed 🎉
curl -X POST 'https://inference.beam.cloud/v1' \
-H 'Authorization: Bearer [YOUR_AUTH_TOKEN]' \
-H 'Content-Type: application/json' \
-d '{}'
Connect any GPU to your cluster with one CLI command and a cURL.
$ beta9 machine create --pool lambda-a100-40
=> Created machine with ID: '9541cbd2'. Use the following command to set up the node:
#!/bin/bash
sudo curl -L -o agent https://release.beam.cloud/agent/agent && \
sudo chmod +x agent && \
sudo ./agent --token "AUTH_TOKEN" \
--machine-id "9541cbd2" \
--tailscale-url "" \
--tailscale-auth "AUTH_TOKEN" \
--pool-name "lambda-a100-40" \
--provider-name "lambda"
You can run this install script on your VM to connect it to your cluster.
Manage your distributed cross-region cluster using a centralized control plane.
$ beta9 machine list
| ID | CPU | Memory | GPU | Status | Pool |
|----------|---------|------------|---------|------------|-------------|
| edc9c2d2 | 30,000m | 222.16 GiB | A10G | registered | lambda-a10g |
| d87ad026 | 30,000m | 216.25 GiB | A100-40 | registered | gcp-a100-40 |
You can run Beta9 locally, or in an existing Kubernetes cluster using our Helm chart.
k3d is used for local development. You'll need Docker to get started.
To use our fully automated setup, run the setup
make target.
make setup
The SDK is written in Python. You'll need Python 3.8 or higher. Use the setup-sdk
make target to get started.
make setup-sdk
After you've setup the server and SDK, check out the SDK readme here.
We welcome contributions big or small. These are the most helpful things for us:
- Submit a feature request or bug report
- Open a PR with a new feature or improvement
If you need support, you can reach out through any of these channels:
- Slack (Chat live with maintainers and community members)
- GitHub issues (Bug reports, feature requests, and anything roadmap related)
- Twitter (Updates on releases and more)