On-demand GPU instances with SSH access to run any GPU-accelerated workloads and let you run, test, and experiment with any AI application seamlessly.

1. Select a base image

Spin up a compute instance with your preferred base image (e.g. PyTorch). Enter your ssh public key to configure access to the instance.

Finally, select a GPU instance type and click Deploy.

2. SSH into the instance

Once the instance is ready, ssh into the endpoint url provided in the deployment details page with username centml.

The instance comes preinstalled with NVIDIA drivers and libraries, as well as those included in your selected base image. Additional packages and libraries can be installed via pip.

pip install vllm

What’s Next

LLM Serving

Explore dedicated public and private endpoints for production model deployments.

Clients

Learn how to interact with the CentML platform programmatically

Resources and Pricing

Learn more about the CentML platform’s pricing.

Private Inference Endpoints

Learn how to create private inference endpoints

Submit a Support Request

Submit a Support Request.

Agents on CentML

Learn how agents can interact with CentML services.

Deployments

Clients

Resources

Examples

Compute Instance

1. Select a base image

2. SSH into the instance

What’s Next

LLM Serving

Clients

Resources and Pricing

Private Inference Endpoints

Submit a Support Request

Agents on CentML

Deployments

Clients

Resources

Examples

​1. Select a base image

​2. SSH into the instance

​What’s Next

LLM Serving

Clients

Resources and Pricing

Private Inference Endpoints

Submit a Support Request

Agents on CentML

1. Select a base image

2. SSH into the instance

What’s Next