CentML provides dedicated LLM endpoints, general inference, and compute deployments — all backed by optimized infrastructure. No need to worry about GPU provisioning, scaling, or maintenance — just log in and deploy.

1. Log into the NVIDIA CCluster

To get started, navigate to https://app.centml.com in your browser and log in with your credentials.
If you need guidance on how to create an account, please visit the Creating an Account documentation. Once logged in, you will see the NVIDIA CCluster console home page, as shown below.

2. Create a Bearer Token

To interact with CentML endpoints programmatically, you need a Bearer Token. Follow the Managing Vault Objects documentation to generate one.

3. Deploy Your First Model

Choose the deployment type that fits your use case:
  • LLM Serving — Deploy dedicated public or private LLM endpoints tailored to your performance requirements and budget.
  • General Inference — Deploy custom containerized models on CentML infrastructure.
  • Compute — Provision GPU compute for training, fine-tuning, or batch workloads.

Additional Support: Billing, Sales, and/or Technical

For billing or sales support reach out to sales@centml.ai. You can also fill out a support request by following our Requesting Support guide. Support requests are not limited to sales and billing. They can include technical support and more. Please do not hesitate to reach out! We’re here to help!

What’s Next