Plan, optimize, and deploy LLMs effortlessly
Deploy dedicated LLM endpoints that fits your performance requirements and budget in just three steps.
Select or enter the Hugging Face model name of your choosing and provide your Hugging Face token. Also provide a name for the dedicated endpoint you are going to deploy.
Make sure you have been granted access to the model you selected. If not, please go to https://huggingface.co/ and request for access.
Choose the cluster or the region you want to deploy the model. Based on that, CentML presents three pre-configured deployment configurations to suit different requirements:
Each configuration is accompanied by detailed analysis on:
These insights help you choose the configuration that best meets your needs.
For advanced users, CentML Platform also offers an option to customize their model performance configuration. Simply click the “Custom” configuration to gain full control over several tunable parameters.
Finally, click “Deploy”. Once the deployment is ready in a few minutes, copy the endpoint url and go to https://<endpoint_url>/docs
to find the list of API endpoints to start using your LLM deployment. We offer API compatibility with CServe, OpenAI, and Cortex, making integration with other applications seamless.
For more details on how to use the LLM deployment, please refer to the examples we’ve prepared.
Dive into how CentML can help optimzie your Model Integration Lifecycle (MILC).
Learn how to interact with the CentML platform programmatically
Learn more about the CentML platform’s pricing.
Learn how to create private inference endpoints
Submit a Support Request.
Learn how agents can interact with CentML services.
Plan, optimize, and deploy LLMs effortlessly
Deploy dedicated LLM endpoints that fits your performance requirements and budget in just three steps.
Select or enter the Hugging Face model name of your choosing and provide your Hugging Face token. Also provide a name for the dedicated endpoint you are going to deploy.
Make sure you have been granted access to the model you selected. If not, please go to https://huggingface.co/ and request for access.
Choose the cluster or the region you want to deploy the model. Based on that, CentML presents three pre-configured deployment configurations to suit different requirements:
Each configuration is accompanied by detailed analysis on:
These insights help you choose the configuration that best meets your needs.
For advanced users, CentML Platform also offers an option to customize their model performance configuration. Simply click the “Custom” configuration to gain full control over several tunable parameters.
Finally, click “Deploy”. Once the deployment is ready in a few minutes, copy the endpoint url and go to https://<endpoint_url>/docs
to find the list of API endpoints to start using your LLM deployment. We offer API compatibility with CServe, OpenAI, and Cortex, making integration with other applications seamless.
For more details on how to use the LLM deployment, please refer to the examples we’ve prepared.
Dive into how CentML can help optimzie your Model Integration Lifecycle (MILC).
Learn how to interact with the CentML platform programmatically
Learn more about the CentML platform’s pricing.
Learn how to create private inference endpoints
Submit a Support Request.
Learn how agents can interact with CentML services.