Deploying Custom Models
How to build your own containerized inference engines and deploy
The CentML Platform simplifies deploying your custom models and AI applications. Package your model or application as a containerized HTTP server, then deploy it on CentML using the General Inference deployment option.
Here’s an example of how to build a stable diffusion model served by a FastAPI-based inference server: https://github.com/CentML/codex/tree/main/general-apps/general-inference/stable-diffusion/centml-endpoint
By default, Docker containers run as the root user. However, the CentML Platform restricts running containers as root for security reasons. To address this, ensure that you create the container image using a non-root numeric user.
Here’s an example of how you can configure this in a Dockerfile,
If you are building the container on MacOS, make sure to set the image platform as linux/amd64
.
After building the container, push it to a public container registry, such as Docker Hub. Then, navigate to the General Inference option in the CentML Platform and enter the image name, tag, and container port to deploy your application.
What’s Next
Agents on CentML
Learn how agents can interact with CentML services.
Clients
Learn how to interact with platform programmatically
Resources and Pricing
Learn more about CentML Pricing
Get Support
Submit a Support Request
CentML Serverless Endpoints
Dive deeper into advanced Serverless configurations and patterns.
LLM Serving
Explore dedicated public and private endpoints for production model and model infrastructure management.