Tag: multi-model

3 articles

docker llm cost-optimization ai multi-model gpu-containers model-deployment inference

Containerized AI Workloads: Multi-Model Management with Docker

May 21, 2026 · 12 min read

Discover how Docker enables efficient multi-model AI workload management with GPU acceleration and automatic scaling.

dockermulti-modeldockeraigpu-containersaimodel-deployment

Atlas Engine: Sub-2-Minute Cold Start for Multi-Model Orchestration on DGX Spark

May 10, 2026 · 7 min read

Run 3 specialised LLMs on a single DGX Spark in under 2 minutes with 100+ tok/s throughput. Production orchestration patterns revealed.

atlasnvidiamulti-modelllminferenceqwen

Unified LLM Power: Integrating Public and Private APIs with LiteLLM for GraphWiz.AI

March 20, 2026 · 4 min read

Professional guide to implementing LiteLLM proxy for multi-provider LLM integration in GraphWiz.AI, featuring production deployment, cost optimization, and advanced routing strategies.

llmaiapi-proxymulti-modelcost-optimizationai-infrastructure