Your Central Control Plane for Every AI System.
The Manage console is your complete dashboard for every AI system in your Prediction Guard deployment. Monitor health, manage models, configure API keys, connect MCP servers, and control advanced settings all from one place.
-1.png?width=570&height=317&name=systems%20(1)-1.png)
Everything About a System, One Click Away
Click Manage on any system card to open its full management dashboard. Every system card shows you health status, active API keys, deployed models, and connected MCP servers at a glance.
- System Status Real-time health state per system Healthy, Never Connected, or Degraded with last update and created timestamps
- API Keys Create and manage scoped API keys for secure access to your system's endpoints directly from the dashboard
- MCP Servers Configure Model Context Protocol server connections for tool use and external integrations per system
- Advanced Settings Update resource limits, networking, cluster-specific options, and public API endpoint configuration
Every Model Type. One Governed Namespace.
From the Models section of any system dashboard, add and manage all three model types. Regardless of where a model is hosted, it automatically inherits your system's governance policies, PII safeguards, injection protection, and MCP integrations.
- Private Models Run curated, scanned, and safety-tested model weights inside your own infrastructure. Full control over data residency, replicas, GPU allocation, and token limits
- Managed Models Hosted by Prediction Guard in SOC 2 compliant infrastructure. No setup required available via your system's API endpoint instantly
- External Models Connect to OpenAI, Anthropic, Google, AWS Bedrock, Azure, and GCP through their native APIs all governed centrally through your system
- Every model in the Prediction Guard catalog is curated, scanned, and tested before availability no arbitrary weights from public registries

Full Infrastructure Control for Private Models
When adding a private model, browse the curated Prediction Guard catalog and configure exactly how it runs in your infrastructure from replicas and GPU allocation down to token limits and supported capabilities.
- General Settings configure model name, number of replicas, Kubernetes runtime class, and model image use the default Prediction Guard image or your own custom registry
- Resource Parameters set CPU millicores (2000–16000), memory in GB, accelerator card count and type, and hugepages for memory optimization
- Token & Batch Limits define min/max input tokens, max total tokens per request, and max concurrent client batches
- Capabilities & Aliases toggle streaming, tool use, reasoning, and image input; set alternative model names for API access

Complete Control Plane for Your AI Infrastructure
Ready to Own Your AI Stack?
See how Prediction Guard gives you full sovereignty over every model, agent, and API key in your organization.