Users interact via Web/Slack/Teams
SSO or API Keys
Manages auth (OAuth2/OIDC) Routes chat payloads
Parses chat into Canonical Spec (LLM services: Vertex/OpenAI/Bedrock)
Coordinates agents & enforces SLOs (Workflows/Step Functions)
Pub/Sub, Event Grid, EventBridge Decouples async tasks
Constraints & guardrails (Firestore/Cosmos/Dynamo)
Fetch feasible GPU/region combos Checks quotas & residency
Selects micro/macro benchmarks Sets batch size/precision configs
Runs tests on GKE/AKS/EKS or VMs GPUs: H100/A100/L4 etc.
Pulls live billing APIs Spot/RI pricing, quotas
Store KPIs, artifacts, cost curves (GCS/BigQuery, S3/Athena, Blob/Synapse)
Pareto analysis: cost vs performance Applies policies & sensitivity bands
Packs chosen plan + alternatives Evidence bundle included
Terraform, Bicep, CFN templates Deployment-ready configs
IAM, KMS, logs & approvals Ensures compliance
Targets K8s or VM fleets Supports canary & rollback
Golden signals: latency, errors, throughput (CloudWatch/Monitor/Stackdriver)
Watches price, quota, performance Triggers re-benchmark
Manages keys, images, registries (KMS, Sigstore, ECR/ACR)
IAM, network policy, SBOM Audits & image signing
by rahul