Handling [STARTED] my-application
... ← minutes of silence
preparing upgrade for my-application
Two compounding bottlenecks within the RTFD (Runtime Fabric Deployment daemon) container could be contributing to the delays in bulk app deployments:
RTFD processes deployment commands serially — one at a time. If the RTFD container CPU limit is set too low for the number of applications the cluster manages, the Linux kernel CFS scheduler will throttle RTFD whenever it reaches that ceiling. A single deployment that should take 5-8 seconds may take several minutes when RTFD is CPU-starved. Every subsequently queued deployment waits behind it, causing backlogs that compound with each additional concurrent deployment request.
This is the primary cause of the queue wait delay observed in RTFD logs. The default or manually configured CPU limit of 200m is insufficient for clusters managing 50 or more applications under load.
When HTTPRoutes are not enabled, RTFD performs 4 sequential Kubernetes API calls per application to patch Ingress objects on every deployment operation. Each call takes approximately 7 seconds, adding approximately 22-28 seconds of overhead per app per deploy. This overhead is applied to every managed application on every deployment and accumulates significantly as cluster size grows.
Under CPU throttling conditions, Ingress patching takes even longer due to resource contention, further compounding the overall delay.
These two issues interact: the CPU throttle slows down every phase of each deployment, and the Ingress patching adds a fixed overhead on top, resulting in total RTFD processing times that can exceed 10 minutes per application under load.
Both steps below are recommended but step 1 is the simplest and faster way to improve the deployment times.
Step 1 eliminates the CPU throttle and queue backlog, we suggest you to implement this change first and test if it helps with delays.
Step 2 eliminates the Ingress patching overhead. Applying both has been validated to reduce total RTFD processing time by approximately 99% under load.
Increase the resource limits on the RTFD container. The following values are validated for clusters managing 100 or more applications. Adjust proportionally for smaller clusters, but ensure the CPU limit is sufficient to prevent CFS throttling under concurrent deployment load.
containers:
- name: rtfd
resources:
limits:
cpu: "1" # minimum recommended for large clusters
memory: "1Gi"
requests:
cpu: "300m"
memory: "512Mi"
For the agent container on large clusters, also verify:
Agent CPU limit: 1500m (minimum recommended for 100+ app clusters)
Agent Memory limit: 1Gi
Apply the changes by editing or patching the deployment directly, depending on your RTF installation method. Restart the agent pod after applying.
Enable HTTPRoutes mode on the RTF cluster. When enabled, RTFD skips all Ingress object patching during deployments, eliminating the 22-28 seconds of per-app overhead entirely. The change applies immediately to all managed applications in the namespace.
Confirm HTTPRoutes is active by checking RTFD logs after the next deployment:
Ingress resources creation from service watcher is skipped when HTTPRoutes is Enabled
For prerequisites and enablement steps, refer to the Runtime Fabric network configuration documentation: docs.mulesoft.com/runtime-fabric/latest/install-self-managed-network-configuration
Run the following after the agent pod stabilizes:
rtfctl cluster status
Note: if agent-version-consistency shows as unhealthy immediately after an agent pod restart, this is a cosmetic artifact — the probe returns agentVersion: "0.0.0" until the agent fully stabilizes. Recheck after 5 minutes; it resolves on its own with no action required.
Monitor RTFD logs during a test deployment. Expected behavior after the fix:
Handling [STARTED] my-application
preparing upgrade for my-application ← within 1-2 seconds
Ingress creation SKIPPED ← HTTPRoutes active
Completed request [6s]
Important: to properly validate the fix, test with a bulk deployment of 30-50 applications simultaneously. A small test (5-10 apps on an idle cluster) will show similar results under both old and new settings because it does not generate enough queue depth to expose the CPU throttle bottleneck. The true benefit is visible under production load conditions.
EXPECTED RESULTS AFTER REMEDIATION
| Metric | Before (Under-Provisioned) | After (Optimized) |
|---|---|---|
| Queue wait before RTFD processing | Up to 8+ minutes per app | 1-2 seconds |
| RTFD processing time per app | 60-90+ seconds | 5-8 seconds |
| Ingress patching overhead per app | ~28 seconds (4 sequential calls) | 1-2 seconds |
| Pod Ready time (post-RTFD) | Blocked by queue | 30-45 seconds (Mule JVM-bound) |
| Bulk deployment (40+ apps) | Hours | ~5-15 minutes |
DEPLOYMENT_RATELIMITPERSECOND=1 (default) is not a bottleneck when RTFD processes each app in ~6 seconds. Only consider raising it to 2 in the app-config ConfigMap if queue depth consistently exceeds 50 apps simultaneously.RTF Self-Managed Network Configuration and HTTPRoutes: docs.mulesoft.com/runtime-fabric/latest/install-self-managed-network-configuration
MuleSoft Documentation - Limitations on Third-Party Software in RTF Clusters: docs.mulesoft.com/runtime-fabric/latest/limitations-self#use-of-third-party-software-within-a-runtime-fabric-k8s-cluster
KB - Enable debug logging for RTF agent deployment troubleshooting: help.salesforce.com/s/articleView?id=005166841&type=1
005385645

We use three kinds of cookies on our websites: required, functional, and advertising. You can choose whether functional and advertising cookies apply. Click on the different cookie categories to find out more about each category and to change the default settings.
Privacy Statement
Required cookies are necessary for basic website functionality. Some examples include: session cookies needed to transmit the website, authentication cookies, and security cookies.
Functional cookies enhance functions, performance, and services on the website. Some examples include: cookies used to analyze site traffic, cookies used for market research, and cookies used to display advertising that is not directed to a particular individual.
Advertising cookies track activity across websites in order to understand a viewer’s interests, and direct them specific marketing. Some examples include: cookies used for remarketing, or interest-based advertising.