Skip to content

Conversation

@alebedev87
Copy link
Contributor

Manual cherry-pick of #1323.

This commit implements HTTPKeepAliveTimeout tuning option of the IngressController API allowing customers to configure timeout http-keep-alive.

In OCP versions prior to 4.16, this timeout was not respected (see haproxy/haproxy#2334). This implementation brings the ability to adjust the behavior to match pre-4.16 configurations.

API PR: openshift/api#2638.

@openshift-ci-robot openshift-ci-robot added jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Jan 5, 2026
@openshift-ci-robot
Copy link
Contributor

@alebedev87: This pull request references Jira Issue OCPBUGS-70315, which is valid.

7 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.18.z) matches configured target version for branch (4.18.z)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)
  • release note text is set and does not match the template
  • dependent bug Jira Issue OCPBUGS-68378 is in the state Verified, which is one of the valid states (VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA))
  • dependent Jira Issue OCPBUGS-68378 targets the "4.19.z" version, which is one of the valid target versions: 4.19.0, 4.19.z
  • bug has dependents

Requesting review from QA contact:
/cc @ShudiLi

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Manual cherry-pick of #1323.

This commit implements HTTPKeepAliveTimeout tuning option of the IngressController API allowing customers to configure timeout http-keep-alive.

In OCP versions prior to 4.16, this timeout was not respected (see haproxy/haproxy#2334). This implementation brings the ability to adjust the behavior to match pre-4.16 configurations.

API PR: openshift/api#2638.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested review from Miciah, ShudiLi and candita January 5, 2026 12:09
@alebedev87 alebedev87 changed the title [release-4.18] OCPBUGS-70315: Implement HTTPKeepAliveTimeout tuning option [WIP] [release-4.18] OCPBUGS-70315: Implement HTTPKeepAliveTimeout tuning option Jan 5, 2026
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 5, 2026
…option

This commit updates `openshift/api` to include the new
IngressController tuning option field: `HTTPKeepAliveTimeout`.
This commit implements `HTTPKeepAliveTimeout` tuning option of the IngressController API
allowing customers to configure `timeout http-keep-alive`.

In OCP versions prior to 4.16, this timeout was not respected (see haproxy/haproxy#2334).
This implementation brings the ability to adjust the behavior to match pre-4.16 configurations.
Move `resolveIngressControllerAddress` and `canaryImageReference`
functions from the idle connection test file to `util_test.go`
because they are used in two places now.
Add `checkRouteConnectivity` function to poll a route with GET requests.
@alebedev87 alebedev87 force-pushed the release-4.18-http-keep-alive branch from e47e464 to c126df7 Compare January 5, 2026 16:11
@alebedev87 alebedev87 changed the title [WIP] [release-4.18] OCPBUGS-70315: Implement HTTPKeepAliveTimeout tuning option [release-4.18] OCPBUGS-70315: Implement HTTPKeepAliveTimeout tuning option Jan 5, 2026
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 5, 2026
@ShudiLi
Copy link
Member

ShudiLi commented Jan 6, 2026

Tested it with 4.18.0-0-2026-01-06-015149-test-ci-ln-ywhwtg2-latest

1.
% oc get clusterversion
NAME      VERSION                                                AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.18.0-0-2026-01-06-015149-test-ci-ln-ywhwtg2-latest   True        False         29m     Cluster version is 4.18.0-0-2026-01-06-015149-test-ci-ln-ywhwtg2-latest

2. configure httpKeepAliveTimeout with valid 50ms, 50s and 15m
% oc -n openshift-ingress-operator get ingresscontroller default -oyaml | yq ".spec.tuningOptions"
httpKeepAliveTimeout: 50ms

% oc -n openshift-ingress rsh router-default-75fd58df47-7mhld
sh-5.1$ env | grep -i live
ROUTER_SLOWLORIS_HTTP_KEEPALIVE=50ms


% oc -n openshift-ingress-operator get ingresscontroller default -oyaml | yq ".spec.tuningOptions"
httpKeepAliveTimeout: 50s

% oc -n openshift-ingress rsh router-default-57b98f6cf6-h6tfp
sh-5.1$ env | grep -i live
ROUTER_SLOWLORIS_HTTP_KEEPALIVE=50s

% oc -n openshift-ingress-operator get ingresscontroller default -oyaml | yq ".spec.tuningOptions"
httpKeepAliveTimeout: 15m

% oc -n openshift-ingress rsh router-default-7545c5f69f-66khc
sh-5.1$ env | grep -i live
ROUTER_SLOWLORIS_HTTP_KEEPALIVE=15m

3. configure httpKeepAliveTimeout with invalid 50m and 1h
# ingresscontrollers.operator.openshift.io "default" was not valid:
# * spec.tuningOptions.httpKeepAliveTimeout: Invalid value: "string": httpKeepAliveTimeout must be less than or equal to 15 minutes

# ingresscontrollers.operator.openshift.io "default" was not valid:
# * spec.tuningOptions.httpKeepAliveTimeout: Invalid value: "string": httpKeepAliveTimeout must be a valid duration string composed of an unsigned integer value, optionally followed by a decimal fraction and a unit suffix (ms, s, m)
#

4. function test of httpKeepAliveTimeout with is 50s(create pod, service and the route, then send traffic and check captured packets)
sh-4.4# tcpdump -i any port 80 -s 0 -n
dropped privs to tcpdump
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked v1), capture size 262144 bytes
03:08:40.984766 IP 10.131.0.24.45022 > 104.198.49.92.http: Flags [S], seq 3646407093, win 64680, options [mss 1320,sackOK,TS val 3424488488 ecr 0,nop,wscale 7], length 0
03:08:40.987243 IP 104.198.49.92.http > 10.131.0.24.45022: Flags [S.], seq 3308724120, ack 3646407094, win 65400, options [mss 1320,sackOK,TS val 452781324 ecr 3424488488,nop,wscale 7], length 0
03:08:40.987297 IP 10.131.0.24.45022 > 104.198.49.92.http: Flags [.], ack 1, win 506, options [nop,nop,TS val 3424488490 ecr 452781324], length 0
03:08:40.987504 IP 10.131.0.24.45022 > 104.198.49.92.http: Flags [P.], seq 1:97, ack 1, win 506, options [nop,nop,TS val 3424488491 ecr 452781324], length 96: HTTP: GET /a1.txt  HTTP/1.1
03:08:40.991249 IP 104.198.49.92.http > 10.131.0.24.45022: Flags [P.], seq 1:417, ack 97, win 511, options [nop,nop,TS val 452781328 ecr 3424488491], length 416: HTTP: HTTP/1.1 200 OK
03:08:40.991288 IP 10.131.0.24.45022 > 104.198.49.92.http: Flags [.], ack 417, win 503, options [nop,nop,TS val 3424488494 ecr 452781328], length 0
03:09:30.995315 IP 104.198.49.92.http > 10.131.0.24.45022: Flags [F.], seq 417, ack 97, win 511, options [nop,nop,TS val 452831331 ecr 3424488494], length 0
03:09:31.036214 IP 10.131.0.24.45022 > 104.198.49.92.http: Flags [.], ack 418, win 503, options [nop,nop,TS val 3424538539 ecr 452831331], length 0
03:09:40.961758 IP 10.131.0.24.45022 > 104.198.49.92.http: Flags [P.], seq 97:192, ack 418, win 503, options [nop,nop,TS val 3424548465 ecr 452831331], length 95: HTTP: GET /a2.txt HTTP/1.1
03:09:40.961855 IP 104.198.49.92.http > 10.131.0.24.45022: Flags [R], seq 3308724538, win 0, length 0

@ShudiLi
Copy link
Member

ShudiLi commented Jan 6, 2026

/retest-required

@ShudiLi
Copy link
Member

ShudiLi commented Jan 6, 2026

/label qe-approved
/verified by @ShudiLi

@openshift-ci openshift-ci bot added the qe-approved Signifies that QE has signed off on this PR label Jan 6, 2026
@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Jan 6, 2026
@openshift-ci-robot
Copy link
Contributor

@ShudiLi: This PR has been marked as verified by @ShudiLi.

Details

In response to this:

/label qe-approved
/verified by @ShudiLi

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 6, 2026

@alebedev87: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@Miciah
Copy link
Contributor

Miciah commented Jan 6, 2026

/approve
/lgtm

This is a low-risk change. The openshift/api bump adds a new field that is nullable with no default value, the changes to the operator logic only take effect if the cluster-admin sets the new field to an explicit, non-nil value, and the rest of the changes are test refactoring and coverage for the new API field.

/label backport-risk-approved

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 6, 2026

@Miciah: The label(s) /label backport-risk-approved cannot be applied. These labels are supported: acknowledge-critical-fixes-only, platform/aws, platform/azure, platform/baremetal, platform/google, platform/libvirt, platform/openstack, ga, tide/merge-method-merge, tide/merge-method-rebase, tide/merge-method-squash, px-approved, docs-approved, qe-approved, ux-approved, no-qe, downstream-change-needed, rebase/manual, cluster-config-api-changed, run-integration-tests, approved, backport-risk-assessed, bugzilla/valid-bug, cherry-pick-approved, jira/valid-bug, ok-to-test, stability-fix-approved, staff-eng-approved. Is this label configured under labels -> additional_labels or labels -> restricted_labels in plugin.yaml?

Details

In response to this:

/approve
/lgtm

This is a low-risk change. The openshift/api bump adds a new field that is nullable with no default value, the changes to the operator logic only take effect if the cluster-admin sets the new field to an explicit, non-nil value, and the rest of the changes are test refactoring and coverage for the new API field.

/label backport-risk-approved

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jan 6, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 6, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Miciah

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 6, 2026
@Miciah
Copy link
Contributor

Miciah commented Jan 6, 2026

/label backport-risk-assessed

@openshift-ci openshift-ci bot added the backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. label Jan 6, 2026
@openshift-merge-bot openshift-merge-bot bot merged commit 7ec6093 into openshift:release-4.18 Jan 6, 2026
12 checks passed
@openshift-ci-robot
Copy link
Contributor

@alebedev87: Jira Issue Verification Checks: Jira Issue OCPBUGS-70315
✔️ This pull request was pre-merge verified.
✔️ All associated pull requests have merged.
✔️ All associated, merged pull requests were pre-merge verified.

Jira Issue OCPBUGS-70315 has been moved to the MODIFIED state and will move to the VERIFIED state when the change is available in an accepted nightly payload. 🕓

Details

In response to this:

Manual cherry-pick of #1323.

This commit implements HTTPKeepAliveTimeout tuning option of the IngressController API allowing customers to configure timeout http-keep-alive.

In OCP versions prior to 4.16, this timeout was not respected (see haproxy/haproxy#2334). This implementation brings the ability to adjust the behavior to match pre-4.16 configurations.

API PR: openshift/api#2638.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-merge-robot
Copy link
Contributor

Fix included in accepted release 4.18.0-0.nightly-2026-01-06-195439

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. qe-approved Signifies that QE has signed off on this PR verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants