Skip to content
This repository has been archived by the owner on May 6, 2022. It is now read-only.

Increase apiserver resource requests and limits #2581

Merged
merged 1 commit into from
Mar 13, 2019

Conversation

jkbschmid
Copy link
Contributor

This PR is a

  • Bug Fix

What this PR does / why we need it:
We are using the service catalog with a catalog with two brokers and a catalog comprising a total of ~150 plans and ~5 services. With the default configuration, the apiserver is forcibly restarted every 30 mins by k8s due to an OOM error. This happens because the apiserver requires around ~35MB, but only has a resource memory limit of 30.
This PR increases the default memory resource request from 20 to 40 and the memory resource limit from 30 to 50 MB.

Merge Checklist:

  • New feature
    • Tests
    • Documentation
  • SVCat CLI flag
  • Server Flag for config
    • Chart changes
    • removing a flag by marking deprecated and hiding to avoid
      breaking the chart release and existing clients who provide a
      flag that will get an error when they try to update

@k8s-ci-robot
Copy link
Contributor

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA.

It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.


  • If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check your existing CLA data and verify that your email is set on your git commits.
  • If you signed the CLA as a corporation, please sign in with your organization's credentials at https://identity.linuxfoundation.org/projects/cncf to be authorized.
  • If you have done the above and are still having issues with the CLA being reported as unsigned, please email the CNCF helpdesk: [email protected]

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Mar 13, 2019
@k8s-ci-robot
Copy link
Contributor

Hi @jkbschmid. Thanks for your PR.

I'm waiting for a kubernetes-incubator or kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Mar 13, 2019
@jberkhahn
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Mar 13, 2019
@jboyd01
Copy link
Contributor

jboyd01 commented Mar 13, 2019

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Mar 13, 2019
@jberkhahn
Copy link
Contributor

@jboyd01
Do you think this is a good idea? I'm not saying it isn't, but these values are already configurable in the chart, what makes the new ones any better? I imagine they'll still cause problems with enough stuff in the backend.

@jimmidyson
Copy link
Contributor

Unless I've missed it, I think that documentation could be clearer to show how to set and update resource configurations, along with some example scenarios.

@jberkhahn
Copy link
Contributor

Documentation on how to configure helm charts seems to be a bit beyond the scope of our project. Or do you mean a list of the possible things you can set in our helm chart?

@jboyd01
Copy link
Contributor

jboyd01 commented Mar 13, 2019

I'm sure the doc could be clearer, but I also think the limits are pretty tight. I'm all for bumping them up a bit by default.

@c0d1ngm0nk3y
Copy link

c0d1ngm0nk3y commented Mar 13, 2019

@jboyd01
Do you think this is a good idea? I'm not saying it isn't, but these values are already configurable in the chart, what makes the new ones any better? I imagine they'll still cause problems with enough stuff in the backend.

Do you expect much problems? We are talking about 10 MB we ask the scheduler in addition, right? Yes, it is configurable, but this exactly what you get per default if you just call helm install. I think it would be better to work out of the box for simple scenarios.

We installed the service catalog in our development landscape and it ended up to restart every 30mins. The problem was that it lost all ServiceClusterBroker in the process. In our case, those defaults would have saved us quite some time.

But I understand that it is hard/impossible to find the perfect default.

@jkbschmid
Copy link
Contributor Author

Thanks for the quick replies!
Once we had identified the resource limits as the source of our issue, we found the option in the helm charts right away. So, in our case, the documentation was not an issue.
The goal of this PR was just to adjust the defaults so that other folks that try out the service catalog don't stumble over the same issue.:)

@jberkhahn
Copy link
Contributor

/approve
my concern is really how do we know this is high enough to be a better default - maybe it should be even higher? we can try this for now, i guess

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jberkhahn

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 13, 2019
@jberkhahn
Copy link
Contributor

weird, the travis tests passed but the git robot isn't pulling the status for some reason.
/retest

@k8s-ci-robot k8s-ci-robot merged commit c000c23 into kubernetes-retired:master Mar 13, 2019
@jboyd01
Copy link
Contributor

jboyd01 commented Mar 13, 2019

@jkbschmid as I reread this and discussed with @jberkhahn I feel obligated to reply with a clarification / elaboration. Any real use of service catalog probably deserves a more advanced deployment & configuration of etcd. As you have seen, if your Service Catalog API Server container is restarted for any reason you are going to loose your etcd storage. For anything other then "play" I'd encourage the setup of a HA etcd deployment with persistent storage and then configuring the Service Catalog API Server to use it. You may have already deduced this, but I wanted to be sure it was understood and I'll try to revisit the doc we have around that.

viviyww pushed a commit to viviyww/service-catalog that referenced this pull request May 10, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants