-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NumPy/pathplannerlib memory issues #149
Comments
Currently, we recommend disabling the niwebserver via
|
I guess a different option is to build numpy without openblas support, which would save 28MB. It's unknown whether this would decrease performance. In theory, it's supposed to accelerate multiplication. Another alternative would be to reduce the number of threads that OpenBLAS uses:
This reduces the amount of memory used by 10MB. Unknown performance impact. |
I built a version of numpy that doesn't use OpenBLAS. For now, you need to explicitly enable it in your pyproject.toml:
Run robotpy sync and on the next deploy it should install that specific version of numpy to your robot. With this version of numpy, it seems to save around 15MB of RAM. |
Chiming in here ... 2881 has been tracking memory issues across CD and GitHub issue threads.
Data point: with all of config updates above ... we have been able to sync/deploy and run full match sessions to a "fresh" roboRIO (rebooted / power-cycled) with no memory allocation exceptions/crashes. So far, so good. However, when we hit about our 5th or so deploy with a code update during dev/testing, we hit the memory limit and the deploy crashes in midflight. We then have to simply reboot the roboRIO or power-cycle and deploy fresh and all is good again for the next ~5 run/deploy cycles. Not a showstopper of course as long as we can ultimately run a match on a clean boot up at our events. It seems that something is not being GC'd and/or leaking when we deploy each time and with our current footprint it takes about 5 times to hit the wall. Just a dev/test nuisance at this point. Hope this helps. |
That's really interesting, there shouldn't be any residual memory usage across deploys. When deploying, there are outputs like "RoboRIO disk usage 238.2M/386.3M (62% full)" and "RoboRIO memory 203.6M/250.2M (19% full)"... are those values going up or staying the samee? If it happens reliably, running I feel like that sort of thing has been reported previously. |
Will try repro and grab both the exception details and the data points on disk/mem usage to share here this weekend. There will be lots of code updates and deploys for sure heading towards week 1 event. |
Following up for 2881 ... after all of the memory optimization updates, the team was able to work all weekend over many (100s) code deployments and not able to repro the memory space allocation fault observed earlier. Heading towards a week 1 event and no time to go back to methodically reverse each optimization and/or RobotPy build to trace the issue, but all seems good now. Also, no noticeable/measurable performance issues with the custom numpy version (non OpenBLAS) ... as expected since real parallelism is not a thing in this environment. 😉 |
RoboRIO 1 with current config. Here's python via pmap:
Same process, now I import numpy. Here's the diff.
Same process, import wpimath:
The text was updated successfully, but these errors were encountered: