-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve memory efficiency of seen cache #1073
Conversation
The `seen` cache currently is a significant memory usage hotspot due to its inefficient implementation: for every entry, two copies of the message id + timing data + `seq` overhead causes it to use much more memory than it has to. In addition, each check involves several layers of allocations as the computed message id gets salted. This PR improves on the situation by: * using a hash of the message id with the salt instead of joining strings * computing the salted id only once per message * storing one digest instead of two message id:s
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1073 +/- ##
=========================================
Coverage ? 84.53%
=========================================
Files ? 91
Lines ? 15517
Branches ? 0
=========================================
Hits ? 13118
Misses ? 2399
Partials ? 0
|
On holesky, this PR reduces memory usage of the seen cache by ~100mb |
addedAt = previous.addedAt | ||
let | ||
previous = t.del(k) # Refresh existing item | ||
addedAt = if previous.isSome(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had a long PR in the past to remove this pattern from the codebase and decrease the risk of raising defects. You can use https://github.com/vacp2p/nim-libp2p/blob/unstable/libp2p/utility.nim#L125
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
valueOr
is not applicable in this case because we're accessing a field of previous[]
, not previous
itself
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, but withValue
can be used in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't work in generic code, due to similar problems as arnetheduck/nim-results#34
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems to work fine:
addedAt = block:
previous.withValue(p):
p[].addedAt
else:
now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The
seen
cache currently is a significant memory usage hotspot due to its inefficient implementation: for every entry, two copies of the message id + timing data +seq
overhead causes it to use much more memory than it has to.In addition, each check involves several layers of allocations as the computed message id gets salted.
This PR improves on the situation by: