-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance Enhancements #150
Merged
auto-differentiation-dev
merged 10 commits into
auto-differentiation:main
from
auto-differentiation-dev:feature/performance-improvements
Nov 29, 2024
Merged
Performance Enhancements #150
auto-differentiation-dev
merged 10 commits into
auto-differentiation:main
from
auto-differentiation-dev:feature/performance-improvements
Nov 29, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Removed std::fma in rollback, enhancing iteration speed - Refactored tape container for efficient joint operations - Combined pushing full statements to tape at once - Added pre-checks for tape existence, reducing overhead Co-authored-by: Zakaria Farini <[email protected]> Co-authored-by: Abdelhay Bouramdane <[email protected]> Co-authored-by: Zakaria Sabri <[email protected]> Co-authored-by: Mouad El Mekaoui <[email protected]> Co-authored-by: El Mokhtar Ouhrich <[email protected]> Co-authored-by: Abderrahim Indjaren <[email protected]> Co-authored-by: Salaheddine Ouahidi <[email protected]> Co-authored-by: Yasser Ait Nasser <[email protected]> Co-authored-by: Ayoub Ben hamou <[email protected]> Co-authored-by: Oussama Khiar <[email protected]> Co-authored-by: Omar Belaizi <[email protected]> Co-authored-by: Reda Ghouzraf <[email protected]> Co-authored-by: Yahya Rhiba <[email protected]> Co-authored-by: Abdelouahed Rabiai <[email protected]> Co-authored-by: Othmane Rekabe <[email protected]> Co-authored-by: Othmane Chouikhane <[email protected]> Co-authored-by: Mohamed Khames <[email protected]> Co-authored-by: Mouad El Asri <[email protected]> Co-authored-by: Hasnaa Et-Taleby <[email protected]> Co-authored-by: Jawad Chakir <[email protected]> Co-authored-by: Hamza Barrak <[email protected]> Co-authored-by: Mohammed Berroukham <[email protected]> Co-authored-by: Soufiane essarhir <[email protected]> Co-authored-by: Ibrahim Esseddyq <[email protected]> Co-authored-by: Rida Rhnizar <[email protected]> Co-authored-by: Mohammed Zeroual <[email protected]>
5tirner
approved these changes
Nov 25, 2024
xcelerit-team
approved these changes
Nov 25, 2024
Pull Request Test Coverage Report for Build 12087374967Details
💛 - Coveralls |
9f40e9e
to
59515e3
Compare
59515e3
to
0e8e885
Compare
Test Results 29 files ± 0 29 suites ±0 17m 12s ⏱️ + 2m 21s Results for commit 6f2df65. ± Comparison against base commit aaf5626. This pull request removes 1 and adds 70 tests. Note that renamed tests count towards both.
♻️ This comment has been updated with latest results. |
f095bb5
to
f33b1d7
Compare
f33b1d7
to
80415d1
Compare
1acb310
to
ba068fc
Compare
ba068fc
to
6f2df65
Compare
bbee306
into
auto-differentiation:main
109 checks passed
This was referenced Nov 29, 2024
Closed
Closed
Closed
This was referenced Nov 29, 2024
Closed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This update introduces significant performance improvements, with some applications now achieving up to 2x faster execution.
Key Enhancements
OperationsContainer
: Unified handling of slots and multipliers to reduce capacity and indexing overheadscomputeAdjoints
: Removed redundant chunk indexing for faster computationstd::fma
overhead: Replaced unnecessary calls tostd::fma
for multiply-addsXAD_LIKELY
andXAD_UNLIKELY
macros for better compiler-level branch predictionOperationsContainerPaired
for faster access, trading slight memory increase for speedXAD_REDUCED_MEMORY
CMake flag, allowing users to toggle between faster paired storage (default) and memory-efficient separated arraysContributors
The following contributors collaborated on this pull request: