You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to figure out how to adjust the parameters of the default cpu Arena allocator in ONNX. The only example code that I found was explaining how to create a shared allocator but since I am running on one core and thread I think that is overkill.
My thought process is as follows:
Default arena allocator works well already
My model tensors continuously grow in size so arena expansion is frequent and expensive.
I would like to use the default allocator but with the ability to adjust the initial chunk size bytes to reduce arena growth during inferences.
Can anyone provide some starter C++ code or point me to any tutorial that does something similar?
Thanks in advance.
To reproduce
Using C/C++ API on Linux
Urgency
No response
Platform
Linux
OS Version
Centos 7
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
1.21.0
ONNX Runtime API
C++
Architecture
X64
Execution Provider
Default CPU
Execution Provider Library Version
No response
The text was updated successfully, but these errors were encountered:
Arena allocator was primarily created to be used with GPU memory allocations. It is not very performant in multi-threaded scenarios on CPU.
For CPU, you can try first disable the arena all together and see if this changes your performance. The below makes tensor allocations go directly to the OS heap. The memory is returned to the OS, not to arena.
If this affects the performance of your scenario in a negative way you have an option to change the way arena allocates memory. The default strategy is a power of two, which makes it double the memory when more is requested.
The other strategy is the the same as requested' which is more conservative.
However, there is not a direct way to change it in the InferenceSession.
To accomplish it you need to create a shared allocator within the environment and configure it with OrtArenaConfig instance.
Within OrtArenaConfig there are multiple useful fields you can explore and adjust.
You can set arena_extend_strategy to 1, and other things you wish to experiment with.
The last step is set session options to use shared allocators from the environment, and then create an instance of inference session.
To put this all together the below code illustrates the approach. (I have not tried to compile it).
Ort::Env env(ORT_LOGGING_LEVEL_VERBOSE); // Logging for awarenessauto arena_memory_info = Ort::MemoryInfo::CreateCpu(OrtArenaAllocator, OrtMemTypeDefault);
OrtArenaConfig arena_config;
arena_config.arena_extend_strategy = 1;
env.CreateAndRegisterAllocator(arena_memory_info, &arena_config);
// Make the session use environment level allocators
Ort::SessionOptions sess_opts;
sess_opts.AddConfigEntry(kOrtSessionOptionsConfigUseEnvAllocators, "1");
// Use the above session_opts to create your Ort::Session.
Describe the issue
Hello All,
I am trying to figure out how to adjust the parameters of the default cpu Arena allocator in ONNX. The only example code that I found was explaining how to create a shared allocator but since I am running on one core and thread I think that is overkill.
My thought process is as follows:
Can anyone provide some starter C++ code or point me to any tutorial that does something similar?
Thanks in advance.
To reproduce
Using C/C++ API on Linux
Urgency
No response
Platform
Linux
OS Version
Centos 7
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
1.21.0
ONNX Runtime API
C++
Architecture
X64
Execution Provider
Default CPU
Execution Provider Library Version
No response
The text was updated successfully, but these errors were encountered: