How Do I Adjust the Threshold of memcpy in x86_64?
Background
The threshold of glibc's memcpy is determined by the parameter x86_non_temporal_threshold. It has a great impact on the memory bandwidth. You can adjust the threshold as needed to achieve better memory copy performance.
Method
The following setting is recommended by the glibc community:
export GLIBC_TUNABLES=glibc.cpu.x86_non_temporal_threshold=$(($(getconf LEVEL3_CACHE_SIZE) * 3 / 4))
memcpy
In glibc-2.34, memcpy and memmove are implemented in the similar way which is described in the glibc source code.
sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S /* memmove/memcpy/mempcpy is implemented as: 1. Use overlapping load and store to avoid branch. 2. Load all sources into registers and store them together to avoid possible address overlap between source and destination. 3. If size is 8 * VEC_SIZE or less, load all sources into registers and store them together. 4. If address of destination > address of source, backward copy 4 * VEC_SIZE at a time with unaligned load and aligned store. Load the first 4 * VEC and last VEC before the loop and store them after the loop to support overlapping addresses. 5. Otherwise, forward copy 4 * VEC_SIZE at a time with unaligned load and aligned store. Load the last 4 * VEC and first VEC before the loop and store them after the loop to support overlapping addresses. 6. If size >= __x86_shared_non_temporal_threshold and there is no overlap between destination and source, use non-temporal store instead of aligned store. */
As described in item 6 above, if __x86_shared_non_temporal_threshold is exceeded, non-temporal stores instead of aligned stores will be used. Non-temporal stores use the movntdq instruction to bypass the CPU L3 cache and directly access the memory. In this cache missing case, non-temporal stores omit the cache read and write and are more suitable for large memory copies than aligned stores.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot