Hello, I attach a patch that uses memcpy instead of strcpy for strings of length 8 and more. The four attached scripts reveal performance gains of 10.5%, 22.4%, 8.5%, 14.7%. I didn't run tests that would e.g. search for more optimal value that the 8. I thought that I will ask for ideas for such tests. When is ztrdup() called often? Or rarely. I would use the ideas and run the tests tomorrow (probably for whole day). I was long struggling with bug in llvm compiler. Memcpy can like randomly slow down there. I documented this here: https://www.youtube.com/watch?v=1HVDbU7Sbnw The gains exceed my expectations, they were rather lower on OS X, however the bug prevented obtaining anything that's consistent. I just warn that the gains might be lower. I now run the tests on FreeBSD machine. It is worth noting that the FreeBSD's memcpy isn't the fastest one. I would say it is slow (http://www.embedded.com/design/configurable-systems/4024961/Optimizing-Memcpy-improves-speed). Tomorrow I will run the tests also on Linux machine. It might be that the faster the CPU is, the lower the gain is. If someone could repeat the tests on machine with 1.5 GHz or more it would be of value. - OS X 10.9.2, clang-500.2.79, zsh version used in the video 377e240, Core i5 2.3 GHz - FreeBSD 10.1, gcc 4.8.4, zsh version used in tests 64061e5, Pentium M 600 MHz Best regards, Sebastian Gniazdowski
Attachment:
copymemory.patch
Description: Binary data
Attachment:
testopt1.zsh
Description: Binary data
Attachment:
testopt2.zsh
Description: Binary data
Attachment:
testopt3.zsh
Description: Binary data
Attachment:
testopt4.zsh
Description: Binary data