Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

[PATCH] Optimization for mb_metastrlenend()



Hello
mb_metastrlenend can quickly count character if it's ASCII (0..127) and
occurs after complete char. A good test for this has been found – syntax
highlighting parser working on 823 lines of Zsh-code input. It comes
from my project HSMW, is a modified and optimized
zsh-syntax-highlighting parser. Running time before optimizations: 2237
ms, after: 2027 ms, so this is a 10% optimization for long buffers.
Repeated the test many times, it's a clear win. For short buffers
(line-by-line calling the parser on different, hard input) the gain is
~30 ms for run times ~1450 ms, so no win. Zprof results for long buffers
and instruction to repeat the test are attached. Checked that all Zsh
tests are passing.



diff --git a/Src/utils.c b/Src/utils.c
index db43529..5bc9ef4 100644
--- a/Src/utils.c
+++ b/Src/utils.c
@@ -5323,7 +5323,7 @@ mb_metastrlenend(char *ptr, int width, char *eptr)
     char inchar, *laststart;
     size_t ret;
     wchar_t wc;
-    int num, num_in_char;
+    int num, num_in_char, complete;

     if (!isset(MULTIBYTE))
        return ztrlen(ptr);
@@ -5331,6 +5331,7 @@ mb_metastrlenend(char *ptr, int width, char *eptr)
     laststart = ptr;
     ret = MB_INVALID;
     num = num_in_char = 0;
+    complete = 1;

     memset(&mb_shiftstate, 0, sizeof(mb_shiftstate));
     while (*ptr && !(eptr && ptr >= eptr)) {
@@ -5339,6 +5340,14 @@ mb_metastrlenend(char *ptr, int width, char
*eptr)
        else
            inchar = *ptr;
        ptr++;
+
+        if ( complete && ( inchar >= 0 && inchar <= 0x7f ) ) {
+            num ++;
+            laststart = ptr;
+            num_in_char = 0;
+            continue;
+        }
+
        ret = mbrtowc(&wc, &inchar, 1, &mb_shiftstate);

        if (ret == MB_INCOMPLETE) {
@@ -5358,6 +5367,7 @@ mb_metastrlenend(char *ptr, int width, char *eptr)
             * so we don't count characters twice.
             */
            num_in_char++;
+            complete = 0;
        } else {
            if (ret == MB_INVALID) {
                /* Reset, treat as single character */
@@ -5378,8 +5388,10 @@ mb_metastrlenend(char *ptr, int width, char
*eptr)
                }
            } else
                num++;
+
            laststart = ptr;
            num_in_char = 0;
+            complete = 1;
        }
     }

-- 
  Sebastian Gniazdowski
  psprint@xxxxxxxxxxxx



Messages sorted by: Reverse Date, Date, Thread, Author