Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: [PATCH 3/3] Constify two local variables.



Peter Stephenson wrote on Mon, Nov 30, 2015 at 09:38:32 +0000:
> On Mon, 30 Nov 2015 03:21:53 +0000
> Daniel Shahaf <d.s@xxxxxxxxxxxxxxxxxx> wrote:
> > I assume that v->pm->gsu.a->getfn() has no access to v->start and
> > v->end, hence changing the order is safe.
> 
> Yes, the getfn() is very low level, dealing only with storage for that
> parameter and having no knowledge at all of any expansion/substitution
> happening above it.
> 

Thanks, that's what I expected.  What are the lifetime interfaces of
getfn/setfn of arrays?  As far as I can tell, getfn returns a "borrowed
reference" (i.e., something whose lifetime is managed by the parameter)
and setfn "steals" a reference to its second parameter (assumes
responsibility for freeing it).

I'm asking because I found that $region_highlights' getfn returns
a heap-allocated array, while its setfn calls free() on the array passed
to it, so
.
    Param pm = $region_highlights;
    values_a = pm->gsu.a->getfn();
    pm->gsu.a->setfn(values_a);
.
is undefined behaviour (calls free() on a heap pointer, triggering SIGABRT).

That's a reduced example of what the "don't reallocate if it's the same
size" patch does, and I'm not sure how to make it work: should I change
get_re, the setfn, or the callsite which tries to pipe them into each
other?

(The getfn is get_region_highlight.  The setfn is set_region_highlight.
I'm attaching that patch in its current state, in case it's relevant;
however, it's not ready to be committed, since it triggers the SIGABRT
whenever I invoke backward-kill-word.)

> > Are there any compilers that choke on 'char **const' (const pointer to pointer
> > to char)?
> 
> I suspect we ditched those some years ago when we started making more
> use of const, but there might be a few oddballs lying around.
> 

Ack.  I'll hold this patch (37253) until the release too then (unless
there's another test version first).  No point in breaking people's
builds if I can avoid it.

> > See for example the number of uses of mkarry()
> 
> that should have been "mkarray()".  I see 19 uses, then there's
> also hmkarray() which comes from the heap and is only used 3 times.

*nod*, however, I haven't looked into these yet; I have been looking
into the "don't reallocate if it's the same size" part first.

Thanks,

Daniel
From c4cba933db3e972a5539e489e45d5908918e3b59 Mon Sep 17 00:00:00 2001
From: Daniel Shahaf <d.s@xxxxxxxxxxxxxxxxxx>
Date: Mon, 30 Nov 2015 09:18:12 +0000
Subject: [PATCH] Optimize non-resizing array assignments.

Before:
	% zsh -fc 'local -a a; N=5000; time ( a[N]=bar; integer i; while (( i++ < N )) { a[i]=foo } )'
	2.07s user 0.00s system 99% cpu 2.077 total

After:
	% zsh -fc 'local -a a; N=5000; time ( a[N]=bar; integer i; while (( i++ < N )) { a[i]=foo } )'
	0.11s user 0.00s system 98% cpu 0.114 total

Also add a test demonstrating that this codepath must call setfn(), and
another triggering the 'v->end > arrlen(old)' case.
---
To be applied on top of 37253.

 Src/params.c        | 70 +++++++++++++++++++++++++++++++++++++----------------
 Test/A06assign.ztst | 12 +++++++++
 2 files changed, 61 insertions(+), 21 deletions(-)

diff --git a/Src/params.c b/Src/params.c
index 142697f..94c996a 100644
--- a/Src/params.c
+++ b/Src/params.c
@@ -2553,13 +2553,9 @@ setarrvalue(Value v, char **val)
 	return;
     } else {
 	char **const old = v->pm->gsu.a->getfn(v->pm);
-	char **new;
-	char **p, **q, **r; /* index variables */
 	const int pre_assignment_length = arrlen(old);
+	const int slice_length = arrlen(val);
 	int post_assignment_length;
-	int i;
-
-	q = old;
 
 	if ((v->flags & VALFLAG_INV) && unset(KSHARRAYS)) {
 	    if (v->start > 0)
@@ -2579,24 +2575,56 @@ setarrvalue(Value v, char **val)
 	if (v->end < v->start)
 	    v->end = v->start;
 
-	post_assignment_length = v->start + arrlen(val);
+	post_assignment_length = v->start + slice_length;
 	if (v->end <= pre_assignment_length)
-	    post_assignment_length += pre_assignment_length - v->end + 1;
-
-	p = new = (char **) zshcalloc(sizeof(char *)
-		                      * (post_assignment_length + 1));
-
-	for (i = 0; i < v->start; i++)
-	    *p++ = i < pre_assignment_length ? ztrdup(*q++) : ztrdup("");
-	for (r = val; *r;)
-	    *p++ = ztrdup(*r++);
-	if (v->end < pre_assignment_length)
-	    for (q = old + v->end; *q;)
-		*p++ = ztrdup(*q++);
-	*p = NULL;
+	    post_assignment_length += pre_assignment_length - v->end;
+
+	if (post_assignment_length == pre_assignment_length) {
+	    /* Optimization: avoid reallocating. */
+	    const int effective_end_index = MIN(v->end, pre_assignment_length);
+	    int i;
+
+	    DPUTS3(v->start + slice_length != effective_end_index,
+		   "BUG: %s: start=%d end=%d",
+		    v->pm->node.nam, v->start, v->end);
+	    DPUTS3(v->start + slice_length != effective_end_index,
+		   "BUG: %s: slice#=%d arr#=%d",
+		    v->pm->node.nam, slice_length, pre_assignment_length);
+	   
+	    /* Shallow copy 'val' into 'old'. */
+	    for (i = v->start; i < effective_end_index; i++)
+		zsfree(old[i]);
+	    memcpy(old + v->start, val, (i - v->start) * sizeof(char *));
+
+	    v->pm->gsu.a->setfn(v->pm, old);
+	    free(val);
+	} else {
+	    /* Why allocate N+2 pointers?  The first N are the elements, the 
+	     * (N+1)th is the NULL sentinel, but what is the (N+2)th pointer?
+	     */
+	    char **const new =
+		(char**) zshcalloc(sizeof(char *)
+				   * (post_assignment_length + 2));
+	    char **p, **q, **r; /* index variables */
+	    int i;
+
+	    q = old;
+	    p = new;
+	    r = val;
+
+	    for (i = 0; i < v->start; i++)
+		*p++ = i < pre_assignment_length ? ztrdup(*q++) : ztrdup("");
+	    for (r = val; *r;)
+		*p++ = ztrdup(*r++);
+	    if (v->end < pre_assignment_length)
+		for (q = old + v->end; *q;)
+		    *p++ = ztrdup(*q++);
+	    *p = NULL;
+
+	    v->pm->gsu.a->setfn(v->pm, new);
+	    freearray(val);
+	}
 
-	v->pm->gsu.a->setfn(v->pm, new);
-	freearray(val);
     }
 }
 
diff --git a/Test/A06assign.ztst b/Test/A06assign.ztst
index 1e3d2ed..17f7c63 100644
--- a/Test/A06assign.ztst
+++ b/Test/A06assign.ztst
@@ -478,3 +478,15 @@
 >first second
 >first
 >second
+
+ typeset -aU unique_array=(first second)
+ unique_array[1]=second
+ print $unique_array
+0:assignment to unique array
+>second
+
+ typeset -a array=(first)
+ array[1,3]=(FIRST)
+ print $array
+0:slice beyond length of array
+>FIRST
-- 
2.1.4



Messages sorted by: Reverse Date, Date, Thread, Author