Arc Forumnew | comments | leaders | submitlogin
Quasiquote in Arc
5 points by fallintothis 5660 days ago | 14 comments
In response to the discussion in a previous thread (http://arclanguage.org/item?id=9912), I used the patch on arc3 to let quasiquote be written in Arc itself.

This initial version is simply a direct port of the backquote implementation found in GNU clisp 2.47, which had a lot of unit tests that I also ported -- and all of the applicable ones pass. If you value your vision, you should probably avert your gaze from qq-test.arc (or at least squint): I rolled my own ad-hoc, feature-lacking test harness. But hey, it generates a nice-looking report.

As I said, this is just an initial implementation. The most glaring omission is that the Arc port doesn't handle mismatched unquotes. In the clisp version, these are handled by the reader macros for ` , ,@ etc., but the Arc version sticks with Scheme's reader, which does not protect from these errors. Of course, it's not too hard to fix this: it's just a matter of adding the 'level argument as in ac.scm.

While perhaps not the most opportune way to publish code (would rather just have plaintext + maybe a tar, for something this small), and though people around here seem to prefer git, I've uploaded the hg repository here: http://bitbucket.org/fallintothis/qq/



1 point by CatDancer 5660 days ago | link

Thanks! A quasiquote implementation that handles nesting correctly is important for us macro writers, so this is a great contribution.

What's a mismatched unquote?

-----

1 point by fallintothis 5659 days ago | link

I meant an unquote that wasn't balanced by an enclosing quasiquote. e.g., `(,,x) has one too many unquotes. This is also true of unquote-splicing, like `(,,@x). clisp handles this in the reader:

  $ clisp
  [1]> `(,,x)
  
  *** - READ: more commas out than backquotes in, is illegal
  The following restarts are available:
  ABORT          :R1      Abort main loop
  Break 1 [2]>
The Arc implementation can handle some cases, because the top-level 'unquote and 'unquote-splicing (outside of quasiquotes) will gripe, but it happens at macroexpansion:

  $ rlwrap mzscheme -m -f as.scm
  Use (quit) to quit, (tl) to return here after an interrupt.
  arc> (load "qq.arc")
  nil
  arc> `(,,x)
  Error: "reference to undefined identifier: _x"
  arc> (= x 1)
  1
  arc> `(,,x)
  Error: "unquote not allowed outside of a quasiquote:  1"
...and it now occurs to me that I'm not sure of any test cases that fail to signal an error. The error message isn't very nice, but as long as it does signal one, it's livable.

Extra input on this would be great. I'll try to hunt for ways to break it. Even if errors are signaled properly, it's better to have them at read-time, I think.

-----

1 point by CatDancer 5659 days ago | link

Extra input on this would be great. I'll try to hunt for ways to break it. Even if errors are signaled properly, it's better to have them at read-time, I think.

Hmm, you'd be rather ahead of much of Arc then. As you noticed with unquote, there are parts of Arc which currently don't catch errors at all, much less as early as possible.

I'm wondering if any read-time checks should be able to be turned off? Maybe I might want to be defining my own unusual quasiquote macro which does something with (quasiquote ((unquote (unquote x))))? I can't think of what that might be though.

What quasiquote expression expands into a dotted list so that you had to make your own append instead of using Arc's join? Is (join '(a) 'b) the kind of expression that would need to work? Do you see any disadvantage to fixing join so that it could do that?

-----

1 point by fallintothis 5659 days ago | link

What quasiquote expression expands into a dotted list so that you had to make your own append instead of using Arc's join?

It's predominantly because that's how the clisp version was implemented -- it's basically a straight port without much design consideration. But 'append does have some sort of impact (though probably rare in practice, and with debatable semantics):

  $ rlwrap mzscheme -m -f as.scm
  Use (quit) to quit, (tl) to return here after an interrupt.
  arc> (load "qq.arc")
  nil
  arc> (toggle-optimize)
  nil
  arc> (defs foo () 'x bar () 'y)
  #<procedure: bar>
  arc> (macex1 '`(,(foo) ,@(bar)))
  (append (list (foo)) (bar))
  arc> `(,(foo) ,@(bar))
  (x . y)
  arc> (toggle-optimize)
  t
  arc> (macex1 '`(,(foo) ,@(bar)))
  (cons (foo) (bar))
  arc> `(,(foo) ,@(bar))
  (x . y)
Optimizations or no, the expression should still mean the same thing, but as you can see, without optimizations it would break if 'join was used instead.

Incidentally, these are the same results you get from clisp. I don't have the interaction here because I just loaded the file and waded through repeated "continue"s until it finally yielded the damned result; if anyone's interested, I can reproduce it. Also, it's the result you get after fixing 'toggle-optimize, which I added but didn't use, so didn't have the chance to see it was broken. It should be:

  (mac toggle-optimize ()
    `(do ,@(map1 (fn (option) `(= ,option (no ,option)))
                 '(optimize-cons* optimize-list* optimize-append*))))
The change will be pushed to the repo.

Is (join '(a) 'b) the kind of expression that would need to work? Do you see any disadvantage to fixing join so that it could do that?

Well, arc seems to pitch on dotted lists a lot, since they aren't used too often. But it seems an odd case to miss, if only for the sake of completeness. Even functions like 'len are broken on dotted lists.

-----

1 point by CatDancer 5659 days ago | link

Have you ever written a program where you used append with a non-list in the last argument to create a dotted list? Or called length on a dotted list? Aside from unit testing?

The reason why I ask is that there's lots of places where Arc isn't "complete" (notice there's a caar, a cadr, and a cddr, but no cdar function in arc.arc, for example), but the question is what would actually get used in a program? And what will help programs be written more succinctly?

-----

1 point by fallintothis 5659 days ago | link

Ah, here's where the "debatable semantics" part comes in. ;)

Have you ever written a program where you used append with a non-list in the last argument to create a dotted list?

Nope.

Or called length on a dotted list?

Not personally: http://github.com/nex3/arc/blob/dca901c0cb59ebb533a511df3a84...

Perhaps 'len is a bad example, since some languages (e.g., Common Lisp) like to consider it an error to ask for the length of an improper list. There's a difference between incidental errors ("oh, right, didn't bother with dotted lists because I don't really use them") and intentional errors ("dotted lists are flat-out wrong to use here"). But the point is orthogonal to the one example, I think.

Inasmuch as Arc isn't "complete", you wind up needing to re-invent the wheel, losing succinctness in the process. Your example of car/cdr compositions is such an example, albeit an uncompelling case since (= cdar cdr:car) is easy to write. The reimplementation of 'len for ppr.arc is another, more compelling instance.

That's not to say that a language should be designed to cater to every possibility that might crop up in code. That leads to bloat. I'm just saying that things like handling dotted lists seem more like incidental "didn't happen to get around to it" bugs than "this should be an error". Indeed, the most striking example of this distinction I can think of is the entire reason that I wound up porting clisp's backquote: "never really used ,,@(...) before" vs. "you shouldn't use ,,@(...)". It's not so much a matter of commonality as it is about what makes sense.

Does that mean that 'append is right to use here? I don't know. I was just copying Common Lisp. If there's no logical reason to include it, going with 'join would be cleaner. On the other hand, if there's no logical reason to exclude it, which do we go for? I'd opt for the one that gives the most freedom: since 'append can be used just like 'join except in an edge case that might be useful, 'append offers you more utility -- rather than giving you an error.

(This sentiment could be phrased much more strongly if I actually had a useful case for choosing 'append over 'join. Note that I didn't try to provide a "real" example, just a demonstrative one.)

-----

1 point by CatDancer 5659 days ago | link

Here's what I suggest: publish a fixed 'join that is able to correctly create dotted lists. Now anyone who wants that doesn't have to reinvent the wheel.

Next, publish a version of your qq port that uses 'join. Now qq is simpler because it doesn't have to implement its own 'append; and you can point out that if we've using the old join then some qq expressions won't produce dotted lists, but if we use the new join they will.

-----

2 points by fallintothis 5658 days ago | link

I'm certainly open to using a fixed 'join, though my aim was to keep qq working as closely to vanilla Arc as possible, instead of modifying arc.arc functionality. Using Arc's current 'join gives way to some incorrect optimizations, as ported from the Common Lisp code -- e.g., (join nil form2) is not necessarily equivalent to just form2. So, I'll have to inspect that code, if there's the possibility of using arc.arc 'join (though I wonder if giving both options would be any "cleaner" than just keeping 'append around until/if 'join is "fixed").

On another note, a more appropriate place for the code and especially an altered 'join would probably be the Anarki repository (if only for keeping it a little more centralized). But I'm not sure I understand the protocol there, since I don't see any 'lib' directory in the arc3 branches. Unless this counts as a fix, I guess. Anyone care to give an executive summary / crash course? (The last I read about it was http://arclanguage.org/item?id=9777) Does it matter to anyone if this gets pushed to the git repo?

-----

1 point by CatDancer 5658 days ago | link

Using Arc's current 'join gives way to some incorrect optimizations

For any expression that doesn't have a dotted list as its final result?

my aim was to keep qq working as closely to vanilla Arc as possible, instead of modifying arc.arc functionality

The goal of Arc is to make it easy to write succinct programs. So, if:

A) someone finds that being able to use `(1 ,@2) helps them write a succinct program; or,

B) you have some natural optimizations that are easy to write if intermediate forms can use dotted lists, but would need more difficult or cumbersome code to work around a join / append that didn't work with (join ... atom), or even be conceptually more difficult to think about its correctness,

then I'd say let's fix Arc so that 'join is able to produce dotted lists.

On the other hand, if the only reason that you're implementing your own 'append is so that the code will pendantically pass Common Lisp unit tests for expressions that no one actually uses in their programs, then I'd question why we need that functionality :-)

What does 'list* do?

  ;; arc doesn't like macro definitions inside of 'do blocks (bug?), so resort to
  ;; defining them here.
What didn't work? In arc3:

  arc> (do (mac a () 33))
  #3(tagged mac #<procedure: a>)
  arc> (a)
  33

-----

1 point by fallintothis 5657 days ago | link

For any expression that doesn't have a dotted list as its final result?

Yes.

The way the optimizations work is to merely streamline the things that quasiquote itself introduces. It doesn't require or even expect user code to use 'append. e.g.,

  arc> (load "qq.arc")
  nil
  arc> (toggle-optimize)
  nil
  arc> (macex1 '`(append ,a ,b))
  (append (list (quote append)) (append (list a) (list b)))
Note that the top-level 'append wasn't touched -- it gets turned into (list (quote append)). But also note that the naive macroexpansion without optimizations produces (append (list a) (list b)), which is obviously excessive. So,

  arc> (toggle-optimize)
  t
  arc> (macex1 '`(append ,a ,b))
  (list (quote append) a b)
We still keep the top-level 'append, but a and b are no longer nested in a bunch of conses; the optimizer only changed the 'appends that quasiquote introduced haphazardly.

Then, if the optimizer tries to get rid of the extra 'appends, it must know that the final result will be the same. e.g., (append nil form2) == form2. Meaning that

  arc> (toggle-optimize)
  nil
  arc> (macex1 '`(append ,@nil ,@(f1)))
  (append (list (quote append)) (append nil (f1)))
  arc> (toggle-optimize)
  t
  arc> (macex1 '`(append ,@nil ,@(f1)))
  (cons (quote append) (f1))
The (append nil (f1)) that quasiquote introduced in the first case is turned into just (f1) in the second. Note again that the top-level 'append isn't touched; it doesn't matter if the programmer uses 'append.

Now, if quasiquote was changed to use 'join, so that any 'join it introduced might be optimized, there are fewer invariants because, for instance, (join nil form2) != form2 if form2 is not a list.

This wouldn't matter in optimized code, because the meaning of (join nil form2) could be taken to mean the same as (append nil form2). That is, we could use any ol' sentinel in lieu of 'append and treat its rules the same, like

  fake-arc> (toggle-optimize)
  nil
  fake-arc> (macex1 '`(append ,@nil ,@(f1)))
  (OPTIMIZE-ME (list (quote append)) (OPTIMIZE-ME nil (f1)))
  fake-arc> (toggle-optimize)
  t
  fake-arc> (macex1 '`(append ,@nil ,@(f1)))
  (cons (quote append) (f1))
thus leading to the same optimized code. However, when optimizations are off, if quasiquote produces (join nil form2) it might lead to an error where (append nil form2) would not. That is, if the above OPTIMIZE-ME were 'join, evaluating it as such dies when (f1) is not a list. Moreover,

  (iso (join (list 'append) (join nil (f1)))
       (cons 'append (f1)))
wouldn't necessarily be true.

So! If 'join were used instead of 'append, optimized code wouldn't care. But the invariant is supposed to be that optimized code results == non-optimized code results. (join nil form2) != form2, so the optimizer shouldn't touch 'join in this case, like it does with 'append currently.

All this means is that, if we s/append/join/g, the optimizer would need to use a different set of rules. This isn't so bad; it's not an insurmountable task by any stretch. But in so doing, it would change quasiquote semantics vs. using an append-style join (as discussed previously). So, giving both options would need to cover these altered semantics. Again, doable, just a note of concern when it doesn't seem a bad thing that 'append can be used (or 'join could be modified to work like CL 'append -- not against that, except when working with vanilla Arc).

In a similar vein, 'list* is introduced by the optimizer in attempts to cons less. Quoth CLTL:

  list* is like list except that the last cons of the constructed list is
  "dotted." The last argument to list* is used as the cdr of the last cons
  constructed; this need not be an atom. If it is not an atom, then the effect
  is to add several new elements to the front of a list. For example:
  (list* 'a 'b 'c 'd) => (a b c . d)
  This is like
  (cons 'a (cons 'b (cons 'c 'd)))
  Also:
  (list* 'a 'b 'c '(d e f)) => (a b c d e f) 
  (list* x) == x
The most glaring issue with this being that it relies on the efficiency of the implementation, of course. So, I'm not even sure that mine conses any less than using just 'cons. This is fixed by a not-dumb implementation, of course!

What didn't work?

  arc> (let b nil (mac b () 33) (b))
  Error: "Function call on inappropriate object #3(tagged mac #<procedure: b>) ()"
  arc> (do (mac b () 33) (b))
  Error: "Function call on inappropriate object #3(tagged mac #<procedure: b>) ()"
(By the way, is this right-margin crowded enough for you? Yay, nesting!)

-----

1 point by CatDancer 5657 days ago | link

thus leading to the same optimized code. However, when optimizations are off, if quasiquote produces (join nil form2) it might lead to an error where (append nil form2) would not. That is, if the above OPTIMIZE-ME were 'join, evaluating it as such dies when (f1) is not a list

Yes, but your example was ",@(f1)", so naturally that wouldn't work without 'append when (f1) isn't a list.

I can think of three cases:

Case A: we want to use ",@X" when X is not a list. This isn't supported by Bawden's implementation, so it's a new feature. Clearly we need to use 'append then.

Case B: we require X to be a list when we use ",@X", but the qq expander (optimizing or not) relies on intermediate forms that call 'append with non-list arguments, even though neither the input nor the eventual final output has dotted pairs. Then we'd either need to use 'append, or rewrite the expander not to produce those intermediate forms.

Case C: When X is a list in any use of ",@X", the expander calls append and produces code that calls append with only lists. Then we could use Arc's current 'join instead of 'append.

Maybe I'm just up too late, but I'm not seeing anything in your examples that would indicate that B is the case?

If I always only use ",@" with lists, does the current code (with optimization turned on or off) ever produce the wrong result if 'join is used instead of 'append? (It's OK if the answer is you wouldn't know without looking into the question further).

  (do (mac b () 33) (b))
oh, were you hoping that the macro would be staticly scoped?

-----

1 point by fallintothis 5657 days ago | link

I seem to have misunderstood your question. Apologies. The cases will make it easier for me to not talk past you:

Case A is what I've been defending (and is currently how the port works). I don't have any direct examples of its usefulness, but think that it's semantically clearer (a point which I don't think I've made clear):

  $ rlwrap mzscheme -m -f as.scm
  Use (quit) to quit, (tl) to return here after an interrupt.
  arc> (load "qq.arc")
  nil
  arc> (= tail '(b c d))
  (b c d)
  arc> `(a ,@tail)
  (a b c d)
  arc> (cons 'a tail)
  (a b c d)
  arc> (= tail 'b)
  b
  arc> `(a ,@tail)
  (a . b)
  arc> (cons 'a tail)
  (a . b)
I also consider it an added bonus: dotted lists exist, so it seems presumptuous to think that macros might not stand to gain from using them, splicing into them, whatever. (Just as it seems presumptuous to think that ,,@(...) should be an error.)

Case B is incorrect, so far as I can tell. Indeed, the clisp code was very careful about this -- making sure not to pass the cdr of a dotted list as the first argument to 'append, for instance.

Case C is correct, I think, but would run against case A, since using 'join when the splice is not a list will result in an error, as in the examples I've belabored.

Note, though, that case A needn't be the "correct" solution, not even for Common Lisp (quoth http://www.supelec.fr/docs/cltl/clm/node190.html#BACKQUOTE):

  As an example, the above definition implies that

  `((,a b) ,c ,@d)

  will be interpreted as if it were

  (append (list (append (list a) (list 'b) 'nil)) (list c) d 'nil)

  but it could also be legitimately interpreted to mean any of the following.

  (append (list (append (list a) (list 'b))) (list c) d)
  (append (list (append (list a) '(b))) (list c) d)
  (append (list (cons a '(b))) (list c) d)
  (list* (cons a '(b)) c d)
  (list* (cons a (list 'b)) c d)
  (list* (cons a '(b)) c (copy-list d))

  (There is no good reason why copy-list should be performed, but it is not prohibited.)
I just happen to like the idea, is all. And, admittedly, for wholly impractical reasons!

If I always only use ",@" with lists, does the current code (with optimization turned on or off) ever produce the wrong result if 'join is used instead of 'append? (It's OK if the answer is you wouldn't know without looking into the question further).

Sorry if I come off like that. I don't mean to seem like I'm trying to answer without knowing anything ("going from the gut" or whatever).

For what it's worth, I strongly think that, if unquote-splicing is used strictly with lists, 'join will do the same job as 'append.

Of course, I wouldn't know without looking into the question further. ;)

So, have I finally managed to address what you asked?

oh, were you hoping that the macro would be staticly scoped?

Not even that. I just wanted it to be done in a single sexp on the last unit test (since it's 1 sexp per test), as I note in the source:

"local" macros don't work in arc:

  arc> (let a 12
         (let b nil
           (mac b ()
             (let c 19
               ``(,a ,@',@(list c))))
           (b)))
  Error: "Function call on inappropriate object #3(tagged mac #<procedure: b>) ()"
Even using a 'do block doesn't work (for some reason):

  arc> (do
         (mac m ()
           (let c 19
             ``(,a ,@',@(list c))))
         (let a 12
           (m)))
  Error: "Function call on inappropriate object #3(tagged mac #<procedure: m>) ()"
  arc> (do
         (mac m ()
           (let c 19
             ``(,a ,@',@(list c))))
         (let a 12
           (m)))
  *** redefining m
  (12 . 19)
Unsure whether this is an Arc bug, the problem was mitigated by using a (mac ...) outside of the 'do block.

-----

1 point by CatDancer 5657 days ago | link

So, have I finally managed to address what you asked?

Yup!

OK, here's what I think.

One principle of Arc is to find the minimum combination of code that implements the language. So we shouldn't end up with both a 'join and a 'append; either we decide we want 'join to be able to produce a dotted list, or else we should just use the old 'join.

One way of getting succinct code is not to implement features that no one uses. So if the only reason for using 'append (or extending 'join to be able to produce a dotted list) was to pendantically pass some Common Lisp unit tests, then I be inclined to go with case (A) and just use Arc's current 'join.

However, another principle of Arc is not to forbid things to programmers unless there's no way to provide them with logical consistency. I can't think of any reason to tell you not to use

  arc> `(a ,@'b)
  (a . b)
if you want to... so why should I advocate for restricting splicing to lists?

So here's a patch to arc.arc that extends 'join to accept a non-list in the last argument: http://hacks.catdancer.ws/dotted-join0.patch

It's shorter than your append2/append, at the cost of not providing an explicit error message if a non-list is passed in some argument not the last.

I just wanted it to be done in a single sexp... Unsure whether this is an Arc bug

In Arc, the entire expression is macro expanded and compiled, and then evaluated. So in

  arc> (do (mac a () 33)
           (a))
  Error: "Function call on inappropriate object #3(tagged mac #<procedure: a>) ()"
what's happening is that the 'mac form, when evaluated, will create a new global macro 'a. However, when the 'do form is being macro expanded / compiled, the 'a macro hasn't been defined yet. So "(a)" compiles into a function call on 'a. Then, when the 'do form is evaluated, 'a is set to be a macro, and then 'a is function called. Which produces an error, since a macro object isn't callable.

To keep it all in one sexp you could say

  arc> (do (mac a () 33)
           (eval '(a)))
  33

-----

1 point by shader 5659 days ago | link

I've called length on a dotted list in ppr.arc, and to do that I needed to redefine 'len to support it. I can't really think of an application outside of messing with argument lists.

-----