Arc Forumnew | comments | leaders | submit | Pauan's commentslogin

It is true that symbols are unhygienic, but if you use boxes instead, then you get perfect hygiene. Unlike function/vau values, boxes can exist entirely at compile-time, so they're suitable for compilation to JavaScript.

For a working example, check out my language Nulan, which compiles to JavaScript and has hygienic macros that use boxes:

https://github.com/Pauan/nulan

http://pauan.github.io/nulan/doc/tutorial.html

I can provide more details if desired. Though... you're exploring vau based semantics with wat, so I'm not sure how useful boxes would be for you.

In any case, I like what you've done with wat, though I've leaned away from vau recently.

-----

2 points by manuelsimoni 4581 days ago | link

So the idea behind boxes is that you can create boxes at compile-time for runtime stuff that doesn't exist yet?

Note that the same idea can work for Vau first-class values: at compile-time you simply create a faux value for each non-compile-time top-level (or other) definition. Macros can then insert them into generated code, even though they are not the real runtime object.

-----

2 points by Pauan 4581 days ago | link

Yes, at least in Nulan the boxes only exist at compile-time: at runtime they're normal JS variables. Of course, if you were writing a VM, the boxes could also exist at runtime, which would be faster than having to lookup a symbol: you would just unbox it!

And yes, I suppose it would be possible to use boxes in some way with vaus, especially because modern vau-based languages use lexical scope. This doesn't completely solve the "vau are slow" problem, but it at least avoids a symbol lookup for every single variable.

In fact, come to think of it... I had been playing around with this idea, but I had forgotten about it. The basic idea is to create a first-class environment/namespace, but rather than mapping symbols to values, it would map symbols to boxes.

The idea is that, this namespace exists both at compile-time and run-time, so that at compile-time it uses the namespace to replace symbols with boxes (thus gaining speed and the ability to statically verify things at compile-time), but using "eval" at runtime would first lookup the symbol in the namespace (which returns a box), and would then unbox it.

If done properly, I think such a system would appear to behave similarly to the normal environment system as used by Kernel, but it could potentially be faster, because most symbols can be replaced at compile-time with boxes. In any case, it's an implementation detail, unless the language decides to expose the boxes underneath.

-----

3 points by manuelsimoni 4581 days ago | link

Exactly.

My plan is to use these ideas not with fexprs, but with procedural macros, thereby gaining fully static expansion at compile-time, because I want a type checker, and type checking fexprs is currently not possible to my knowledge.

The macroexpansion pass would register fake bindings (very similar in spirit to your boxes) for all runtime definitions in the compile-time environment.

Macros then include these fake objects at compile-time (just as fexprs would with the real objects at runtime) into the generated code.

Thereby, I hope to maintain the pleasurable hygiene properties of fexprs in a language with preprocessing macros.

-----

2 points by Pauan 4581 days ago | link

Well then, that sounds exactly like what Nulan does. I'm glad to see these ideas spreading, whether independently discovered or not.

Since I've already explored these ideas, I would like to mention a couple things.

---

What you call "fake bindings", Nulan calls boxes. I like this name because it's short, unique, and intuitive. A box is simply a data structure that can hold a single item and supports getting that item and setting that item.

In other words, putting something into the box, and taking something out of the box. But changing the contents of the box doesn't change the box itself, obviously. And because they're mutable, boxes are equal only to themself. So even if two boxes hold the same value, they might not be equal to eachother.

Interestingly, boxes are exactly equivalent to gensyms:

  (define foo (gensym))  ; Create the box
  (eval foo)             ; Get the value of the box
  (eval `(set! ,foo 1))  ; Set the value of the box
And gensyms are equal only to themself. So they satisfy all the conditions of boxes, and in Nulan, gensyms actually are boxes! So another way of thinking about it is that Nulan replaces all symbols with gensyms.

Yet another way to look at it is that boxes are equivalent to pointers in C, but now we're straying a bit far away from Lisp...

Because I wanted Nulan to be hygienic by default, I decided to use boxes for everything. In addition, they're a first-class entity. That is, Nulan lets you run arbitrary code at compile-time, and also gives you access to boxes at compile-time.

And the way that Nulan implements macros is... basically, a "macro" is simply a box that has a &macro property. When Nulan sees a box as the first element of a list, it checks if the box has a &macro property (which should be a function), and if so, it calls it.

But boxes can have other properties as well, for instance &pattern lets you define custom pattern-matching behavior, &get is called when the box isn't a macro, and &set is called when assigning to the box:

  foo 1 2 3     # &macro
  foo           # &get
  foo <= 1      # &set
  -> (foo 1) 2  # &pattern
This means the same box can have different behavior depending on its context. And because these things are tied to the box rather than the symbol, everything is perfectly hygienic and behaves as you expect.

Boxes also have other properties as well, like whether they're local or global, and whether they were defined at compile/run-time. And the Nulan IDE (http://pauan.github.io/nulan/doc/tutorial.html) understands boxes as well, so that if you click on a box, it will highlight it. So even though two variables may have the same name, the Nulan IDE knows they're two different boxes.

By the way, in Nulan, the correct way to write the "when" macro is like this:

  $mac when -> test @body
    'if test
       | ,@body
Basically, Nulan doesn't have "quasiquote". Instead, "quote" supports "unquote" and "unquote-splicing". In addition, "quote" returns boxes rather than symbols, so it is by default hygienic.

If you want to write an unhygienic macro, you can use the "sym" function, which takes a string and converts it into a symbol:

  $mac aif -> test @rest
    w/box it = sym "it"
      'w/box it = test
         if it ,@rest
I prefer this way of doing things because it reduces the amount of syntax (no quasiquote), it means everything is hygienic by default, and it allows you to break hygiene, but you have to explicitly use the "sym" function. Because unhygienic macros are rare, I think this is a good tradeoff.

Also, you might have noticed that the above two macros don't use "unquote". As a convenience, when using "quote", if the box is local (bound inside a function), it will automatically "unquote" it, meaning that these two are equivalent:

  w/box foo = 5
    'foo

  w/box foo = 5
    ',foo
The reason for having "unquote" at all is for splicing in expressions:

  # returns (5 + 10)
  w/box foo = 5
    '(foo + 10)

  # returns 15
  w/box foo = 5
    ',(foo + 10)
---

Actually, I don't mind unhygienic macros that much, so "hygienic macros" isn't the primary reason I love boxes. The main reason I like them is because they make it really easy to implement hyper-static scope. And once your language has hyper-static scope, it's very easy to make an amazingly simple, concise, powerful, and flexible namespace system.

So boxes really kill multiple birds with one stone, in a really simple to understand and simple to implement way. They may not have the generality of vau or functions, but they're a practical data structure that does a lot for little cost.

-----

3 points by shader 4580 days ago | link

I can understand the logic for not using quotes in favor of simpler techniques for hygenic macros, but I still like having them as short hand for really common list manipulations. Or even just an easy way to make a symbol. Of course, I've also always wanted something like python's list and *dict for splicing a list onto the end of an argument set without having to do something ugly like (apply f (join (list a b) c)). (f a b @c) looks so much nicer to me...

Anyway, aside from aesthetics, boxes really do seem useful. Another thought that just occurred to me is that they should enable trivial implementation of setforms and "out args".

I.e., if (car (cdr (obj key))) returned a box, you would be able to just call '= on it, with no thought for how you found the value.

And for out args, all you need to do is have double-boxes for function arguments. If you access the outer box, 'get is passed through to the inner box and you just receive the value. If you just call '= on the outer box, the value gets shadowed like normal, because 'set would not be passed through by default. If you want to modify the external variable, all you have to do is unbox it and call 'set on the original box.

I feel like there might be a similar way to enable dynamically scoped functions with first class environments, but I'm not sure.

-----

3 points by shader 4580 days ago | link

I've been thinking about this some more, and it seems to me that boxes in the way Nulan uses them make for a completely different evaluation paradigm than the traditional metacircular evaluation.

In the traditional evaluation scheme, using environments, symbols, read, eval, and apply, the process is as follows:

  1) read turns text into code (lists of symbols and values)
  2) eval looks up a symbol in the current environment, or expands a macro, and calls apply to get the value of a function
  3) apply calls eval to get the value of the arguments, and then returns the value of the function
In a compile-time based box scenario, the environments go away and read/eval change to something like this:

  1) parse the new code into lists of appropriately boxed values, and expand macros
  2) recursively apply functions to values to get the final result.
More of a traditional compile/run separation. The main difference to note in the box formulation is that there is no such thing as an environment. The 'hyperstatic scope' that Pauan keeps mentioning is another way of saying, 'there is no scope', only variables.

Thus, there is no real way to make variations on the scoping scheme using hyperstatic scope, because that's all a read/compile time feature. Dynamic scope is impossible without tracking environments, which are otherwise unnecessary.

Now, if one were to use both first-class environments and boxes, one can theoretically have the best of both worlds, with only a little extra overhead, and most of that at compile time. Flexible scoping becomes possible by specifically referencing the variable in the environment instead of using the default boxed values. They would still be boxes, just referenced by symbol in the environment table.

Now what I'm wondering is whether there is any useful equivalence to be found between a box and an environment. Heck, just make a box an object that has a few hard coded slots (that can be accessed by array offset) and a hash table for the rest, and voila! It can replace cons cells and regular hash tables too :P All we need is a reason for making everything use the same core structure...

-----

2 points by Pauan 4580 days ago | link

Yes, that's intentional. Because I wanted to compile to fast JavaScript, I chose an evaluation model that has a strict separation between compile-time and run-time.

---

"The 'hyperstatic scope' that Pauan keeps mentioning is another way of saying, 'there is no scope', only variables."

Actually, there is still scope. After all, functions still create a new scope. It's more correct to say that, with hyper-static scope, every time you create a variable, it creates a new scope:

  box foo = 1     # set foo to 1
  def bar -> foo  # a function that returns foo
  box foo = 2     # set foo to 2
  bar;            # call the function
In Arc, the call to "bar" would return 2. In Nulan, it returns 1. The reason is because the function "bar" is still referring to the old variable "foo". The new variable "foo" shadowed the old variable, rather than replacing it like it would in Arc. That's what hyper-static scope means.

From the compiler's perspective, every time you call "box", it creates a new box. Thus, even though both variables are called "foo", they're separate boxes. In Arc, the two variables "foo" would be the same box.

---

"Thus, there is no real way to make variations on the scoping scheme using hyperstatic scope, because that's all a read/compile time feature. Dynamic scope is impossible without tracking environments, which are otherwise unnecessary."

If by "dynamic scope" you mean like Arc where globals are overwritten, then that's really easy to do. As an example of that, check out Arc/Nu, which also uses boxes:

https://github.com/Pauan/ar

In the Arc/Nu compiler, it literally only takes a single line of code to switch between Arc's dynamic scope and hyper-static scope.

If by "dynamic scope" you mean like Emacs Lisp, then... actually that should be really easy as well. You would just replace the same symbol with the same box, and then use box mutation at run-time. Of course, at that point I don't think there'd be any benefit over run-time environments... In any case, I like lexical scope, and boxes work well for lexical scope.

Also, although Nulan has hyper-static scope, it does have dynamic variables. The way that it works in Nulan is, you create a box like normal...

  box foo = 1
...and then you can dynamically change that variable within a code block:

  w/box! foo = 2
    ...
Within the "w/box!" block, the variable "foo" is 2, but outside, it is 1. You can do this with any variable in Nulan.

---

"Now what I'm wondering is whether there is any useful equivalence to be found between a box and an environment."

No. And that's a good thing. I've found one of the major benefits of boxes is that they're a single stand-alone entity: each variable corresponds to a single box. In the environment model, you have a hash table which maps symbols to values, so you have a single data structure representing many variables.

The reason I prefer each variable being represented by a separate box is that it makes namespaces really really easy to design and implement. For instance, in Arc/Nu you can selectively import only certain variables:

  (w/include (foo bar qux)
    (import some-file))
This is really easy with boxes: you simply grab the "foo", "bar", and "qux" boxes and import them. But with environments, it's a lot harder, because the environment for the file "some-file" may contain all kinds of variables that you don't want to import.

You could take the values from the environment and copy them into the current environment, but now any changes made won't show up. With boxes, the changes show up. I don't think environments work well for namespace systems. But boxes work wonderfully well.

---

"It can replace cons cells and regular hash tables too :P"

I've thought about ways to replace hash tables with boxes, but I don't think it'd be especially helpful. Hash tables serve a different purpose from boxes, so it makes sense for a language to have both data structures. If you want a box, just use a box. If you want a hash table, just use a hash table.

Interestingly enough, both Nulan and Arc/Nu have a giant hash table at compile-time that maps symbols to boxes. This is basically the compile-time equivalent of the run-time environments that languages like Kernel have. Creating a new scope or changing the existing scope is as easy as changing this hash table, which makes it really easy to play around with different namespace systems.

-----

2 points by rocketnia 4579 days ago | link

"The main difference to note in the box formulation is that there is no such thing as an environment."

Doesn't step 1 need to use environment(s)? How else would it turn text into a structure that contains references to previously defined values?

---

"The 'hyperstatic scope' that Pauan keeps mentioning is another way of saying, 'there is no scope', only variables."

The hyper-static global environment (http://c2.com/cgi/wiki?HyperStaticGlobalEnvironment) is essentially a chain of local scopes, each one starting at a variable declaration and continuing for the rest of the commands in the program (or just the file). I think the statement "there is no scope" does a better job of describing languages like Arc, where all global variable references using the same name refer to the same variable.

---

"Flexible scoping becomes possible by specifically referencing the variable in the environment instead of using the default boxed values. They would still be boxes, just referenced by symbol in the environment table."

I don't understand. If we're generating an s-expression and we insert a symbol, that's because we want that symbol to be looked up in the evaluation environment. If we insert a box, that's because we want to look up the box's element during evaluation. Are you suggesting a third thing we could insert here?

Perhaps if the goal is to make this as dynamic as possible, the inserted value should be an object that takes the evaluation environment as a parameter, so that it can do either of the other behaviors as special cases. I did something as generalized as this during Penknife's compilation phase (involving the compilation environment), and this kind of environment-passing is also used by Kernel-like fexprs (vau-calculus) and Christiansen grammars.

---

"Now what I'm wondering is whether there is any useful equivalence to be found between a box and an environment."

I would say yes, but I don't think this is going to be as profound as you're expecting, and it depends on what we mean by this terminology.

I call something an environment when it's commonly used with operations that look vaguely like this:

  String -> VariableName
  (Environment, VariableName) -> Value
  (Environment, Ast) -> Program
Meanwhile, I call something a box when it's primarily used with an operation that looks vaguely like this:

  Box -> Value
This might look meaningless, but it provides a clean slate so we can isolate some impurity in the notion of "operation" itself. When a mutable box is unboxed, it may return different values at different times, depending on the most recent value assigned to that box. Other kinds of boxes include Racket parameters, Racket continuation marks, Racket promises, JVM ThreadLocals, and JVM WeakReferences, each with its own impure interface.

When each entry of an environment needs to have its own self-contained impure behavior, I typically model the environment as a table which maps variable names to individual boxes. The table's get operation is pure (or at least impure in a simpler way), and the boxes' get operation is is impure.

You were wondering about equivalences, and I have two in mind: For any environment, (Environment, VariableName) is a box. For any box, we can consider that box to be a degenerate environment where the notion of "variable name" includes only a single name.

-----

1 point by Pauan 4579 days ago | link

"Doesn't step 1 need to use environment(s)?"

I think he's talking about run-time environments a la Kernel, Emacs Lisp, etc.

Nulan and Arc/Nu use a compile-time environment to replace symbols with boxes at compile-time. But that feels quite a bit different in practice from run-time environments (it's faster too).

---

"Are you suggesting a third thing we could insert here?"

Once again, I think he's referring to run-time environments. Basically, what he's saying is that you would use boxes at compile-time (like Nulan), but you would also have first-class environments with vau. The benefit of this system is that it's faster than a naive Kernel implementation ('cause of boxes), but you still have the full dynamicism of first-class run-time environments. I suspect there'll be all kinds of strange interactions and corner cases though.

---

Slightly off-topic, but... I would like to point out that if the run-time environments are immutable hash tables of boxes, you effectively create hyper-static scope, even if everything runs at run-time (no compile-time).

On the other hand, if you create the boxes at compile-time, then you can create hyper-static scope even if the hash table is mutable (the hash table in Nulan is mutable, for instance).

-----

2 points by Pauan 4580 days ago | link

I'm not sure I understand your argument... aside from macros, how often do you use symbols? The only other case I can think of is using symbols as the keys of hash tables:

  (= foo (obj))
  (foo 'bar)
But in Nulan, hash tables use strings as keys, so that's no problem. As for your "really common list manipulation"... my system is actually better for that. Compare:

  (cons a b)                  ; Arc 3.1
  `(,a ,@b)                   ; Arc 3.1

  (list a b c)                ; Arc 3.1
  `(,a ,b ,c)                 ; Arc 3.1

  (join (list a) b (list c))  ; Arc 3.1
  `(,a ,@b ,c)                ; Arc 3.1

  'a ,@b                      # Nulan
  {a @b}                      # Nulan

  'a b c                      # Nulan
  {a b c}                     # Nulan

  'a ,@b c                    # Nulan
  {a @b c}                    # Nulan
---

"(f a b @c) looks so much nicer to me..."

I agree. Nulan supports @ splicing for both pattern matching and creating lists:

  # a is the first argument
  # b is everything except a and c
  # c is the last argument
  -> a @b c ...

  # works for list destructuring too!
  # this function accepts a single argument, which is a list
  # a is the first element of the list
  # b is everything in the list except a and c
  # c is the last element of the list
  -> {a @b c} ...

  # in addition to working for function arguments, it also works for assignment
  # a is 1
  # b is {2 3 4}
  # c is 5
  box {a @b c} = {1 2 3 4 5}

  # and of course you can nest it as much as you like
  box {a {b c {d} @e}} = {1 {2 3 {4} 5 6 7}}

  # equivalent to (join (list a) b (list c)) in Arc
  {a @b c}

  # equivalent to (apply foo bar qux) in Arc
  foo bar @qux
  
  # equivalent to (apply foo (join bar (list qux)))
  foo @bar qux
---

"I.e., if (car (cdr (obj key))) returned a box, you would be able to just call '= on it, with no thought for how you found the value."

In Nulan, you could do what you're talking about, except it would all have to be done at compile-time, because boxes only exist at compile-time. In addition, you wouldn't be able to set the box directly, you would instead generate code that sets the box at run-time, i.e. macros.

Nulan has the restrictions that it does because I wanted to compile to really fast JavaScript. Other languages (like Arc) don't have that restriction, so it should be possible to design a language that has boxes at run-time, in which case your idea should work.

-----

3 points by shader 4580 days ago | link

I actually really like symbols, and the existence of a symbol type in lisp is one of my favorite features. Technically in Nulan a 'symbol' is replaced by a 'box', and if your box had a string property called "name" that held the name of the variable they would probably be interchangeable.

Either way, I agree that your list tools are generally more useful than quote/unquote. The main things I use it for are 1) to get a literal symbol and 2) to get something like list splicing. It looks rather arcane and adds clutter though, so in most other cases I wouldn't use it.

If there was another way to just splice in a list in the middle of the code the way Nulan seems to, I might not feel as attached to it.

-----

2 points by Pauan 4580 days ago | link

"I actually really like symbols, and the existence of a symbol type in lisp is one of my favorite features."

I too like symbols. But I think if you examine why you like symbols, you'll realize that you like them because... most other languages don't have a first-class way to refer to variables. But in Lisp you can, using symbols.

Well, Nulan has both symbols (representing unhygienic variables), and boxes (representing hygienic variables). It's just that hygienic variables are so much better in so many situations that there's not much reason to make it easy to create symbols in Nulan.

---

"Technically in Nulan a 'symbol' is replaced by a 'box', and if your box had a string property called "name" that held the name of the variable they would probably be interchangeable."

Yes boxes have a name property. This is currently only used when printing the box. Yes you could convert from a box to a symbol, but Nulan doesn't do this, because I haven't found a reason to.

-----

1 point by shader 4579 days ago | link

Yes, that is part of the reason I like them so much. The other part is that they're a way to legally use bare words as part of the syntax.

If json had a symbol type, Wat would be a lot less ugly.

-----

1 point by Pauan 4580 days ago | link

Actually, there is ONE situation I've encountered where I would have liked to use symbols... in my playlist program, I have a list of file extensions which are considered "audio". Here's how it looks in Arc:

  '(flac flv mid mkv mp3 mp4 ogg ogm wav webm wma)
In Nulan, that would have to be written like this:

  {"flac" "flv" "mid" "mkv" "mp3" "mp4" "ogg" "ogm" "wav" "webm" "wma"}
Or perhaps like this:

  "flac flv mid mkv mp3 mp4 ogg ogm wav webm wma".split " "
To work around that, I wrote a short and simple "words" macro:

  $mac words -> @args
    '{,@(args.map -> x "@x")}

  words flac flv mid mkv mp3 mp4 ogg ogm wav webm wma

-----

2 points by akkartik 4580 days ago | link

Wart uses the @splice notation. I've sung its praises before: http://arclanguage.org/item?id=17281

-----

3 points by shader 4579 days ago | link

This relates to yet another idea I think I've actually mentioned before.

Namely, making the lists that form the code/ast have reverse links, so you can mutate above the macro call level, instead of just insert arbitrary code in place. This wouldn't be feasible for general lists, as it is possible for a sub-list to be referenced in more than one place, but for code, each piece is generally considered unique even if it looks the same.

Anyway, this would allow for affects ranging from splicing to arithmetic and other much more evil and nefarious but possibly useful effects. I haven't thought through all of the implications, and I bet most of them are negative, but it would still be interesting to consider.

An implementation of intermediate splicing would be something like:

  (mac (list)
    (= (cdr list) (cdr (parent)))
    (= (cdr (parent)) list))
Where you replace (parent) with whatever technique would get the parent cons cell whose car is the macro call.

-----

3 points by rocketnia 4577 days ago | link

"An implementation of intermediate splicing would be something like[...]"

Which of these interpretations do you mean?

  (list 1 2 (splice 3 4) 5)
  -->
  (list 1 2 3 4 5)
  
  
  (list 1 2 (splice (reverse (list 4 3))) 5)
  -->
  (list 1 2 3 4 5)
I wrote the rest of this post thinking you were talking about the first one, but right at the end I realized I wasn't so sure. :)

---

"Namely, making the lists that form the code/ast have reverse links, so you can mutate above the macro call level, instead of just insert arbitrary code in place."

I'll make an observation so you can see if it agrees with what you're thinking of: The expression "above the macro call level" will always be a function call or a special form, never a macro call. If it were a macro call, we'd be expanding that call instead of this one.

---

For these purposes, it would be fun to have a cons-cell-like data structure with three accessors: (car x), (cdr x), and (parent x). The parent of x is the most recent cons-with-parent to have been constructed or mutated to have x as its car. If this construction or mutation has never happened, the parent is nil.

Then we can have macros take cons-with-parent values as their argument lists, and your macro would look like this:

  (mac splice list
    (= (cdr list) (cdr (parent list)))
    (= (cdr (parent list)) list))
Unfortunately, if we call (list 1 2 (splice 3 4) 5), then when the splice macro calls (parent list), it'll only see ((splice 3 4) 5). If it calls (parent (parent list)), it'll see nil.

---

Suppose we have a more comprehensive alternative that lets us manipulate the entire surrounding expression. I'll formulate it without the need to use conses-with-parents or mutation:

  ; We're defining a macro called "splice".
  ; The original code we're replacing is expr.
  ; We affect 1 level of code, and our macro call is at location (i).
  (mac-deep splice expr (i)
    (let (before ((_ . args) . after)) (cut expr i)
      (join before args after)))
If I were to implement an Arc-like language that supported this, it would have some amusingly disappointing consequences:

  (mac-deep subquote expr (i)
    `(quote ,expr))
  
  
  (list (subquote))
  -->
  (quote (list (subquote)))
  
  
  (do (subquote))
  -->
  ((fn () (subquote)))
  -->
  ((quote (fn () (subquote))))
  
  
  (fn (subquote) (+ subquote subquote))
  -->
  (fn (subquote) (+ subquote subquote))

-----

2 points by Pauan 4585 days ago | link | parent | on: Regular expressions in Arc

I would like to point out that although it's a Pratt parser, it's been specifically modified to work better with Lisps, meaning it operates on lists of symbols rather than on a single expression. I have not seen another parser like it.

This makes it much more powerful while also being much easier to use. Using the Nulan parser, adding in new syntax is as easy as writing a macro!

For instance, in Nulan, the "->" syntax is used for functions:

  foo 1 2 3 -> a b c
    a + b + c
The above is equivalent to this Arc code:

  (foo 1 2 3 (fn (a b c)
    (+ a b c)))
And here's how you can implement the "->" syntax in Nulan:

  $syntax-rule "->" [
    priority 10
    order "right"
    parse -> l s {@args body}
      ',@l (s args body)
  ]
As you can see, it's very short, though it might seem cryptic if you don't understand Nulan. Translating it into Arc, it might look like this:

  (syntax-rule "->"
    priority 10
    order "right"
    parse (fn (l s r)
            (with (args (cut r 0 -1)
                   body (last r))
              `(,@l (,s ,args ,body)))))
The way that it works is, the parser starts with a flat list of symbols. It then traverses the list looking for symbols that have a "parse" function.

It then calls the "parse" function with three arguments: everything to the left of the symbol, the symbol, and everything to the right of the symbol. It then continues parsing with the list that the function returns.

So basically the parser is all about manipulating a list of symbols, which is uh... pretty much exactly what macros do.

Going back to the first example, the "parse" function for the "->" syntax would be called with these three arguments:

  1  (foo 1 2 3)
  2  ->
  3  (a b c (a + b + c))
It then destructures the 3rd argument so that everything but the last element is in the variable "args", and the last element is in the variable "body":

  args  (a b c)
  body  (a + b + c)
It then generates the list using "quote", which is then returned.

Basically, it transforms this:

  foo 1 2 3 -> a b c (a + b + c)
Into this:

  foo 1 2 3 (-> (a b c) (a + b + c))
As another example, this implements Arc's ":" ssyntax, but at the parser level:

  $syntax-rule ":" [
    priority 100
    order "right"
    delimiter %t
    parse -> l _ r
      ',@l r
  ]
So now this code here:

  foo:bar:qux 1 2 3
Will get transformed into this code here:

  foo (bar (qux 1 2 3))
I've never seen another syntax system that's as easy and as powerful as this.

Oh yeah, and there's also two convenience macros:

  $syntax-unary "foo" 20
  $syntax-infix "bar" 10
The above defines "foo" to be a unary operator with priority 20, and "bar" to be an infix operator with priority 10.

Which basically means that...

  1 2 foo 3 4  =>  1 2 (foo 3) 4
  1 2 bar 3 4  =>  1 (bar 2 3) 4
Here's a more in-depth explanation of the parser:

https://github.com/Pauan/nulan/blob/780a8f46cb4ff90e849c03ea...

Nulan's parser is powerful enough that almost all of Nulan's syntax can be written in Nulan itself. The only thing that can't be is significant whitespace.

Even the string syntax (using "), whitespace ( ), and the various braces ({[]}) can be changed from within Nulan.

-----

2 points by Pauan 4585 days ago | link | parent | on: Regular expressions in Arc

I've used that library, and I think it's really awesome. It feels very Arc-like while still supporting regexp stuff.

-----

3 points by Pauan 4585 days ago | link | parent | on: Arc DB

Because it's the simplest solution for Arc. Arc does not have any kind of SQL library, and it doesn't even have an officially supported way to drop down to Racket libraries, nor does it have any kind of FFI whatsoever. So flat files are indeed the fastest way to get up and running.

Perhaps in the long run it might be better to have some sort of way to connect to an SQL DB, but keep in mind that Arc is still a work in progress, and unfortunately has not been updated in a long time. For a while, Arc didn't even have Unicode support, because pg was working on more important things.

If you think of Arc as being a prototype, then it all makes sense. pg probably intended to eventually flesh it out into a full language, but I hear he hasn't had the time.

-----

3 points by shader 4585 days ago | link

If arc had support for something like DBA or SQLAlchemy built in, I might just have used it with either postgres or sqlite. However, neither of those databases really fit the arc data model very well, imo, because arc is very hash table and list oriented. Objects have very little in the way of a set schema, and hash tables map pretty well to... hash tables.

Anyway, I mostly want to leave all the objects in memory and use direct references between them; my data relations aren't that complicated, and explicit relations where necessary are actually fairly efficient. In fact, that's what most orm's a la SQLAlchemy seem to do; whenever an object is loaded, you can specify desired relations that also get loaded in memory, so you don't have to explicitly query the database each time.

Memory is cheap these days, and I was hoping for something that allowed versioning and perhaps graph-db features.

-----

2 points by akkartik 4585 days ago | link

Hmm, do you care about threading and consistency at all? If not, you could probably do everything with just arc macros over the existing flat file approach..

-----

3 points by shader 4583 days ago | link

I think that some form of scalability would be valuable, but that could easily be achieved with some sort of single threaded worker for each db 'server', and then have multiple instances running to provide the scalability. In order to make the single threaded semantics work well even in a multi-threaded application, I already have a short library for erlang-style pattern matched message passing.

Given the data volumes I've been planning on working with, I mostly want to use the permanent storage for history and fault tolerance, as opposed to live access. That could probably be handled in-memory for the most part. So maybe some form of flat file system would work without causing too many problems.

I originally started using git to effectively achieve that design without having to manage the trees, history, and diff calculation myself, but I've discovered that storing thousands of tiny objects in the git index may not be very efficient. I still think something similar is a good idea, but I would want to separate 'local' version states for each object from the 'global' version, so that it doesn't take forever to save the state of a single object. Maybe storing each object in a git 'branch' with the guid of the object as the branch name would work, since only one object would be in each index. The overhead for saving each object would be slightly higher, but it should be constant, rather than linear with the total number of objects.

Any obvious flaws with that idea that I'm missing? Have any better ideas or foundations to build off of?

-----

1 point by akkartik 4583 days ago | link

Building atop git is an interesting idea, and you clearly have more experience with it. Do you have any pointers to code?

-----

3 points by shader 4583 days ago | link

Here's the code I had written before, using the shell git interface to interact with the repo: https://github.com/shader/metagame/blob/master/git-db.arc

That code is pretty rudimentary, but allows low level access to git commands from arc, plus storage and retrieval of arc objects. After my previous comment though, I'll probably change it so that each object gets a separate branch, with 'meta branches' listing which branches to load if necessary.

-----

1 point by akkartik 4585 days ago | link

Let's build this for the LISP contest! http://arclanguage.org/item?id=17640

-----

4 points by dido 4585 days ago | link

So I guess that makes a consistent foreign function interface something very important for Arcueid to start having then. I think I've built up a foreign function API (sorta) and am now working out the details for dynamic loading of shared libraries so you can do something like (load "mysql.so") and have it dynamically link into Arcueid, as well as a way to compile C sources into such an extension shared object.

-----


"[...] though how it could do that without also transforming into Lisp is another matter."

My experiments with Nulan have convinced me that it's easier (and better) to start with a Lisp and then add Ruby syntax which desugars to S-expressions, rather than trying to turn Ruby into a Lisp.

-----

2 points by rocketnia 4598 days ago | link

I assume by "s-expressions" we mean nested lists whose first elements are often symbols which represent operators. Apparently, Ruby already has built-in support for that code representation: http://stackoverflow.com/questions/6894763/extract-the-ast-f...

-----

1 point by Pauan 4598 days ago | link

Well then, using that it'd be possible to make Ruby macros.

-----

2 points by rocketnia 4598 days ago | link

Hmm, even if the package for doing this comes with Ruby MRI, I'm not sure there's a corresponding out-of-the-box way to take these s-expressions and get running code. But here's a five-year-old article talking about Ruby2Ruby, a library that converts these s-expressions to strings: http://www.igvita.com/2008/12/11/ruby-ast-for-fun-and-profit...

I don't think a third-party tool should count as language support, unless (and inasmuch as) the people maintaining the language regularly go out of their way to avoid breaking that tool.[1][2] This is where I draw the line, because of how it affects backwards compatibility.

Everyday language users tend to think of language updates as backwards-compatible as long as their existing code has the same behavior. But tools like eval(), JavaScript minifiers, and Ruby2Ruby place the bar higher, since their input is arbitrary language code, and it might be written in a newer language version than the tool itself. Incidentally, even a program that calls these tools will continue to work after a language update, as long as the input of these calls is all generated programmatically, rather than directly taken from the program's input.

[1] Well, I guess the term "third-party" might not apply in that case!

[2] I have no idea how much I'm actually talking about Ruby2Ruby, and this Ruby s-expression format in general, when I make this comment.

-----


If I had to give a reason for making Arc/Nu... well, that's really complicated and involves a lot of history and backstory.

Basically, before Arc/Nu, there was ar, which was created by awwx. It was a rewrite of Arc in Racket, where the compiler was intentionally exposed to Arc. This meant you could not only access the compiler from Arc, but could actually change it!

So rather than writing the compiler in Racket, you'd write the minimum necessary to get things going, then write the rest of the compiler in Arc. This naturally made the compiler very easy to hack for Arc users.

I really liked that, but I also disliked certain things in ar. One of those is that all the compiler stuff is prefixed with "racket-", which is really verbose.

Another dissatisfaction is that although awwx did add in a module system for Arc, it was very... uh... basically you could create multiple Racket namespaces, which are completely separate from each other, and load Arc into each one.

The problem is that because they're separate, it's really difficult for them to communicate with each other. Which kinda defeats the point.

Another problem is that you had to reload Arc every time you created a new namespace, which meant ~1-2 second pauses every single time. Which is pretty ridiculous.

I felt I could do better, so I made a fork of ar and hacked away on it. But having to use the "racket-" prefix really bothered me, so I experimented with ways to get rid of it.

Eventually I decided to start over with pg's vanilla Arc 3.1, rather than ar. I fixed it so it would work in Racket rather than mzscheme, and then I did significant refactoring, making things faster, using new techniques I had discovered, etc.

I also added in a module system which was exactly the same as ar's module system, except that you could make a module that inherits from another module. This solved the "have to reload Arc all the time" problem, and also made it a bit easier to communicate between modules.

The end result was Arc/Nu. It had all the benefits of ar, all the benefits of Arc 3.1, and none of the downsides of either one.

In the end, though, Arc/Nu's module system was terrible. Yes, you could now make modules that inherit from other modules, but I kept running into all kinds of problems, both theoretical and practical. But it was the best module system I knew of.

I wrote some programs in Arc/Nu, and after many months passed, I discovered hyper-static scope.

Before I continue with this story, I'll need to explain Nulan. Nulan is a Lisp I created from scratch. It has hyper-static scope, hygienic macros, and it compiles to JavaScript and can thus be used in web browsers.

I decided to make Nulan because I wanted to experiment with cool new things like hyper-static scope, but I couldn't do that in Arc/Nu because that would break backwards compatibility with Arc 3.1. I also wanted a language for web browsers that would be better than JavaScript.

While implementing hyper-static scope in Nulan, I discovered how amazing boxes are. Seriously, they're great. Using them, you can create a module system which is easily thousands of times better than any other module system I had ever seen.

Dido posted a topic about a module system he was designing for Arc, and I replied with this: http://arclanguage.org/item?id=17408

After that, I realized that I had to change Arc/Nu to use boxes. And so I did. The module system in Arc/Nu is now amazing, and no longer based upon Racket namespaces.

As for whether I'm still using Arc/Nu, the answer is yes. I like using it for writing shell scripts, since I find Arc vastly better at that task than Python or Bash.

-----

3 points by lark 4593 days ago | link

It would be great if one could easily write in Arc as opposed to Javascript. Nulan and RainbowJS are in a position to infiltrate web-based programming through the browser.

-----

1 point by kinleyd 4599 days ago | link

Thanks Pauan, and more power to you too!

-----

1 point by Pauan 4605 days ago | link | parent | on: Install hackernews Problem

Please post the file permissions for arc/admins:

  ls -l -d arc/admins

-----

1 point by cassano 4605 days ago | link

Did not have the file arc/admins No such file or directory

I am a tiro for linux and my english is also bad I am chinese:)

-----

1 point by Pauan 4604 days ago | link

Try this:

  mkdir arc
  echo "yourname" > arc/admins
You shouldn't be running Arc as root. Using sudo means you're running as root, so leave off the sudo.

-----

2 points by cassano 4604 days ago | link

Tks a lot ! it's ok thk U very much ! Welcome to Beijing ! but the air is not good

-----

2 points by Pauan 4606 days ago | link | parent | on: How do you http redirect in Arc?

The reason it does that is so that things like [+ _ 1] work. Under your suggestion, you would have to write [(+ _ 1)] which looks ugly.

Rather than using [(prn ...)] you can just use [prn ...]

---

Incidentally, using Arc/Nu, it's easy to make the square brackets behave the way you want them to:

  (mac square-brackets args
    `(fn (_) (do ,@args)))
I believe Anarki has something similar as well.

-----

3 points by Pauan 4606 days ago | link | parent | on: New version of Arc/Nu

New version:

https://github.com/Pauan/ar/commit/30aa79bbd1082bf182e07d86f...

Now files only export the boxes they create. Which means...

  ; foo
  (def foo ())

  ; bar
  (import foo)
  (def bar ())

  ; qux
  (import bar)
  (prn bar) ; works
  (prn foo) ; error
Basically, "qux" imported "bar" which imported "foo", but "bar" only exported the boxes it defined, thus it didn't export "foo".

This is not only cleaner (and faster), but it opens up the possibility of having two separate Arc versions operating side by side, in the same namespace, able to communicate with each other.

That's important because I really want to have Arc/Nu use hygienic macros and hyper-static scope by default, but that would break Arc 3.1 compatibility. But now I can have Arc/Nu and Arc 3.1 operating at the same time. This was possible before, but was very difficult. Now it's easy.

-----

2 points by Pauan 4606 days ago | link

https://github.com/Pauan/ar/commit/d3971088cdf3160124d9affa7...

Arc/Nu now uses hyper-static scope and hygienic macros by default. This means I had to change "02 arc" in the following ways:

* Rearranged things so that they are always defined before they are used.

* Fixed some code duplication (e.g. alist now uses acons).

* Fixed some places where macros were supposed to use gensyms but didn't.

* Fixed unhygienic macros (like aif) so they work in the new hygienic macro system.

* The various defining macros (def, mac, etc.) all use "var" now rather than "assign".

* The "defs" macro has been changed a bit and now has an actual use.

* Added in a few new macros, like redef, remac, w/sym.

For the most part, though, I tried to change it as little as possible.

Thanks to this, I also found a few typos in the Arc/Nu libraries.

By the way, I made sure to change it in such a way that it works correctly whether hyper-static/hygienic-macros is turned on or off.

And of course, the Arc/Nu compiler can still load Arc 3.1's "arc.arc" unmodified, if you want to do that.

-----

2 points by Pauan 4606 days ago | link

By the way, if you want to write macros that work correctly in both Arc 3.1 and Arc/Nu, here's how.

Non-anaphoric macros should Just Work(tm).

Anaphoric macros (aif, awhen, etc.) need to be tweaked a bit. First, let's look at the definition in Arc 3.1:

  (mac awhen (expr . body)
    `(let it ,expr (if it (do ,@body))))
Now, there's two ways to make this work in Arc/Nu. First, you can unquote+quote:

  (mac awhen (expr . body)
    `(let ,'it ,expr (if ,'it (do ,@body))))
Secondly, you can use "w/sym":

  (mac awhen (expr . body)
    (w/sym it
      `(let it ,expr (if it (do ,@body)))))
I personally recommend "w/sym", since you can just slap it onto the front and be done, but it's up to you.

And that's it. Now your macros work in both unhygienic Arc 3.1 and hygienic Arc/Nu.

-----

2 points by Pauan 4605 days ago | link

After some thought, I decided to stop half-assing it. What I've been trying to do is make a new implementation of Arc that is backwards compatible with Arc 3.1 but adds new features and fixes bugs.

The problem is that eventually I'll run into features I want to add, or bugs I want to fix, and I won't be able to do that without breaking compat with Arc 3.1.

This is very restricting, and part of Arc is about unbridled freedom. I thought about going the pg route and just saying "screw compat!" but if I did that I'd just end up with Nulan, so what's the point?

So instead, here's what I'm gonna do. There will be two separate languages implemented by the Arc/Nu compiler: Arc/Nu and Arc/3.1.

Arc/3.1 is the Arc you know and love. Unhygienic macros, dynamically scoped globals, and most of the warts and bugs and missing features.

Arc/Nu is a language similar to Arc, but with new features, bug fixes, and general cleanup. It has hyper-static scope and hygienic macros. It doesn't try to maintain backwards compat with Arc 3.1.

Now, you might be wondering, what's the point? I mean, we already have vanilla Arc 3.1 as implemented by pg, so why should Arc/Nu support Arc 3.1 at all?

The twist is that thanks to the magical power of booooxes, I can have Arc/3.1 and Arc/Nu running in the same namespace. This means Arc/3.1 can import Arc/Nu stuff, and Arc/Nu can import Arc/3.1 stuff.

And it all works correctly, so that if you define an unhygienic macro in Arc/3.1 and import it into Arc/Nu, it will behave unhygienically. And vice versa, if you define a hygienic macro in Arc/Nu and import it into Arc/3.1, it will behave hygienically.

And you can do things like import an Arc/3.1 library and mutate its stuff, or do the same to an Arc/Nu library. There's no restrictions between the two, because it all runs in a single namespace.

This means you can write new code with all the shiny features, yet still use old Arc/3.1 libraries. And if you just plain like using Arc/3.1 but there's some Arc/Nu library you want to use? You can.

And if pg ever releases Arc 4, I can just add it as a third language, which will let it interact with both Arc/Nu and Arc/3.1. This is all very very easy to do thanks to boxes.

-----

2 points by Pauan 4603 days ago | link

https://github.com/Pauan/ar/commit/2c7c5f8f0ea2f313a47fadb2a...

After some major refactoring, I reverted it back to use vanilla Arc 3.1.

Now, Arc/Nu has a rudimentary "multiple language" feature. When using the "arc" executable, you can now specify the "--lang" property.

For instance, when using "arc --lang foo", it will look for a folder called "foo" in the "lang" subdirectory. This folder should contain a file called "main" which should be written in Racket. Using this file, you can implement custom languages using the Arc/Nu compiler.

Currently, only two languages are supported: "arc/nu" and "arc/3.1"

https://github.com/Pauan/ar/tree/2c7c5f8f0ea2f313a47fadb2ae6...

https://github.com/Pauan/ar/tree/2c7c5f8f0ea2f313a47fadb2ae6...

As you can see, "nu.nu" is quite sparse at the moment, but it does load correctly if you use "arc --lang arc/nu". With some more refactoring, I'll be able to make it so that arc/nu and arc/3.1 can use libraries written in the other language.

-----

1 point by Pauan 4603 days ago | link

https://github.com/Pauan/ar/commit/6352cbffc8866fac7b19ed3b3...

Now multiple languages are fully supported. The way it works is that, within a file, you can use "w/lang" to temporarily change the language. For instance, suppose you had the following "arc/3.1" program:

  (w/lang arc/nu
    (var foo 1))

  (= bar (+ foo 2))
Within the "w/lang" block, it's using "arc/nu", but outside, it's using "arc/3.1"! Using this, it's easy to import libraries written in "arc/nu":

  (w/lang arc/nu
    (import foo))
And vice versa, if you're writing a program in "arc/nu", you can use "w/lang" to import things from "arc/3.1":

  (w/lang arc/3.1
    (import foo))
By the way, because all of this is using boxes at compile-time, there's zero runtime cost. The only downside is that if you use two languages at once, it has to load both of them, which increases the startup time.

-----

4 points by Pauan 4603 days ago | link

Well! I just learned something new! Racket's "procedure-rename" is ridiculously slow. In case you don't know what I'm talking about, it's just a function that lets you rename functions:

  (procedure-rename (lambda () 1) 'foo)
Anyways, here's the timing tests I did for Arc/Nu using "procedure-rename":

  > (+ 1 2)
  Arc/Nu   iter: 45,419,082   gc:   0   diff:  0.00%
  Arc 3.1  iter: 52,867,613   gc: 188   diff: 16.40%

  > (no ())
  Arc/Nu   iter: 26,594,507   gc:   0   diff:   0.00%
  Arc 3.1  iter: 54,532,505   gc: 152   diff: 105.05%
And here's the results when I removed "procedure-rename":

  > (+ 1 2)
  Arc/Nu   iter: 88,933,497   gc:   0   diff: 71.88%
  Arc 3.1  iter: 51,741,236   gc: 116   diff:  0.00%

  > (no ())
  Arc/Nu   iter: 78,026,418   gc:   0   diff: 37.11%
  Arc 3.1  iter: 56,909,869   gc: 156   diff:  0.00%
What a ginormous difference. Removing it doubled the speed of Arc/Nu! Given that the sole purpose of "procedure-rename" is to, well, rename functions, I wouldn't expect it to have such a huge runtime performance penalty, but apparently it does...

-----

3 points by lark 4602 days ago | link

I sometimes wonder if garbage collection in dynamic languages is necessary.

-----

2 points by rocketnia 4602 days ago | link

Sounds like a good topic, and I'd be interested in hearing to hear your thoughts on this. I've been thinking about the same kind of thing, but I'm looking to use it in a weakly typed subset of a statically typed language. My motivation is mostly to see if we can make it convenient to externally manage the memory needs of otherwise encapsulated programs.

This is probably a discussion for another thread. ^_^;

-----

1 point by akkartik 4602 days ago | link

My arc variant uses ref-counting, for what it's worth: https://github.com/akkartik/wart/blob/adf058706b/010memory

-----

1 point by rocketnia 4603 days ago | link

I wonder if the compiler can't see through a 'procedure-rename call to realize it can still apply optimizations to the code. Like, maybe it can optimize ((lambda (x) ...) 2) but not ((procedure-rename (lambda (x) ...) 'foo) 2).

-----

2 points by lark 4604 days ago | link

Does Arc/Nu support file uploads?

-----

2 points by Pauan 4604 days ago | link

Are you talking about this? http://www.arclanguage.com/item?id=16355

Arc/Nu runs Arc 3.1 unmodified, so it behaves almost identically to vanilla Arc 3.1.

It would probably be possible to add it in, but I don't use the server portion of Arc, so that's out of my league. Patches welcome.

-----

1 point by lark 4603 days ago | link

Yes, that's what I'm talking about.

I use the server portion of Arc to write web-based applications, and would like to upload files.

Anarki hadn't worked very well or very fast when uploading. I might just bite the bullet and use web.py alongside Arc, and have Python calling Arc after a file has been saved. I might also resort to using Anarki, although breaking backward compatibility in a few ways can become a bit of a concern.

I wish there were a language that offered what I need.

-----

3 points by Pauan 4607 days ago | link | parent | on: Install hackernews Problem

Looks like you installed Arc as root, and then tried to use it when not as root. I would suggest installing and running it as a normal user.

-----

1 point by cassano 4607 days ago | link

gtalk?

-----

1 point by akkartik 4606 days ago | link

Send me your id (my email is in my profile).

-----

More