Arc Forum | I've been thinking about this some more, and it seems to me that boxes in the wa...

Arc Forum

3 points by shader 4433 days ago | link | parent

I've been thinking about this some more, and it seems to me that boxes in the way Nulan uses them make for a completely different evaluation paradigm than the traditional metacircular evaluation.

In the traditional evaluation scheme, using environments, symbols, read, eval, and apply, the process is as follows:

  1) read turns text into code (lists of symbols and values)
  2) eval looks up a symbol in the current environment, or expands a macro, and calls apply to get the value of a function
  3) apply calls eval to get the value of the arguments, and then returns the value of the function

In a compile-time based box scenario, the environments go away and read/eval change to something like this:

  1) parse the new code into lists of appropriately boxed values, and expand macros
  2) recursively apply functions to values to get the final result.

More of a traditional compile/run separation. The main difference to note in the box formulation is that there is no such thing as an environment. The 'hyperstatic scope' that Pauan keeps mentioning is another way of saying, 'there is no scope', only variables.

Thus, there is no real way to make variations on the scoping scheme using hyperstatic scope, because that's all a read/compile time feature. Dynamic scope is impossible without tracking environments, which are otherwise unnecessary.

Now, if one were to use both first-class environments and boxes, one can theoretically have the best of both worlds, with only a little extra overhead, and most of that at compile time. Flexible scoping becomes possible by specifically referencing the variable in the environment instead of using the default boxed values. They would still be boxes, just referenced by symbol in the environment table.

Now what I'm wondering is whether there is any useful equivalence to be found between a box and an environment. Heck, just make a box an object that has a few hard coded slots (that can be accessed by array offset) and a hash table for the rest, and voila! It can replace cons cells and regular hash tables too :P All we need is a reason for making everything use the same core structure...

2 points by Pauan 4433 days ago | link

Yes, that's intentional. Because I wanted to compile to fast JavaScript, I chose an evaluation model that has a strict separation between compile-time and run-time.

---

"The 'hyperstatic scope' that Pauan keeps mentioning is another way of saying, 'there is no scope', only variables."

Actually, there is still scope. After all, functions still create a new scope. It's more correct to say that, with hyper-static scope, every time you create a variable, it creates a new scope:

  box foo = 1     # set foo to 1
  def bar -> foo  # a function that returns foo
  box foo = 2     # set foo to 2
  bar;            # call the function

In Arc, the call to "bar" would return 2. In Nulan, it returns 1. The reason is because the function "bar" is still referring to the old variable "foo". The new variable "foo" shadowed the old variable, rather than replacing it like it would in Arc. That's what hyper-static scope means.

From the compiler's perspective, every time you call "box", it creates a new box. Thus, even though both variables are called "foo", they're separate boxes. In Arc, the two variables "foo" would be the same box.

---

"Thus, there is no real way to make variations on the scoping scheme using hyperstatic scope, because that's all a read/compile time feature. Dynamic scope is impossible without tracking environments, which are otherwise unnecessary."

If by "dynamic scope" you mean like Arc where globals are overwritten, then that's really easy to do. As an example of that, check out Arc/Nu, which also uses boxes:

https://github.com/Pauan/ar

In the Arc/Nu compiler, it literally only takes a single line of code to switch between Arc's dynamic scope and hyper-static scope.

If by "dynamic scope" you mean like Emacs Lisp, then... actually that should be really easy as well. You would just replace the same symbol with the same box, and then use box mutation at run-time. Of course, at that point I don't think there'd be any benefit over run-time environments... In any case, I like lexical scope, and boxes work well for lexical scope.

Also, although Nulan has hyper-static scope, it does have dynamic variables. The way that it works in Nulan is, you create a box like normal...

  box foo = 1

...and then you can dynamically change that variable within a code block:

  w/box! foo = 2
    ...

Within the "w/box!" block, the variable "foo" is 2, but outside, it is 1. You can do this with any variable in Nulan.

---

"Now what I'm wondering is whether there is any useful equivalence to be found between a box and an environment."

No. And that's a good thing. I've found one of the major benefits of boxes is that they're a single stand-alone entity: each variable corresponds to a single box. In the environment model, you have a hash table which maps symbols to values, so you have a single data structure representing many variables.

The reason I prefer each variable being represented by a separate box is that it makes namespaces really really easy to design and implement. For instance, in Arc/Nu you can selectively import only certain variables:

  (w/include (foo bar qux)
    (import some-file))

This is really easy with boxes: you simply grab the "foo", "bar", and "qux" boxes and import them. But with environments, it's a lot harder, because the environment for the file "some-file" may contain all kinds of variables that you don't want to import.

You could take the values from the environment and copy them into the current environment, but now any changes made won't show up. With boxes, the changes show up. I don't think environments work well for namespace systems. But boxes work wonderfully well.

---

"It can replace cons cells and regular hash tables too :P"

I've thought about ways to replace hash tables with boxes, but I don't think it'd be especially helpful. Hash tables serve a different purpose from boxes, so it makes sense for a language to have both data structures. If you want a box, just use a box. If you want a hash table, just use a hash table.

Interestingly enough, both Nulan and Arc/Nu have a giant hash table at compile-time that maps symbols to boxes. This is basically the compile-time equivalent of the run-time environments that languages like Kernel have. Creating a new scope or changing the existing scope is as easy as changing this hash table, which makes it really easy to play around with different namespace systems.

-----

2 points by rocketnia 4433 days ago | link

"The main difference to note in the box formulation is that there is no such thing as an environment."

Doesn't step 1 need to use environment(s)? How else would it turn text into a structure that contains references to previously defined values?

---

"The 'hyperstatic scope' that Pauan keeps mentioning is another way of saying, 'there is no scope', only variables."

The hyper-static global environment (http://c2.com/cgi/wiki?HyperStaticGlobalEnvironment) is essentially a chain of local scopes, each one starting at a variable declaration and continuing for the rest of the commands in the program (or just the file). I think the statement "there is no scope" does a better job of describing languages like Arc, where all global variable references using the same name refer to the same variable.

---

"Flexible scoping becomes possible by specifically referencing the variable in the environment instead of using the default boxed values. They would still be boxes, just referenced by symbol in the environment table."

I don't understand. If we're generating an s-expression and we insert a symbol, that's because we want that symbol to be looked up in the evaluation environment. If we insert a box, that's because we want to look up the box's element during evaluation. Are you suggesting a third thing we could insert here?

Perhaps if the goal is to make this as dynamic as possible, the inserted value should be an object that takes the evaluation environment as a parameter, so that it can do either of the other behaviors as special cases. I did something as generalized as this during Penknife's compilation phase (involving the compilation environment), and this kind of environment-passing is also used by Kernel-like fexprs (vau-calculus) and Christiansen grammars.

---

"Now what I'm wondering is whether there is any useful equivalence to be found between a box and an environment."

I would say yes, but I don't think this is going to be as profound as you're expecting, and it depends on what we mean by this terminology.

I call something an environment when it's commonly used with operations that look vaguely like this:

  String -> VariableName
  (Environment, VariableName) -> Value
  (Environment, Ast) -> Program

Meanwhile, I call something a box when it's primarily used with an operation that looks vaguely like this:

  Box -> Value

This might look meaningless, but it provides a clean slate so we can isolate some impurity in the notion of "operation" itself. When a mutable box is unboxed, it may return different values at different times, depending on the most recent value assigned to that box. Other kinds of boxes include Racket parameters, Racket continuation marks, Racket promises, JVM ThreadLocals, and JVM WeakReferences, each with its own impure interface.

When each entry of an environment needs to have its own self-contained impure behavior, I typically model the environment as a table which maps variable names to individual boxes. The table's get operation is pure (or at least impure in a simpler way), and the boxes' get operation is is impure.

You were wondering about equivalences, and I have two in mind: For any environment, (Environment, VariableName) is a box. For any box, we can consider that box to be a degenerate environment where the notion of "variable name" includes only a single name.

-----

1 point by Pauan 4433 days ago | link

"Doesn't step 1 need to use environment(s)?"

I think he's talking about run-time environments a la Kernel, Emacs Lisp, etc.

Nulan and Arc/Nu use a compile-time environment to replace symbols with boxes at compile-time. But that feels quite a bit different in practice from run-time environments (it's faster too).

---

"Are you suggesting a third thing we could insert here?"

Once again, I think he's referring to run-time environments. Basically, what he's saying is that you would use boxes at compile-time (like Nulan), but you would also have first-class environments with vau. The benefit of this system is that it's faster than a naive Kernel implementation ('cause of boxes), but you still have the full dynamicism of first-class run-time environments. I suspect there'll be all kinds of strange interactions and corner cases though.

---

Slightly off-topic, but... I would like to point out that if the run-time environments are immutable hash tables of boxes, you effectively create hyper-static scope, even if everything runs at run-time (no compile-time).

On the other hand, if you create the boxes at compile-time, then you can create hyper-static scope even if the hash table is mutable (the hash table in Nulan is mutable, for instance).

-----