Arc Forumnew | comments | leaders | submitlogin
Buckets vs. Alists?
3 points by evanrmurphy 5083 days ago | 5 comments
I was reading The Art of the Interpreter [1] and was surprised by the implementation of symbol tables/environments:

A symbol table is represented as a list of buckets; each bucket is a list whose car is a list of names and whose cdr is a list of corresponding values. {Note This ain't A-lists} (page 11)

I had never heard of buckets before, though I understand their structure now [2]. What I don't understand is the tradeoffs: why would you want to use a bucket over an alist? I can't see any performance / algorithmic difference yet except exchanging some cars for cdrs, but I'm probably missing something.

Update: Ah, I didn't realize the "{Note This ain't A-lists}" was an endnote. Still reading it, but their choice seems to be about both aesthetics and efficiency.

To summarize, they say the routines they use to act on buckets (bind and lookup) have nicer names than the ones Lisp 1.5 uses to act on alists (pairlis and assoc). More importantly, "that the number of conses use to bind a given number of variables is usually smaller."

---

[1] http://www.scribd.com/doc/2416804/Lisp-Art-of-the-interpreter-sussman

[2] A simple example in case anyone else was unfamiliar with buckets. First, an alist that maps some symbols to values:

  (= al '((a 1) (b 2) (c 3)))
Or, if you subscribe to http://arclanguage.org/item?id=13387:

  (= al '((a . 1) (b . 2) (c . 3)))
In either case, the corresponding bucket would be:

  (= bucket '((a b c) . (1 2 3)))
Alternatively written without dot notation as:

  (= bucket '((a b c) 1 2 3))


2 points by rocketnia 5081 days ago | link

Hmm, an interesting feature of a bucket is that there's a definite root node you can mutate, whereas in order to mutate something onto the front of an alist, you need to handle the alist by way of another box: (nil (a . 1) (b . 2) (c . 3)). Since buckets don't need to be wrapped in boxes, they provide less of a separation of concepts, which could be good or bad. (Edit: Oh, I'm duplicating zck's comment a little here.)

On the same note, two buckets can't share a tail. However, they can share key tails and value tails individually.

Another advantage of buckets is that they require no dots to write, as you've shown.

Just giving some random thoughts. ^_^

-----

2 points by evanrmurphy 5081 days ago | link

> Another advantage of buckets is that they require no dots to write, as you've shown.

Exactly, I like this one a lot. Fewer conses / greater efficiency whilst being more palatable to read and write.

-----

1 point by rocketnia 5081 days ago | link

Well, there are definitely fewer conses in ((a b) 1 2) than in ((a 1) (b 2)), but ((a . 1) (b . 2)) has them both beat. (Maybe I'm missing something.)

As for being palatable to read and write, I actually disagree, even with the decrease in punctuation.

When writing a literal bucket, to add or remove a key-value pair, I'd have to edit in two places. Besides that, I'd encounter horizontal layout annoyances: I'd have to word-wrap the keys and the values in the same way in order to see the bindings clearly. And every time I added or removed a binding, I'd have to rewrap.

Reading is a bit more of a wash. When reading a bucket in debug output, it's harder to look up specific bindings but a lot easier to see what keys exist. I think I do those two things in about equal proportions (when I'm working on Penknife, at least).

-----

1 point by evanrmurphy 5081 days ago | link

> Well, there are definitely fewer conses in ((a b) 1 2) than in ((a 1) (b 2)), but ((a . 1) (b . 2)) has them both beat. (Maybe I'm missing something.)

I resorted to cons counting to confirm it for my own simple mind and... you're correct! ^_^

  ; bucket (5 conses)
  ((a . (b . nil))  . (1 . (2 . nil)) )

  ; "frugal" alist (4 conses)
  ((a . b) . ((b . 2) . nil))
Note, however, that it's a constant difference of 1 cons. The difference doesn't grow with the size of the data, so it's rather insignificant.

Sorry I don't have time to respond to the rest of your comment right now.

-----

2 points by zck 5082 days ago | link

Their names being better isn't really a good reason -- you could make bind and lookup work on alists too.

They have a good argument about the state being able to be shared -- that does make things quicker. I'm not sure how much, but being as that it's Steele and Sussman, I'm sure they did their homework.

-----