Arc Forum | One way to provide vectors without bloating the language is to have the hash ta...

Arc Forum

8 points by joesurfer 6566 days ago | link | parent

One way to provide vectors without bloating the language is to have the hash table implementation check if the hash table is being used as an array, and then switch to an array implementation.

Lua does this, and gets good results.

4 points by nlavine 6566 days ago | link

I disagree, actually, but not because it wouldn't work. It seems to me that the simplest way to think about data storage is to give the programmer two options: to store data in noncontiguous memory with pointers, with O(n) access but O(1) insertion/reordering/whatever, or in contiguous memory in an array, with O(1) access but potentially O(n) insertion. It's an appropriate arc concept because it gets at the core of what is going on, and lets the programmer do whatever they want, while keeping the language simple and elegant (only two options!). I believe everything else can be built out of that, if you really want to do it (not that I think hash tables should leave - they're far too useful for that. but arrays and pointers give you what you need to build whatever you want).

-----

3 points by reitzensteinm 6566 days ago | link

But surely what is important is freedom and power to express ideas, not dictate their implementation to the compiler/runtime?

If the two systems behave exactly the same (mapping keys to values), and the only penalty is execution time (which depending on how the automatic switching works, shouldn't be much), why should they be separate concepts in the language itself?

-----

1 point by nlavine 6565 days ago | link

I see your point, and you may be right. But I could respond to that: if all we're abstracting over is key/value mappings, then why do we have lists?

I think we're looking at this from different points of view. I'm thinking of a basis for the space of all data layouts in memory (so to speak), and you're thinking of having a generic interface to mappings, if I understand you right.

I have a thought that might shed some light on this, and I'd like to hear what you and everyone else think about it. It seems like we're talking about two types of maps - maps whose keys are values (hash tables), and maps whose keys are nonnegative integers (arrays, but also some uses of lists).

The difference between them is the way you think about it. Hash tables are pure associative maps, which you use in the application server to store user data for cookies, but you could also use as record types, because they're really the same thing. Maps indexed by integers, on the other hand, hold data in a sequence, like the temperatures in Durham, NC for the past month, and they tend to be variable-sized and ordered because the integers have a natural order, and because the most natural set of integers is all of them.

Are we really looking for ways to represent these two concepts?

-----

1 point by reitzensteinm 6563 days ago | link

But do we think about arrays differently because there's a legitimate reason to, or because that's just the way it has worked in other languages?

I mean, the fact that a simple:

(def array (elements) (table))

Would bridge that conceptual gap probably means that although it will feel weird using a table as an array, including arrays solely for familiarity is probably the kind of backwards compatability that Arc is looking to avoid.

You're right, we are definitely approaching this from different directions. I do think that a generic interface to mappings is the way to go, so long as the abstraction doesn't leak too much (i.e. reasonably small performance loss).

-----

1 point by reitzensteinm 6566 days ago | link

I think that's an excellent idea.

But for particularly performance intensive situations, it would be nice to have a function that creates a hash table that's internally already an n element array.

There are probably only a few cases where this would actually be necessary (and it should only ever be done after profiling), but when you run into one of those cases, you're in deep trouble if there's no way around it.

An example would be if you'd like to write a bucket system for a 2D or 3D set of objects, and have the system still functional. Reusing the buckets between calls would turn everything into an array quickly, but the thought of xyz hash maps realizing they should be arrays at runtime is a bit scary. Allocating arrays directly, on the other hand, should still be more than fast enough for use in real time applications.

-----

2 points by zachbeane 6565 days ago | link

Does that mean that Arc must be unsuitable for implementing itself? Why?

-----