points by gruseom 14 years ago

I still think there's room for experimenting with a Lisp that has the hashmap (a.k.a. dictionary, associative array, property bag) as its organizing principle rather than the list. There's nothing better than hashmaps for exploratory programming. If there is something to be gained from building a Lisp in terms of them (from the ground up; I don't mean support for object literals), I'd like to know what it is. This has little to do with syntax.

lisper 14 years ago

This presents a very thorny language design problem. Lists have this very nice property that they are ordered, which lets you attach implied semantics to the position of an element, e.g.:

(defun foo (a b c) ...)

We know this is a function defining because the symbol DEFUN is in the FIRST position of the list.

Associative arrays are unordered, so you have to explicitly label everything:

{ top-level-form : defun, arguments : { 1 : a, 2 : b, 3 : c }, ... }

Or something like that. You can see where that would get annoying.

Exercise for the reader: what happens if you base a Lisp-like language on vectors instead of cons cells?

  • gruseom 14 years ago

    Could you allow (a b c) as a synonym for { 0 a 1 b 2 c }? That's more or less how JS deals with this, though JS feels messy under the hood.

    what happens if you base a Lisp-like language on vectors instead of cons cells?

    I want to know that too! One thing is that you lose cons and cdr (except by copying the vector), so you lose the ability to do most of the classic Lisp things efficiently. You must have something in mind other than this?

    • lisper 14 years ago

      > Could you allow (a b c) as a synonym for { 0 a 1 b 2 c }?

      Yes, of course, but all you've done is re-invent vectors.

      > you lose cdr (except by copying the rest of the vector)

      Copying the rest of the vector is not semantically equivalent to CDR unless you are being purely functional. But you can actually implement CDR using displaced vectors.

      > You must have something in mind other than this?

      Could be :-)

      Try writing EVAL for a vector-based Lisp and see what happens. In particular, when you're done, make a list of all the primitives you needed to employ in order to do it. There's a deep insight at the end of this process. (No, I'm not going to tell you what it is.)

      • gruseom 14 years ago

        Yes, of course, but all you've done is re-invent vectors.

        True, but only as a device to keep the playing field open for any new benefits that might spring from universal hashmaps. What are they? I don't know, that's the question. Maybe there aren't any, but that itself would be interesting as a negative result (the L in Lisp is there for a reason).

        What I'd really like to know is what would complete this sequence: list:Lisp, array:APL, stack:Forth, hashmap:?. That would be really interesting, no? And no one seems to be working on it, which is surprising given the ubiquity of hashmaps in modern languages (especially JS, but also Python, Lua, etc).

        • jes5199 14 years ago

          when I had a similar thought about hashmaps, I noticed that a hashmap is basically a function that takes one argument and returns a constant. And then I fell into a Turing tar pit about Church encoding. ... I'd love to try again, though.

      • munificent 14 years ago

        > In particular, when you're done, make a list of all the primitives you needed to employ in order to do it. There's a deep insight at the end of this process. (No, I'm not going to tell you what it is.)

        Having done this before (https://github.com/munificent/lark), all it meant for me was that numbers and basic integer arithmetic became primitive. The clever bit about McCarthy's original metacircular interpreter is that it didn't need any types at all beyond atoms and cons cells: no bools, no ints, no nothing.

        I found it interesting, but I don't know if I reached any deep insight. Maybe that's just me, or maybe I missed something.

        • lisper 14 years ago

          I suppose depth is in the eye of the beholder. That's pretty much it: to build a Lisp out of vectors you need numbers as primitives. To build a Lisp out of cons cells you don't.

          Follow-up exercise: do you need numbers to build a Lisp out of associative arrays? (Hint: this is not a trivial question to answer.)

          • vorg 14 years ago

            In Clojure an associative array auto-decomposes to a sequence of pairs when necessary.

          • akkartik 14 years ago

            Rereading McCarthy's paper yesterday I noticed for the first time that it didn't need numbers. But I hadn't connected all the dots yet.

  • akkartik 14 years ago

    I'll bite: cdr gets expensive? You could represent a list as a naked array of lisp pointers, but then cons gets expensive. And good luck garbage collecting the vectors.

    • lisper 14 years ago

      > cdr gets expensive

      Yes, that's one thing. But do you really need CDR?

      > You could represent a list as a naked array of lisp pointers, but then cons gets expensive.

      Actually, it's much worse than that. CONS actually becomes impossible. (Think about it.) But maybe there's something else that you can use instead of CONS?

      > And good luck garbage collecting the vectors

      Why would that be a problem? Extant Lisps collect vectors just fine.

      • akkartik 14 years ago

        "Extant Lisps collect vectors just fine."

        But they don't permit pointers inside the vectors, do they? I think if you permit internal pointers so you can have copyless cdr, computing reachability would get really expensive.

        • lisper 14 years ago

          The phrase "permit pointers inside the vectors" doesn't really make sense. Vectors are just ordered collections of objects. They don't have insides and outsides, except insofar as the objects contained by the vector could be (but usually aren't) considered "inside" the vector. Lisp vectors are usually implemented as arrays of pointers, so in that sense pointers are "permitted" inside vectors. But I'm guessing that's not what you meant. (I have no idea what you actually meant.)

          • akkartik 14 years ago
              (a (b (c . nil))
              ^  ^
              |  |
              |  |
              A  B
            

            If you implement lists as vectors with car/cdr (I'm still thinking about your other questions) and choose to have cdr not copy results out of the original cell, you can end up in the (rope-like) situation above with pointer B (value (b c)) in addition to pointer A (value (a b c)). But that makes GC inefficient. For a vector to be GC'd you have to know what internal pointers it has, and then you have to make the choice to copy them out if they're 'toward the end', or to leave the vector uncollected.

            (ignore pointers to a, b, and c.)

            • lisper 14 years ago

              Ah. What you are calling a "pointer to the inside of a vector" is actually called a displaced vector. It's a standard Common Lisp feature, and no problem for modern GCs.

              • akkartik 14 years ago

                Interesting! Thanks.

              • ori_b 14 years ago

                It does mean that you are potentially keeping around masses of data when you only care about one or two elements. Normally not an issue, but normally internal pointers ("displaced arrays") aren't a central organizational concept in a language.

  • akkartik 14 years ago

    The point of json, I think, is that you'd only use associative arrays when you need them.

  • akkartik 14 years ago

    Nobody seems to have mentioned the advantage of table literals: free syntax for ruby-style keyword arts.

  • jes5199 14 years ago
      (defun foo (a b c))

    could accept a map of variable names

      (foo {a: 1, b: 2, c: 3})
  • baddox 14 years ago

    > Associative arrays are unordered

    That's not true in general, is it? It's just hash maps that are unordered. You can use a tree to represent an associative array (so the order of keys is found by a tree traversal), or a "worse" method like an association list (linked list). You can also just store a linked list of keys along with your hash map.

    • lisper 14 years ago

      You are conflating the abstract data type with its implementation. A hash-map is one implementation of an associative array. A tree is another implementation of an associative array. A tree happens to depend on an ordering of the keys in order to do efficient lookup and a hash-map doesn't, but this is an implementation detail. It's not part of the definition of an associative array. And this is a good thing because not all data types have well defined orderings but you still might want to use them as keys, with symbols being the most notable example.

      • wanorris 14 years ago

        I think the mistake the parent post has made is conflating the kind of ordering you're referring to -- ordering in a tree implementation to allow for O(log n) resolution -- and ordering to preserve the position of each key-value pair as it was in the source structure. These are only the same thing if the source structure already happens to be sorted by some kind of ordering, something that obviously cannot be extended to the general case of preserving positionality.

        AFAIK, the only two ways to preserve positionality as part of an associative array are (1) to store as a postional list of key-value pairs and use linear search to find matching keys, or (2) to use an additional data structure as an index, either by storing as a positional list and having a additional map from key value into the positions, or by storing as a map and having an additional list of keys stored in positional ordering.

        • ori_b 14 years ago

          Or by having the map nodes have a "next" pointer.

          • wanorris 14 years ago

            Duh, I should have thought of that. Thank you.

  • ekiru 14 years ago

    Smalltalk requires explicit labeling of all but the first argument (though they are ordered) for keyword messages. Good naming conventions can make this enjoyable (at least subjectively).

    In a hashmap based lisp, I'd expect defun to look something like

    (define-function: (foo-with-a: a b: b c: c) with-body: ...)

    Of course, since you're using an unordered representation, one could instead say (with-body: ... define-function: (c: c b: b foo-with-a: a)), which is significantly more confusing. This also means you probably can't do implicit progn.

    progn in general will probably be unpleasant, unfortunately, unless one includes syntactic sugar for ordered sequences, which might defeat the purpose of an associative-array-based Lisp.

    The first thing I've noticed about a vector-based Lisp is that you actually need primitive integers so that you can do indexing, which isn't the case with cons-cell based Lisps.

  • sedachv 14 years ago

    > Exercise for the reader: what happens if you base a Lisp-like language on vectors instead of cons cells?

    I've been thinking about this in terms of a self-hosted Parenscript. What I see is a lot more pattern matching and map/reducing. Not altogether a bad thing.

premchai21 14 years ago

Lua tables are a general-purpose associative structure that are used for both map and list functions in that language: there's syntactic sugar for both consecutive integer keys and simple string keys, and the former are used to make sequences (with internal, semantics-preserving optimizations in the Lua runtime for storing them). They strike me as one of the best options for a single core data structure if one were limited to one.

  • tikhonj 14 years ago

    I haven't use Lua, but that sounds exactly like JavaScript objects. JavaScript does get on very well with basically just these objects.

    • premchai21 14 years ago

      They're similar, but the Lua form is much cleaner. In particular, JS objects only permit string keys, which makes Arrays sort of preposterous; they also have other baggage related to the prototype system. Lua tables can have any type except nil (which is not truly first-class in Lua) as keys, and can also have "metatables", which are of similar or greater power to the latter baggage but semantically somewhat simpler. To me this makes a world of difference.

cyrus_ 14 years ago

You could argue that Javascript is a language with the hashmap as its central data structure. It doesn't have a macro system/quotations/other LISPy things, of course, but you can certainly frame your argument as "would JS benefit from <LISP feature>?"

  • gruseom 14 years ago

    Yes, JS certainly is that (or comes close), which is the thing I like the most about JS and miss the most in Lisp - so much so that I wrote macros to allow me to use JS-style objects wherever possible. (I don't mean JS-style OO, which I avoid at all costs. Hate "this" and "prototype", love "a.b" and "a[b]".) The benefit of the hashmap for exploratory programming is so great that it raises the question of whether a fully regular/metaprogrammable Lisp could be evolved from it. Not clear if it would be practical, or add anything of value, but if it did it would be interesting.