The immediate context for this is this post https://v8.dev/blog/v8-release-80 on the v8 blog, where they announced that in v8 version 8, they saved an average of 40% memory, and (unlike usual memory-time tradeoffs) also got good performance improvements. (Design doc: https://docs.google.com/document/d/10qh2-b4C5OtSg-xLwyZpEI5Z...) So pointer compression is clearly a good thing. I don't know much about the history of pointer compression, but I know the following.
In 2008, Don Knuth posted on his then "news" page (https://cs.stanford.edu/~knuth/news08.html):
> A Flame About 64-bit Pointers
> It is absolutely idiotic to have 64-bit pointers when I compile a program that uses less than 4 gigabytes of RAM. When such pointer values appear inside a struct, they not only waste half the memory, they effectively throw away half of the cache.
> The gcc manpage advertises an option "-mlong32" that sounds like what I want. Namely, I think it would compile code for my x86-64 architecture, taking advantage of the extra registers etc., but it would also know that my program is going to live inside a 32-bit virtual address space.
> Unfortunately, the -mlong32 option was introduced only for MIPS computers, years ago. Nobody has yet adopted such conventions for today's most popular architecture. Probably that happens because programs compiled with this convention will need to be loaded with a special version of libc.
> Please, somebody, make that possible.
Presumably Knuth was not the only person asking for it, and in 2011 there was work on this: see "Making Knuth's wish come true: the x32 ABI" (http://blog.reverberate.org/2011/09/making-knuth-wish-come-t...) and Wikipedia/LWN coverage (https://en.wikipedia.org/w/index.php?title=X32_ABI&oldid=921... https://lwn.net/Articles/456731/)
Unfortunately, no one was using it (not sure why, maybe not many people who care about performance write the kinds of programs that would hugely benefit from this?), and the x32 ABI got sort of deprecated by late 2018 (https://www.phoronix.com/scan.php?page=news_item&px=Linux-Po... etc).
Now, a personal story. Recently, while searching for something on Stack Exchange, I came across a question related to Bentley's June 1986 Programming Pearls column that featured an invited literate program by Knuth and a review by Doug McIlroy, about which there is a lot of misinformation and misunderstanding on the internet (e.g. calling it an "interview question" and what not!). Anyway, this question on codegolf.SE (https://codegolf.stackexchange.com/questions/188133/bentleys...) was about implementing a fast solution to the same problem, and the "winner" was an elegant Rust program. I was curious about Knuth's original Pascal (WEB) program from 1986, so I studied it, translated it to C++, and found to my surprise that it ran faster than the fastest program that had been posted on the site! Looking closer into why, experimenting with this and that, it turned out AFAICT that the probable reason, ultimately, was that where the Rust program used (64-bit) pointers, the translation of Knuth's program (which had been targeting "common denominator" Pascal, without pointer types) used (32-bit) array indices, so it was able to fit twice as many struct values in the cache.
In fact, taking just this one idea (cache-friendliness) and using a regular trie data structure (as we're no longer operating under similar memory or language constraints as Knuth was) gives something even faster. (https://codegolf.stackexchange.com/a/197870) I'd been planning to write a blog post explaining all this -- the clever data structures used (tries, trie-packing, and hash tries), how they're used in the TeX program for hyphenation, the context in 1986 and misunderstandings today, and my experiments with the programs -- but got distracted by other things, but this post has reminded me to try again. :-)
Having 32-bit pointers doesn't mean that ASLR becomes much less effective? I can suppose that for V8 is not a big problem, because they use compressed pointers where necessary, but if it was a compiler directive (like Knuth wanted) it would affect the whole program. I would not use that option for any program that have to process untrusted input.
It does: pointers in the V8 heap all lie within the same 4gb region. But then again, Spectre also makes ASLR much less effective.
This is really interesting, and I think 32-bit pointers make sense for a significant class of problems. At the very least, having the option could be useful. But...
One time in node, building a react-native project, the build failed with `FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory`. (see issue here: https://github.com/expo/expo-cli/issues/94) I ended up fixing it with `export NODE_OPTIONS=--max_old_space_size=8000` But it took a while to find that. In an era where developer experience is king, maybe the default should be 64-bit pointers with an option for 32-bit.
Also if a 32-bit pointer type option existed in say C, I worry that programmers would abuse it, chasing performance at the expense of bugs.
IMHO the best alternative to pointers is "tagged index handles" as described here:
https://floooh.github.io/2018/06/17/handles-vs-pointers.html
This approach drastically reduces the risk of memory corruption in usafe languages, and at the same time keeps data structures compact because handles rarely need to be 64-bits (it's a bit similar to the compressed pointers described in the original post, basically split pointers into a "private" base pointer, and a public offset/index, and use some bits in the public value for dangling protection).