Untangling Lifetimes: The Arena Allocator

nine_k 3 days ago

This article is quite long, and spends many kilobytes to make the following points (AFAICT):

- Pure malloc/free allocation is error-prone and expensive; it's too granular in many cases.

- Stack allocation has obvious limitations due to its LIFO nature. RAII has similar limitations.

- Let's use a bunch o separate, independent allocators / memory arenas instead. We can free them more quickly in one go when needed. We can group objects by lifetime using them. Having thread-local arenas naturally separates thread-local allocations from program-global allocations.

This sounds pretty reasonable, and, AFAIK, Zig leans heavily on this concept. I wonder if Rust can reap some of the benefits of arena-based allocation by leveraging its lifetime tracking.

alextingle 3 days ago

This approach to memory management is completely at odds with the whole point of Rust.
Object orientated programming has conditioned programmers into believing that having a hairy nest of small allocations, all with pointers to each other, is the normal, unavoidable situation.
In fact, it creates all sorts of problems. First, and most obviously, it's really hard to keep track of all those allocations, so you get leaks, and use after free, and all the other familiar memory bugs. But you also get bloated memory use, with both your user code, and the allocator having to keep track of all those chunks of memory. You get poor cache utilisation. You incur often ridiculous CPU overhead constructing and tearing down these massive, intricate structures.
Rust makes it harder to trip over the memory bugs, but that makes it easier to keep on using the lots-of-tiny-allocations paradigm, which is a much bigger problem overall.
- kibwen 2 days ago
  
  > This approach to memory management is completely at odds with the whole point of Rust.
  No, not in the slightest. Rust works extremely well with arenas.
  > In fact, it creates all sorts of problems. First, and most obviously, it's really hard to keep track of all those allocations, so you get leaks, and use after free, and all the other familiar memory bugs.
  Given that the context of this subthread is Rust, I'm not sure why you bring this up. Rust doesn't exhibit any of these.
  > But you also get bloated memory use, with both your user code, and the allocator having to keep track of all those chunks of memory.
  No, Rust often uses less heap memory than the comparable C or C++ program, because it's so much easier to safely pass around pointers to the stack and thereby avoid the need to use the heap at all. Defensive copying isn't a thing in Rust.
  > You incur often ridiculous CPU overhead constructing and tearing down these massive, intricate structures.
  No, there is no ridiculous CPU overhead here. Most objects in Rust have trivial drop implementations and simply recursively free their children, who also have trivial drop implementations. Freeing memory does not show up on the list of performance bottlenecks for any ordinary Rust program.
- nextaccountic 2 days ago
  
  Many Rust programs lean heavily on arenas though.
pornel 3 days ago

Stack-allocated objects are used very often in Rust, and they're not limited to a strict LIFO usage.
Rust doesn't have C++'s RAII with destructors running unconditionally at the end of scope. In Rust, destructors run only once after the last use of a value, even if the value moves between scopes.
In the article's terms, Rust uses stack allocation wherever possible, but can also pass objects by value to make the complex non-LIFO uses possible (which is easier in Rust, because it doesn't identify objects by their address, doesn't need a pointer indirection to ensure uniqueness, and lifetimes ensure there are no dangling pointers to the old address).
Thread-local arenas are possible, and Rust can enforce that the objects never leave the thread (!Send types).
It's also possible to create memory arenas in any scope and use lifetimes to ensure that their objects can't leave the scope.
However, placing objects with destructors in arenas is tricky. Since the whole point is to let objects easily reference each other, there's nothing stopping them from trying to access the other objects in the same arena while the arena is being destroyed and runs the destructors. That's an inherent issue with arenas, in C too if the arena supports some kind of per-object cleanup.
- whytevuhuni 2 days ago
  
  I like your response the most among your siblings.
  A couple more points, based on how the bumpalo crate does it:
  - If you don't really need destructors, then it's fine to just not call them. This works for things like strings, lists of strings, objects with references to each other, etc.
  - If you do need a destructor, you can create a stack object that references and allocates into the arena, and it will be the stack object whose purpose is to run that destructor (but deallocation happens later). This preserves drop ordering while still getting some benefit of arena allocation.
SeanAnderson 3 days ago

> I wonder if Rust can reap some of the benefits of arena-based allocation by leveraging its lifetime tracking.
I'm reminded of one positive remark made here: https://loglog.games/blog/leaving-rust-gamedev/#ecs-solves-t...
> The key point being, this allows a language like Rust to completely side-step the borrow checker and allow us to do "manual memory management with arenas" without actually touching any hairy pointers, and while remaining 100% safe. If there was one thing to point at that I like about Rust, it'd be this. Especially with a library like thunderdome it really feels that this is a great match, and that this data structure very well fits the language as it was intended.
sirwhinesalot 3 days ago

Yes, you can have memory safe arenas in Rust, as with Bumpalo.
hinkley 3 days ago

> We can free them more quickly in one go when needed. We can group objects by lifetime using them. Having thread-local arenas naturally separates thread-local allocations from program-global allocations.
These are at odds due to concurrency and object lifetimes. Reaching into memory allocated by another thread on another core was never free but has only gotten more costly.
You can either pay more at allocation time, use time, or have a copying collector. But you have to deal with at least one.
So for a cache or a lazily loaded lookup table, you want an arena allocator. But the same data structure used within the scope of say a single request should be a thread local arena.

HarHarVeryFunny 3 days ago

I don't see this as very well conceived article, since the two concepts being discussed, manual memory management, and arena allocators are really orthogonal.

Arena allocators aren't going to save you from memory management bugs, and aren't intended to. They are just an efficient may to allocate and free a bunch of chunks of memory that have the same lifetime, for example due to belonging to the same data structure. The idea is that you just sequentially allocate from a large chunk of memory, then free up the entire large chunk in one go at the end of the collective lifetime of the allocated pieces.

The things that make memory management in C error-prone are that:

1) C doesn't have objects/destructors, so all freeing of memory is manual, thereby creating the possibility that you mess it up.

2) C's pointers don't have ownership semantics, and therefore also doesn't have niceties like shared ownership (cf C++'s shared vs unique pointers). All ownership tracking is manual, giving you another chance to mess up.

The only think that is going to save you from memory management errors in C is programmer discipline and attention to detail.

gorjusborg 3 days ago

It seems like the choice of a stacklike Arena API makes the examples a little more confusing than needed. An arena doesn't necessarily mean allocation 2 must be freed before allocation 1.

If this seems cool to you, check out Zig. The libraries use a similar convention where code that might allocate requires passing an allocator, which may be an arena, or something cool we don't even know about yet.

ykonstant 3 days ago

The author's defense of C reminds me of this classic youtube video: https://www.youtube.com/watch?v=443UNeGrFoM&pp=ygUPaG93IGkgc...

I am sure the video above will cause immediate disagreement (I think it goes too far on some topics), but I urge people to consider the ideas contained within.

(I seem to have mis-posted this to another thread?)

dataangel 3 days ago

Ryan is one of the people who really badly want arenas to be a borrow checker substitute, but they're not. You still need to make sure references to objects in the arena don't outlive the arena itself.

williamcotton 3 days ago

Another use case is a per-request arena for a web server.

nine_k 3 days ago

If PHP did one thing right, it is this: allocate resources while handling a request, free them all unconditionally when the response has been sent.