Discussion:
Strategies for using several GB of memory
Geoff Leyland
2014-09-07 01:59:55 UTC
Permalink
Hi,

I’m working with OpenStreetmap data, specifically in this case, road centrelines, and am running into LuaJIT’s memory limits.

I store centrelines of road segments, each of which is an x, y polyline, and each segment centreline is stored as a Lua table of points, where a point is also a table with two elements: {{1,2},{3,4},{4,5},...}.

The code analyses the segments, looking for segment crossings, near misses and so on, and then breaks the segments at network nodes (and joins them where there’s a break, but no actual node) and builds a routable network. It’s convenient, but not essential, to have the polylines stored in tables and garbage collected because they go through quite a lot of transformation.

This all works fine for smaller areas (I live in New Zealand), but with larger areas, I run into memory limitations. Can anybody suggest good strategies for dealing with the problem?

I could, for example, malloc the memory for each segment’s polyline, which would alleviate the problem, partly because the storage would be more compact, and also because, with a bit of luck, malloc might return addresses above 4G. Unfortunately, there’s no guarantee that malloc won’t just take up space in the precious heap below 4G. I don’t know of a way of hinting to malloc that I’d like higher memory.

For other work, where I know how much memory I need in advance, I just mmap what I need, hinting to mmap that I want the memory above 4G (and then checking and repeating until it does what I want). Since each individual polyline is fairly small and there are plenty of them, using mmap as a malloc substitute would be a fairly poor idea.

Is there perhaps a way I can induce LuaJIT to grab all the memory it can on startup, and then not release it, so that subsequent calls to malloc are forced above the memory LuaJIT wants?

Thanks for any ideas,
Geoff
demetri
2014-09-07 02:12:56 UTC
Permalink
This may not be a good fit for you but if your objects are
of small enough variation in size it might not be terrible to
use mmap and a very dumb manual allocation strategy
(consume slots from an array on alloc, add the slot id to
a free list on release). I did this once with reasonable
success but it was a simple research project and I didn't
take it far enough to feel confident that I've considered
all the possible failure modes :)

I too would be interested in any general solutions to this
problem!

Demetri
Hi,
I’m working with OpenStreetmap data, specifically in this case, road
centrelines, and am running into LuaJIT’s memory limits.
I store centrelines of road segments, each of which is an x, y polyline,
and each segment centreline is stored as a Lua table of points, where a
point is also a table with two elements: {{1,2},{3,4},{4,5},...}.
The code analyses the segments, looking for segment crossings, near misses
and so on, and then breaks the segments at network nodes (and joins them
where there’s a break, but no actual node) and builds a routable network.
It’s convenient, but not essential, to have the polylines stored in tables
and garbage collected because they go through quite a lot of transformation.
This all works fine for smaller areas (I live in New Zealand), but with
larger areas, I run into memory limitations. Can anybody suggest good
strategies for dealing with the problem?
I could, for example, malloc the memory for each segment’s polyline, which
would alleviate the problem, partly because the storage would be more
compact, and also because, with a bit of luck, malloc might return
addresses above 4G. Unfortunately, there’s no guarantee that malloc won’t
just take up space in the precious heap below 4G. I don’t know of a way of
hinting to malloc that I’d like higher memory.
For other work, where I know how much memory I need in advance, I just
mmap what I need, hinting to mmap that I want the memory above 4G (and then
checking and repeating until it does what I want). Since each individual
polyline is fairly small and there are plenty of them, using mmap as a
malloc substitute would be a fairly poor idea.
Is there perhaps a way I can induce LuaJIT to grab all the memory it can
on startup, and then not release it, so that subsequent calls to malloc are
forced above the memory LuaJIT wants?
Thanks for any ideas,
Geoff
Geoff Leyland
2014-09-07 02:28:39 UTC
Permalink
Post by demetri
This may not be a good fit for you but if your objects are
of small enough variation in size it might not be terrible to
use mmap and a very dumb manual allocation strategy
(consume slots from an array on alloc, add the slot id to
a free list on release). I did this once with reasonable
success but it was a simple research project and I didn't
take it far enough to feel confident that I've considered
all the possible failure modes :)
Unfortunately some polylines have two points and others have hundreds. I could just mmap far too much and just use it linearly (never free anything), which would be fairly simple I guess, but as soon as you start trying to be clever you're on the road to your own version of malloc, and someone smarter already did that better.

At the moment, I’m just working on mallocing the memory and hoping that the more efficient storage and chance allocations above 4G will get me through Australia at least.
Justin Cormack
2014-09-07 09:59:09 UTC
Permalink
Post by Geoff Leyland
Unfortunately some polylines have two points and others have hundreds. I could just mmap far too much and just use it linearly (never free anything), which would be fairly simple I guess, but as soon as you start trying to be clever you're on the road to your own version of malloc, and someone smarter already did that better.
At the moment, I’m just working on mallocing the memory and hoping that the more efficient storage and chance allocations above 4G will get me through Australia at least.
You could just integrate jemalloc, and fix it to use mmap over 4G for
allocations.

Justin
Karel Tuma
2014-09-07 11:08:58 UTC
Permalink
Post by Geoff Leyland
Is there perhaps a way I can induce LuaJIT to grab all the memory it can on startup, and then not release it, so that subsequent calls to malloc are forced above the memory LuaJIT wants?
Chances are you are still running out of LuaJIT limit (1GB on Linux/x64). On Linux/x64,
LuaJIT uses 0x40000000-0x80000000 range, without ever looking elsewhere. libc
malloc uses 0x0???????-0x40000000 at first, then brk fails as it encounters mmaped,
segment and continues to mmap way above the 32bit lowmem. If LuaJIT yells at you it
ran out of memory, most likely the 0x40000000-0x80000000 is full only with
data from LuaJIT allocator.

If you really insist on eating up all memory in 32bit space at startup, I'm using
code like [1] for that (and the resulting chunks are fed to LuaJIT allocator,
instead of mmap).

[1] https://gist.github.com/katlogic/4721e91ff5a845d2b939
ivan starkov
2014-09-07 11:35:43 UTC
Permalink
Post by Geoff Leyland
Is there perhaps a way I can induce LuaJIT to grab all the memory it can
on startup, and then not release it, so that subsequent calls to malloc are
forced above the memory LuaJIT wants?

I've rewrote mmap for similar task and patch luajit code,
https://github.com/istarkov/lua_osx_allocator

Patch replaces calls to mmap in luajit code to ice_alloc.
ice_alloc - at first call preallocates a big chunk of low memory addresses
for luajit.
Post by Geoff Leyland
Post by Geoff Leyland
Is there perhaps a way I can induce LuaJIT to grab all the memory it can
on startup, and then not release it, so that subsequent calls to malloc are
forced above the memory LuaJIT wants?
Chances are you are still running out of LuaJIT limit (1GB on Linux/x64). On Linux/x64,
LuaJIT uses 0x40000000-0x80000000 range, without ever looking elsewhere. libc
malloc uses 0x0???????-0x40000000 at first, then brk fails as it encounters mmaped,
segment and continues to mmap way above the 32bit lowmem. If LuaJIT yells at you it
ran out of memory, most likely the 0x40000000-0x80000000 is full only with
data from LuaJIT allocator.
If you really insist on eating up all memory in 32bit space at startup, I'm using
code like [1] for that (and the resulting chunks are fed to LuaJIT allocator,
instead of mmap).
[1] https://gist.github.com/katlogic/4721e91ff5a845d2b939
Karel Tuma
2014-09-07 11:45:13 UTC
Permalink
Post by ivan starkov
I've rewrote mmap for similar task and patch luajit code,
https://github.com/istarkov/lua_osx_allocator
You might want to consider PROT_NONE+mprotect so it does not explode when
memory is constrained.
Tony Finch
2014-09-10 09:29:53 UTC
Permalink
Post by Geoff Leyland
I store centrelines of road segments, each of which is an x, y polyline,
and each segment centreline is stored as a Lua table of points, where a
point is also a table with two elements: {{1,2},{3,4},{4,5},...}.
Do you save enough memory by representing the polylines as
{x,x,x,x},{y,y,y,y} instead of {{x,y},{x,y},{x,y},{x,y}} ?

Tony.
--
f.anthony.n.finch <dot-***@public.gmane.org> http://dotat.at/
Trafalgar: Cyclonic in northwest, otherwise mainly northerly or northwesterly
5 or 6. Slight or moderate. Showers in northwest. Good.
Geoff Leyland
2014-09-10 09:47:51 UTC
Permalink
Post by Tony Finch
Post by Geoff Leyland
I store centrelines of road segments, each of which is an x, y polyline,
and each segment centreline is stored as a Lua table of points, where a
point is also a table with two elements: {{1,2},{3,4},{4,5},...}.
Do you save enough memory by representing the polylines as
{x,x,x,x},{y,y,y,y} instead of {{x,y},{x,y},{x,y},{x,y}} ?
Good question. I didn’t actually try it, but I suspect not. I went straight on to mallocing arrays of “struct point { double x, y; };”. That wasn’t quite enough, so I moved a few more members of the road segments into cdata storage. A road segment is now:

{
id = “a string”,
road = <table 0x…>,
cdata = struct { double min[2], max[2]; point *points; unsigned point_count; }
}

Min and max are the bounding box of the polyline. Moving them into cdata made a big difference because previously bounds was three tables: {min={0,1}, max={3,4}}

That gets me part of the way, but I now run out of memory when I build a kdtree of the ends of the polylines.
Continue reading on narkive:
Loading...