Welcome to the main channel on the development of MoarVM, a virtual machine for NQP and Rakudo (moarvm.org). This channel is being logged for historical purposes.
Set by lizmat on 24 May 2021.
ugexe MasterDuke: any reason to not also hook up lchown and fchown similar to node.js? 00:07
lizmat more memory gc oddities: 09:24
for ^100 { my $a = $io.slurp.substr(0,1) } on a 4MB file ends up with a max-rss of 1296336 aka about 1.2GB 09:25
prefix this with a `start { }`
and the max-rss winds up at 832080 aka about 800MB
with *less* CPU usage (5.73 -> 5.70) 09:26
If I do a force_gc for every 10 supervisor runs, max-rss goes down to 209056 (about 200MB) and time down to 5.65 09:30
if I do it for *every* supervisor run, it goes down to 187824 09:31
with wallclock / cpu slightly up
jnthn ^^ anything that could explain that? 09:32
it really looks like: 09:38
1. having the supervisor running has a remarkable positive effect on memory usahe
*usage
nine Is that start { } effect with or without the force_gc in the supervisor? 09:39
lizmat 2. calling nqp::force_gc for every supervisor run has the potential of reducing memory usage to 20% of the original usage with minor CPU benefits
well, if I don't put the start {} in, there is no supervisor 09:40
but the effect is there even without the force_gc in
nine That's curious. But I'd say it's not conclusive. It could just be that the different memory usage profile just triggers the GC at more opportune times. 09:48
lizmat I'm creating a PR with support for a RAKUDO_GC_EVERY environment variable 09:49
that would allow you to force a nqp::force_gc for every N supervisor ticks
so we can experiment with values and see if it helps... 09:50
to give an example: the IRC logs server, when it is running for about a week
grows to about 20GB 09:51
yet MacOS is able to compress 7GB+ of that
so the actual usage is only 13GB
nine Side note: a tight loop dealing strings that size is a bit of a worst case. MoarVM will start a GC run when the nursery is full. The nursery is 4MB by default. String objects are only 40 bytes large with the string itself getting allocated on the heap. So the nursery can hold 100K of those 4MB strings before a GC gets triggered.
lizmat aha! 09:52
I guess parsing logs is such a worst case then, generally :-( 09:53
nine Parsing logs, no. Typical log parsing would be: process a line (typical size in the 100 bytes range), get some information out of there, store the information somewhere (database, disk), then move on to the next line. I wonder what you keep that information around for? 09:55
Oh, of course one major use case doesn't even store the information. It's the one where we just print it to stderr, e.g. grep some logs
lizmat right as in "rak" :-)
github.com/rakudo/rakudo/pull/5097 10:00
nine FWIW I really don't like that PR and don't think it should be merged. Triggering a GC run every x seconds is downright trivial without modifications to the core and without depending on the supervisor thread: start { $*VM.request-garbage-collection; sleep 0.01 } 10:03
lizmat and a loop I assume :-) 10:04
nine Thread.run({ loop { $*VM.request-garbage-collection; sleep 0.01 } } 10:05
)
You get the idea :D
lizmat yeah, but that would take a thread, and in the supervisor it wouldn't
nine So?
If your application is multi threaded, use the start { } version which runs on the thread pool, thereby not needing its own thread at all. If your app is not multi threaded use Thread.run to avoid having the supervisor. 10:06
If you absolutely want to avoid threading for this (for some obscure reason), put that $*VM.request-garbage-collection into your main loop 10:07
lizmat ok, noted... 10:15
going afk for most of the day&
lizmat thoughts about 100K 4MB strings in a nursery: 12:58
perhaps each item in the nursery should get a weight associated with it 12:59
and as soon as the entry is stale, the weight gets added to some counter
and if that counter exceeds X, a GC run would be initiated ? 13:00
I realize this is extra work in a very hot code path, but wanted to throw out this idea anyway, so it can be shot down properly (or not :-) 13:01
nine LOL I just noticed that I have got some 2.7 TiB of rr recordings on my disk... Thanks to (presumably) sparse files, they only took ~120 GiB of actual space though. 13:15
lizmat nice 13:16
reminds me of a situation in the early 2000's where a mod_perl server would need about 30 seconds to startup, because it would preload a *lot* of stuff 13:17
which would grow up to 700 MB or so
and then spawn about 100 children, and see memory usage go up to 70GB on a machine with 8GB of memory :-) 13:18
it worked because each child only had about 10MB unique memory, and as soon as that exceeded 30MB or so, it would commit suicide :-) 13:19
nine MoarVM already keeps track of unmanaged sizes and uses that for making decisions about full GC runs. I'm somewhat sceptical about using it also during allocation from nursery as like you mentioned, this is an incredibly hot path. We're allocating an unbelievable number of objects.
lizmat I understand
but I feel to these particular workflows, it's not making the right decisions 13:20
also: I'd be more worried about the extra memory for each nursery entry 13:21
anyways, going afk again&
nine It's always going to be a compromise. If it fits your workload very badly and more collection runs would help, there's always $*VM.request-garbage-collection
Worried how?
lizmat overflowing CPU caches earlier 13:24
actually VM.request-garbage-collection also work, no need to look up a dynamic variable
yes, it works... but it requires code changes, which my PR would not: 13:25
it would allow you to tune *your* application externally
nine Expanding a bit more: even if we would want to take the performance hit and took unmanaged size into consideration, what would we do exactly? The thing is, we don't know e.g. how long those strings are going to stay around. If we start a collection, just because someone allocated 100 MiB of strings and they are actually still in use, we've done a completely unnecessary GC run which is not exactly free.
Your PR only works in applications which make use of the thread pool. So it's not exactly a generic solution either. 13:26
lizmat ah, duh, we don't know whether an entry in the nursery is stale *until* we do a GC run, right ? 13:27
was thinking with a reference counting mind :-(
nine: in the PR I mentioned that maybe specifying that env variable should start the supervisor 13:28
nine GC tuning is a hard problem and an area of ongoing research.
lizmat completely understood 13:29
that's why I think it should be easy to tweak a setting, such as in my PR
afk& 13:31
nine That's just one knob that happens to work nicely in one specific use case. If we decide to tackle this problem, we ought to do it in a well designed way that covers a large range of use cases and provides all the knobs required for this at different layers, e.g. as APIs for applications themselves and configurables for users. 13:37
nine Oh boy.... I finally came across my old adversary: BEGIN time EVAL 20:36
MasterDuke the jvm famously has very tunable GCs, might want to look there 22:22
ugexe: mostly because moarvm doesn't expose fchmod. but i guess lchown might be good to have 22:28
MasterDuke why can i never remember how to bump a submodule? 23:05