Welcome to the main channel on the development of MoarVM, a virtual machine for NQP and Rakudo (moarvm.org). This channel is being logged for historical purposes.
Set by lizmat on 24 May 2021.
ugexe MasterDuke: any reason to not also hook up lchown and fchown similar to node.js? 00:07
00:10 reportable6 left, reportable6 joined 00:44 sortiz left 01:19 discord-raku-bot left, discord-raku-bot joined 01:55 squashable6 left 01:57 squashable6 joined 02:36 Kaipei left 03:44 jjatria left, jjatria joined 03:50 camelia left 03:54 camelia joined 03:57 Kaiepi joined 03:58 Techcable left 03:59 Techcable joined 04:03 gfldex joined 04:07 RakuIRCLogger left, RakuIRCLogger joined 04:16 leont left, tbrowder left, ugexe left, leont joined, tbrowder joined 04:17 ugexe joined 04:32 leedo left, samcv left 04:33 leedo joined 04:34 samcv joined 05:09 Kaiepi left 06:03 Techcable left 06:04 Techcable joined 06:07 reportable6 left, reportable6 joined 06:37 samcv left, samcv_ joined 08:16 greppable6 left, shareable6 left, unicodable6 left 08:17 linkable6 left, evalable6 left, reportable6 left, bloatable6 left, bisectable6 left, releasable6 left, benchable6 left, nativecallable6 left 08:18 squashable6 left, coverable6 left, statisfiable6 left, quotable6 left 08:19 notable6 left, committable6 left, tellable6 left, sourceable6 left 08:41 samcv_ left 08:42 samcv joined 08:54 Kaiepi joined
lizmat more memory gc oddities: 09:24
for ^100 { my $a = $io.slurp.substr(0,1) } on a 4MB file ends up with a max-rss of 1296336 aka about 1.2GB 09:25
prefix this with a `start { }`
and the max-rss winds up at 832080 aka about 800MB
with *less* CPU usage (5.73 -> 5.70) 09:26
If I do a force_gc for every 10 supervisor runs, max-rss goes down to 209056 (about 200MB) and time down to 5.65 09:30
if I do it for *every* supervisor run, it goes down to 187824 09:31
with wallclock / cpu slightly up
jnthn ^^ anything that could explain that? 09:32
it really looks like: 09:38
1. having the supervisor running has a remarkable positive effect on memory usahe
*usage
nine Is that start { } effect with or without the force_gc in the supervisor? 09:39
lizmat 2. calling nqp::force_gc for every supervisor run has the potential of reducing memory usage to 20% of the original usage with minor CPU benefits
well, if I don't put the start {} in, there is no supervisor 09:40
but the effect is there even without the force_gc in
nine That's curious. But I'd say it's not conclusive. It could just be that the different memory usage profile just triggers the GC at more opportune times. 09:48
lizmat I'm creating a PR with support for a RAKUDO_GC_EVERY environment variable 09:49
that would allow you to force a nqp::force_gc for every N supervisor ticks
so we can experiment with values and see if it helps... 09:50
to give an example: the IRC logs server, when it is running for about a week
grows to about 20GB 09:51
yet MacOS is able to compress 7GB+ of that
so the actual usage is only 13GB
nine Side note: a tight loop dealing strings that size is a bit of a worst case. MoarVM will start a GC run when the nursery is full. The nursery is 4MB by default. String objects are only 40 bytes large with the string itself getting allocated on the heap. So the nursery can hold 100K of those 4MB strings before a GC gets triggered.
lizmat aha! 09:52
I guess parsing logs is such a worst case then, generally :-( 09:53
nine Parsing logs, no. Typical log parsing would be: process a line (typical size in the 100 bytes range), get some information out of there, store the information somewhere (database, disk), then move on to the next line. I wonder what you keep that information around for? 09:55
Oh, of course one major use case doesn't even store the information. It's the one where we just print it to stderr, e.g. grep some logs
lizmat right as in "rak" :-)
github.com/rakudo/rakudo/pull/5097 10:00
nine FWIW I really don't like that PR and don't think it should be merged. Triggering a GC run every x seconds is downright trivial without modifications to the core and without depending on the supervisor thread: start { $*VM.request-garbage-collection; sleep 0.01 } 10:03
lizmat and a loop I assume :-) 10:04
nine Thread.run({ loop { $*VM.request-garbage-collection; sleep 0.01 } } 10:05
)
You get the idea :D
lizmat yeah, but that would take a thread, and in the supervisor it wouldn't
nine So?
If your application is multi threaded, use the start { } version which runs on the thread pool, thereby not needing its own thread at all. If your app is not multi threaded use Thread.run to avoid having the supervisor. 10:06
If you absolutely want to avoid threading for this (for some obscure reason), put that $*VM.request-garbage-collection into your main loop 10:07
10:12 sena_kun joined
lizmat ok, noted... 10:15
going afk for most of the day&
10:45 samcv left, samcv joined 12:13 Altai-man joined 12:16 sena_kun left 12:47 Altai-man left 12:57 RakuIRCLogger left 12:58 RakuIRCLogger joined
lizmat thoughts about 100K 4MB strings in a nursery: 12:58
perhaps each item in the nursery should get a weight associated with it 12:59
and as soon as the entry is stale, the weight gets added to some counter
and if that counter exceeds X, a GC run would be initiated ? 13:00
I realize this is extra work in a very hot code path, but wanted to throw out this idea anyway, so it can be shot down properly (or not :-) 13:01
nine LOL I just noticed that I have got some 2.7 TiB of rr recordings on my disk... Thanks to (presumably) sparse files, they only took ~120 GiB of actual space though. 13:15
lizmat nice 13:16
reminds me of a situation in the early 2000's where a mod_perl server would need about 30 seconds to startup, because it would preload a *lot* of stuff 13:17
which would grow up to 700 MB or so
and then spawn about 100 children, and see memory usage go up to 70GB on a machine with 8GB of memory :-) 13:18
it worked because each child only had about 10MB unique memory, and as soon as that exceeded 30MB or so, it would commit suicide :-) 13:19
nine MoarVM already keeps track of unmanaged sizes and uses that for making decisions about full GC runs. I'm somewhat sceptical about using it also during allocation from nursery as like you mentioned, this is an incredibly hot path. We're allocating an unbelievable number of objects.
lizmat I understand
but I feel to these particular workflows, it's not making the right decisions 13:20
also: I'd be more worried about the extra memory for each nursery entry 13:21
anyways, going afk again&
nine It's always going to be a compromise. If it fits your workload very badly and more collection runs would help, there's always $*VM.request-garbage-collection
Worried how?
lizmat overflowing CPU caches earlier 13:24
actually VM.request-garbage-collection also work, no need to look up a dynamic variable
yes, it works... but it requires code changes, which my PR would not: 13:25
it would allow you to tune *your* application externally
nine Expanding a bit more: even if we would want to take the performance hit and took unmanaged size into consideration, what would we do exactly? The thing is, we don't know e.g. how long those strings are going to stay around. If we start a collection, just because someone allocated 100 MiB of strings and they are actually still in use, we've done a completely unnecessary GC run which is not exactly free.
Your PR only works in applications which make use of the thread pool. So it's not exactly a generic solution either. 13:26
lizmat ah, duh, we don't know whether an entry in the nursery is stale *until* we do a GC run, right ? 13:27
was thinking with a reference counting mind :-(
nine: in the PR I mentioned that maybe specifying that env variable should start the supervisor 13:28
nine GC tuning is a hard problem and an area of ongoing research.
lizmat completely understood 13:29
that's why I think it should be easy to tweak a setting, such as in my PR
afk& 13:31
nine That's just one knob that happens to work nicely in one specific use case. If we decide to tackle this problem, we ought to do it in a well designed way that covers a large range of use cases and provides all the knobs required for this at different layers, e.g. as APIs for applications themselves and configurables for users. 13:37
14:36 coleman left, coleman joined 14:42 coleman left, coleman joined 18:11 sena_kun joined 18:21 Altai-man joined 18:24 sena_kun left
nine Oh boy.... I finally came across my old adversary: BEGIN time EVAL 20:36
20:41 Altai-man left 21:40 reportable6 joined 21:41 benchable6 joined 21:42 shareable6 joined, quotable6 joined, nativecallable6 joined, linkable6 joined, releasable6 joined, unicodable6 joined, notable6 joined, coverable6 joined, bloatable6 joined, greppable6 joined, statisfiable6 joined 21:43 committable6 joined, evalable6 joined, squashable6 joined, tellable6 joined 21:46 linkable6 left 21:47 linkable6 joined
MasterDuke the jvm famously has very tunable GCs, might want to look there 22:22
ugexe: mostly because moarvm doesn't expose fchmod. but i guess lchown might be good to have 22:28
22:43 bisectable6 joined 22:45 sourceable6 joined
MasterDuke why can i never remember how to bump a submodule? 23:05
23:10 epony left