Welcome to the main channel on the development of MoarVM, a virtual machine for NQP and Rakudo (moarvm.org). This channel is being logged for historical purposes. Set by lizmat on 24 May 2021. |
|||
00:02
reportable6 left
00:04
reportable6 joined
01:04
linkable6 left,
evalable6 left,
evalable6 joined
02:05
linkable6 joined
|
|||
timo | in inline cache we only free inline cache entries by freeing them from the fsa, but we dont look into the entries to get at the dp themselves to free them | 03:15 | |
so that is where the fix for the leak probably lies | |||
oh thats not what cleanup entry does i guess | 03:28 | ||
Geth | MoarVM/new-disp: 88081523a4 | (Timo Paulssen)++ | src/disp/inline_cache.c destroy dispatch programs from inline entries when destroying static frames |
03:44 | |
timo | this reduces the amount of leaked allocations from dispatch programs | 03:46 | |
04:40
linkable6 left,
evalable6 left
04:41
linkable6 joined
05:08
frost joined
06:02
reportable6 left
06:42
evalable6 joined
07:04
reportable6 joined
|
|||
Nicholas | good *, #moarvm | 07:47 | |
jnthnwrthngtn: you suggested that new-disp could be more performant if the dispatch program "dispatcher" could use computed goto (where available). I had a quick look, but it's not obvious to me *which* switch statement in which file is the main dispatch loop that this would be for. Could you drop a hint? | 07:49 | ||
(also, not promising that I'll get to this, but I figured that if I can't spot it, someone else possibly can't either) | |||
MasterDuke | Nicholas: i think it's here github.com/MoarVM/MoarVM/blob/new-...am.c#L2633 but don't take my word for it | 07:52 | |
timo++ | 07:53 | ||
before: definitely lost: 355,776 bytes in 6,286 blocks and indirectly lost: 1,267,760 bytes in 19,160 blocks | |||
after: definitely lost: 26,832 bytes in 412 blocks and indirectly lost: 84,112 bytes in 1,256 blocks | 07:54 | ||
invalid frees are still there | |||
Geth | MoarVM: bb14f42376 | (Stefan Seifert)++ | src/gc/finalize.c Fix finalizers of gen2 objects rarely getting run After garbage collection we go through the list of objects with finalizers (the finalize list) and move objects that were collected to the list of finalizers to run. We rewrite the finalize list as we walk through it, in order to only keep live objects. However during a nursery-only collection we only considered nursery objects. Objects that were already in gen2 were not moved to the collapsed finalize list. Thus we never ran those object's finalizers. Fix by indiscriminately moving objects we don't have a closer look at to the collapsed list. |
08:24 | |
08:42
linkable6 left,
evalable6 left,
evalable6 joined
08:57
Altai-man left
|
|||
Geth | MoarVM/fixkey-zero-static: 104d32c16c | (Nicholas Clark)++ | 5 files Change `permroot_descriptions` and related functions to `const char *`. |
09:22 | |
MoarVM/fixkey-zero-static: 3283717532 | (Nicholas Clark)++ | 3 files `entry_size == 0` enables static storage in MVMFixKeyHashTable. The hashtable internally stores just pointers, which indirect to the stored structure, so that the stored structure remains at a fixed address in memory. The default was always to allocate storage with the Fixed Size Allocator. ... (5 more lines) |
|||
MoarVM/fixkey-zero-static: 277e3928c9 | (Nicholas Clark)++ | 4 files Convert the system calls hash to MVMFixKeyHashTable. MVMFixKeyHashTable can now handle indirecting to static storage. Hence add `MVMString *name` to the start of `struct MVMDispSysCall` and store this in an MVMFixKeyHashTable, instead of having a `struct MVMDispSysCallHashEntr` which indirected to it, and was stored in an MVMStrHashTable. As the key is now *in* the struct with a fixed storage address, use `MVM_gc_root_add_permanent_desc` to mark it for GC. Hence we no longer need to iterate over the hash on each GC run adding the keys to the worklist. |
|||
09:31
Altai-man joined
|
|||
Geth | MoarVM: nwc10++ created pull request #1531: Convert the systems calls hash to MVMFixKeyHashTable |
09:35 | |
10:07
Kaiepi left
|
|||
jnthnwrthngtn | Nicholas: The one MasterDuke pointed to is indeed the correct one | 10:17 | |
timo | i see some good commits | 10:20 | |
nine | If anyone else is wondering what computed gotos actually are and why they are faster than switch statements despite all the magic compilers can do, this is an excellent read: eli.thegreenplace.net/2012/07/12/c...tch-tables | 10:27 | |
10:32
Kaiepi joined
|
|||
timo | ah yes, having a separate entry in the branch prediction table for every instruction in the interpreter can be a big win | 10:33 | |
and jitting can be an even bigger win here :p | 10:34 | ||
jnthnwrthngtn | It can, though I'm not sure if it's worth JITting dispatch programs "alone"; need to see if there is significant time spent in them post-spesh | 10:44 | |
10:44
linkable6 joined
|
|||
jnthnwrthngtn | Sigh, the anaesthetic from this morning's dentist trip seems to have numbed my brain too... | 10:47 | |
Or maybe that's just because I didn't have any coffee today | |||
nine | The coffee situation sounds fixable, so I guess it's worth a try :) | 10:53 | |
jnthnwrthngtn | Nicholas: I think the hash PR for syscalls is alright, though I left a note about a potential downside | 10:57 | |
(That wasn't one that you mentioned) | |||
Though I'm still inclined to think the upsides are better | |||
The other mildly "amusing" thing that happened this morning is that I figured I'd sit on the terrace for a bit to try and relax before going to the dentist (I get really anxious about such things). Except there was some construction work happening across the street, so all I heard the whole time was drilling. | 11:01 | ||
nine | Well, compared to those drills... | 11:02 | |
timo | at least theer wasnt a jackhammer going | 11:03 | |
jnthnwrthngtn | Uff, yes. | ||
There's like 3 buildings near mine all undergoing reconstruction at the moment (2 new roofs, 1 is gaining an extra floor and thus a new roof also). | 11:04 | ||
timo | sometimes teeth get new roofs as well | 11:05 | |
jnthnwrthngtn | Sigh, just about everything that needs doing with new-disp right now requires thinking hard. :) | 11:07 | |
1) spesh of dispatch programs that set resumption init state, 2) Figure out istype's future (it'll stay, but what's behind it wants to change), 3) the remaining spectest fails, but aside from the code-gen problem that causes the hang, none of it looks like easy debugging. | 11:08 | ||
nine | So, attack that code-gen bug? | 11:09 | |
jnthnwrthngtn | I was sorta hoping somebody else might pick that relative LHF-ish :) | ||
timo | difficult to figure out how to free the dispatch programs properly? might need to put them in the fsa as well so the safepoint mechanism can be used for their bodies and ops | ||
nine | It is a bit bothersome anyway. I removed that test file locally, to get rid of the hangs in spectest | ||
jnthnwrthngtn: oh, in that case, what do you know about the bug? | |||
jnthnwrthngtn | nine: The QAST compiler seems to fail to spit out a coercion op | 11:10 | |
nine: So we read an uninitialized register | |||
If you dump the bytecode of the thing that hangs, the location of the missing instruction is obvious enough. | 11:11 | ||
lunch, bbiab | 11:12 | ||
11:18
frost left
|
|||
nine | jnthnwrthngtn: ha! If I had known that the missing instruction is _that_ obvious | 11:36 | |
Geth | MoarVM/new-disp: 104d32c16c | (Nicholas Clark)++ | 5 files Change `permroot_descriptions` and related functions to `const char *`. |
11:41 | |
MoarVM/new-disp: 3283717532 | (Nicholas Clark)++ | 3 files `entry_size == 0` enables static storage in MVMFixKeyHashTable. The hashtable internally stores just pointers, which indirect to the stored structure, so that the stored structure remains at a fixed address in memory. The default was always to allocate storage with the Fixed Size Allocator. ... (5 more lines) |
|||
MoarVM/new-disp: 277e3928c9 | (Nicholas Clark)++ | 4 files Convert the system calls hash to MVMFixKeyHashTable. MVMFixKeyHashTable can now handle indirecting to static storage. Hence add `MVMString *name` to the start of `struct MVMDispSysCall` and store this in an MVMFixKeyHashTable, instead of having a `struct MVMDispSysCallHashEntr` which indirected to it, and was stored in an MVMStrHashTable. As the key is now *in* the struct with a fixed storage address, use `MVM_gc_root_add_permanent_desc` to mark it for GC. Hence we no longer need to iterate over the hash on each GC run adding the keys to the worklist. |
|||
nine | Wow....can it really be that all this time we missed an MVMROOT of the result register during nativecall callbacks? | 11:53 | |
12:02
reportable6 left
12:04
reportable6 joined
|
|||
lizmat would not be surprised ? | 12:06 | ||
nine | Of course, when I try to provoke it so I can test my fix with confidence, nothing explodes | 12:09 | |
lizmat | sneaky | 12:13 | |
nine | Ok, I can in the full blown apllication. Just not in any reduced test case | 12:15 | |
jnthnwrthngtn back | 12:32 | ||
With coffee, now I can feel my mouth again :) | 12:33 | ||
nine: I've no idea how obvious it would be to fix said missing instruction... :) | 12:34 | ||
12:36
sena_kun joined
|
|||
nine | I think is_full_collection still needs some more tuning. As it is, any long running process with varying objects in the working set that live long enough to make it into gen2 will grow without bounds even if the working set does have a limited size. | 12:46 | |
That's why ironically, sprinkling nqp::force_gc into your loops can make memory consumption worse rather than better. | 12:47 | ||
jnthnwrthngtn | What do you mean by "varying objects"? | 12:48 | |
nine | What happens is: we have a steady flow of objects that make it into gen2 and could die a while later. On every GC run we look at promoted size. If it exceeds 20M we look at the process' RSS. If that has grown by more than 20 % since the last full collection, we trigger another one. | ||
Next time around our baseline RSS to compare with is larger by ~ 20 %. So we wait longer till we trigger a full collection. Next time 20 % more again, and so on. | 12:49 | ||
What I mean is if (part of) the objects your application allocates usually live long enough to make it into gen2, but could be collected at some point. | 12:50 | ||
We're fine if all objects are very short lived and those which make it into gen2 live for the whole life time of the process anyway. Objects with medium life times will lead to unbounded process growth. | 12:51 | ||
And there's an even worse case: RSS only counts physical memory. It doesn't include swap. So precisely in those situations where you've already gobbled up all available RAM and are swapping, we don't see any RSS growth at all and will never trigger a full collection anymore. | 12:52 | ||
lizmat | that would explain the memory growth I'm seeing on the irc logs server | ||
lots of big objects that possibly didn't make it to the nursery to begin with | 12:53 | ||
that are potentially short lived | |||
nine | Yesterday's trivial example I used to find the finalizer issue demonstrated it quite well. It quickly grew to multiple gigabytes until I added a nursery-runs counter to the GC and had it do a full collection after at most 100 nursery collections. | 12:59 | |
jnthnwrthngtn | nine: Yeah, sounds like it could do with a different approach to avoid that kind of effect. It *is* a good strategy for programs that build up an increasingly large data structure as they work (compilation is such an example) | ||
lizmat | reminds me of a "use less 'memory'" pragma :-) | 13:00 | |
afk for a few hours& | |||
jnthnwrthngtn | Wonder if we can cheaply factor it how much the GC collected | ||
(in a full collection | 13:01 | ||
) | |||
And use the discrepancy between promoted and collected to decide on whether to increase the ceiling | |||
nine | Yes, I figured we could use something along those lines. But after the 13 hour session yesterday and today's continued bug hunting, I'm not smart enough anymore to put those thoughts into an algorithm | 13:03 | |
But then it has an incredibly productive week so far :) With the finalizer fix, the callback GC issue fix and github.com/niner/Inline-Perl5/comm...251d9f0d6d our application has become stable and memory usage is fine | 13:05 | ||
That Inline::Perl5 issue was unexpected reentrancy caused by finalizers and has plagued us for 2 years as it defied every attempt to reproduce. Until yesterday, when I worked on the memory leak. | 13:06 | ||
Geth | MoarVM: 7a385a3506 | (Stefan Seifert)++ | 2 files Fix possible access to fromspace in NativeCall callbacks An untimely garbage collection between setting the result object in the callback and unmarshalling of that result object could lead to an outdated pointer in res.o and further to segfaults and other nastiness. Fix by adding res.o to the roots. Callbacks always return objects, so no no special handling of primitive types necessary. |
13:07 | |
jnthnwrthngtn | nine++ # wish I could say the same for my week... | 13:23 | |
dogbert11 | jnthnwrthngtn: given that there's still work to be done on new-disp is it reasonable that this tiny program, gist.github.com/dogbert17/9cb16b7a...ab72b274de takes three times as long to run on new-disp than on master= | 13:24 | |
jnthnwrthngtn | Yes | 13:25 | |
Inlining is very very very very very important. :) | 13:26 | ||
dogbert11 | thanks :) | ||
what could this be - lang-call cannot invoke object of type 'VMNull' belonging to no language | 13:41 | ||
jnthnwrthngtn | Something tried to invoke a null object. A golf of that could be interesting. | 13:44 | |
dogbert11 | so far it only appears with a small (64k) nursery when running /spec/S32-exceptions/misc.t | 13:45 | |
jnthnwrthngtn | Goodness, the profiler producing the SQL is slow | 13:46 | |
13:46
rakuUser joined
15:48
nebuchadnezzar left
|
|||
Geth | MoarVM/new-disp: 074a3c43e8 | (Jonathan Worthington)++ | 8 files Prepare for translating dispatch resume inits These are cases where a dispatch wishes to keep state around in case there is a resumption later - that is, every Rakudo method and multi dispatch. The aim is to minimize runtime cost in the case that a resumption never happens, which is the overwhelmingly common case. ... (15 more lines) |
16:22 | |
dogbert11 | here's another golf, for me it fails with a compilation error | 16:33 | |
MVM_SPESH_NODELAY=1 MVM_SPESH_BLOCKING=1 ./rakudo-m -e 'my $foo is default(Nil) = 42; for 41,42,43 -> $a, $b, $c { my $foo is default(Nil) = 42; }' | 16:34 | ||
hopefully someone can repro | |||
jnthnwrthngtn | yes, can repro it | 16:43 | |
Is this relatively new, or around for a while? | 16:44 | ||
The specialized frame breaking it is src/core.c/Variable.pm6:53 | 16:46 | ||
Which fits the error | |||
dogbert11 | jnthnwrthngtn: atm I don't know how new it is, I can try to find out though. | 16:50 | |
I have seen several errors like this for a few days but this is the first golf | 16:51 | ||
it *might* have something to do with nine's lang-hllize dispatch work but that is only a guess | 16:53 | ||
16:54
linkable6 left,
evalable6 left
16:55
evalable6 joined
16:56
linkable6 joined
|
|||
jnthnwrthngtn | "Malformed UTF-8 near bytes 20 4c 8b at line 1 col 53" | 17:00 | |
?! | |||
But yeah, that's the thing nine++ was hunting | |||
heh | 17:02 | ||
P6opaque: no such attribute '$!descriptor' on type L\213V\030I\017\277B... | |||
Geth | MoarVM/new-disp: 7566af62b2 | (Jonathan Worthington)++ | src/spesh/optimize.c Fix mis-optimization of reads from Scalars Just having the type facts is not enough, we also need to know that it is concrete. Surprisingly this hasn't come up before now, but did so when hllize became a dispatcher. It inserts a guard on the type only (but not on ... (6 more lines) |
17:39 | |
jnthnwrthngtn | nine: I think ^^ is a fix for the hllize thing you were hunting. In the end not actually a deopt bug; we did a deopt because we'd read "junk", and then the same junk thing was used to in getattribute afterwards. | 17:41 | |
home time & | 17:52 | ||
17:53
sena_kun left
18:03
reportable6 left,
reportable6 joined
18:12
Altai-man left
18:41
Altai-man joined
19:57
MasterDuke47 joined
|
|||
MasterDuke47 | is --full-cleanup still supposed to work when we exit because of a compilation error? | 19:58 | |
`valgrind --leak-check=full raku --full-cleanup -e 'my $a = \c[LATIN]'` has a ton of leaks | 19:59 | ||
jnthnwrthngtn | Today's discovery: if you profile CORE.setting build on new-disp you do get SQL output fine...you can read it and run the statements against a SQLite database to import it...BUT | 20:06 | |
One of the lines is 160MB line! | |||
*long! | |||
And Java uses UTF-16 internally so that's a 320MB String that you pretty much can't avoid forming. | |||
jnthnwrthngtn wonders if we might be able to spread that rather long line over a few :) | 20:07 | ||
MasterDuke47: If it calls nqp::exit it's worth checking if that actually triggers a full cleanup or not | 20:08 | ||
MasterDuke47 | k, i'll see if i can track that down | 20:10 | |
.tell Nicholas here's a valgrind log of compiling CORE.e with --full-cleanup and it has a bunch of hash-related leaks gist.github.com/MasterDuke17/3afad...b1ec9fbb4d | 20:13 | ||
tellable6 | MasterDuke47, I'll pass your message to Nicholas | ||
dogbert11 | if anyone is interested in a SEGV just run: MVM_SPESH_NODELAY=1 MVM_SPESH_BLOCKING=1 ./rakudo-m -Ilib t/spec/S32-io/io-spec-win.t | 20:17 | |
MasterDuke47 | jnthnwrthngtn: btw, the creation of very large profiles was mildly optimized, but kind of just to the point that it was actually able to do so in most cases instead of dying or producing something that couldn't be used | ||
so i'm sure there are more optimizations to be found | 20:19 | ||
dogbert11: on new-disp? | 20:20 | ||
jnthnwrthngtn | Yeah, it takes longer to output the profile than it did to record it :) | ||
dogbert11 | MasterDuke47: yes | ||
jnthnwrthngtn | dogbert11: Did you confirm the fix I pushed earlier, btw? | 20:21 | |
dogbert11 | jnthnwrthngtn: yes indeed, it worked perfectly | ||
Thread 2 "spesh optimizer" received signal SIGSEGV, Segmentation fault. | |||
[Switching to Thread 0x7ffff6ef8700 (LWP 799641)] | 20:22 | ||
annotate_inline_start_end (inlinee_last_bb=0x7ffff3f9e1c0, inlinee_last_bb=0x7ffff3f9e1c0, inline_boundary_handler=<optimized out>, idx=4, inlinee=0x7ffff3623fa0, inliner=0x4, tc=0x5555555f1670) | |||
at src/spesh/inline.c:1301 | |||
1301 end_ann->next = end_ann_target->annotations; | |||
end_ann_target is NULL | 20:23 | ||
MasterDuke47 | `valgrind --leak-check=full ./install/bin/raku --full-cleanup -e 'use nqp; nqp::exit'` does indeed report a ton of leaks. should nqp::exit respect --full-cleanup? | ||
i think i would expect it to | 20:24 | ||
ah, nqp::exit is pretty simple. it's just `MVMint64 exit_code = GET_REG(cur_op, 0).i64; MVM_io_flush_standard_handles(tc); exit(exit_code);` | 20:27 | ||
should it instead call MVM_vm_exit or MVM_vm_destroy_instance? | 20:28 | ||
jnthnwrthngtn | MasterDuke47: Yes, but only if --full-cleanup is set, I think | 20:30 | |
Probably MVM_vm_exit | 20:31 | ||
Maybe that already checks the full cleanup flag | |||
MasterDuke47 | it doesn't | ||
jnthnwrthngtn | (Normally we don't want to try and clean up, the OS can do it so much faster.) | ||
(So it is only useful if doing leak checking) | |||
MasterDuke47 | MVM_vm_exit and MVM_vm_destroy_instance are only ever called from main.c which is where the --full-cleanup flag is checked for | 20:32 | |
should nqp::exit instead just exit the interpreter, and then let normal exiting happen (i.e., MVM_vm_exit or MVM_vm_destroy_instance)? | 20:36 | ||
dogbert11 | as far as the SEGV is concerned, it's older than github.com/MoarVM/MoarVM/commit/4d...eca0f3d836 | 20:40 | |
jnthnwrthngtn | MasterDuke47: I think it should do as it does today when it's not full cleanup, and if there is full cleanup call MVM_vm_destroy_instance(instance); before exit | 20:43 | |
MasterDuke47 | aiui nothing outside of main.c knows whether --full-cleanup was used, so we'd have to put that information somewhere that interp.c can know about it | 20:45 | |
jnthnwrthngtn | Yes | 20:48 | |
Flag on MVMInstance probably | |||
MasterDuke47 | k | ||
21:02
linkable6 left,
evalable6 left
21:03
evalable6 joined
21:04
linkable6 joined
|
|||
Geth | MoarVM: MasterDuke17++ created pull request #1532: Respect --full-cleanup in nqp::exit |
21:49 | |
21:58
MasterDuke47 left
23:26
greppable6 left,
reportable6 left,
linkable6 left,
unicodable6 left,
committable6 left,
statisfiable6 left,
coverable6 left,
bisectable6 left,
squashable6 left,
tellable6 left,
sourceable6 left,
releasable6 left,
benchable6 left,
shareable6 left,
quotable6 left,
notable6 left,
bloatable6 left,
evalable6 left,
nativecallable6 left,
quotable6 joined,
statisfiable6 joined,
tellable6 joined
23:27
bloatable6 joined
23:28
coverable6 joined,
unicodable6 joined,
squashable6 joined,
releasable6 joined
23:29
committable6 joined,
notable6 joined
|