Welcome to the main channel on the development of MoarVM, a virtual machine for NQP and Rakudo (moarvm.org). This channel is being logged for historical purposes.
Set by lizmat on 24 May 2021.
00:02 reportable6 left 00:04 reportable6 joined 01:03 jgaz joined 01:11 jgaz left 01:47 frost joined 02:47 linkable6 left, evalable6 left 02:49 linkable6 joined 02:50 evalable6 joined 04:51 evalable6 left, linkable6 left 04:52 linkable6 joined 04:54 evalable6 joined 06:02 reportable6 left 06:04 reportable6 joined 07:02 patrickb joined 08:09 linkable6 left, evalable6 left 08:10 evalable6 joined 08:11 linkable6 joined
nine MasterDuke: doesn't sound like a GC issue to me. Most of all because tc->cur_frame is a part of the ThreadContext struct and that one isn't managed by the GC (would be hard to bootstrap, since we store the GC's state in the ThreadContext as well) 10:14
tellable6 nine, I'll pass your message to MasterDuke
nine MasterDuke: does tc at this point actually look like a valid ThreadContext at all? I.e. does it have appropriate pointers where it ought to have them, do *_num or *_size fields look sane? 10:15
tellable6 nine, I'll pass your message to MasterDuke
Geth MoarVM/new-disp: 82ea6c690b | (Jonathan Worthington)++ | src/disp/syscall.c
Add a pair of syscalls for stringification

These ease the migration of NQP away from the smrt_strify op towards a dispatcher.
10:56
10:58 sena_kun joined
jnthnwrthngtn And now `need Test` works; `use Test` does not yet 11:02
oh, it does 11:03
11:34 patrickb left 11:35 patrickb joined 11:36 patrickb left 11:37 patrickb joined
dogbert17 jnthnwrthngtn++, are we getting close to some kind of release 11:37
11:43 patrickb left 11:44 patrickb joined, squashable6 left 11:45 squashable6 joined 11:47 patrickb left 11:48 patrickb joined 11:49 patrickb left 11:50 patrickb joined 11:51 patrickb left 11:52 patrickb joined 11:53 patrickb left 11:54 patrickb joined, patrickb left 12:02 reportable6 left
jnthnwrthngtn dogbert17: Depends what "close" means. There's still quite a bit to do. 12:03
otoh, there's less to do than there has been at any point so far :) 12:04
12:05 reportable6 joined 12:33 MasterDuke joined
MasterDuke . 12:33
tellable6 2021-07-13T10:15:47Z #moarvm <nine> MasterDuke: does tc at this point actually look like a valid ThreadContext at all? I.e. does it have appropriate pointers where it ought to have them, do *_num or *_size fields look sane?
hey MasterDuke, you have a message: gist.github.com/40a8832e927197b339...40fd77e449
MasterDuke nine: i think the tc looks ok gist.github.com/MasterDuke17/851f5...9ae07824bc 12:36
dogbert17 jnthnwrthngtn: close enough for you needing some more help testing stuff :) 12:50
13:00 sena_kun left
jnthnwrthngtn dogbert17: Ah, yes, that is not so far off. `make test` in Rakudo is still quite explosive, but once it's relatively clean and we're into the spectests, it gets interesting to start looking for things not known to be undone 13:05
The current task is handling multiple dispatch when one of the arguments is wrapped in a Proxy, which it turns out happens in some of the module loading internals 13:14
13:30 sena_kun joined 13:31 patrickb joined 13:33 MasterDuke left 13:40 timo left, MasterDuke joined, timo joined
jnthnwrthngtn Both `make test` and `make spectest` have a handful of tests that hang and eat memory, which is what I'll work out next, since it's annoying. 14:10
However, my first even run of `make spectest` on new-disp passes 980 test files in full.
dogbert17 that's way cool 14:11
14:16 patrickb left, patrickb joined 14:18 patrickb left, patrickb joined 14:20 patrickb left, patrickb joined
jdv since it's annoying:) 14:20
14:22 patrickb left, patrickb joined 14:26 patrickb left, patrickb joined 14:28 dogbert17 left, dogbert17 joined 14:30 patrickb left 14:31 patrickb joined 14:32 patrickb left 14:33 patrickb joined 14:34 patrickb left 14:35 patrickb joined, patrickb left
nine MasterDuke: that's a different tc 15:06
MasterDuke: I meant at the point of the segfault 15:07
MasterDuke ah
gist updated 15:09
i'd say it looks corrupted 15:10
nine definitely 15:14
So cur_frame is what it stumbles over, but it's really either an already freed ThreadContext or a totally bogus tc pointer 15:15
Does the program actually use multiple threads? 15:16
MasterDuke not explicitly, just because of spesh
m: ($_ = "a: " until .Str) andthen .Str 15:17
camelia (signal SEGV)
MasterDuke that's all it is
nine Huh...for me it fails differently. I get a sane looking tc->cur_frame, but still a segfault in src/core/interp.c:6394 at found = GET_LEX(cur_op, 2, f); due to f being NULL 15:21
Oh, that may be because I disabled the JIT to get a useful backtrace. With JIT I do see a bogus tc 15:22
MasterDuke if it makes a difference, i had the jit disabled 15:23
Geth MoarVM/new-disp: 93c19a9b2d | (Jonathan Worthington)++ | 3 files
Support capturehasnameds on new Capture object
15:26
MoarVM/new-disp: 7397006c3b | (Jonathan Worthington)++ | src/core/args.c
Workaround for arg_count not being always set

It's going away under the new calling conventions. We'd ideally just change the code here to construct an MVMCapture, which will come to fully replace use of MVMCallCapture, however NQP's multiple dispatch will have to be ported before that can happen (or at least, it will in order to avoid a bunch of throwaway work updating the current multiple dispatch cache mechanism, which is also going away.)
15:26 Kaiepi left
jnthnwrthngtn Already up to 1025 fully passing spectest files on the latest run 15:27
MasterDuke nice
jnthnwrthngtn I guess "NYI case of race building resumable dispatch program" is my next target, since it causes failures at random :) 15:28
MasterDuke those previous commits fixed the hangs/memory growth?
15:28 Kaiepi joined 15:30 Kaiepi left
Nicholas to confirm - nqp is all "new-disp" all the time, but rakudo is only using new-disp if RAKUDO_NEW_DISP=1 is set in the environment? And setting build will fail with that set (so far it's only tests that mostly work with it)? 15:34
15:34 Kaiepi joined
nine Nicholas: I think you're a bit out of date. That environment variable is gone and it should make it through setting compilation 15:35
Nicholas ah OK. It *does* make it through the setting compilation
ASAN yawns at your progress 15:36
15:36 Kaiepi left
MasterDuke nine: gist updated with a watch on tc, it gets overwritten 15:38
15:40 Kaiepi joined
dogbert17 MasterDuke++, Nine++ SEGV hunting 15:42
jnthnwrthngtn MasterDuke: Memory growth yes (well, actually it was the Rakudo commit that did it). There is one spectest that sits eating CPU but not memory, so probably some kind of infinite loop due to a mis-dispatch or similar. 15:43
Nicholas: Things are all new-disp where they're all new-disp; there's a few places we still emit the legacy invoke instructions or findmeth and so forth that are yet to be migrated. 15:49
The big one being NQP's multiple dispatch 15:50
And also the regex compiler
Neither is awfully hard to do, just a bit tedious
timo MasterDuke: can you check with "print &tc" if it's in a register?
could very well be that it's just an optimization
if this is still that "until .Str" code we're analysing? 15:52
MasterDuke Address requested for identifier "tc" which is in register $rdi
timo: yep
nine timo: I get the segfault with --optimize=0
timo right, that's why you get bogus values
cool, then you can go with --optimize=0 for easier debugging :)
i could imagine it's an infinite recursion between "i need to check the result of the list to see if i should iterate" and "i need to iterate the array to see if there's a value in it" 15:53
check how deep the backtrace is at the point where it's doing that?
Geth MoarVM/new-disp: ffd3ca8821 | (Jonathan Worthington)++ | 3 files
Marking and cleanup of non-installed disp programs

When we produce a dispatch program, we try to transiation the callsite to include it. However, another thread may have been trying to do the very same thing. Rather than deal with equivalence of programs, we just discard it and - in the event it was different rather than duplicate - it will get reproduced and installed sometime in the future if needed. ... (7 more lines)
15:55
jnthnwrthngtn Not running into that NYI gets it up to 1049 fully passing files in `make spectest`. 15:56
timo cool, i ran into that NYI with my fannkuch script when it was still using hyper-for 15:57
jnthnwrthngtn "Ambiguous multi candidates; better error NYI"...wow, past me was lazy :) 15:59
16:22 patrickb joined 16:47 MasterDuke left
jnthnwrthngtn 1125 fully passing in make spectest now, and the hanging test is seemingly gone 16:55
17:01 sena_kun left
nine MasterDuke: huh...have you ever looked at a Raku backtrace at the point of the segfault? 17:42
tellable6 nine, I'll pass your message to MasterDuke
nine That 28000 call frames deep backtrace looks a mite suspicious 17:44
17:52 MasterDuke joined
MasterDuke . 17:52
tellable6 2021-07-13T17:42:06Z #moarvm <nine> MasterDuke: huh...have you ever looked at a Raku backtrace at the point of the segfault?
MasterDuke nine, timo: yeah, we've seen the raku stacktrace. lizmat and i were discussing it yesterday and thought there were two separate bugs. one is the raku infinite recursion, which she thought was in codegen. the other is the segv (if you disable spesh inlining it just spins in the recursion) 17:54
so theoretically fixing the segv would mean that example would just spin regardless of spesh disabled or not until the rakudo bug is fixed 17:55
18:03 reportable6 left 18:05 reportable6 joined
nine m: sub b($b) { $b > 10000 ?? $b !! a($b + 1) }; sub a($a) { b($a) }; say a(1); 18:15
camelia (signal SEGV)
MasterDuke whoops. same bug? 18:16
nine Well this one creates an ordinary stack overflow in optimize_bb
19:30 AlexDaniel left
timo that's not great 19:31
we did some extra work to make sure spesh doesn't blow its stack
even with very big frames that have loads of BBs
MasterDuke see an easy fix? 19:47
timo still compiling rakudo right now 19:52
this doesn't require new-disp, right? 19:53
MasterDuke correct, master all around
timo hum. i'm on new-disp now
i'll see if it's also there, then
nope ... well, nto that surprising since new-disp changes so much 19:55
MasterDuke yeah, it goes away if spesh inlining is disabled, so since new-disp is missing a bunch of speshing... 19:56
afk for a while &
20:01 MasterDuke left 20:19 patrickb left 20:20 patrickb joined
timo ok we're recurring from getting a spesh graph from an unoptimized frame into optimizing a call back into probably the same frame again 20:28
now, what might the correct resolution be here? there's already in theory a limit to how deep we'll inline, i guess that's not being checked and/or honored when we go through the path where we don't have an optimized candidate yet 20:31
jnthnwrthngtn We're not meant to inline a frame into itself; there should already be a check that we're not doing that 20:44
Of note github.com/MoarVM/MoarVM/blob/mast...line.c#L67 20:45
timo ah, let me look more closely at this particular bit 20:49
ha. that check comes after doing an initial MVM_spesh_optimize of the graph 20:51
no, that's is_graph_inlinable, not is_sf_inlinable 20:52
20:52 AlexDaniel joined
timo d'oh it's literally a few lines further up and i missed it haha 20:53
ok this one is easy, like i thought, we've got b as the inliner and a as the target_sf 21:03
jnthnwrthngtn Ah, a mutual recursion. Yeah, it'd miss that. 21:04
Ah, a mutual recursion. Yeah, it'd miss that. 21:05
Unless there's a check later in the inline table
If not, can scan the inline table of the inlinee for that code ref too
timo do the inlines already land in some table before try_get_graph_from_unspecialized even returns? 21:09
jnthnwrthngtn If we're inlining a spesh candidate then the candidate has an inlines table 21:10
timo right, this is for when we only have target_sf, no spesh_cand 21:11
well, i'm already very deep in the stack, maybe i should go look near the first or second spesh run 21:13
cue five minutes of gdb printing stack traces 21:16
jnthnwrthngtn Ah, in that case you may have to check it after building the graph etc. 21:22
timo right, at the point of recursion the graph for the inline target has already been built, we're in the process of optimizing it while stumbling into the recursion after all 21:26
21:26 linkable6 left
timo it does have num_inlines = 0 in it, though 21:26
21:27 linkable6 joined
timo i'm sure this code would optimize into something truly amazing if we just inline fifty levels of it into itself/each other 21:32
a simple counter for "how deep are we currently in inlining stuff" would be enough for literally every case of accidental infinite recursion of inlinings 21:34
who knows if someone will come up with a piece of code that has multiple calls out on every step and so would go pyramid-shaped :D 21:36
21:39 patrickb left
jnthnwrthngtn This morning Rakudo's new-disp branch ran 0 spectests becuse `use Test` failed. It now passes 1,197 spectest files in full. 22:06
I think that's enough for today. :) 22:07
timo very nice 22:12
i assume the commits to make that work are already up and the notification is just a little missing 22:13
jnthnwrthngtn Yes. Well, most of the recent ones are in Rakudo itself, tweaking the dispatchers 22:14
timo oh i just wasn't looking in the right channel 22:15
fannkuch runs with hyper now :) 22:24
23:43 MasterDuke joined
MasterDuke timo: looks like you're well on your way to a fix? 23:44