Welcome to the main channel on the development of MoarVM, a virtual machine for NQP and Rakudo (moarvm.org). This channel is being logged for historical purposes.
Set by lizmat on 24 May 2021.
00:03 reportable6 left 00:05 reportable6 joined 01:05 notable6 left, benchable6 left, linkable6 left, committable6 left, unicodable6 left, releasable6 left, nativecallable6 left, greppable6 left, squashable6 left, reportable6 left, tellable6 left, sourceable6 left, bisectable6 left, quotable6 left, statisfiable6 left, bloatable6 left, coverable6 left, evalable6 left, shareable6 left 01:06 evalable6 joined, benchable6 joined 01:07 unicodable6 joined, tellable6 joined, greppable6 joined, statisfiable6 joined 01:08 quotable6 joined, sourceable6 joined 02:07 committable6 joined, reportable6 joined 02:08 releasable6 joined, shareable6 joined 03:08 committable6 left, reportable6 left, releasable6 left, evalable6 left, shareable6 left, benchable6 left, statisfiable6 left, unicodable6 left, tellable6 left, greppable6 left, quotable6 left, sourceable6 left, bloatable6 joined, committable6 joined, notable6 joined 03:09 statisfiable6 joined, unicodable6 joined 03:10 quotable6 joined, benchable6 joined, sourceable6 joined, tellable6 joined 03:11 evalable6 joined 04:06 coverable6 joined 04:07 squashable6 joined 04:08 releasable6 joined 04:10 shareable6 joined 04:11 greppable6 joined 04:29 psydroid left, Voldenet left, rba left 04:30 rba joined, Voldenet joined 04:37 psydroid joined 04:41 leedo left 04:42 leedo joined 05:07 bisectable6 joined 05:08 nativecallable6 joined, linkable6 joined
Nicholas Good *, #moarvm 05:42
05:48 brrt joined 06:05 reportable6 joined
japhb Monday always comes so quickly -- especially when you're talking to people several timezones ahead ... 06:12
brrt it sneaks up on you 06:34
nine I'm pretty sure that deopt doesn't have anything to do with the bug. It happens at the very end of the frame and debug output indicates the wrong answer from EXISTS-KEY before the deopt happens 06:38
It's not that we get the wrong callframe either. The Stash we call EXISTS-KEY on clearly holds that variable: Stash element = {"\$_" => Rakudo::Internals::LoweredAwayLexical, "\$seen" => 154, "\&a" => proto sub a (;; Mu |) {*}} 06:40
Ah but that's a red herring. callframe(1) works by getting the current MVMContext (inside the callframe sub) and moving up the call chain from that. But "moving" here means basically, getting a clone of that MVMContext and adding a traversal. 07:13
Every time we actually look up something in that MVMContext it then applies these traversals (i.e. "move one frame up the call chain"). 07:14
This means that the result of my debug output, which clearly shows the $seen lexical does not tell us anything about whether EXISTS-KEY will find that lexical.
That's because EXISTS-KEY is inlined, which seems to affect the traversal. In essence, we're trying to find $seen in &a, instead of its caller. 07:15
lizmat Q: isn't it time to merge new-disp ? 08:22
nine (or rename it to main)
Nicholas A: I think it probably should be rebased first (and there's some nqp stuff to "fix in post" - ie commit-then-revert) 08:23
lizmat eh... I think there's some stuff in master that we want to keep ?
like a 2021.09 release ?
Nicholas yes, I meant all 3 rebased onto master
lizmat would be nice news item :-) 08:24
*a
Nicholas A: I thenk that new-disp is down to 3 modules in the ecosystem that it breaks, and those are assumed to be new-disp bugs, and jnthnwrthngtn does not run in constant time in parallel universes, and no-one else is brave enough to loop
to look
and I tink
lizmat you keep on tinkering! 08:25
Nicholas A: he was also trying to get a bit more inlining back so that the speed is similar
moon-child 'jnthnwrthngtn does not run in constant time' :D 08:27
nine As long as there are known bugs and performance issues, there's not terribly much to gain from merging. Any new issues discovered by wider circulation would only get added to the list and we'd probably get reports about issues that we knew already. 08:31
09:03 sena_kun left 09:12 sena_kun joined 09:26 brrt left
jnthnwrthngtn I wondered if the limitation on argument processing logic having to be completed prior to a deopt was really needed with the new calling convs, so tried removing it. CORE.setting compilation explodes. Hm, OK 10:34
So I thought, what if for required positional args I tweak QAST compilation to spit out the arg processing instructions and then the type checks, because that'll also remove a lot of the inlining blockers. 10:35
Same explosion. So I guess in both cases I just allow more inlining and expose some other problem.
10:44 brrt joined
jnthnwrthngtn Was sure I'd find something better to try and spesh bissect than CORE.setting compilation. Alas, no, the other options all involve threads and so aren't deterministic. 10:51
timo github.com/emeryberger/dthreads ? 10:53
jnthnwrthngtn timo: Hm, that's interseting 10:56
10:57 brrt left
timo haven't tried it yet, was just a quick google. i thought "there's gotta be something simple you can just slot in for pthreads" 10:58
Nicholas last commit was Nov 2014, so "just slot in" *might* no longer fly
unless you're running some CentOS fossil 10:59
jnthnwrthngtn Yeah, I suspect the CORE.setting bissect will be faster than trying to get that to work (I'm almost there with it) 11:00
But it's good to know for if the only repro is threaded 11:01
OK, the bissect gets me something 3 frames down in the backtrace at the crash point, so this seems legit 11:03
timo ah, i knew the name "emery berger" rung a bell, he's also worked on coz (the Causal Profiling thing) and Hoard (a memory allocator)
Nicholas Is there anything we can do to be helpful? You don't even seem to need a rubber duck... 11:04
timo hoard is meant to be much more efficient in multithreaded applications, especially in multi-processor devices 11:07
unlikely to be of too much use in moarvm since we do so much with our own allocators 11:09
jnthnwrthngtn The failure mode is that we end up passing (apparently) a Map type object to FLATTENABLE_HASH, but there's no explicit calls to that, so I'm figuring it is thanks to hllize 11:26
But also wondering if we have sp_dispatch that is going to look at the HLL, then we inline it, we'll do things for the wrong HLL
Sticking useshll on sp_dispatch_* seems to help, though that's a heavy hammer 11:30
timo we don't have anything to guard on hll yet? 11:34
jnthnwrthngtn We shouldn't need to as it's a static property of a callsite 11:36
At the moment lang-hllize calls MVM_hll_current, however when we are recording a dispatch program we have the static frame where the callsite originally appeared handy
So we can reliably obtain it even with an inlined sp_dispatch 11:37
And yes, that does seem to help 11:38
timo ah, so what we want is not to abort when the hll mismatche, we just want to access the right hll at the right time
jnthnwrthngtn Yeah 11:39
lunch, bbiab
12:03 reportable6 left
Geth MoarVM/new-disp: 2c65946784 | (Jonathan Worthington)++ | 3 files
Always locate correct HLL in lang-hllize

When we inline an sp_dispatch, use the static frame of the callsite
  (which is its pre-inline static frame, where the inline cache lives) in
order to determine the HLL. Otherwise, we can use the wrong HLL when we record furthre dispatch programs after the inlining.
12:53
jnthnwrthngtn Good news: that fixes the CORE.setting issue. Bad news: the others are different 12:55
However, there is the hope that the callframe one nine++ was on with is involved 12:56
Removing the "can't deopt in arg processing" restriction in inlining, and building without my (local) NQP change, gets me a superset of the `make test` issues. 13:02
> Specialization of 'EVAL' 13:09
oh lovely
Deopt one requested by interpreter in frame 'EVAL' (cuid '14069') 13:11
Will deopt 2712 -> 12
Completed deopt_one in 'EVAL' (cuid '14069')
...another inline boundary bug? argh
aha, OK, so if it's a pre-deopt point we need the equal and opposite fix as pre-deopt points did at the end offset 13:18
.oO( 4 hours of sleep is probably not optimal conditions for hunting inlining/deopt bugs )
13:20
dogbert17 jnthnwrthngtn: coffee or tea might help a bit 13:21
or you could attack something which doesn't require as much brane 13:22
13:24 MasterDuke joined
MasterDuke perhaps some lighter fare would be the MVMDispProgram leak at src/disp/program.c:2330 13:28
jnthnwrthngtn Hm, that line is just `MVMDispProgram *dp = MVM_malloc(sizeof(MVMDispProgram));`? 13:30
Or is it another one, because I added something in that file today?
MasterDuke yep, that's where valgrind reports we're leaking in `raku --full-cleanup -e ''` 13:31
definitely lost: 26,720 bytes in 442 blocks
jnthnwrthngtn I'm guessing we leak a bunch of other stuff hanging off it?
MasterDuke indirectly lost: 82,848 bytes in 1,213 blocks
jnthnwrthngtn OK, the indirect less is stuff hanging off it then, I guess 13:32
*loss
MasterDuke i've tried a couple things, but MVMCallStackDispatchRecord (where that MVMDispProgram gets stored in ->produced_dp) don't have a *_destroy 13:34
and i haven't been able to figure out where i can manually call `MVM_disp_program_destroy(tc, record->produced_dp)` 13:35
jnthnwrthngtn Ah, because the stack isn't unwound fully at exit
Geth MoarVM/new-disp: 07ce2b1acb | (Jonathan Worthington)++ | src/spesh/deopt.c
Pre-deopt offsets are inclusive when uninlining

They come before the first instruction in the inline, and so we should consider the start offset inclusively for such a deopt index.
13:37
jnthnwrthngtn MasterDuke: I think you'd have to add something a bit like the GC mark iterator over stack frames that instead destroys all if finds, and call it from wherever we destroy the thread context. 13:38
But it's logic would not be like GC mark, but rather like MVM_callstack_unwind_frame in terms of doing cleanup instead 13:39
(Like GC mark in that it walks the whole stack, though)
MasterDuke walk via tc->stack_first_region+tc->stack_top? 13:41
jnthnwrthngtn You can just walk it from tc->stack_top and follow ->prev as marking does 13:42
at least, I think that's waht marking does
You shouldn't have to care about regions
MasterDuke thanks, i'll give that some experimenting
jnthnwrthngtn I'm now down to one regression (and then only under nodelay+blocking) after removing the restrictions on inlining when we can deopt during arg processing 13:50
[Coke] regarding merge: i'd rather open the possibility for getting *new* bug reports and will volunteer to close any other actual bugs as dupes of any existing ones. (want as much testing time this month as possible)
jnthnwrthngtn Annoyingly it only wants to reproduce under the harness... 13:51
[Coke] github.com/rakudo/rakudo/wiki/Raku...isp-branch - please note any remaining issues here. We can use this to help decide when to merge. 13:55
14:05 reportable6 joined
dogbert17 and the regression is not t/spec/S06-advanced/callframe.t? 14:09
jnthnwrthngtn dogbert17: No, require.t 14:11
Even trying to replicate the env vars and so on that the harness sets I can't make it happen outside of it :/ 14:12
That makes it hard to get things like deopt logs etc.
MasterDuke have you tried running a different spectest in the background? 14:13
maybe it's the load and not the runner?
jnthnwrthngtn It's the runner; I'm not running a spectest to get it, just `make t/spec/S12-modules/require.t` or so 14:14
And it happens reliably and I can even bissect it to a particular specialization
Maybe I should push the change to see if anybody else has any luck golfing 14:15
Just cleaning up the MoarVM change in question
It does unblock a load of inlining
In fact, I think this is the final round of inlining unblocking to get us to similar inlining levels as master 14:19
MasterDuke are there any other "must have" optimizations required before the merge? 14:21
Geth MoarVM/new-disp: b619a562a9 | (Jonathan Worthington)++ | src/spesh/inline.c
Remove argument processing inlining restrictions

Previously we have refused to deoptimize during argument processing, as we didn't reconstruct the argument processing context fully during deopt. With new-disp, we do so (aided by it being rather simpler). This unblocks further inlining, and hopefully gets us back to achieving the kinds of levels as master (plus we get to inline some things we could not before too).
jnthnwrthngtn MasterDuke: Don't think so; arguably startup, arguably eliding sp_resumption ops when we can prove the inlinee could never possibly resume
MasterDuke cool beans 14:23
jnthnwrthngtn I suspect once we merge it we'll learn soon enough about major regressions that need tackling
It'd be good to get another test-t update with the above change in 14:24
sena_kun o/
jnthnwrthngtn o/ sena_kun
sena_kun jnthnwrthngtn, hi! Did you by chance have time to address rest of the modules we talked about last time?
Nicholas \o
(don't want to capsise)
MasterDuke with that change stage parse was just 45s for me. master is usually around 42-43s
jnthnwrthngtn MasterDuke: Yes, otoh we've won a bit on the other stages, or at least it's so on my machine 14:25
sena_kun (btw I'll be in Prague Wednesday's morning)
MasterDuke before that last commit stage parse on new-disp was just 46s
jnthnwrthngtn sena_kun: Not yet; those are next. I think we're down to 3 or so of them?
sena_kun jnthnwrthngtn, something like that, yes. 14:26
jnthnwrthngtn Ah, missing JIT of sp_bindcomplete and the profiler still not fully working are probably also blocker-y
Nicholas jnthnwrthngtn: would it be a good idea to create a github issue that can track this (short) list? 14:28
because the list grew a bit from what I last knew, and it might be useful for the "night" shift if they happen to exist
[Coke] sena_kun: can you list any modules broken with new-disp under the new wiki page? 14:29
Nicholas: if there's actual tickets, we can link to the tickets from the wiki. (or tag them new-disp or something)
Nicholas there's a wiki? ;-)
[Coke] github.com/rakudo/rakudo/wiki/Raku...isp-branch 14:30
jnthnwrthngtn Just added a few things, would be good to list the remaining modules to fix there too, I just need to remember what they are 14:33
dogbert17 DateTime::Timezones is one 14:35
and Test::Base I believe 14:36
lizmat and yet another Rakudo Weekly News hits the Net: rakudoweekly.blog/2021/09/20/2021-...-feedback/
dogbert17 ah lizmat 14:37
jnthnwrthngtn Hm, this is an interesting discrepancy. MVM_SPESH_INLINE_LOG=1 claims something is inlined, the profiler output says not. Furthermore, the claim it's inlined is still made when running with --profile 14:39
These can't both be correct
lizmat jnthnwrthngtn: fwiw, I've seen discrepancies like that on master as well, so this may not be a new-disp issue 14:40
sena_kun [Coke], I'll start a ticket once I do a new Blin run next Wed. 14:50
MasterDuke jnthnwrthngtn: rakudo on the jvm can't use java classes (i.e., in a jar) in a raku program, correct?
jnthnwrthngtn MasterDuke: Hm, I thought something like that was implemented, but it's been years since I looked, so I've really no idea. 14:52
MasterDuke k, thanks
jnthnwrthngtn lizmat: Hmm. I really don't see how they can happen, but it certainly seems they are. 14:55
[Coke] ... they are paving the roads outside my house today and boy is my dog giving them a piece of her mind. 14:59
MasterDuke work not up to her standards? 15:01
[Coke] Apparently not! (borf borf)
oh wait, now she's asleep again. Fickle management. 15:03
jnthnwrthngtn Gah. The profiler was telling the truth. Actually so was the inline log. But...oops. 15:10
We had: unit containing a loop, the loop body, stuff in the loop body
The stuff called in the loop body was relatively small operators and got specialized; the unit had a hot loop so via OSR got specialized, but the loop body, between them, did *not* reach the optimization threshold because it has larger bytecode 15:11
So unit got specialized and did an unspecialized inline of the body and that in turn didn't get anything inlined into it
By the next specialization run, the loop body was hot enough given its size, and we inlined stuff into it...but unit was already optimized 15:12
And using the less-well-optimized body
Lesson: the OSR threshold should not be lower than at least the second level of body threshold, otherwise this can happen 15:13
Tweak numbers, 5.48s -> 2.65s
[Coke] wow. 15:14
MasterDuke ha 15:15
Geth MoarVM/new-disp: d8f4794140 | (Jonathan Worthington)++ | src/spesh/plan.h
Tweak OSR threshold; add comment on picking it
15:18
lizmat jnthnwrthngtn: time to do another test-t run it feels? 15:24
15:25 linkable6 left, evalable6 left
jnthnwrthngtn lizmat: Yes, plesae 15:25
actually wait a moment, I'll push one thing to Rakudo also :) 15:26
15:26 evalable6 joined
jnthnwrthngtn Done 15:27
lizmat ok, hang on :-) 15:28
Geth MoarVM/new-disp: 48ef4a5488 | (Jonathan Worthington)++ | src/core/continuation.c
Reinstate profiling when continuations are used
15:35
jnthnwrthngtn That's one missing piece :) 15:40
lizmat test-t new-disp: 1.454 / 0.766 15:42
was 1.651 / .870 15:43
on master the numbers were test-t: 1.372 / .634 15:44
jnthnwrthngtn m: say 1.372 / 1.454 15:45
camelia 0.943604
lizmat m: say .634 / .766 15:46
camelia 0.827676
jnthnwrthngtn ~6%, pretty close. Maybe I shoulda got you to measure after I put in sp_bindcomplete JIT too...
lizmat with startup having gone from ~ 120 -> 160 msecs, that shows up badly for the --race case 15:47
jnthnwrthngtn (Since it blocks a bunch of JITting)
Oh, these times include startup?
lizmat yes
jnthnwrthngtn m: say 1.454 - 0.040
camelia 1.414
jnthnwrthngtn m: say 1.372 / 1.414 15:48
camelia 0.970297
jnthnwrthngtn Even closer with that factored in, then
lizmat m: say (1.372 - .120) / (1.454 - .160)
camelia 0.967543
MasterDuke well, you'd need to subtract master's startup time too, right?
lizmat indeed :-)
jnthnwrthngtn Oh, yeah, I subtracted the difference but that's probably not quite legit :)
lizmat but I'd say we're in the same ballpark now 15:49
jnthnwrthngtn Yeah. Maybe sp_bindcomplete JITting unlocks a bit more
nine So....where was I? 15:55
jnthnwrthngtn On vacation, I think... :) 15:56
MasterDuke starts thinking of the "where in the world is carmen san diego" theme song
jnthnwrthngtn Maybe also trying to work out how I busted callframe.t 15:57
nine Well, both are right :) 15:58
Geth MoarVM/new-disp: 7a1a85de8e | (Jonathan Worthington)++ | 2 files
JIT sp_bindcomplete
15:59
jnthnwrthngtn lizmat: You'll only need to rebuild MoarVM to see if ^^ wins us any more
MasterDuke sp_runfunc_* are added in new-disp, right? they cause a ton of missing template message in a spesh log 16:00
lizmat jnthnwrthngtn: will in a mo
nine MasterDuke: yes
MasterDuke also sp_runbytecode_*
16:01 sena_kun left
MasterDuke looks like there's a bunch of missing sp_* templates 16:02
jnthnwrthngtn: what about sp_assertparamcheck, can that also be jitted? 16:03
jnthnwrthngtn MasterDuke: Yes, but those will likely be a bit...interesting...to template JIT, though my hope is rather less interesting than the previous scheme.
MasterDuke: It can, but is it still showing up regularly in JIT bail logs?
Oh hmm, we still JIT assertparamcheck but that's useless 16:04
MasterDuke sp_assertparamcheck: 'trait_mod:<is>'(2) 'Bool' 'trait_mod:<does>'(2) 'signature' 'onlystar' 'soft' 'defined'
jnthnwrthngtn Because it's always replaced with sp_assertparamcheck now 'cus of the IC
Let's see...
MasterDuke those counted from a spesh log of compiling CORE.c 16:05
lizmat jnthnwrthngtn: no noticeable difference with test-t 16:07
MasterDuke ugh, my raku one-liner(ish) to calculate those is not fast, 1m for a 1.4gb log file 16:08
Geth MoarVM/new-disp: 728291fb1c | (Jonathan Worthington)++ | 3 files
JIT sp_assertparamcheck

We no longer ever have assertparamcheck in specialized code, so adapt that implementation in the lego JIT. There's still some figuring out needed for having IC-accessing things in the expression JIT (probably we want some convenience macros for that).
16:11
nine Oh boy... I've found the actual guilty commit that broke callframe.t. And it's actually my very own: github.com/MoarVM/MoarVM/commit/1d...c5a872a353 16:12
jnthnwrthngtn MasterDuke++ # nudging me 16:13
In my latest CORE.c.setting compilation the total time was 48.7s; by contrast, master last I measured it was 50.4s
nine The case for f == tc->cur_frame shouldn't apply when we did not actually start traversing from tc->cur_frame. As is the case with MVMContext we got from nqp::ctx 16:14
MasterDuke jnthnwrthngtn: btw, i just commented on 728291fb1c
jnthnwrthngtn nine: Ah, so it was busted and my enabling more inlining uncovered it?
nine jnthnwrthngtn: indeed
Geth MoarVM/new-disp: d5b6979400 | (Jonathan Worthington)++ | 3 files
JIT sp_assertparamcheck

We no longer ever have assertparamcheck in specialized code, so adapt that implementation in the lego JIT. There's still some figuring out needed for having IC-accessing things in the expression JIT (probably we want some convenience macros for that).
16:15
nine Because we're now running MVMContext's existskey in an inlined block which just didn't happen before. And through my broken fix, the frame walker gets confused and walks out of that inline instead of to the caller. 16:16
jnthnwrthngtn "fixed in post" ;P
MasterDuke++
nine: oooh, existskey...hmm
MasterDuke heh, nice
jnthnwrthngtn I just wonder if that means fixing it will also fix require.t
(The explanation of what's wrong with callframe.t makes it seem more likely) 16:17
nine So the "if (f == tc->cur_frame)" should read "if (f == tc->cur_frame && we_are_still_at_the_start_frame)". 16:20
16:20 brrt joined
nine jnthnwrthngtn: do you think this is a correct spelling of ^^^? gist.github.com/niner/f3b62798b717...d388c3765b 16:23
jnthnwrthngtn nine: Only if nothing else looks at fw->started... 16:25
Which seems to be the case, so yeah, it seems alright 16:26
nine Bad news: require.t is still broken 16:30
jnthnwrthngtn But is callframe.t fixed? 16:33
nine jnthnwrthngtn: the golfed test case and callframe.t now pass 16:36
require.t fails with Lexical with name '&allgreet' does not exist in this frame 16:37
16:40 [Coke] left
nine Fails quite reliably even in rr 16:40
jnthnwrthngtn You've done better than me at getting some useful info out of it. 16:41
nine in line 108 btw 16:42
jnthnwrthngtn Text::CSV has really a lot of inclusive time in MVM_disp_program_run...that's interesting. 16:43
Geth MoarVM: MasterDuke17++ created pull request #1545:
Add _n cases to jitting some of the new ops
16:44
MoarVM/new-disp: bb8408ebc8 | (Stefan Seifert)++ | src/spesh/frame_walker.c
Fix frame walker confused when traversing a saved context

For the currently executing frame, the frame walker uses the program counter directly to get the current position (for determining whether we are in an inlined frame) rather than a deopt index as we might not be on an instruction carrying a deopt annotation.
... (11 more lines)
16:53
nine No idea if my commit message makes it even remotely clear what's going on... 16:54
groceries& 16:55
17:05 brrt left
jnthnwrthngtn I think we're going to need a guard for object HLL. While in most cases guarding on the type suffices, and nicely eliminates in dispatch program translation, there's places it does not 17:06
The most immediate one being in raku-multi-plan, which sees a huge number of different types 17:07
Geth MoarVM/new-disp: 3deb4c150c | (Daniel Green)++ | src/jit/graph.c
Add _n cases to jitting some of the new ops
17:10
MoarVM/new-disp: 9272183955 | (Jonathan Worthington)++ (committed using GitHub Web editor) | src/jit/graph.c
Merge pull request #1545 from MasterDuke17/add_num_cases_to_jitting_some_new_ops_on_new-disp

Add _n cases to jitting some of the new ops
nine jnthnwrthngtn: you mean while guarding on type covers object HLL indirectly, it's too specific and leads to unneeeded deopts? 17:17
jnthnwrthngtn nine: Yes, well, not deopts because the site is way too polymorphic for spesh to do anything other than leave it as an sp_dispatch, but we also fill the inline cache up with entries too 17:18
nine: It's not noticeable in microbenchmarks, but any decent size program will multi-dispatch over quite a lot of different types 17:19
Text::CSV being an example of something that just blows the budget even in a simple use of it.
I'm quite relieved this is the only situation where it does so. :) 17:20
(In that particular example, anyway) 17:21
Started implementing it, but think it's time for some rest and food. 17:22
nine Yeah, food sounds more and more tempting :)
17:51 rba left, rba joined 17:53 nine left 17:54 nine joined 18:02 reportable6 left 18:03 reportable6 joined
MasterDuke huh. i have a profile with 800k calls to .contains, and it's showing as yellow. a spesh log of the same code i profiled has the 'after' of contains with only three BBs and no jit bails (template or otherwise). why would it be yellow in the profile? 18:04
and the spesh log says "JIT was successful and compilation took 106us"
timo MasterDuke: deopts? 18:25
MasterDuke 700 of them 18:52
but all in the mainline, not .contains 18:53
18:57 brrt joined
Nicholas do we have (reliable) spectest failures currently on new-disp? I thought that t/spec/S17-scheduler/every.t failed for me (this is a new one) but I can't repeat it, and I just did a clean spectest run 18:57
good *, brrt
lizmat runs a spectest on new-disp 18:58
MasterDuke nine and jnthnwrthngtn were talking about a failure in require.t, but maybe it requires some env variables to be set
nine MVM_SPESH_BLOCKING=1 MVM_SPESH_NODELAY=1 ./rakudo-m -Ilib t/spec/S11-modules/require.t 18:59
Nicholas: I see many sleeps in every.t. That test is clearly racey. 19:00
Nicholas :-( (and thanks for doing my homework for me) 19:01
brrt good * Nicholas, lizmat, MasterDuke, nine 19:06
lizmat brrt o/
Nicholas: spectest is clean for me on MacOS
nine \o
Nicholas it cmopiles, ship it!
lizmat dyslexis untie! 19:08
Nicholas :-)
timo MasterDuke: i'm actually not 100% if the deopt would be counted in the mainline or in .contains in this case 19:13
MasterDuke i wonder if the fact that i'm use -n has anything to do with it 19:14
gist.github.com/MasterDuke17/72433...acb022d71c is what i'm running 19:17
timo do you know -MSIL? 19:19
MasterDuke 55% specialized, 44% jitted
i'd forgotten about it until lizmat was mentioning it recently (my bash aliases predate it) 19:20
dogbert17 any c experts around
lizmat I'd consider Nicholas one
dogbert17 if valgrind complains if MoarVM is compiled with --no-optimize but not otherwise, is that a problem which should be looked into? 19:21
==702950== Thread 2 spesh optimizer: 19:22
==702950== Conditional jump or move depends on uninitialised value(s)
==702950== at 0x4BAAF89: optimize_bb_switch (optimize.c:2299)
this is from running './rakudo-valgrind-m -e 'role PDF { }' 19:24
timo personally, i don't care :) 19:26
dogbert17 but can't you run rr :)
nine jnthnwrthngtn: the require.t failure is definitely another MVMContext frame walker inlining issue 19:27
Actually it very much reminds me of my struggles to get Backtrace working reliably with MVM_SPESH_NODELAY. It again seems to be ye olde problem of traversing through a call stack when of of the frames is still active. 19:36
&REQUIRE_IMPORT get's it's caller's lexpad via `my $block := CALLER::MY::`, moves on into an inlined frame which then tries to access that lexpad. In bind_key, we try to traverse from the MVMContext 2 callers out to the lexpad. But one of the frames is &REQUIRE_IMPORT and its return_offset has changed since we created the MVMContext. 19:40
19:47 gabriel80546 joined
gabriel80546 Have you ever wanted to run raku on your phone? well that is totally possible. 19:48
the great guy named Max Kapusta on stackoverflow have figured that out.
all you need is to use the app UserLand and install rakudo with `sudo apt install rakudo`
stackoverflow.com/questions/690910...1#69247911
timo it'll be kinda slow without the full jit since we don't have that for arm yet 19:53
20:00 brrt left
nine Actually I think my assessment is only some 90 % correct because there is this caller_deopt_idx thing in the frame extra that's supposed to cover this case 20:08
But then I should be heading for bed anyway...
20:14 gabriel80546 left 20:27 linkable6 joined
jnthnwrthngtn nine: One problem I ran into when implementing sp_resumption inlining support was needing to make absolutely sure the sp_dispatch or sp_runbytecode that followed got a deopt index reliably, so that the position info was up to date if you started walking on the stack top. Dunno if this is a case of that too. 20:28
timo speaking of deopt idxes, just the other day i saw a specialization that started with like 20 "set" instructions that had deopt annotations on them because they used to be guards 20:30
jnthnwrthngtn The thing that removes deopt usages doesn't go by the annotation only, but by the op properties too 20:34
And set isn't marked "may cause deopt"
So having the annotation left shouldn't be an issue (it's also cost some cycles to find/remove, plus I like retaining them for debugging purposes anyway) 20:35
timo ah, wonderful 20:37
i thought it might stick around in the table anyway, but it probably just gets skipped when we do code-gen or so 20:39
MasterDuke huh. i'm only getting a single MVM_CALLSTACK_RECORD_START while walking the callstack in my cleanup function called from MVM_callstack_destroy 20:47
jnthnwrthngtn MasterDuke: Hmm...that'd imply that they're not leaked because of missing exit-time cleanup, but just leaked in "normal" operation, which is certainly ungood 21:03
MasterDuke oh, and this is new (i think), but another valgrind run just pointed out github.com/MoarVM/MoarVM/blob/new-...am.c#L2321 as leaked also 21:05
and some other vectors from that compile_state 21:06
jnthnwrthngtn I'd imagine it's all the same underlying issue: the dispatch program isn't being destroyed, so all that hangs off the MVMDispProgram will be leaked too. 21:07
MasterDuke yep 21:09
unfortunately my `git grep` foo isn't good enough to find things that aren't there, but should be 21:10
jnthnwrthngtn Hmm...all the callstack unwinding looks correct though. Alas, so does the MVMStaticFrame cleanup 21:11
Though another pair of eyes can't hurt; look around where MVM_disp_inline_cache_transition is called, and produced_dp is handled for those that fail to be installed during IC transition 21:14
MasterDuke k
jnthnwrthngtn That's the two paths: it gets installed, or it gets stored in the IC which hangs off the STable
Grep for produced_dp in callstack.c for the matching piece 21:16
afk for a bit 21:18
MasterDuke ah ha! found it 21:19
Geth MoarVM: MasterDuke17++ created pull request #1546:
Correctly clean up disp programs in cleanup_entry
21:35
21:39 [Coke] joined
[Coke] . 21:39
MasterDuke and with one other un-related change, valgrind now reports no leaks for `raku --full-cleanup -e ''` 21:40
Geth MoarVM: MasterDuke17++ created pull request #1547:
Clean the hash of syscalls during vm cleanup
21:45
MoarVM/new-disp: c79043e622 | (Daniel Green)++ | src/disp/inline_cache.c
Correctly clean up disp programs in cleanup_entry

Destroying the dispatch programs in inline cache entries when destroying static frames was introduced in 6b50be1f8a7c9c81fbd85e7fe44467405d1979c6, but there was a typo in the case of polymorphic entries.
22:10
MoarVM/new-disp: 5e18f3c40b | (Jonathan Worthington)++ (committed using GitHub Web editor) | src/disp/inline_cache.c
Merge pull request #1546 from MasterDuke17/fix_disp_programs_not_getting_cleaned_from_polymorphic_inline_cache_entries_on_new-disp

Correctly clean up disp programs in cleanup_entry Correctly clean up disp programs in cleanup_entry
jnthnwrthngtn The keys are like right next to each other :D
timo oooooh
on the german keyboard, < and > are on the same key, even
Geth MoarVM/new-disp: 405409aefe | (Daniel Green)++ | src/moar.c
Clean the hash of syscalls during vm cleanup

Otherwise valgrind will report a leak for `raku --full-cleanup -e ''`.
MoarVM/new-disp: 66f6d6f62b | (Jonathan Worthington)++ (committed using GitHub Web editor) | src/moar.c
Merge pull request #1547 from MasterDuke17/clean_the_hash_of_syscalls_during_vm_cleanup_on_new-disp

Clean the hash of syscalls during vm cleanup
MasterDuke oh really? ha! that must make programming fun 22:15
timo well, it's even worse, {} are on shift 7 and 0 whereas [] are on shift 8 and 9 22:17
no, omg i'm totally wrong
that's not shift, that's altgr
shift 8 and 9 give you () and shift 7 and 0 give you / and = respectively 22:18
i haven't coded with de layout for ages 22:25
22:53 linkable6 left, evalable6 left 22:55 evalable6 joined 22:57 linkable6 joined
Geth MoarVM/new-disp: 52477ea3df | (Jonathan Worthington)++ | 11 files
Add support for HLL guards; use in lang-hllize

Some callsites become polymorphic or even megamorphic in the case that they only care about HLL, but we enforce it with a type guard. The Raku multi dispatch planner ran into this, meaning that in any non-trivial program that dispatches over many types, we'd end up with a full inline cache site with all the costs of that. Introduce guards on HLL in order that we can avoid this situation.
23:51
jnthnwrthngtn That seems to bring a minor startup improvement too 23:52
'night