Welcome to the main channel on the development of MoarVM, a virtual machine for NQP and Rakudo (moarvm.org). This channel is being logged for historical purposes. Set by lizmat on 24 May 2021. |
|||
00:03
reportable6 left
00:05
reportable6 joined
01:05
notable6 left,
benchable6 left,
linkable6 left,
committable6 left,
unicodable6 left,
releasable6 left,
nativecallable6 left,
greppable6 left,
squashable6 left,
reportable6 left,
tellable6 left,
sourceable6 left,
bisectable6 left,
quotable6 left,
statisfiable6 left,
bloatable6 left,
coverable6 left,
evalable6 left,
shareable6 left
01:06
evalable6 joined,
benchable6 joined
01:07
unicodable6 joined,
tellable6 joined,
greppable6 joined,
statisfiable6 joined
01:08
quotable6 joined,
sourceable6 joined
02:07
committable6 joined,
reportable6 joined
02:08
releasable6 joined,
shareable6 joined
03:08
committable6 left,
reportable6 left,
releasable6 left,
evalable6 left,
shareable6 left,
benchable6 left,
statisfiable6 left,
unicodable6 left,
tellable6 left,
greppable6 left,
quotable6 left,
sourceable6 left,
bloatable6 joined,
committable6 joined,
notable6 joined
03:09
statisfiable6 joined,
unicodable6 joined
03:10
quotable6 joined,
benchable6 joined,
sourceable6 joined,
tellable6 joined
03:11
evalable6 joined
04:06
coverable6 joined
04:07
squashable6 joined
04:08
releasable6 joined
04:10
shareable6 joined
04:11
greppable6 joined
04:29
psydroid left,
Voldenet left,
rba left
04:30
rba joined,
Voldenet joined
04:37
psydroid joined
04:41
leedo left
04:42
leedo joined
05:07
bisectable6 joined
05:08
nativecallable6 joined,
linkable6 joined
|
|||
Nicholas | Good *, #moarvm | 05:42 | |
05:48
brrt joined
06:05
reportable6 joined
|
|||
japhb | Monday always comes so quickly -- especially when you're talking to people several timezones ahead ... | 06:12 | |
brrt | it sneaks up on you | 06:34 | |
nine | I'm pretty sure that deopt doesn't have anything to do with the bug. It happens at the very end of the frame and debug output indicates the wrong answer from EXISTS-KEY before the deopt happens | 06:38 | |
It's not that we get the wrong callframe either. The Stash we call EXISTS-KEY on clearly holds that variable: Stash element = {"\$_" => Rakudo::Internals::LoweredAwayLexical, "\$seen" => 154, "\&a" => proto sub a (;; Mu |) {*}} | 06:40 | ||
Ah but that's a red herring. callframe(1) works by getting the current MVMContext (inside the callframe sub) and moving up the call chain from that. But "moving" here means basically, getting a clone of that MVMContext and adding a traversal. | 07:13 | ||
Every time we actually look up something in that MVMContext it then applies these traversals (i.e. "move one frame up the call chain"). | 07:14 | ||
This means that the result of my debug output, which clearly shows the $seen lexical does not tell us anything about whether EXISTS-KEY will find that lexical. | |||
That's because EXISTS-KEY is inlined, which seems to affect the traversal. In essence, we're trying to find $seen in &a, instead of its caller. | 07:15 | ||
lizmat | Q: isn't it time to merge new-disp ? | 08:22 | |
nine | (or rename it to main) | ||
Nicholas | A: I think it probably should be rebased first (and there's some nqp stuff to "fix in post" - ie commit-then-revert) | 08:23 | |
lizmat | eh... I think there's some stuff in master that we want to keep ? | ||
like a 2021.09 release ? | |||
Nicholas | yes, I meant all 3 rebased onto master | ||
lizmat | would be nice news item :-) | 08:24 | |
*a | |||
Nicholas | A: I thenk that new-disp is down to 3 modules in the ecosystem that it breaks, and those are assumed to be new-disp bugs, and jnthnwrthngtn does not run in constant time in parallel universes, and no-one else is brave enough to loop | ||
to look | |||
and I tink | |||
lizmat | you keep on tinkering! | 08:25 | |
Nicholas | A: he was also trying to get a bit more inlining back so that the speed is similar | ||
moon-child | 'jnthnwrthngtn does not run in constant time' :D | 08:27 | |
nine | As long as there are known bugs and performance issues, there's not terribly much to gain from merging. Any new issues discovered by wider circulation would only get added to the list and we'd probably get reports about issues that we knew already. | 08:31 | |
09:03
sena_kun left
09:12
sena_kun joined
09:26
brrt left
|
|||
jnthnwrthngtn | I wondered if the limitation on argument processing logic having to be completed prior to a deopt was really needed with the new calling convs, so tried removing it. CORE.setting compilation explodes. Hm, OK | 10:34 | |
So I thought, what if for required positional args I tweak QAST compilation to spit out the arg processing instructions and then the type checks, because that'll also remove a lot of the inlining blockers. | 10:35 | ||
Same explosion. So I guess in both cases I just allow more inlining and expose some other problem. | |||
10:44
brrt joined
|
|||
jnthnwrthngtn | Was sure I'd find something better to try and spesh bissect than CORE.setting compilation. Alas, no, the other options all involve threads and so aren't deterministic. | 10:51 | |
timo | github.com/emeryberger/dthreads ? | 10:53 | |
jnthnwrthngtn | timo: Hm, that's interseting | 10:56 | |
10:57
brrt left
|
|||
timo | haven't tried it yet, was just a quick google. i thought "there's gotta be something simple you can just slot in for pthreads" | 10:58 | |
Nicholas | last commit was Nov 2014, so "just slot in" *might* no longer fly | ||
unless you're running some CentOS fossil | 10:59 | ||
jnthnwrthngtn | Yeah, I suspect the CORE.setting bissect will be faster than trying to get that to work (I'm almost there with it) | 11:00 | |
But it's good to know for if the only repro is threaded | 11:01 | ||
OK, the bissect gets me something 3 frames down in the backtrace at the crash point, so this seems legit | 11:03 | ||
timo | ah, i knew the name "emery berger" rung a bell, he's also worked on coz (the Causal Profiling thing) and Hoard (a memory allocator) | ||
Nicholas | Is there anything we can do to be helpful? You don't even seem to need a rubber duck... | 11:04 | |
timo | hoard is meant to be much more efficient in multithreaded applications, especially in multi-processor devices | 11:07 | |
unlikely to be of too much use in moarvm since we do so much with our own allocators | 11:09 | ||
jnthnwrthngtn | The failure mode is that we end up passing (apparently) a Map type object to FLATTENABLE_HASH, but there's no explicit calls to that, so I'm figuring it is thanks to hllize | 11:26 | |
But also wondering if we have sp_dispatch that is going to look at the HLL, then we inline it, we'll do things for the wrong HLL | |||
Sticking useshll on sp_dispatch_* seems to help, though that's a heavy hammer | 11:30 | ||
timo | we don't have anything to guard on hll yet? | 11:34 | |
jnthnwrthngtn | We shouldn't need to as it's a static property of a callsite | 11:36 | |
At the moment lang-hllize calls MVM_hll_current, however when we are recording a dispatch program we have the static frame where the callsite originally appeared handy | |||
So we can reliably obtain it even with an inlined sp_dispatch | 11:37 | ||
And yes, that does seem to help | 11:38 | ||
timo | ah, so what we want is not to abort when the hll mismatche, we just want to access the right hll at the right time | ||
jnthnwrthngtn | Yeah | 11:39 | |
lunch, bbiab | |||
12:03
reportable6 left
|
|||
Geth | MoarVM/new-disp: 2c65946784 | (Jonathan Worthington)++ | 3 files Always locate correct HLL in lang-hllize When we inline an sp_dispatch, use the static frame of the callsite (which is its pre-inline static frame, where the inline cache lives) in order to determine the HLL. Otherwise, we can use the wrong HLL when we record furthre dispatch programs after the inlining. |
12:53 | |
jnthnwrthngtn | Good news: that fixes the CORE.setting issue. Bad news: the others are different | 12:55 | |
However, there is the hope that the callframe one nine++ was on with is involved | 12:56 | ||
Removing the "can't deopt in arg processing" restriction in inlining, and building without my (local) NQP change, gets me a superset of the `make test` issues. | 13:02 | ||
> Specialization of 'EVAL' | 13:09 | ||
oh lovely | |||
Deopt one requested by interpreter in frame 'EVAL' (cuid '14069') | 13:11 | ||
Will deopt 2712 -> 12 | |||
Completed deopt_one in 'EVAL' (cuid '14069') | |||
...another inline boundary bug? argh | |||
aha, OK, so if it's a pre-deopt point we need the equal and opposite fix as pre-deopt points did at the end offset | 13:18 | ||
.oO( 4 hours of sleep is probably not optimal conditions for hunting inlining/deopt bugs ) |
13:20 | ||
dogbert17 | jnthnwrthngtn: coffee or tea might help a bit | 13:21 | |
or you could attack something which doesn't require as much brane | 13:22 | ||
13:24
MasterDuke joined
|
|||
MasterDuke | perhaps some lighter fare would be the MVMDispProgram leak at src/disp/program.c:2330 | 13:28 | |
jnthnwrthngtn | Hm, that line is just `MVMDispProgram *dp = MVM_malloc(sizeof(MVMDispProgram));`? | 13:30 | |
Or is it another one, because I added something in that file today? | |||
MasterDuke | yep, that's where valgrind reports we're leaking in `raku --full-cleanup -e ''` | 13:31 | |
definitely lost: 26,720 bytes in 442 blocks | |||
jnthnwrthngtn | I'm guessing we leak a bunch of other stuff hanging off it? | ||
MasterDuke | indirectly lost: 82,848 bytes in 1,213 blocks | ||
jnthnwrthngtn | OK, the indirect less is stuff hanging off it then, I guess | 13:32 | |
*loss | |||
MasterDuke | i've tried a couple things, but MVMCallStackDispatchRecord (where that MVMDispProgram gets stored in ->produced_dp) don't have a *_destroy | 13:34 | |
and i haven't been able to figure out where i can manually call `MVM_disp_program_destroy(tc, record->produced_dp)` | 13:35 | ||
jnthnwrthngtn | Ah, because the stack isn't unwound fully at exit | ||
Geth | MoarVM/new-disp: 07ce2b1acb | (Jonathan Worthington)++ | src/spesh/deopt.c Pre-deopt offsets are inclusive when uninlining They come before the first instruction in the inline, and so we should consider the start offset inclusively for such a deopt index. |
13:37 | |
jnthnwrthngtn | MasterDuke: I think you'd have to add something a bit like the GC mark iterator over stack frames that instead destroys all if finds, and call it from wherever we destroy the thread context. | 13:38 | |
But it's logic would not be like GC mark, but rather like MVM_callstack_unwind_frame in terms of doing cleanup instead | 13:39 | ||
(Like GC mark in that it walks the whole stack, though) | |||
MasterDuke | walk via tc->stack_first_region+tc->stack_top? | 13:41 | |
jnthnwrthngtn | You can just walk it from tc->stack_top and follow ->prev as marking does | 13:42 | |
at least, I think that's waht marking does | |||
You shouldn't have to care about regions | |||
MasterDuke | thanks, i'll give that some experimenting | ||
jnthnwrthngtn | I'm now down to one regression (and then only under nodelay+blocking) after removing the restrictions on inlining when we can deopt during arg processing | 13:50 | |
[Coke] | regarding merge: i'd rather open the possibility for getting *new* bug reports and will volunteer to close any other actual bugs as dupes of any existing ones. (want as much testing time this month as possible) | ||
jnthnwrthngtn | Annoyingly it only wants to reproduce under the harness... | 13:51 | |
[Coke] | github.com/rakudo/rakudo/wiki/Raku...isp-branch - please note any remaining issues here. We can use this to help decide when to merge. | 13:55 | |
14:05
reportable6 joined
|
|||
dogbert17 | and the regression is not t/spec/S06-advanced/callframe.t? | 14:09 | |
jnthnwrthngtn | dogbert17: No, require.t | 14:11 | |
Even trying to replicate the env vars and so on that the harness sets I can't make it happen outside of it :/ | 14:12 | ||
That makes it hard to get things like deopt logs etc. | |||
MasterDuke | have you tried running a different spectest in the background? | 14:13 | |
maybe it's the load and not the runner? | |||
jnthnwrthngtn | It's the runner; I'm not running a spectest to get it, just `make t/spec/S12-modules/require.t` or so | 14:14 | |
And it happens reliably and I can even bissect it to a particular specialization | |||
Maybe I should push the change to see if anybody else has any luck golfing | 14:15 | ||
Just cleaning up the MoarVM change in question | |||
It does unblock a load of inlining | |||
In fact, I think this is the final round of inlining unblocking to get us to similar inlining levels as master | 14:19 | ||
MasterDuke | are there any other "must have" optimizations required before the merge? | 14:21 | |
Geth | MoarVM/new-disp: b619a562a9 | (Jonathan Worthington)++ | src/spesh/inline.c Remove argument processing inlining restrictions Previously we have refused to deoptimize during argument processing, as we didn't reconstruct the argument processing context fully during deopt. With new-disp, we do so (aided by it being rather simpler). This unblocks further inlining, and hopefully gets us back to achieving the kinds of levels as master (plus we get to inline some things we could not before too). |
||
jnthnwrthngtn | MasterDuke: Don't think so; arguably startup, arguably eliding sp_resumption ops when we can prove the inlinee could never possibly resume | ||
MasterDuke | cool beans | 14:23 | |
jnthnwrthngtn | I suspect once we merge it we'll learn soon enough about major regressions that need tackling | ||
It'd be good to get another test-t update with the above change in | 14:24 | ||
sena_kun | o/ | ||
jnthnwrthngtn | o/ sena_kun | ||
sena_kun | jnthnwrthngtn, hi! Did you by chance have time to address rest of the modules we talked about last time? | ||
Nicholas | \o | ||
(don't want to capsise) | |||
MasterDuke | with that change stage parse was just 45s for me. master is usually around 42-43s | ||
jnthnwrthngtn | MasterDuke: Yes, otoh we've won a bit on the other stages, or at least it's so on my machine | 14:25 | |
sena_kun | (btw I'll be in Prague Wednesday's morning) | ||
MasterDuke | before that last commit stage parse on new-disp was just 46s | ||
jnthnwrthngtn | sena_kun: Not yet; those are next. I think we're down to 3 or so of them? | ||
sena_kun | jnthnwrthngtn, something like that, yes. | 14:26 | |
jnthnwrthngtn | Ah, missing JIT of sp_bindcomplete and the profiler still not fully working are probably also blocker-y | ||
Nicholas | jnthnwrthngtn: would it be a good idea to create a github issue that can track this (short) list? | 14:28 | |
because the list grew a bit from what I last knew, and it might be useful for the "night" shift if they happen to exist | |||
[Coke] | sena_kun: can you list any modules broken with new-disp under the new wiki page? | 14:29 | |
Nicholas: if there's actual tickets, we can link to the tickets from the wiki. (or tag them new-disp or something) | |||
Nicholas | there's a wiki? ;-) | ||
[Coke] | github.com/rakudo/rakudo/wiki/Raku...isp-branch | 14:30 | |
jnthnwrthngtn | Just added a few things, would be good to list the remaining modules to fix there too, I just need to remember what they are | 14:33 | |
dogbert17 | DateTime::Timezones is one | 14:35 | |
and Test::Base I believe | 14:36 | ||
lizmat | and yet another Rakudo Weekly News hits the Net: rakudoweekly.blog/2021/09/20/2021-...-feedback/ | ||
dogbert17 | ah lizmat | 14:37 | |
jnthnwrthngtn | Hm, this is an interesting discrepancy. MVM_SPESH_INLINE_LOG=1 claims something is inlined, the profiler output says not. Furthermore, the claim it's inlined is still made when running with --profile | 14:39 | |
These can't both be correct | |||
lizmat | jnthnwrthngtn: fwiw, I've seen discrepancies like that on master as well, so this may not be a new-disp issue | 14:40 | |
sena_kun | [Coke], I'll start a ticket once I do a new Blin run next Wed. | 14:50 | |
MasterDuke | jnthnwrthngtn: rakudo on the jvm can't use java classes (i.e., in a jar) in a raku program, correct? | ||
jnthnwrthngtn | MasterDuke: Hm, I thought something like that was implemented, but it's been years since I looked, so I've really no idea. | 14:52 | |
MasterDuke | k, thanks | ||
jnthnwrthngtn | lizmat: Hmm. I really don't see how they can happen, but it certainly seems they are. | 14:55 | |
[Coke] | ... they are paving the roads outside my house today and boy is my dog giving them a piece of her mind. | 14:59 | |
MasterDuke | work not up to her standards? | 15:01 | |
[Coke] | Apparently not! (borf borf) | ||
oh wait, now she's asleep again. Fickle management. | 15:03 | ||
jnthnwrthngtn | Gah. The profiler was telling the truth. Actually so was the inline log. But...oops. | 15:10 | |
We had: unit containing a loop, the loop body, stuff in the loop body | |||
The stuff called in the loop body was relatively small operators and got specialized; the unit had a hot loop so via OSR got specialized, but the loop body, between them, did *not* reach the optimization threshold because it has larger bytecode | 15:11 | ||
So unit got specialized and did an unspecialized inline of the body and that in turn didn't get anything inlined into it | |||
By the next specialization run, the loop body was hot enough given its size, and we inlined stuff into it...but unit was already optimized | 15:12 | ||
And using the less-well-optimized body | |||
Lesson: the OSR threshold should not be lower than at least the second level of body threshold, otherwise this can happen | 15:13 | ||
Tweak numbers, 5.48s -> 2.65s | |||
[Coke] | wow. | 15:14 | |
MasterDuke | ha | 15:15 | |
Geth | MoarVM/new-disp: d8f4794140 | (Jonathan Worthington)++ | src/spesh/plan.h Tweak OSR threshold; add comment on picking it |
15:18 | |
lizmat | jnthnwrthngtn: time to do another test-t run it feels? | 15:24 | |
15:25
linkable6 left,
evalable6 left
|
|||
jnthnwrthngtn | lizmat: Yes, plesae | 15:25 | |
actually wait a moment, I'll push one thing to Rakudo also :) | 15:26 | ||
15:26
evalable6 joined
|
|||
jnthnwrthngtn | Done | 15:27 | |
lizmat | ok, hang on :-) | 15:28 | |
Geth | MoarVM/new-disp: 48ef4a5488 | (Jonathan Worthington)++ | src/core/continuation.c Reinstate profiling when continuations are used |
15:35 | |
jnthnwrthngtn | That's one missing piece :) | 15:40 | |
lizmat | test-t new-disp: 1.454 / 0.766 | 15:42 | |
was 1.651 / .870 | 15:43 | ||
on master the numbers were test-t: 1.372 / .634 | 15:44 | ||
jnthnwrthngtn | m: say 1.372 / 1.454 | 15:45 | |
camelia | 0.943604 | ||
lizmat | m: say .634 / .766 | 15:46 | |
camelia | 0.827676 | ||
jnthnwrthngtn | ~6%, pretty close. Maybe I shoulda got you to measure after I put in sp_bindcomplete JIT too... | ||
lizmat | with startup having gone from ~ 120 -> 160 msecs, that shows up badly for the --race case | 15:47 | |
jnthnwrthngtn | (Since it blocks a bunch of JITting) | ||
Oh, these times include startup? | |||
lizmat | yes | ||
jnthnwrthngtn | m: say 1.454 - 0.040 | ||
camelia | 1.414 | ||
jnthnwrthngtn | m: say 1.372 / 1.414 | 15:48 | |
camelia | 0.970297 | ||
jnthnwrthngtn | Even closer with that factored in, then | ||
lizmat | m: say (1.372 - .120) / (1.454 - .160) | ||
camelia | 0.967543 | ||
MasterDuke | well, you'd need to subtract master's startup time too, right? | ||
lizmat | indeed :-) | ||
jnthnwrthngtn | Oh, yeah, I subtracted the difference but that's probably not quite legit :) | ||
lizmat | but I'd say we're in the same ballpark now | 15:49 | |
jnthnwrthngtn | Yeah. Maybe sp_bindcomplete JITting unlocks a bit more | ||
nine | So....where was I? | 15:55 | |
jnthnwrthngtn | On vacation, I think... :) | 15:56 | |
MasterDuke starts thinking of the "where in the world is carmen san diego" theme song | |||
jnthnwrthngtn | Maybe also trying to work out how I busted callframe.t | 15:57 | |
nine | Well, both are right :) | 15:58 | |
Geth | MoarVM/new-disp: 7a1a85de8e | (Jonathan Worthington)++ | 2 files JIT sp_bindcomplete |
15:59 | |
jnthnwrthngtn | lizmat: You'll only need to rebuild MoarVM to see if ^^ wins us any more | ||
MasterDuke | sp_runfunc_* are added in new-disp, right? they cause a ton of missing template message in a spesh log | 16:00 | |
lizmat | jnthnwrthngtn: will in a mo | ||
nine | MasterDuke: yes | ||
MasterDuke | also sp_runbytecode_* | ||
16:01
sena_kun left
|
|||
MasterDuke | looks like there's a bunch of missing sp_* templates | 16:02 | |
jnthnwrthngtn: what about sp_assertparamcheck, can that also be jitted? | 16:03 | ||
jnthnwrthngtn | MasterDuke: Yes, but those will likely be a bit...interesting...to template JIT, though my hope is rather less interesting than the previous scheme. | ||
MasterDuke: It can, but is it still showing up regularly in JIT bail logs? | |||
Oh hmm, we still JIT assertparamcheck but that's useless | 16:04 | ||
MasterDuke | sp_assertparamcheck: 'trait_mod:<is>'(2) 'Bool' 'trait_mod:<does>'(2) 'signature' 'onlystar' 'soft' 'defined' | ||
jnthnwrthngtn | Because it's always replaced with sp_assertparamcheck now 'cus of the IC | ||
Let's see... | |||
MasterDuke | those counted from a spesh log of compiling CORE.c | 16:05 | |
lizmat | jnthnwrthngtn: no noticeable difference with test-t | 16:07 | |
MasterDuke | ugh, my raku one-liner(ish) to calculate those is not fast, 1m for a 1.4gb log file | 16:08 | |
Geth | MoarVM/new-disp: 728291fb1c | (Jonathan Worthington)++ | 3 files JIT sp_assertparamcheck We no longer ever have assertparamcheck in specialized code, so adapt that implementation in the lego JIT. There's still some figuring out needed for having IC-accessing things in the expression JIT (probably we want some convenience macros for that). |
16:11 | |
nine | Oh boy... I've found the actual guilty commit that broke callframe.t. And it's actually my very own: github.com/MoarVM/MoarVM/commit/1d...c5a872a353 | 16:12 | |
jnthnwrthngtn | MasterDuke++ # nudging me | 16:13 | |
In my latest CORE.c.setting compilation the total time was 48.7s; by contrast, master last I measured it was 50.4s | |||
nine | The case for f == tc->cur_frame shouldn't apply when we did not actually start traversing from tc->cur_frame. As is the case with MVMContext we got from nqp::ctx | 16:14 | |
MasterDuke | jnthnwrthngtn: btw, i just commented on 728291fb1c | ||
jnthnwrthngtn | nine: Ah, so it was busted and my enabling more inlining uncovered it? | ||
nine | jnthnwrthngtn: indeed | ||
Geth | MoarVM/new-disp: d5b6979400 | (Jonathan Worthington)++ | 3 files JIT sp_assertparamcheck We no longer ever have assertparamcheck in specialized code, so adapt that implementation in the lego JIT. There's still some figuring out needed for having IC-accessing things in the expression JIT (probably we want some convenience macros for that). |
16:15 | |
nine | Because we're now running MVMContext's existskey in an inlined block which just didn't happen before. And through my broken fix, the frame walker gets confused and walks out of that inline instead of to the caller. | 16:16 | |
jnthnwrthngtn | "fixed in post" ;P | ||
MasterDuke++ | |||
nine: oooh, existskey...hmm | |||
MasterDuke | heh, nice | ||
jnthnwrthngtn | I just wonder if that means fixing it will also fix require.t | ||
(The explanation of what's wrong with callframe.t makes it seem more likely) | 16:17 | ||
nine | So the "if (f == tc->cur_frame)" should read "if (f == tc->cur_frame && we_are_still_at_the_start_frame)". | 16:20 | |
16:20
brrt joined
|
|||
nine | jnthnwrthngtn: do you think this is a correct spelling of ^^^? gist.github.com/niner/f3b62798b717...d388c3765b | 16:23 | |
jnthnwrthngtn | nine: Only if nothing else looks at fw->started... | 16:25 | |
Which seems to be the case, so yeah, it seems alright | 16:26 | ||
nine | Bad news: require.t is still broken | 16:30 | |
jnthnwrthngtn | But is callframe.t fixed? | 16:33 | |
nine | jnthnwrthngtn: the golfed test case and callframe.t now pass | 16:36 | |
require.t fails with Lexical with name '&allgreet' does not exist in this frame | 16:37 | ||
16:40
[Coke] left
|
|||
nine | Fails quite reliably even in rr | 16:40 | |
jnthnwrthngtn | You've done better than me at getting some useful info out of it. | 16:41 | |
nine | in line 108 btw | 16:42 | |
jnthnwrthngtn | Text::CSV has really a lot of inclusive time in MVM_disp_program_run...that's interesting. | 16:43 | |
Geth | MoarVM: MasterDuke17++ created pull request #1545: Add _n cases to jitting some of the new ops |
16:44 | |
MoarVM/new-disp: bb8408ebc8 | (Stefan Seifert)++ | src/spesh/frame_walker.c Fix frame walker confused when traversing a saved context For the currently executing frame, the frame walker uses the program counter directly to get the current position (for determining whether we are in an inlined frame) rather than a deopt index as we might not be on an instruction carrying a deopt annotation. ... (11 more lines) |
16:53 | ||
nine | No idea if my commit message makes it even remotely clear what's going on... | 16:54 | |
groceries& | 16:55 | ||
17:05
brrt left
|
|||
jnthnwrthngtn | I think we're going to need a guard for object HLL. While in most cases guarding on the type suffices, and nicely eliminates in dispatch program translation, there's places it does not | 17:06 | |
The most immediate one being in raku-multi-plan, which sees a huge number of different types | 17:07 | ||
Geth | MoarVM/new-disp: 3deb4c150c | (Daniel Green)++ | src/jit/graph.c Add _n cases to jitting some of the new ops |
17:10 | |
MoarVM/new-disp: 9272183955 | (Jonathan Worthington)++ (committed using GitHub Web editor) | src/jit/graph.c Merge pull request #1545 from MasterDuke17/add_num_cases_to_jitting_some_new_ops_on_new-disp Add _n cases to jitting some of the new ops |
|||
nine | jnthnwrthngtn: you mean while guarding on type covers object HLL indirectly, it's too specific and leads to unneeeded deopts? | 17:17 | |
jnthnwrthngtn | nine: Yes, well, not deopts because the site is way too polymorphic for spesh to do anything other than leave it as an sp_dispatch, but we also fill the inline cache up with entries too | 17:18 | |
nine: It's not noticeable in microbenchmarks, but any decent size program will multi-dispatch over quite a lot of different types | 17:19 | ||
Text::CSV being an example of something that just blows the budget even in a simple use of it. | |||
I'm quite relieved this is the only situation where it does so. :) | 17:20 | ||
(In that particular example, anyway) | 17:21 | ||
Started implementing it, but think it's time for some rest and food. | 17:22 | ||
nine | Yeah, food sounds more and more tempting :) | ||
17:51
rba left,
rba joined
17:53
nine left
17:54
nine joined
18:02
reportable6 left
18:03
reportable6 joined
|
|||
MasterDuke | huh. i have a profile with 800k calls to .contains, and it's showing as yellow. a spesh log of the same code i profiled has the 'after' of contains with only three BBs and no jit bails (template or otherwise). why would it be yellow in the profile? | 18:04 | |
and the spesh log says "JIT was successful and compilation took 106us" | |||
timo | MasterDuke: deopts? | 18:25 | |
MasterDuke | 700 of them | 18:52 | |
but all in the mainline, not .contains | 18:53 | ||
18:57
brrt joined
|
|||
Nicholas | do we have (reliable) spectest failures currently on new-disp? I thought that t/spec/S17-scheduler/every.t failed for me (this is a new one) but I can't repeat it, and I just did a clean spectest run | 18:57 | |
good *, brrt | |||
lizmat runs a spectest on new-disp | 18:58 | ||
MasterDuke | nine and jnthnwrthngtn were talking about a failure in require.t, but maybe it requires some env variables to be set | ||
nine | MVM_SPESH_BLOCKING=1 MVM_SPESH_NODELAY=1 ./rakudo-m -Ilib t/spec/S11-modules/require.t | 18:59 | |
Nicholas: I see many sleeps in every.t. That test is clearly racey. | 19:00 | ||
Nicholas | :-( (and thanks for doing my homework for me) | 19:01 | |
brrt | good * Nicholas, lizmat, MasterDuke, nine | 19:06 | |
lizmat | brrt o/ | ||
Nicholas: spectest is clean for me on MacOS | |||
nine | \o | ||
Nicholas | it cmopiles, ship it! | ||
lizmat | dyslexis untie! | 19:08 | |
Nicholas | :-) | ||
timo | MasterDuke: i'm actually not 100% if the deopt would be counted in the mainline or in .contains in this case | 19:13 | |
MasterDuke | i wonder if the fact that i'm use -n has anything to do with it | 19:14 | |
gist.github.com/MasterDuke17/72433...acb022d71c is what i'm running | 19:17 | ||
timo | do you know -MSIL? | 19:19 | |
MasterDuke | 55% specialized, 44% jitted | ||
i'd forgotten about it until lizmat was mentioning it recently (my bash aliases predate it) | 19:20 | ||
dogbert17 | any c experts around | ||
lizmat | I'd consider Nicholas one | ||
dogbert17 | if valgrind complains if MoarVM is compiled with --no-optimize but not otherwise, is that a problem which should be looked into? | 19:21 | |
==702950== Thread 2 spesh optimizer: | 19:22 | ||
==702950== Conditional jump or move depends on uninitialised value(s) | |||
==702950== at 0x4BAAF89: optimize_bb_switch (optimize.c:2299) | |||
this is from running './rakudo-valgrind-m -e 'role PDF { }' | 19:24 | ||
timo | personally, i don't care :) | 19:26 | |
dogbert17 | but can't you run rr :) | ||
nine | jnthnwrthngtn: the require.t failure is definitely another MVMContext frame walker inlining issue | 19:27 | |
Actually it very much reminds me of my struggles to get Backtrace working reliably with MVM_SPESH_NODELAY. It again seems to be ye olde problem of traversing through a call stack when of of the frames is still active. | 19:36 | ||
&REQUIRE_IMPORT get's it's caller's lexpad via `my $block := CALLER::MY::`, moves on into an inlined frame which then tries to access that lexpad. In bind_key, we try to traverse from the MVMContext 2 callers out to the lexpad. But one of the frames is &REQUIRE_IMPORT and its return_offset has changed since we created the MVMContext. | 19:40 | ||
19:47
gabriel80546 joined
|
|||
gabriel80546 | Have you ever wanted to run raku on your phone? well that is totally possible. | 19:48 | |
the great guy named Max Kapusta on stackoverflow have figured that out. | |||
all you need is to use the app UserLand and install rakudo with `sudo apt install rakudo` | |||
stackoverflow.com/questions/690910...1#69247911 | |||
timo | it'll be kinda slow without the full jit since we don't have that for arm yet | 19:53 | |
20:00
brrt left
|
|||
nine | Actually I think my assessment is only some 90 % correct because there is this caller_deopt_idx thing in the frame extra that's supposed to cover this case | 20:08 | |
But then I should be heading for bed anyway... | |||
20:14
gabriel80546 left
20:27
linkable6 joined
|
|||
jnthnwrthngtn | nine: One problem I ran into when implementing sp_resumption inlining support was needing to make absolutely sure the sp_dispatch or sp_runbytecode that followed got a deopt index reliably, so that the position info was up to date if you started walking on the stack top. Dunno if this is a case of that too. | 20:28 | |
timo | speaking of deopt idxes, just the other day i saw a specialization that started with like 20 "set" instructions that had deopt annotations on them because they used to be guards | 20:30 | |
jnthnwrthngtn | The thing that removes deopt usages doesn't go by the annotation only, but by the op properties too | 20:34 | |
And set isn't marked "may cause deopt" | |||
So having the annotation left shouldn't be an issue (it's also cost some cycles to find/remove, plus I like retaining them for debugging purposes anyway) | 20:35 | ||
timo | ah, wonderful | 20:37 | |
i thought it might stick around in the table anyway, but it probably just gets skipped when we do code-gen or so | 20:39 | ||
MasterDuke | huh. i'm only getting a single MVM_CALLSTACK_RECORD_START while walking the callstack in my cleanup function called from MVM_callstack_destroy | 20:47 | |
jnthnwrthngtn | MasterDuke: Hmm...that'd imply that they're not leaked because of missing exit-time cleanup, but just leaked in "normal" operation, which is certainly ungood | 21:03 | |
MasterDuke | oh, and this is new (i think), but another valgrind run just pointed out github.com/MoarVM/MoarVM/blob/new-...am.c#L2321 as leaked also | 21:05 | |
and some other vectors from that compile_state | 21:06 | ||
jnthnwrthngtn | I'd imagine it's all the same underlying issue: the dispatch program isn't being destroyed, so all that hangs off the MVMDispProgram will be leaked too. | 21:07 | |
MasterDuke | yep | 21:09 | |
unfortunately my `git grep` foo isn't good enough to find things that aren't there, but should be | 21:10 | ||
jnthnwrthngtn | Hmm...all the callstack unwinding looks correct though. Alas, so does the MVMStaticFrame cleanup | 21:11 | |
Though another pair of eyes can't hurt; look around where MVM_disp_inline_cache_transition is called, and produced_dp is handled for those that fail to be installed during IC transition | 21:14 | ||
MasterDuke | k | ||
jnthnwrthngtn | That's the two paths: it gets installed, or it gets stored in the IC which hangs off the STable | ||
Grep for produced_dp in callstack.c for the matching piece | 21:16 | ||
afk for a bit | 21:18 | ||
MasterDuke | ah ha! found it | 21:19 | |
Geth | MoarVM: MasterDuke17++ created pull request #1546: Correctly clean up disp programs in cleanup_entry |
21:35 | |
21:39
[Coke] joined
|
|||
[Coke] | . | 21:39 | |
MasterDuke | and with one other un-related change, valgrind now reports no leaks for `raku --full-cleanup -e ''` | 21:40 | |
Geth | MoarVM: MasterDuke17++ created pull request #1547: Clean the hash of syscalls during vm cleanup |
21:45 | |
MoarVM/new-disp: c79043e622 | (Daniel Green)++ | src/disp/inline_cache.c Correctly clean up disp programs in cleanup_entry Destroying the dispatch programs in inline cache entries when destroying static frames was introduced in 6b50be1f8a7c9c81fbd85e7fe44467405d1979c6, but there was a typo in the case of polymorphic entries. |
22:10 | ||
MoarVM/new-disp: 5e18f3c40b | (Jonathan Worthington)++ (committed using GitHub Web editor) | src/disp/inline_cache.c Merge pull request #1546 from MasterDuke17/fix_disp_programs_not_getting_cleaned_from_polymorphic_inline_cache_entries_on_new-disp Correctly clean up disp programs in cleanup_entry Correctly clean up disp programs in cleanup_entry |
|||
jnthnwrthngtn | The keys are like right next to each other :D | ||
timo | oooooh | ||
on the german keyboard, < and > are on the same key, even | |||
Geth | MoarVM/new-disp: 405409aefe | (Daniel Green)++ | src/moar.c Clean the hash of syscalls during vm cleanup Otherwise valgrind will report a leak for `raku --full-cleanup -e ''`. |
||
MoarVM/new-disp: 66f6d6f62b | (Jonathan Worthington)++ (committed using GitHub Web editor) | src/moar.c Merge pull request #1547 from MasterDuke17/clean_the_hash_of_syscalls_during_vm_cleanup_on_new-disp Clean the hash of syscalls during vm cleanup |
|||
MasterDuke | oh really? ha! that must make programming fun | 22:15 | |
timo | well, it's even worse, {} are on shift 7 and 0 whereas [] are on shift 8 and 9 | 22:17 | |
no, omg i'm totally wrong | |||
that's not shift, that's altgr | |||
shift 8 and 9 give you () and shift 7 and 0 give you / and = respectively | 22:18 | ||
i haven't coded with de layout for ages | 22:25 | ||
22:53
linkable6 left,
evalable6 left
22:55
evalable6 joined
22:57
linkable6 joined
|
|||
Geth | MoarVM/new-disp: 52477ea3df | (Jonathan Worthington)++ | 11 files Add support for HLL guards; use in lang-hllize Some callsites become polymorphic or even megamorphic in the case that they only care about HLL, but we enforce it with a type guard. The Raku multi dispatch planner ran into this, meaning that in any non-trivial program that dispatches over many types, we'd end up with a full inline cache site with all the costs of that. Introduce guards on HLL in order that we can avoid this situation. |
23:51 | |
jnthnwrthngtn | That seems to bring a minor startup improvement too | 23:52 | |
'night |