Welcome to the main channel on the development of MoarVM, a virtual machine for NQP and Rakudo (moarvm.org). This channel is being logged for historical purposes.
Set by lizmat on 24 May 2021.
00:00 linkable6 left, evalable6 left, evalable6 joined 00:01 linkable6 joined 00:02 reportable6 left
japhb MasterDuke, nine: No --full-cleanup involved with panic. While trying to golf the panic, I found this interesting case: 00:19
$ raku -e 'use MONKEY-SEE-NO-EVAL; EVAL buf8.allocate(100_000).raku'
Bytecode validation error at offset 52, instruction 10: 00:20
callsite expects 257 more positionals in block <unit> at -e line 1
Still trying to find what I need to golf the panic.
timo we might be using an 8bit number somewhere still 00:27
japhb timo: If I drop a digit in the allocate number (to 10_000) it succeeds. So maybe a 16-bit? Checking "interesting" numbers .... 00:29
Oooh, 65_535 gives the same error, but 65_534 gives: 00:30
Bytecode validation error at offset 524374, instruction 65551:
operand type 64 does not match register type 56 for op wval in frame <unit>
01:04 reportable6 joined 02:04 linkable6 left, quotable6 left, bloatable6 left, nativecallable6 left, statisfiable6 left, greppable6 left, squashable6 left, sourceable6 left, reportable6 left, shareable6 left, committable6 left, coverable6 left, bisectable6 left, evalable6 left, benchable6 left, releasable6 left, unicodable6 left, notable6 left, tellable6 left, shareable6 joined, nativecallable6 joined, squashable6 joined, benchable6 joined 02:05 quotable6 joined, reportable6 joined 02:06 bloatable6 joined
japhb Alright, best I've managed so far in reducing the panic: 02:21
raku -e 'use MONKEY-SEE-NO-EVAL; use YAMLish; my $buf = buf8.allocate(100_000); my $raku = $buf.raku; my $yaml = save-yaml $buf; for ^5 { say $_; try EVAL $raku; load-yaml $yaml }'
0
1
MoarVM panic: Tried to garbage-collect a locked mutex
On my system, hoisting either the EVAL or the load-yaml call out of the loop means it survives. 02:22
03:05 coverable6 joined 03:06 statisfiable6 joined, committable6 joined, evalable6 joined, linkable6 joined 03:07 releasable6 joined, greppable6 joined 03:38 frost-lab joined 04:06 notable6 joined, unicodable6 joined 04:16 frost-lab left 04:39 Guest92 joined 04:41 Guest92 left 05:05 sourceable6 joined 05:06 tellable6 joined 05:19 Guest92 joined 05:22 Guest92 left, frost-lab joined 06:02 reportable6 left 06:31 discord-raku-bot left, linkable6 left 06:32 discord-raku-bot joined 07:03 reportable6 joined 07:06 bisectable6 joined 07:08 frost-lab left 07:32 linkable6 joined 08:25 linkable6 left 08:26 linkable6 joined
lizmat šŸ³ļøā€šŸŒˆ 08:26
oops :-)
it's just an example of a .chars == 1 and .codes = 4 :-) 08:27
MasterDuke gist.github.com/MasterDuke17/434a6...d3edaea42f a backtrace from that `MoarVM panic: Tried to garbage-collect a locked mutex` error 08:30
jnthnwrthngtn moarning o/ 09:08
MasterDuke: The backtrace won't be terribly useful, alas, 'cus it'll be in the GC, which can be triggered at any time. What may be useful is trying to work out where the mutex was allocated, which may be possible-ish with rr 09:12
The actual MVMObject isn't useful in itself 'cus it moves every GC, but iirc it points to a piece of malloc'd memory to hold the uv_mutex and you could perhaps do a reverse watch on that 09:13
MasterDuke the ->body.mutex i assume? 09:14
Nicholas \o 09:15
jnthnwrthngtn Yes 09:17
Not the pointer, but some memory location inside of it
dogbert17 Hah, what a fail, tried to to 'zef install YAMLish' on master and was met with 'labeled next without loop construct' 09:38
no matter, found a workaround
dogbert17 drinks some coffee 09:39
lizmat dogbert17: you probably need to upgrade zef 09:44
hmmm... I thought zef had worked around 09:45
in any case, there's a PR (in MoarVM) by nine that should fix this
dogbert17 lizmat: thx, I got around the problem by setting MVM_SPESH_DISABLE=1 during the install 09:48
lizmat jnthnwrthngtn: perhaps we could merge nine's PR and bump MoarVM? 09:49
10:00 MasterDuke left
Geth MoarVM: a932b1732c | (Stefan Seifert)++ | 2 files
Fix spesh optimizing away still needed label register

Objects representing loop labels are kept in a register and may be used by loop handlers (like next LABEL). Spesh did not take this relationship into account, just saw a register that was written to, but not otherwise used and optimized the writers of this register away. Fix by giving a handler's label_reg the same treatment as block_reg.
Fixes Rakudo issue #4456
10:01
MoarVM: 860cc65508 | (Jonathan Worthington)++ (committed using GitHub Web editor) | 2 files
Merge pull request #1522 from MoarVM/fix_spesh_losing_label_reg

Fix spesh optimizing away still needed label register
10:03 patrickb joined
lizmat any other stuff I should wait for before bumping MoarVM? 10:04
10:05 MasterDuke joined
jnthnwrthngtn argh, that's annoying... So I got spesh linking working again. Turns out that since it emits into temporaries while forming the dispatch program and then releases them, and then runbytecode also does this and uses the temps within the lifetime of the other temps, they interfere and then boom segv 10:11
Guess they need delayed release or some such 10:12
Geth MoarVM/new-disp: 5c12d74508 | (Jonathan Worthington)++ | 3 files
Reinstate spesh linking

That is, where possible, determine the candidate that we are going to be invoking, and identify it directly, so we don't have to run through the spesh arg guard.
10:31
jnthnwrthngtn Curiously this has only so much effect on CORE.setting build time, but is a rather more noticeable in `make test` 10:32
10:36 AlexDaniel left, psydroid left 10:39 AlexDaniel joined
Geth MoarVM/new-disp: fe2fe669fe | (Jonathan Worthington)++ | src/disp/inline_cache.c
Add a way to dump full inline cache backtraces
10:41
10:41 psydroid joined
Nicholas jnthnwrthngtn: ASAN now reports a leak: paste.scsys.co.uk/595621 10:47
quite a few of those
different backtraces, but all end up in translate_dispatch_program src/spesh/disp.c 10:48
lizmat dogbert17: MoarVM bumped, zef should work again on master 10:53
11:07 sena_kun joined 11:13 patrickb left 11:14 patrickb joined 11:15 patrickb left 11:16 patrickb joined 11:17 patrickb left 11:18 patrickb joined
lizmat jnthnwrthngtn: Q, will INDIRECT_NAME_LOOKUP exist in its current form in new-disp ? 11:18
11:20 patrickb left, patrickb joined 11:22 patrickb left, patrickb joined
jnthnwrthngtn lizmat: It's not something I've looked at during new-disp, and it hasn't shown up in any test failures 11:22
lizmat: So I suspect it doesn't (need to) change
lizmat ok, then I'll spend some time optimizing it :-) 11:23
jnthnwrthngtn Intuitively I don't expect it to need to
11:26 patrickb left, patrickb joined 11:28 patrickb left, patrickb joined 11:30 patrickb left, patrickb joined 11:32 patrickb left, patrickb joined 11:34 patrickb left, patrickb joined 11:36 patrickb left, patrickb joined, patrickb left 12:02 reportable6 left 12:04 reportable6 joined 12:18 patrickb joined
MasterDuke jnthnwrthngtn, nine: does the new rr backtrace in gist.github.com/MasterDuke17/434a6...d3edaea42f look normal/ok? 12:19
rr does not like if i try to watch inside the `rm->body.mutex` 12:20
timo MasterDuke: maybe try casting it to (MVMuint64*) so it doesn't need to watch an entire big memary area 12:26
MasterDuke oh...that would explain why it complains about too many breakpoints 12:27
at first i thought it was something about having installed a new kernel or two and maybe i needed to reboot
but re the backtrace i wasn't sure if we could deserialize an MVMRentrantMutex 12:29
gist updated 12:34
timo let's see if i have the opportunity to look closely when i return from errandications 12:50
Nicholas your cat will have other plans for you? 12:52
jnthnwrthngtn is back from errands 13:06
grmbl, I see broken spectests
How'd I manage that
Nicholas insuffucient tea? 13:07
jnthnwrthngtn It's a bit warm in here for tea at the moment
(The air conditioner is working at this problem, however.) 13:08
Uff. I've no idea how the change I've done causes the problem I see...
Nicholas what is this concept of "too warm for tea?" Have you gone native? :-)
jnthnwrthngtn Well, I did just collect a letter telling me that my permanent residence permit is ready for collection... :) 13:09
Nicholas woohoo
jnthnwrthngtn Of course, the migration office is the other side of the city 13:10
Nicholas That sounds like the feed line for a good pun about migration. I miss TimToady 13:11
jnthnwrthngtn OK, this bug is very confusing 13:22
13:30 sena_kun left
Geth MoarVM/new-disp: 5c38f5f6f3 | (Jonathan Worthington)++ | 15 files
Eliminate legacy dispatcher ops

Which are no longer used in Rakudo.
14:43
dogbert17 m: use MONKEY-SEE-NO-EVAL; EVAL buf8.allocate(17_000).raku 15:07
camelia ( no output )
dogbert17 sigh
on new-disp I get: 15:08
MoarVM oops: Oversize callstack flattening record requested (wanted 153104, maximum 131040)
at gen/moar/Metamodel.nqp:2371 (/home/dogbert/repos/rakudo/blib/Perl6/Metamodel.moarvm:)
timo too many arguments, yeah :( 15:09
dogbert17 it does fail on master as well, see japhb's comments from the night
m: say 1024 * 128
camelia 131072
nine 131040 arguments ought to be enough... 15:12
dogbert17 I'm inclined to agree 15:15
jnthnwrthngtn Is the EVAL actually needed?
Sounds like somewhere is doing arg flattening that really shouldn't be
Even if it works, it's horribly inefficient 15:16
Geth MoarVM/new-disp: e4c801f1d0 | (Jonathan Worthington)++ | src/spesh/graph.c
Prepare spesh graph builder for inlining

It needs to be able to create graphs from instructions containing specialized dispatch-related bytecodes.
timo exciting! 15:17
Geth MoarVM/new-disp: ad20e837f9 | (Jonathan Worthington)++ | 2 files
First steps towards reinstating inlining

Try to build an inline graph, and add back the logging of whether we could inline if we actually tried to do so. Don't actually inline for now, however. Going this far does not seem to cause any regressions.
15:22
MasterDuke m: say buf8.new(|(^100_000)) 15:24
camelia Too many arguments (100001) in flattening array, only 65535 allowed.
in block <unit> at <tmp> line 1
MasterDuke i think the EVAL is required to get around that error
m: use MONKEY-SEE-NO-EVAL; EVAL "say buf8.new({(^66_000).join(q|,|)})" 15:27
camelia Bytecode validation error at offset 52, instruction 10:
callsite expects 257 more positionals
in block <unit> at <tmp> line 1
MasterDuke that's the other one japhb found when golfing the mutex panic 15:28
and here's gist.github.com/MasterDuke17/3bd1f...2480f6327a some debugging info 15:32
Nicholas jnthnwrthngtn: you turned off MVM_HASH_RANDMOIZE in commit 5c38f5f6f3fa904a10985ec93ba2caa091d506e6 15:43
nine I know where that GC issue may come from
Aaaand: it's coming from the other problem 15:44
15:44 patrickb left
MasterDuke the panic is coming from the bytecode validation? 15:44
*mutex panic
nine I.e. the Bytecode validation error happens during prepare_and_verify_static_frame called from instrumentation_level_barrier, i.e. while we're holding that mutex. So we jump out of that instrumentation_level_barrier without ever unlocking 15:45
Luckily we have a mechanism for "please unlock this mutex in case an exception gets thrown" 15:50
Alas, gotta go now
MasterDuke oh, what mechanism is that? 15:51
Geth MoarVM/new-disp: 3ff82a7004 | (Jonathan Worthington)++ | src/spesh/graph.c
Add more basic block boundary ops
16:05
MoarVM/new-disp: c66c96cfbc | (Jonathan Worthington)++ | src/moar.h
Reinstate hash randomization

Disabled to get stable spesh logs, and accidentally committed.
MoarVM/new-disp: e3c36fc015 | (Jonathan Worthington)++ | 2 files
First pass at porting the inlining algorithm

This is the minimal set of changes that would in theory be required for it to work.
jnthnwrthngtn Nicholas: oops, thanks
MasterDuke `Unhandled exception: Internal error: multiple ex_release_mutex` doh 16:20
hm. MVM_load_bytecode does a `MVM_tc_set_ex_release_mutex` before we reach `prepare_and_verify_static_frame` 16:24
make it an array of mutexes? 16:26
in prepare_and_verify_static_frame unlock the one that's already held, set the new one, then re-set the old one if an exception hasn't been thrown? 16:29
jnthnwrthngtn So inlining reenalbed will get through the NQP build and fail one test; the Rakudo build fails the same test. Which...OK let's debug, but WHY does the frame the broken inlining happens in have to be EXPR (the expression parser)? :/ 16:31
Sure I'll go hunt this in 153 basic blocks :P 16:32
timo whew.
maybe it's a phi instruction with more than 65k arguments
jnthnwrthngtn :P 16:33
The upper limit of the size of that is surely the number of basic blocks? 16:34
Oh, it's a deopt bug 16:35
What on earth is it doing
I'm suspecting its table of inline extends is rather off 16:36
Nicholas ASAN or valgrind have hints?
jnthnwrthngtn Nicholas: Didn't check, but feels very unlikely; it's almost certainly an off-by-something
It thinks that it's doing uninlining 16:37
Or rather, that it should
But it really shouldn't
I guess something about the way calls look now means that end of inline marker is mis-placed or similar
Or start 16:38
MasterDuke huh. gist.github.com/MasterDuke17/e74be...8ce02e4547 doesn't fix the panic. even if that patch had some other bad behaviors, i thought it would at least prevent the panic 16:41
16:44 linkable6 left, evalable6 left
timo well, i can't wait to read the blog post at the end of this :) 16:45
16:46 linkable6 joined
jnthnwrthngtn It's a bit odd. I can change the <= to < or make inline end annotations move in the other direction to fix the deopt bug...but doing either of those gets me a different wrong deopt 16:48
Nicholas timo: Well, you never know. it might be "It's faster. Enjoy" 16:54
jnthnwrthngtn I'm now at "why does this work on master" :) 16:58
We set the end offset of an inline as the location after writing that instruction 16:59
So the <= is wrong, since it would include the instruction after the inlined one. 17:00
That'd also suggest we want to move it back, not forward
Since it should be the last inlined instruction.
oh hang on what
We're writing it as a pre instruction 17:01
I misread
That implies it has to be on the instruction after the deopt always 17:02
Doing that gets me another issue that...doesn't look like a deopt one, just bytecode corruption. Odd. 17:30
ah, or maybe it is 17:32
time for a break
18:02 reportable6 left
nine MasterDuke: are you sure it's complaining about the same mutex though? 18:35
MasterDuke no
nine MasterDuke: with that patch we now no longer unlock the tmp_ex_release_mutex, so maybe that's what gets collected while still locked 18:36
MasterDuke oh. right. hm
seems like maybe it does have to be an array of ex_release_mutexes? 18:37
nine That would slow things down though 18:40
MasterDuke yeah. my only other idea is passing the mutex down through all the functions calls and unlocking it in fail() 18:41
18:45 evalable6 joined 20:03 reportable6 joined
Geth MoarVM/new-disp: 18f6e3b0eb | (Jonathan Worthington)++ | src/spesh/deopt.c
Account for deopt all vs. deopt on difference

When we are doing a deopt all, the bytecode position is the op we will return to. Thus we need to account for being one past the end of the inline in that case (inclusive), but to *not* do it for a failed guard that is immediately following a depot, otherwise we'll end up wrongly uninlining an inline we were not in. (This may need a further look, as ... (5 more lines)
20:10
MoarVM/new-disp: 78d1cdf687 | (Jonathan Worthington)++ | src/spesh/inline.c
Correct insertion of inline end marker
MoarVM/new-disp: 42dbab6e00 | (Jonathan Worthington)++ | src/spesh/optimize.c
Switch inlining back on

For now without optimizing the inlinee with knowledge of the surrounding context, and also without creating inlines of candidates that were not yet specialized.
20:13
jnthnwrthngtn So the way to debug inlining deopt bugs is apparently to order and eat a "garlic chicken" curry where they put WHOLE CLOVES of garlic - lots of them - into an already very garlicy sauce :) 20:16
I didn't try it with stressing yet, but this causes no regressions
Note, however, that it's inlining very little Raku code so far 20:17
Or even spesh linking
'cus I didn't yet teach it about dispatch programs with resumption state
nine If that's the way to debug inlining deopt bugs, I do hope there's a couple of those bugs left for me :D
timo i see a bunch of "inlining prevented by the op callercode" in a random piece of the nqp build speshlogged; isn't that an op that we can reasonably do right with inlining? 20:21
Nicholas I'm getting a massive DU backtrace when I run time perl Configure.pl --backends=moar --prefix=/home/nick/Sandpit/moar-SAN 20:25
timo someone can implement sp_guardnonzero for the jit 21:06
Geth MoarVM/new-disp: 90ddd1f768 | (Jonathan Worthington)++ | src/spesh/optimize.c
Re-enable post-inline optimizations
21:20
timo ohmygosh 21:21
jnthnwrthngtn I ran tests while I was afk, no regressions, so guess it's close
Nicholas: Congrats or something..you have the DU checking turned on, I guess? 21:22
Nicholas I had, but I turned them off and I still see it
So I'm cofnused
jnthnwrthngtn huh
Nicholas that's what I thought
jnthnwrthngtn "MoarVM oops in spesh thread: Malformed DU chain: reading goto of 4(2) not in graph" :D 21:25
timo whoops, goto
jnthnwrthngtn Yeah, I can guess what silly I've done
timo changed the info of an ins without freeing usages? 21:26
jnthnwrthngtn Yeah
Geth MoarVM/new-disp: e212bfcd54 | (Jonathan Worthington)++ | src/spesh/inline.c
Remove runbytecode usages of args when inlining
21:28
jnthnwrthngtn I like how much code went away when updating inlining
The main thing left to re-instate is the JIT 21:29
Which is probably doing rather little at present
Since none(runbytecode, runcfunc, sp_dispatch) are JITted so far 21:30
Even in lego, let alone expr
Geth MoarVM/new-disp: 84e75a6830 | (Jonathan Worthington)++ | 2 files
Remove properties from deprecated ops
21:36
[Coke] will we ever be able to remove the deprecated ones, or will those gaps always be there? 22:06
(And it shouldn't impact anything, right?)
timo we can fill them back up 22:08
with new ops as they come along, if they do 22:09
22:18 linkable6 left, evalable6 left 22:21 evalable6 joined
Geth MoarVM/new-disp: d48f0d9348 | (Jonathan Worthington)++ | 5 files
Start to re-work unspecialized inlining

This is a translation of what's needed to reinstate this on new-disp. However, something isn't quite right yet; enabling it leads to an NQP build failure.
22:24
jnthnwrthngtn I'm too tired to figure this breakage out today 22:30
timo good work today jnthnwrthngtn 22:45
23:20 linkable6 joined