Welcome to the main channel on the development of MoarVM, a virtual machine for NQP and Rakudo (moarvm.org). This channel is being logged for historical purposes. Set by lizmat on 24 May 2021. |
|||
00:00
linkable6 left,
evalable6 left,
evalable6 joined
00:01
linkable6 joined
00:02
reportable6 left
|
|||
japhb | MasterDuke, nine: No --full-cleanup involved with panic. While trying to golf the panic, I found this interesting case: | 00:19 | |
$ raku -e 'use MONKEY-SEE-NO-EVAL; EVAL buf8.allocate(100_000).raku' | |||
Bytecode validation error at offset 52, instruction 10: | 00:20 | ||
callsite expects 257 more positionals in block <unit> at -e line 1 | |||
Still trying to find what I need to golf the panic. | |||
timo | we might be using an 8bit number somewhere still | 00:27 | |
japhb | timo: If I drop a digit in the allocate number (to 10_000) it succeeds. So maybe a 16-bit? Checking "interesting" numbers .... | 00:29 | |
Oooh, 65_535 gives the same error, but 65_534 gives: | 00:30 | ||
Bytecode validation error at offset 524374, instruction 65551: | |||
operand type 64 does not match register type 56 for op wval in frame <unit> | |||
01:04
reportable6 joined
02:04
linkable6 left,
quotable6 left,
bloatable6 left,
nativecallable6 left,
statisfiable6 left,
greppable6 left,
squashable6 left,
sourceable6 left,
reportable6 left,
shareable6 left,
committable6 left,
coverable6 left,
bisectable6 left,
evalable6 left,
benchable6 left,
releasable6 left,
unicodable6 left,
notable6 left,
tellable6 left,
shareable6 joined,
nativecallable6 joined,
squashable6 joined,
benchable6 joined
02:05
quotable6 joined,
reportable6 joined
02:06
bloatable6 joined
|
|||
japhb | Alright, best I've managed so far in reducing the panic: | 02:21 | |
raku -e 'use MONKEY-SEE-NO-EVAL; use YAMLish; my $buf = buf8.allocate(100_000); my $raku = $buf.raku; my $yaml = save-yaml $buf; for ^5 { say $_; try EVAL $raku; load-yaml $yaml }' | |||
0 | |||
1 | |||
MoarVM panic: Tried to garbage-collect a locked mutex | |||
On my system, hoisting either the EVAL or the load-yaml call out of the loop means it survives. | 02:22 | ||
03:05
coverable6 joined
03:06
statisfiable6 joined,
committable6 joined,
evalable6 joined,
linkable6 joined
03:07
releasable6 joined,
greppable6 joined
03:38
frost-lab joined
04:06
notable6 joined,
unicodable6 joined
04:16
frost-lab left
04:39
Guest92 joined
04:41
Guest92 left
05:05
sourceable6 joined
05:06
tellable6 joined
05:19
Guest92 joined
05:22
Guest92 left,
frost-lab joined
06:02
reportable6 left
06:31
discord-raku-bot left,
linkable6 left
06:32
discord-raku-bot joined
07:03
reportable6 joined
07:06
bisectable6 joined
07:08
frost-lab left
07:32
linkable6 joined
08:25
linkable6 left
08:26
linkable6 joined
|
|||
lizmat | š³ļøāš | 08:26 | |
oops :-) | |||
it's just an example of a .chars == 1 and .codes = 4 :-) | 08:27 | ||
MasterDuke | gist.github.com/MasterDuke17/434a6...d3edaea42f a backtrace from that `MoarVM panic: Tried to garbage-collect a locked mutex` error | 08:30 | |
jnthnwrthngtn | moarning o/ | 09:08 | |
MasterDuke: The backtrace won't be terribly useful, alas, 'cus it'll be in the GC, which can be triggered at any time. What may be useful is trying to work out where the mutex was allocated, which may be possible-ish with rr | 09:12 | ||
The actual MVMObject isn't useful in itself 'cus it moves every GC, but iirc it points to a piece of malloc'd memory to hold the uv_mutex and you could perhaps do a reverse watch on that | 09:13 | ||
MasterDuke | the ->body.mutex i assume? | 09:14 | |
Nicholas | \o | 09:15 | |
jnthnwrthngtn | Yes | 09:17 | |
Not the pointer, but some memory location inside of it | |||
dogbert17 | Hah, what a fail, tried to to 'zef install YAMLish' on master and was met with 'labeled next without loop construct' | 09:38 | |
no matter, found a workaround | |||
dogbert17 drinks some coffee | 09:39 | ||
lizmat | dogbert17: you probably need to upgrade zef | 09:44 | |
hmmm... I thought zef had worked around | 09:45 | ||
in any case, there's a PR (in MoarVM) by nine that should fix this | |||
dogbert17 | lizmat: thx, I got around the problem by setting MVM_SPESH_DISABLE=1 during the install | 09:48 | |
lizmat | jnthnwrthngtn: perhaps we could merge nine's PR and bump MoarVM? | 09:49 | |
10:00
MasterDuke left
|
|||
Geth | MoarVM: a932b1732c | (Stefan Seifert)++ | 2 files Fix spesh optimizing away still needed label register Objects representing loop labels are kept in a register and may be used by loop handlers (like next LABEL). Spesh did not take this relationship into account, just saw a register that was written to, but not otherwise used and optimized the writers of this register away. Fix by giving a handler's label_reg the same treatment as block_reg. Fixes Rakudo issue #4456 |
10:01 | |
MoarVM: 860cc65508 | (Jonathan Worthington)++ (committed using GitHub Web editor) | 2 files Merge pull request #1522 from MoarVM/fix_spesh_losing_label_reg Fix spesh optimizing away still needed label register |
|||
10:03
patrickb joined
|
|||
lizmat | any other stuff I should wait for before bumping MoarVM? | 10:04 | |
10:05
MasterDuke joined
|
|||
jnthnwrthngtn | argh, that's annoying... So I got spesh linking working again. Turns out that since it emits into temporaries while forming the dispatch program and then releases them, and then runbytecode also does this and uses the temps within the lifetime of the other temps, they interfere and then boom segv | 10:11 | |
Guess they need delayed release or some such | 10:12 | ||
Geth | MoarVM/new-disp: 5c12d74508 | (Jonathan Worthington)++ | 3 files Reinstate spesh linking That is, where possible, determine the candidate that we are going to be invoking, and identify it directly, so we don't have to run through the spesh arg guard. |
10:31 | |
jnthnwrthngtn | Curiously this has only so much effect on CORE.setting build time, but is a rather more noticeable in `make test` | 10:32 | |
10:36
AlexDaniel left,
psydroid left
10:39
AlexDaniel joined
|
|||
Geth | MoarVM/new-disp: fe2fe669fe | (Jonathan Worthington)++ | src/disp/inline_cache.c Add a way to dump full inline cache backtraces |
10:41 | |
10:41
psydroid joined
|
|||
Nicholas | jnthnwrthngtn: ASAN now reports a leak: paste.scsys.co.uk/595621 | 10:47 | |
quite a few of those | |||
different backtraces, but all end up in translate_dispatch_program src/spesh/disp.c | 10:48 | ||
lizmat | dogbert17: MoarVM bumped, zef should work again on master | 10:53 | |
11:07
sena_kun joined
11:13
patrickb left
11:14
patrickb joined
11:15
patrickb left
11:16
patrickb joined
11:17
patrickb left
11:18
patrickb joined
|
|||
lizmat | jnthnwrthngtn: Q, will INDIRECT_NAME_LOOKUP exist in its current form in new-disp ? | 11:18 | |
11:20
patrickb left,
patrickb joined
11:22
patrickb left,
patrickb joined
|
|||
jnthnwrthngtn | lizmat: It's not something I've looked at during new-disp, and it hasn't shown up in any test failures | 11:22 | |
lizmat: So I suspect it doesn't (need to) change | |||
lizmat | ok, then I'll spend some time optimizing it :-) | 11:23 | |
jnthnwrthngtn | Intuitively I don't expect it to need to | ||
11:26
patrickb left,
patrickb joined
11:28
patrickb left,
patrickb joined
11:30
patrickb left,
patrickb joined
11:32
patrickb left,
patrickb joined
11:34
patrickb left,
patrickb joined
11:36
patrickb left,
patrickb joined,
patrickb left
12:02
reportable6 left
12:04
reportable6 joined
12:18
patrickb joined
|
|||
MasterDuke | jnthnwrthngtn, nine: does the new rr backtrace in gist.github.com/MasterDuke17/434a6...d3edaea42f look normal/ok? | 12:19 | |
rr does not like if i try to watch inside the `rm->body.mutex` | 12:20 | ||
timo | MasterDuke: maybe try casting it to (MVMuint64*) so it doesn't need to watch an entire big memary area | 12:26 | |
MasterDuke | oh...that would explain why it complains about too many breakpoints | 12:27 | |
at first i thought it was something about having installed a new kernel or two and maybe i needed to reboot | |||
but re the backtrace i wasn't sure if we could deserialize an MVMRentrantMutex | 12:29 | ||
gist updated | 12:34 | ||
timo | let's see if i have the opportunity to look closely when i return from errandications | 12:50 | |
Nicholas | your cat will have other plans for you? | 12:52 | |
jnthnwrthngtn is back from errands | 13:06 | ||
grmbl, I see broken spectests | |||
How'd I manage that | |||
Nicholas | insuffucient tea? | 13:07 | |
jnthnwrthngtn | It's a bit warm in here for tea at the moment | ||
(The air conditioner is working at this problem, however.) | 13:08 | ||
Uff. I've no idea how the change I've done causes the problem I see... | |||
Nicholas | what is this concept of "too warm for tea?" Have you gone native? :-) | ||
jnthnwrthngtn | Well, I did just collect a letter telling me that my permanent residence permit is ready for collection... :) | 13:09 | |
Nicholas | woohoo | ||
jnthnwrthngtn | Of course, the migration office is the other side of the city | 13:10 | |
Nicholas | That sounds like the feed line for a good pun about migration. I miss TimToady | 13:11 | |
jnthnwrthngtn | OK, this bug is very confusing | 13:22 | |
13:30
sena_kun left
|
|||
Geth | MoarVM/new-disp: 5c38f5f6f3 | (Jonathan Worthington)++ | 15 files Eliminate legacy dispatcher ops Which are no longer used in Rakudo. |
14:43 | |
dogbert17 | m: use MONKEY-SEE-NO-EVAL; EVAL buf8.allocate(17_000).raku | 15:07 | |
camelia | ( no output ) | ||
dogbert17 | sigh | ||
on new-disp I get: | 15:08 | ||
MoarVM oops: Oversize callstack flattening record requested (wanted 153104, maximum 131040) | |||
at gen/moar/Metamodel.nqp:2371 (/home/dogbert/repos/rakudo/blib/Perl6/Metamodel.moarvm:) | |||
timo | too many arguments, yeah :( | 15:09 | |
dogbert17 | it does fail on master as well, see japhb's comments from the night | ||
m: say 1024 * 128 | |||
camelia | 131072 | ||
nine | 131040 arguments ought to be enough... | 15:12 | |
dogbert17 | I'm inclined to agree | 15:15 | |
jnthnwrthngtn | Is the EVAL actually needed? | ||
Sounds like somewhere is doing arg flattening that really shouldn't be | |||
Even if it works, it's horribly inefficient | 15:16 | ||
Geth | MoarVM/new-disp: e4c801f1d0 | (Jonathan Worthington)++ | src/spesh/graph.c Prepare spesh graph builder for inlining It needs to be able to create graphs from instructions containing specialized dispatch-related bytecodes. |
||
timo | exciting! | 15:17 | |
Geth | MoarVM/new-disp: ad20e837f9 | (Jonathan Worthington)++ | 2 files First steps towards reinstating inlining Try to build an inline graph, and add back the logging of whether we could inline if we actually tried to do so. Don't actually inline for now, however. Going this far does not seem to cause any regressions. |
15:22 | |
MasterDuke | m: say buf8.new(|(^100_000)) | 15:24 | |
camelia | Too many arguments (100001) in flattening array, only 65535 allowed. in block <unit> at <tmp> line 1 |
||
MasterDuke | i think the EVAL is required to get around that error | ||
m: use MONKEY-SEE-NO-EVAL; EVAL "say buf8.new({(^66_000).join(q|,|)})" | 15:27 | ||
camelia | Bytecode validation error at offset 52, instruction 10: callsite expects 257 more positionals in block <unit> at <tmp> line 1 |
||
MasterDuke | that's the other one japhb found when golfing the mutex panic | 15:28 | |
and here's gist.github.com/MasterDuke17/3bd1f...2480f6327a some debugging info | 15:32 | ||
Nicholas | jnthnwrthngtn: you turned off MVM_HASH_RANDMOIZE in commit 5c38f5f6f3fa904a10985ec93ba2caa091d506e6 | 15:43 | |
nine | I know where that GC issue may come from | ||
Aaaand: it's coming from the other problem | 15:44 | ||
15:44
patrickb left
|
|||
MasterDuke | the panic is coming from the bytecode validation? | 15:44 | |
*mutex panic | |||
nine | I.e. the Bytecode validation error happens during prepare_and_verify_static_frame called from instrumentation_level_barrier, i.e. while we're holding that mutex. So we jump out of that instrumentation_level_barrier without ever unlocking | 15:45 | |
Luckily we have a mechanism for "please unlock this mutex in case an exception gets thrown" | 15:50 | ||
Alas, gotta go now | |||
MasterDuke | oh, what mechanism is that? | 15:51 | |
Geth | MoarVM/new-disp: 3ff82a7004 | (Jonathan Worthington)++ | src/spesh/graph.c Add more basic block boundary ops |
16:05 | |
MoarVM/new-disp: c66c96cfbc | (Jonathan Worthington)++ | src/moar.h Reinstate hash randomization Disabled to get stable spesh logs, and accidentally committed. |
|||
MoarVM/new-disp: e3c36fc015 | (Jonathan Worthington)++ | 2 files First pass at porting the inlining algorithm This is the minimal set of changes that would in theory be required for it to work. |
|||
jnthnwrthngtn | Nicholas: oops, thanks | ||
MasterDuke | `Unhandled exception: Internal error: multiple ex_release_mutex` doh | 16:20 | |
hm. MVM_load_bytecode does a `MVM_tc_set_ex_release_mutex` before we reach `prepare_and_verify_static_frame` | 16:24 | ||
make it an array of mutexes? | 16:26 | ||
in prepare_and_verify_static_frame unlock the one that's already held, set the new one, then re-set the old one if an exception hasn't been thrown? | 16:29 | ||
jnthnwrthngtn | So inlining reenalbed will get through the NQP build and fail one test; the Rakudo build fails the same test. Which...OK let's debug, but WHY does the frame the broken inlining happens in have to be EXPR (the expression parser)? :/ | 16:31 | |
Sure I'll go hunt this in 153 basic blocks :P | 16:32 | ||
timo | whew. | ||
maybe it's a phi instruction with more than 65k arguments | |||
jnthnwrthngtn | :P | 16:33 | |
The upper limit of the size of that is surely the number of basic blocks? | 16:34 | ||
Oh, it's a deopt bug | 16:35 | ||
What on earth is it doing | |||
I'm suspecting its table of inline extends is rather off | 16:36 | ||
Nicholas | ASAN or valgrind have hints? | ||
jnthnwrthngtn | Nicholas: Didn't check, but feels very unlikely; it's almost certainly an off-by-something | ||
It thinks that it's doing uninlining | 16:37 | ||
Or rather, that it should | |||
But it really shouldn't | |||
I guess something about the way calls look now means that end of inline marker is mis-placed or similar | |||
Or start | 16:38 | ||
MasterDuke | huh. gist.github.com/MasterDuke17/e74be...8ce02e4547 doesn't fix the panic. even if that patch had some other bad behaviors, i thought it would at least prevent the panic | 16:41 | |
16:44
linkable6 left,
evalable6 left
|
|||
timo | well, i can't wait to read the blog post at the end of this :) | 16:45 | |
16:46
linkable6 joined
|
|||
jnthnwrthngtn | It's a bit odd. I can change the <= to < or make inline end annotations move in the other direction to fix the deopt bug...but doing either of those gets me a different wrong deopt | 16:48 | |
Nicholas | timo: Well, you never know. it might be "It's faster. Enjoy" | 16:54 | |
jnthnwrthngtn | I'm now at "why does this work on master" :) | 16:58 | |
We set the end offset of an inline as the location after writing that instruction | 16:59 | ||
So the <= is wrong, since it would include the instruction after the inlined one. | 17:00 | ||
That'd also suggest we want to move it back, not forward | |||
Since it should be the last inlined instruction. | |||
oh hang on what | |||
We're writing it as a pre instruction | 17:01 | ||
I misread | |||
That implies it has to be on the instruction after the deopt always | 17:02 | ||
Doing that gets me another issue that...doesn't look like a deopt one, just bytecode corruption. Odd. | 17:30 | ||
ah, or maybe it is | 17:32 | ||
time for a break | |||
18:02
reportable6 left
|
|||
nine | MasterDuke: are you sure it's complaining about the same mutex though? | 18:35 | |
MasterDuke | no | ||
nine | MasterDuke: with that patch we now no longer unlock the tmp_ex_release_mutex, so maybe that's what gets collected while still locked | 18:36 | |
MasterDuke | oh. right. hm | ||
seems like maybe it does have to be an array of ex_release_mutexes? | 18:37 | ||
nine | That would slow things down though | 18:40 | |
MasterDuke | yeah. my only other idea is passing the mutex down through all the functions calls and unlocking it in fail() | 18:41 | |
18:45
evalable6 joined
20:03
reportable6 joined
|
|||
Geth | MoarVM/new-disp: 18f6e3b0eb | (Jonathan Worthington)++ | src/spesh/deopt.c Account for deopt all vs. deopt on difference When we are doing a deopt all, the bytecode position is the op we will return to. Thus we need to account for being one past the end of the inline in that case (inclusive), but to *not* do it for a failed guard that is immediately following a depot, otherwise we'll end up wrongly uninlining an inline we were not in. (This may need a further look, as ... (5 more lines) |
20:10 | |
MoarVM/new-disp: 78d1cdf687 | (Jonathan Worthington)++ | src/spesh/inline.c Correct insertion of inline end marker |
|||
MoarVM/new-disp: 42dbab6e00 | (Jonathan Worthington)++ | src/spesh/optimize.c Switch inlining back on For now without optimizing the inlinee with knowledge of the surrounding context, and also without creating inlines of candidates that were not yet specialized. |
20:13 | ||
jnthnwrthngtn | So the way to debug inlining deopt bugs is apparently to order and eat a "garlic chicken" curry where they put WHOLE CLOVES of garlic - lots of them - into an already very garlicy sauce :) | 20:16 | |
I didn't try it with stressing yet, but this causes no regressions | |||
Note, however, that it's inlining very little Raku code so far | 20:17 | ||
Or even spesh linking | |||
'cus I didn't yet teach it about dispatch programs with resumption state | |||
nine | If that's the way to debug inlining deopt bugs, I do hope there's a couple of those bugs left for me :D | ||
timo | i see a bunch of "inlining prevented by the op callercode" in a random piece of the nqp build speshlogged; isn't that an op that we can reasonably do right with inlining? | 20:21 | |
Nicholas | I'm getting a massive DU backtrace when I run time perl Configure.pl --backends=moar --prefix=/home/nick/Sandpit/moar-SAN | 20:25 | |
timo | someone can implement sp_guardnonzero for the jit | 21:06 | |
Geth | MoarVM/new-disp: 90ddd1f768 | (Jonathan Worthington)++ | src/spesh/optimize.c Re-enable post-inline optimizations |
21:20 | |
timo | ohmygosh | 21:21 | |
jnthnwrthngtn | I ran tests while I was afk, no regressions, so guess it's close | ||
Nicholas: Congrats or something..you have the DU checking turned on, I guess? | 21:22 | ||
Nicholas | I had, but I turned them off and I still see it | ||
So I'm cofnused | |||
jnthnwrthngtn | huh | ||
Nicholas | that's what I thought | ||
jnthnwrthngtn | "MoarVM oops in spesh thread: Malformed DU chain: reading goto of 4(2) not in graph" :D | 21:25 | |
timo | whoops, goto | ||
jnthnwrthngtn | Yeah, I can guess what silly I've done | ||
timo | changed the info of an ins without freeing usages? | 21:26 | |
jnthnwrthngtn | Yeah | ||
Geth | MoarVM/new-disp: e212bfcd54 | (Jonathan Worthington)++ | src/spesh/inline.c Remove runbytecode usages of args when inlining |
21:28 | |
jnthnwrthngtn | I like how much code went away when updating inlining | ||
The main thing left to re-instate is the JIT | 21:29 | ||
Which is probably doing rather little at present | |||
Since none(runbytecode, runcfunc, sp_dispatch) are JITted so far | 21:30 | ||
Even in lego, let alone expr | |||
Geth | MoarVM/new-disp: 84e75a6830 | (Jonathan Worthington)++ | 2 files Remove properties from deprecated ops |
21:36 | |
[Coke] | will we ever be able to remove the deprecated ones, or will those gaps always be there? | 22:06 | |
(And it shouldn't impact anything, right?) | |||
timo | we can fill them back up | 22:08 | |
with new ops as they come along, if they do | 22:09 | ||
22:18
linkable6 left,
evalable6 left
22:21
evalable6 joined
|
|||
Geth | MoarVM/new-disp: d48f0d9348 | (Jonathan Worthington)++ | 5 files Start to re-work unspecialized inlining This is a translation of what's needed to reinstate this on new-disp. However, something isn't quite right yet; enabling it leads to an NQP build failure. |
22:24 | |
jnthnwrthngtn | I'm too tired to figure this breakage out today | 22:30 | |
timo | good work today jnthnwrthngtn | 22:45 | |
23:20
linkable6 joined
|