github.com/moarvm/moarvm | IRC logs at colabti.org/irclogger/irclogger_logs/moarvm
Set by AlexDaniel on 12 June 2018.
nwc10 jnthn: Rakduo setting build for your branch wedges in the same way - CPU bound in MVM_spesh_arg_guard_run_types 08:18
timotimo does the new spesh arg code perhaps generate infinite loops in very rare cases? it's not supposed to be able to, i believe? but maybe it happens anyway :) 09:22
nwc10 timotimo: maybe, but only in his branch 09:26
nwc10 jnthn / timotimo - the ASAN excitement I saw and no-pasted last night (with the profiler) *only* happens on jnthn's branch 09:50
on, HEAD^ on his branch
whilst for HEAD, ie origin/derived-specializations 5e7e701615a9b88d9567c49725306f97d822f266 First attempt at derived specializations 09:53
Both one of NQP's tests and the core setting end up in a 100% CPU loop
jnthn Hmm, wonder why I got through the NQP build/tests without that... 10:17
nwc10 I have 6 things more annoying: 10:19
+#define MVM_ARRAY_CONC_DEBUG 1
+#define FSA_SIZE_DEBUG 1
+#define MVM_SPESH_CHECK_DU 1
ooh, that one?
MVM_SPESH_BLOCKING=1
MVM_SPESH_NODELAY=1
MVM_SPESH_GUARD_DEBUG=1
oh wait, that lot
anyway, I shall now retry but with less silly buggers 10:20
jnthn Ah, probably the spesh ones
A hang where you mention is nice
Because it'll tell me exactly which guard tree is broken
nwc10 the profiler ASAN barf is real, but clearly the whole branch isn't quite good yet, so it might go away again as a side effect of another bug fix 10:21
jnthn Yeah, malformed arg guard trees could cause much trouble
yay, repro'd the hang 10:23
nwc10 \o/
jnthn lolwut 10:27
The entire guard three is:
0: CALLSITE 0x7ffff7c64580 | Y: 1, N: 0
1: LOAD ARG 0 | Y: 2
2: STABLE CONC NQPClassHOW | Y: 3, N: 0
3: LOAD ARG 1 | Y: 4
4: LOAD ARG 2 | Y: 5
There is no 5
nwc10 The 5 is a lie! 10:28
jnthn And if it goes looking for one, that'll be memory corruption I guess :)
Geth MoarVM/derived-specializations: d3edc2be60 | (Jonathan Worthington)++ | src/spesh/arg_guard.c
Assert no guard tree buffer overrun
10:36
timotimo you're probably lucky that whatever's at position 5 has y and n both zeroed out? 10:37
Geth MoarVM/derived-specializations: 72c9673b4e | (Jonathan Worthington)++ | src/spesh/arg_guard.c
Don't create pointless load nodes in derived spesh

If we aren't going to check against an argument, there is no need to load it.
10:47
jnthn That also led to us creating more nodes that we expected
NQP is now clean for me under the blocking/nodelay
Currently running CORE.setting build under that too
Which is taking quite a while, given I've an unoptimized debug build too... 10:48
It builds :) 10:51
Let's see what the tests have to say...
There are issues. 10:54
nwc10 drink!
jnthn umm 10:56
moar: src/jit/x64/tiles.dasc:675: MVM_jit_tile_sub_load_idx: Assertion `out != in1' failed.
Aborted (core dumped)
That is...uh...pretty distant from what I've been working on
paging Dr brrt... 10:57
nwc10 \o
jnthn That assertion failure is the error on everything so far. Hm. 10:58
2 failures with the expr JIT disabled 11:04
nwc10 jnthn: actually, do you want a Mr brrt? Because I think you need a surgeon to fix the JIT 11:17
jnthn grr, I think I need to reorder the commits, so that I can have the out of bounds fixes but not the derived specialization bit 11:21
.oO( thankfully, we have git! )
11:23
OK, there are still failures there, but at least now I can first debug the thing that should not have caused behavior changes 11:24
Um, well, the one that was "just a refacotr", at least 11:25
OK, the only failure I get on `make test` without the derived specializations patch with the blocking/no delay flag is t/02-rakudo/99-misc.t 11:31
That's with the expr JIT disabled
nwc10 the profiler?
all the subtests of test 3? 11:32
jnthn Yes, got "MoarVM oops: Spesh: get_osr_deopt_index failed\" 11:32
There's extra failures with the expression JIT enabled, but I wonder if they're new in my branch... 11:33
jnthn hah, yes, the expr jit failures are indeed on master too 11:38
nwc10 odd, why do they fail for you but not me?
jnthn I don't know. You have assertions enabled, I assume?
nwc10 I believe so because I hit the one you added
jnthn I'm running `MVM_SPESH_NODELAY=1 MVM_SPESH_BLOCKING=1 make test` 11:39
Not diff
*no 11:40
Very odd 11:41
nwc10 indeed
jnthn Anyway, I guess I'll look at the profiler one first 11:42
I can sort of guess what it might be...
ah, yes... 11:44
Geth MoarVM/derived-specializations: 91efac0991 | (Jonathan Worthington)++ | 6 files
Fix candidate discarding on instrumentation

Previously it was enough to simply throw away the arg guard, since we added to it. Now that we rebuild it from the candidate list each time, we need to keep track of which spesh candidates we discarded, so that we do not include them in it again.
11:59
jnthn That fixes the profiler regression 12:00
And gets me to equivalence with master so long as the patch that actually uses derived specializations isn't in place
So now I "just" have to debug that one :)
Only two failures (so long as I disable expr JIT) 12:03
(in make test, spectest can run while I lunch...)
There are failures. 12:57
Odd, they all look like "wrong number of arguments" style errors... 13:00
Geth MoarVM/derived-specializations: 2042d9b9d2 | (Jonathan Worthington)++ | 3 files
Remove unused certain result arg guard node

We now distribute these throughout the tree.
13:04
jnthn ooh, when I spesh dump the problematic case, I get "Too many levels of inlining popped" 13:09
Aha, and disabling inlining makes things oK 13:10
This will be...interesting. 13:11
jnthn It does an inline. The caller's type information leads to a branch elimination inside of the inlinee. That renders some basic blocks unreachable. Those basic blocks in turn contained inlines. The start inline annotation gets deleted, but the end one, 'cus it lives at the start of a surviving basic block ('cus that's how they are placed) survives. 13:39
What I don't yet know is if this is the cause of the real problem, or just a dumping problem that's hiding me getting to the real one.
Geth MoarVM/derived-specializations: 39a7d006a9 | (Jonathan Worthington)++ | src/spesh/dead_bb_elimination.c
Clean up stay inline end annotations

We could end up with them after dead basic block elimination. At a minimum they just broke dumping, but they could potentially cause other kinds of confusion too.
13:58
jnthn Good news: I fixed it. Bad news: that's not the actual problem. 13:59
jnthn urgh, I see it 14:38
Also a fairly general problem, just one we didn't run into before, or not enough to casue trouble.
Basically: 14:39
1. Planner determines that things are fairly polymorphic in a particular arg, so it'll make a derived specialization 14:40
2. But (for whatever reason, maybe not a good one) something in the optimizer decides it's going to stick a guard on something we decont out of the arg anyway. 14:41
3. Now there's a deopt point in the middle of args binding.
4. We can't cope with that
jnthn Probably we can prevent 3 from being allowed to happen. But 2 suggests something's interesting with the stats (I may have cheated somewhat here...) 14:44
And we can probably lessen the impact of ruling out 3 by a code-gen tweak
(Though may not need to) 14:45
Ah, it's less bad: we only can't *inline* things that have a deopt during args binding 14:50
Geth MoarVM/derived-specializations: 02104e30c9 | (Jonathan Worthington)++ | src/spesh/inline.c
Refuse to inline with arg op after deopt op

We cannot restore the args buffer and parameter context properly when we uninline such a frame, which can then cause a crash after the deopt. In theory this has long been a possible problem, however the new derived specializations make it more likely to happen - though possibly only because we are making poor decisions about what to guard on.
15:00
jnthn OK, spectest now looks a bit better 15:20
And a lot of my remaining failure is 'cus my Rakudo master is some way behind HEAD, it seems... 15:21
dogbert17 perhaps you're done 15:25
jnthn Alas, no 15:38
At least one failure remains
Almost through the spectest with nodelay/blocking and will know exactly how many soon
A few failures, including a hang in S17-procasync/nonexistent.t 15:40
That said, I didn't do such a run against master
Which I'll do before any more hutning 15:41
*hunting
Got some other things I need to work on today, so that'll be all for now.
dogbert17 jnthn++ 15:42
jnthn Looks like all but one are there on master also... 15:53
Geth MoarVM/derived-specializations: d9e38edb81 | (Jonathan Worthington)++ | src/spesh/inline.c
One more possible argument handling instruction

Which needs deopt handling too.
15:55
brrt \o 17:15
I hear there's jit trouble
jnthn: what's the expr JIT failure? 17:16
oh, I see it
damn
how do I trigger?
jnthn brrt: Rakudo's `make test` with `MVM_SPESH_NODELAY=1 MVM_SPECH_BLOCKING=1` on `master` did it for me 17:27
brrt ok, I'll try it 17:50
nine I actually get that error in a lot of test files when running make test on a debug build with MVM_SPESH_BLOCKING=1 MVM_SPESH_NODELAY=1 17:54
13 core dumps for a make test run
jnthn Yeah, I had something in that range. Discovered when stressing my derived spesh branch, then I discovered it was on master too 17:55
nine The tile it stumbles over is {emit = 0x7f80ce55643f <MVM_jit_tile_sub_load_idx>, node = 201, op = MVM_JIT_SUB, num_refs = 3, refs = {35, 63, 45, 0}, args = {8, 8, 0, 0, 0, 0}, values = "\001\001\002\006", register_spec = "\001\001\001\001", size = 8 '\b', debug_name = 0x7f80ce8f7a18 "(sub reg (load (idx reg reg $scale) $size))"}
2019.11 looks fine 18:05
brrt hmmm 18:06
nine In fact even 2020.01.1 looks good 18:12
oh damn, forgot to turn on debugging again 18:13
nine This is odd... it works with MoarVM on master, nqp master and rakudo 2020.01, but not anymore with rakudo on master 18:30
nine Trouble seems to have started with this commit: github.com/rakudo/rakudo/commit/49...cc166a916e 19:04
I'm pretty sure the commit itself is innocent, but it may show the code that causes the JIT to trip up 19:11
lizmat fwiw, I found other odd things with nqp::bindattr_i in the core setting, not outside of it 20:18
so that may be related