timotimo the first stab at teaching decont_* about get*ref_* shows me zero hope for both my particles benchmark as well as json_fast 00:45
but i do see two instances of the generated code trying to decont_n a wval
which can with a high probability be constant-folded, which would be nice 00:46
but somehow a goto became the writer of some register, which is ... clearly not what we want
dalek arVM: f51ddf8 | timotimo++ | src/spesh/optimize.c:
missed optimizing decont_s and decont_u reprops
00:47
arVM: 1088538 | timotimo++ | src/spesh/facts.c:
there were a bunch of "break" statements missing.
timotimo those two commits were just sensible in general, though they give us practically nothing.
06:15 domidumont joined 06:22 domidumont joined 06:40 Ven joined 06:51 zakharyas joined
jnthn nine_: Code-reviewed your Moar PR :) 09:13
dalek arVM/reframe: 96c8b15 | jnthn++ | src/core/threadcontext.c:
Initialize/destroy call stack region per thread.
09:23
10:12 Ven joined 10:42 brrt joined
dalek arVM/reframe: 736a38f | jnthn++ | src/core/interp.c:
Eliminate duplicate MVMContext creation code.

Just use the function that does it, rather than repeating it a bunch of times in the interpreter.
10:43
brrt oh, hey, i have a better solution for the write barrier thingy 10:53
jnthn o/ brrt 10:54
brrt \o jnthn
hmm, not a perfect solution though
dalek arVM/reframe: a129c2e | jnthn++ | src/ (7 files):
Insert frame stack -> heap promotion where needed.

We don't allocate any frames on the call stacks yet, and the code to actually move the frames is not yet implemented. This just inserts the calls to do the promotion, if needed, in (hopefully all of) the places where a GC-able object is about to come to reference a frame.
brrt it does fix the SEGV in install-core-dist.pl, so that's something 10:58
dalek arVM/reframe: e910664 | jnthn++ | src/core/continuation.c:
Remove out-dated comment.
10:59
jnthn lunch & 11:02
dalek arVM/reframe-jit: 2a89927 | brrt++ | src/jit/emit_x64.dasc:
write-barriers should short-circuit

This fixes the SEGV in rakudo/tools/install-core-dist.pl; it also fixes the write barrier in bindlex, which requires both a check_wb as a hit_wb.
11:11
arVM/reframe-jit: d7b2218 | brrt++ | src/jit/graph.c:
Fix declaration after code
11:13
brrt slowly creeping towards correctness... 11:16
brrt afk 11:26
12:07 Ven joined
dalek arVM/reframe: 46a9395 | jnthn++ | src/core/frame.c:
Remove now-unrequired zero assignments.

We simply zero the frame as a whole now upon allocation, so these are no longer required.
12:09
12:15 domidumont joined
dalek arVM/reframe: 096a45a | jnthn++ | src/core/ (3 files):
Implement stack to heap promotion of frames.
12:39
arVM/reframe: 7a12456 | jnthn++ | src/gc/roots.c:
Tiny code simplification.
12:42
masak ooh, stack-to-heap promotion of frames 12:48
that sounds interesting
dalek arVM/reframe: f9fd956 | jnthn++ | src/gc/ (2 files):
Update GC to be aware of frames on call stacks.
13:04
JimmyZ jnthn: github.com/MoarVM/MoarVM/commit/09...950b64R604 # looks like forgetting clear . 13:06
dalek arVM/reframe: 775efea | jnthn++ | src/core/frame.c:
Correct a couple of thinkos in promotion.
13:07
jnthn JimmyZ: It was even wronger than that :)
ooh, sneaky, the active handler chain can get outdated 13:22
timotimo o/
dalek arVM/reframe: 920ec05 | jnthn++ | src/core/frame.c:
Update active handlers when promoting frames.
13:35
arVM/reframe: 8af6f7e | jnthn++ | src/gc/roots.c:
Add missing NULL check.
jnthn Can build/test NQP successfully with new callstack thingy switched on 13:40
(Enabled in local commits)
And Rakudo gets to building its CORE.setting and...it gets up to allocating 8 GB before I killed it :( 13:43
timotimo uh oh
jnthn I didn't update perl6_ops.c, mind. Maybe something in there wants attention :)
timotimo *shrug* it'd be interesting to know how the allocation happens. like, is it stuck in a loop of some kind that just allocates and allocates? 13:46
jnthn Found a few places that perl6_ops.c needed updates :) 14:03
Trying now with those
dalek arVM/reframe: 69322c6 | jnthn++ | src/core/frame.h:
Make a couple of functions public.

So they can be used in fixing up Rakudo's extops.
14:10
jnthn Didn't do it. But...uh...MVM_dump_backtrace(tc) is printing a heck of along stack trace 14:15
timotimo along of stack trace 14:17
jnthn a long :P
timotimo clearly we aren't popping stacks off the stack when we return out of code 14:18
jnthn We are
:)
If that didn't happen a lot more would be wrong :)
nwc10 jnthn: I think it might well plateau at 8Gb, as it completed for me 14:19
jnthn nwc10: You don't have my local patch that actually enables the new call stack, though :)
nwc10 ah right. good point. 14:20
jnthn oh, hah 14:21
But good to know all the changes *without* that work :)
oh...it's not what I suspected 14:22
Hmm
Seems that somehow it looks up $*ACTIONS and gets a bogus object 14:24
I thought that was going to be because of the dynlex cache holding onto a frame
But no
Anyways, somehow it seems to end up with a Perl6::Grammar instance instead of a Perl6::Actions instance and then recurses endlessly 14:25
Taht's the first guess from the stack trace, anyways.
timotimo does the dynlex cache have a simple on/off switch? 14:26
jnthn Dunno, but it's not clear if it's to blame 14:28
timotimo 'k 14:29
jnthn no 14:33
Somehow we have...a circlular call stack o.O
timotimo the wonders of modern computing architecture 14:37
jnthn Yeah, now to figure out how the heck that happens 14:40
tc->cur_frame->caller->caller->caller->caller->header.flags
4
tc->cur_frame->caller->caller->caller->caller->caller->header.flags
0
That should never happen :)
ooh, unlinlining is doing a naughty 14:51
dalek arVM/reframe: ead7190 | jnthn++ | src/spesh/deopt.c:
Uphold no heap->stack pointers rule in deopt.
15:03
15:05 zakharyas joined
jnthn That wasn't The Bug causing the hang though 15:08
In fact, totally disabling spesh doesn't unbust it 15:09
15:19 ggoebel114 joined 15:20 nebuchadnezzar joined
jnthn oh, I think it did fix the circular thingy...and now it runs and leaks a ton 15:20
Just forgot to free the ->work and ->env, that's all :P 15:22
OK, current status: NQP builds/test passes. Rakudo builds. :) 15:26
nwc10 it compiles? ship it!
jnthn A bunch of test fails, though they fail a sanity check I put in.
Or at least, the first I checked did 15:27
Well, recompiling 'cus I note I didn't compile some bits of Rakudo with said sanity check
dalek arVM/reframe: be0ca8f | jnthn++ | src/core/continuation.c:
Missing stack->heap force in continuation invoke.
15:33
jnthn Better. Guess it's spectest time...
Some issues, but guess it's "close enough" so will flip it on in the branch for others to join in with the testing. 15:39
timotimo \o/
liking the sound of that very much
dalek arVM/reframe: 354ed2c | jnthn++ | src/core/frame.c:
Allocate frames on the per-thread callstack.

This means we get simple add/subtract the pointer cleanup of the frame memory itself. A few remaining issues, but gets through the NQP build/test and Rakudo build/test, with remaining few issues in the spectests.
15:41
jnthn A "further down the line" thing will be moving ->env and ->work onto that callstack too, but that'll be problematic in a few places. 15:42
So want to get this part of it straightened out first.
timotimo that'd be swell :) 15:43
jnthn The weird thing is that I'm getting some hangs in spectests, but every moar process is stuck at zero CPU
dalek arVM/decont_assign_nativerefs: 3181590 | timotimo++ | src/spesh/facts.c:
set KNOWN_BOX_SRC on all the native ref ops in fact discovery
15:44
arVM/decont_assign_nativerefs: a87daa3 | timotimo++ | src/spesh/optimize.c:
try rewriting decont_* into set or getlex

didn't manage to trigger it even once in user code, though, so 100% untested. also lots of debug spam.
nwc10 jnthn: before your fixes (and hence still on master) I can see one of the spectests hanging, with all threads sitting waiting for a mutex
(peacefully, until the end of time)
jnthn I've got a test with a single thread that's sat waiting on a condvar... 15:46
oh wait, is this gonna be...
nwc10 so it might be that something like that is happening, to get everything idle but unending
jnthn I think it's that threads were premature-exiting 15:52
timotimo ./perl6-m tools/build/install-core-dist.pl /home/timo/perl6/install/share/perl6 15:54
Internal error: zeroed target thread ID in work pass
known?
dalek arVM/reframe: 4503bd6 | jnthn++ | src/gc/roots.c:
tc->thread_entry_frame may be on the stack.
arVM/reframe: 5301caf | jnthn++ | src/core/frame.c:
Update thread_entry_frame if promoted.
jnthn timotimo: One of the above coulda potentially just fixed that. 15:55
timotimo doesn't seem so
jnthn But I didn't try make install yet
timotimo ah, ok
jnthn Seeing just a small handful of spectest fails 15:56
And one just gave "Internal error: invalid thread ID 107110992 in GC work pass" 15:57
timotimo that's more threads than we could start in a lifetime!
jnthn One explodes with Trying to unwind over wrong handler also, which is interesting. 15:58
timotimo i accidentally ran the primes code from the channel the other day with jit and it didn't asplode 16:02
oh btw, is there any sense in resolving the "real data"/"replaced" pointer of P6Opaque during gc runs? or do we perhaps already do that? 16:03
jnthn Resolving? 16:04
Surely we follow it when GC-marking?
timotimo making it not needed for future uses of the object, i mean
i don't 100% understand how the mechanism is implemented, fwiw :) 16:05
jnthn ah
timotimo but it seems like an indirection that could be removed when we copy the object and its data around anyway
jnthn maybe but...not sure it's common enough to be worth it
It'd need the GC to allocate it a bigger spot when moving it 16:06
timotimo OK, maybe i'll whip up a measurement tool for that
jnthn It's not clear how to factor that so it doesn't slow everything down for the sake of a few
Time for a break...can't concentrate any more... 16:22
But that unwind bug seems to be at the heart of many problems. 16:23
timotimo time for you to unwind a bit :)
16:43 domidumont joined
dalek arVM/reframe: 830f246 | jnthn++ | src/gc/roots.c:
Active handler frames may be on the call stack.
18:23
jnthn grr, the unwind bug is tricky indeed 18:32
Shows up when attempting to resume after an exception 18:33
And with a junk active handler record
Let's see if valgrind can give a hint... 18:35
timotimo food time /o/ 18:38
nwc10 that's not the usual hint that valgrind gives 18:40
jnthn Yeah...it's output is more likely to send me to the beer fridge or the whisky cupboard... 18:41
Unfortunately, though, no additional insight from valgrind. 18:54
Seems the failing sanity check is catching the problem before there's a chance for corruption... 18:55
nwc10 it it one of the regular test scripts that you're running here?
jnthn t/spec/S10-packages/basic.rakudo.moar 18:56
Is the current one
nwc10 ah OK.
jnthn Though a number of them do it
nwc10 ASAN build not yet finished.
have you got something locally enabled, or is the code in git good enough?
jnthn Nothing that should matter locally 18:57
I have gist.github.com/jnthn/38fbd0a76284...386fce8fec 18:58
But not found anything that trips over those two yet
nwc10 ASAN goes SEGV
regular boring SEGV
jnthn Oh? 18:59
nwc10 NULL pointer dereference
#0 0x7f132a41f5f1 in MVM_interp_run src/core/interp.c:433
jnthn You've got MVM_JIT_DISABLE seet also?
nwc10 pants
no
not in that window
thanks
jnthn :)
Yeah, I did the exact same thing 5 mins ago :)
nwc10 reaches the end of the test (I think) 19:01
ok 81 - .WHO of nested subset definition stringifies to long name
# FUDGED!
jnthn huh, so it does for me too on Linux
Boom with a GC error on Windows
nwc10 Odd, and annoying 19:02
jnthn t/spec/S03-sequence/nonnumeric.rakudo.moar I did get to blow up under valgrind
nwc10 (that it can't be replicated on Linux)
ok 25 - 'A' ... 'ZZ' does not go on forever
Trying to unwind over wrong handler
no ASAN barfage 19:03
jnthn Right, and same on Windows too
Yeah :(
That'll be some fun
nine_ jnthn: thanks! Updated the pull request
nwc10 I'm not sure if I can be of much help in the next $soon, so I might be AFK for a bit
jnthn Yeah, I'm getting tired to the point of totally useless for today also :) 19:04
Though glad that the refactor got far enough along today that we're already onto worrying about spectests.
Disabling the fixed size allocator didn't make things any differently faily 19:05
Mildly suspect continuations given where it fails, but can't see anything obviously missed. 19:06
jnthn gives up for the day 19:08
nine_: Cool, I'll look tomorrow :) 19:09
nine_ Thanks :) Have a very nice rest :)
jnthn Thanks! :)
19:14 brrt joined
brrt oh wait, jnthn, you mean that ->work can move too? 19:15
jnthn brrt: Not in this set of refactors, no
brrt hmmmm... how much further down the line will that be, then? 19:16
also, nwc10, where do you get a SEGV with jit enabled... oh wait, you probably don't run reframe-jit
ehm, we already compile with -fno-omit-frame-pointer 19:17
ok, that's interesting to know 19:18
jnthn brrt: Not sure, but when it happens it will only be a small handful of ops where it might happen.
brrt hmmm...kay
jnthn Rather than with FRAME where anything that allocates can make it happen.
brrt we'll add a check to those ops, then
jnthn Yeah
Invocation and continuation invocation are two of the paths, when we're already falling out of the JITted code. 19:19
brrt oh... of course
what about making a continuation? 19:20
jnthn reset, or control?
reset no, control yes 19:21
brrt from the current frame? i'm fuzzy on continuations
jnthn Ah, there's 3 operations
reset = mark the root of the continuation, control = take it (slicing everything off the stack down to the reset), invoke = restore those things we took onto the current stack
brrt aha 19:22
jnthn goes for a walk before he gets too tired to do that also :) 19:23
brrt negative jit entry label, again 19:40
moritz maybe it's a synthetic codepoint instruction :-) 19:42
brrt more probably, it's a jit entry label that doesn't belong to the function
question is, how does that happen
19:45 zakharyas joined 19:48 nebuchadnezzar joined
brrt answer, we're really in the callers jit code 19:49
so evidently something went wrong somewhere
sequence number 924 19:52
wow....
924 frames before this goes wrong
figure that
and there's the thing; our callee is seq nr 405 19:56
.. interesting 19:59
plot thickens
but, that's for tomorrow
timotimo huh, i don't see any assign_i, assign_n or assign_s be emitted ... 20:56
ah, ok, in json_fast it does happen 20:57
but the writers of those registers turn out to only getlex 20:58
to only be getlexes, i mean
and that's not something spesh could improve, i don't think
21:17 avar joined 21:21 avar joined 21:23 Ven joined
dalek arVM/decont_assign_nativerefs: 154eb7b | timotimo++ | src/spesh/optimize.c:
copy-paste of try_optimize_decont_native for assign_* ops

doesn't do anything, also doesn't trigger often.
23:44