01:08 Ven joined 03:28 pyrimidine joined 04:13 leedo_ joined 05:21 leego joined 05:50 geekosaur joined 05:53 pyrimidi_ joined 05:58 geekosaur joined 06:16 domidumont joined 06:36 TimToady joined 06:53 geekosaur joined
nwc10 good \o, o/ 07:09
07:32 domidumont joined 07:38 domidumont joined 07:40 geekosaur joined 08:12 pyrimidine joined 08:51 zakharyas joined 08:53 pyrimidine joined 08:58 brrt joined
brrt \o nwc10 08:58
yoleaux2 13 Dec 2016 19:20Z <nine> brrt: nice advent calendar post! Though I have to admit that I do not yet fully understand the tile syntax. There are many "reg"s and in the explanation I'm not sure to which ones you refer. Also "the expression" is quite ambiguous.
brrt thanks nine :-)
.tell nine: I see.... I can elucidate it a bit more if you want. 08:59
yoleaux2 brrt: What kind of a name is "nine:"?!
brrt .tell nine I see.... I can elucidate it a bit more if you want
yoleaux2 brrt: I'll pass your message to nine.
jnthn moarning o/ 09:01
yoleaux2 03:08Z <MasterDuke> jnthn: explicitly putting a QAST::Op with a say in the while body prints (what i gave the say) for every line of the input file. but 'say "hi"' in the -ne doesn't ever print (not does .say)
04:44Z <MasterDuke> jnthn: and -p doesn't print the file contents
dalek arVM: 8c6d4c5 | jnthn++ | src/gc/debug.h:
Only do MVM_ASSERT_NOT_FROMSPACE in GC debug mode.
09:14
09:39 pyrimidine joined
jnthn Today's unfortunate "fun": two optimizations interact to screw up the "no GC during spesh" invariant 09:53
Well, spesh is of course optimizing, that's its job 09:54
Lazy deserialization is the other one
Turns out that in a couple of places spesh looks at method caches or resolves wvals
Those can both trigger lazy deserialization
That in turn leads to trying to acquire a reentrant mutex
nwc10 er erk 09:55
you have a plan for this now?
"moar coffee"?
jnthn The reentrant mutex then marks the thread blocked, but finds itself to have been GC interrupted, so goes ahead and GCs
Breaking the inaviant 09:56
*invariant
A plan? Well, the obvious one for the method cache is "don't bother if we didn't deserialize the thing"
Since in that case we're trying to optimize a codepath that was never taken anyway (we'll fail) 09:57
The inlining wval issue is far less clear cut
Oh...actually, we can just treat it as a reason to rule out inlining 09:58
So yeah, we can fix the two cases that crop up 09:59
And I can see if there's any others liable to cause such trouble
This doesn't strike me as entirely robust however
nwc10 is it viable to add enough flags to spot the general problem?
jnthn Yes
nwc10 but at that point can it do anything other than panic/abort? 10:00
jnthn Detecting the situation robustly isn't too difficult.
nwc10 (which at least gives a bug report)
jnthn But....right, that's all we can do if we detect it.
But given if we're in that position we'll currently likely end up reaching SEGV or other corruption, it's probably an improvement. 10:02
10:03 pyrimidine joined
jnthn MVM_6model_get_how is a third such vulnerable path 10:04
10:17 pyrimidine joined 10:29 Ven joined
nine ~~ 10:47
yoleaux2 08:59Z <brrt> nine: I see.... I can elucidate it a bit more if you want
nine brrt: yes, that would be nice :)
brrt okay, here goes 11:05
so the actual 'tile' that is matched against the tree is (xor reg (load (index reg reg $stride) $size)) 11:08
the symbolic return value of that tile is a 'reg'
which means that this tile can be selected to stand in for a tree wherever an 'upper' tile needs a 'reg'
so for instance: if you'd have (add (const 4 4) (xor (load ...) (load ..))); then the tiler could select the: (add reg reg) tile by replacing (const 4 4) with the a 'reg' segment, and the first (load ...) with something that yields a reg; and then the (xor ... (load ..)) with our template 11:10
does that make sense
it's a excessively complex sentence, so it might not :-)
the point is that these 'reg' thingies are /symbols/ that allow the tiler to connect and select the correct tiles
they are also overloaded as /values/ by the register allocator, in that the register allocator tries to a allocate a register for every 'reg' thing that is alive 11:11
so three things come into play here: 11:12
- the tiler uses them as symbols, the same way it could use 'flag' or 'void' as a symbol
- the tile template generator uses them to distinguish 'valuelike' tree nodes from 'constant' tree nodes, and generates a 'path', which is a string of indexes into the tree 11:13
- the register allocator uses the generated path (along with a bitmap) to find out how the tiles' values are used; they are the 'virtual registers' in the 'three-address-code' that is the tile list 11:14
and finally, the register allocator assigns real registers to these virtual registers 11:15
so that the tile implementation (the little C + asm code bit) can use them to generate the correct machine code 11:16
(i apologise for terrorizing you with detail) 11:19
dalek arVM: 85cac62 | jnthn++ | src/gc/finalize.c:
Add an fromspace assertion in finalize.

To try and detect a maybe-bug in there.
11:27
arVM: 1c06af2 | jnthn++ | / (5 files):
Avoid a number of spesh GC invariant violations.

In some cases, we could end up triggering lazy deserialization during spesh. This is undesirable, and can break the "no GC during spesh" invariant by acquiring a reentrant mutex, which along the way will block for GC and so may also enter the GC. (It also means we're 5214212 | jnthn++ | src/ (3 files): Panic if we try to GC when speshing/JITing.
Enabled by MVM_GC_DEBUG being non-zero.
nine brrt: hey I asked for the details :) 11:31
dogbert17 jnthn: although I'm hiding at $work, let me know if I can help somewhat by running spectests in the background 11:33
11:35 pyrimidine joined
dalek arVM: 5a96721 | jnthn++ | src/6model/sc.c:
Ensure we don't leak partially deserialized objs.

This (perhaps a tad over-zealous) check makes sure that, in the case we are in the middle of deserializing something, we don't hand it back from the SC when doing a "try to get this" style lookup.
11:46
arVM: cb118d7 | jnthn++ | src/spesh/ (2 files):
Avoid two more GC triggers inside of spesh.
arVM: b94d9cf | jnthn++ | src/core/frame.c:
Fix unrooted frame around SC object lookup.

While these only ever cause gen2 allocation, they can still trigger GC due to mutex acquisition needing to enter the GC, and lazy deserialization can trigger that.
11:55
brrt chuckles at the 'eventually static' pun 12:12
dalek arVM: 2edb721 | jnthn++ | src/core/interp.c:
Remove some GC debug code.

This was put in before MVM_GC_DEBUG level 2. It's now subsumed by it. Worse, one of these had a nasty bug; better leave it to the more general mechanism.
arVM: ccafaa4 | jnthn++ | src/spesh/args.c:
Avoid reading nativerefs in spesh.

Reading them triggers a boxing, which in turn triggers GC, which violates the spesh no-GC invariant.
jnthn brrt: Where'd you see that one? :) 12:13
I'm not sure whether or not I coined it. :P
dogbert17 jnthn: looks like you opened a can of worms 12:14
jnthn Well, it was already open, I'm just trying to make sure the worms aren't escaping all over our users :) 12:15
brrt in the 2014 advent calendar
your post, in fact
jnthn That S15-nfg/many-threads.t test was very good at shaking them out
brrt damn, is it end of 2016 already?
dogbert17 cool, I'm running make spectest with your latest fixes (1 meg nursery for speed), should I holler if I see any fails or should I wait for more fixes? 12:16
jnthn brrt: Yeah, I was looking for a way to talk about the way that dynamic code tends to have a static execution profile in many cases, and at the same time at $dayjob stuff was having various discussions of eventual consistency, so the term "eventually static" kinda fell out of the intersection of them :-) 12:17
dogbert17: Plesae do :)
dogbert17: I'm about to have lunch
But have S15-nfg/many-threads.t running in a loop inside of gdb
(Stops looping if it panics or SEGVs)
dogbert17 me too :-) so the tests can run while eating
jnthn :) 12:18
bbi30 :)
brrt it's a good pun imho :-) 12:19
brrt wonders if/when we can revive the 'trace spesh' effort
jnthn back 12:57
timotimo wonders about escape analysis, as well as allocation-site measurements 12:58
dogbert17 for me, t/spec/S15-nfg/many-threads.t still fails (MoarVM commit ccafaa4cf8e0b0e1) 13:03
jnthn dogbert17: I'm trying a slightly smaller nursery size now, just to see if I can squeeze anything more out of S15-nfg/threads.t but given it ran over lunch I suspect probably not
Oh :/
dogbert17 let me gdb
nwc10 use more 'lunch' ? 13:04
dogbert17 17 spectests failed during lunch (1 Meg nursery)
as usual I might have made some kind of mistake, I get: Missing or wrong version of dependency 'gen/moar/Metamodel.nqp' (from 'gen/moar/BOOTSTRAP.nqp') 13:07
jnthn Sounds like something didn't get rebuilt
nine dogbert17: it's not you, it's our build system...
dogbert17 what do I need to do; I only dir a 'perl Configure ...' followed my 'make install' from the nqp/MoarVM dir 13:10
s/dir/did/
dogbert17 tries again 13:11
timotimo might want a make clean in rakudo if not a Configure.pl 13:12
dogbert17 timotimo: thx, will try 13:13
timotimo: you were 100% correct, rerunning spectests now with 64k nursery 13:25
dalek arVM: 42655a7 | jnthn++ | src/6model/sc.h:
Remove well-meaning but bogus assertions.

Our GC is partially concurrent: while the mark phase is certainly done with the world stopped, we sweep afterwards. This means that we may encounter objects whose headers are still marked with the GEN2_LIVE flag, which is cleared after the stop-the-world phase, and that this can happen legitimately.
timotimo i'm glad 13:27
dogbert17 and we can now safely ignore my earlier comment about 17 failed spectests :-)
dalek arVM: e4784c9 | jnthn++ | src/core/interp.c:
Cope with push being used on concurrent queues.

We should probably not have re-purposed push for sending on a concurrent queue. But, with that being done, it's possible the push will trigger GC and so `obj` is out-dated when we come to consider the object write barrier.
13:28
jnthn dogbert17: How's it looking now? Or still waiting for results? :) 13:29
dogbert17: btw, managed to reproduce something close to github.com/MoarVM/MoarVM/issues/449 13:30
dogbert17 I'm only at the S06* test so far. Only one fail, i.e. t/spec/S17-supply/syntax.t, but we had a separete issue for that if I remember correctly 13:44
ha, that's the one you're looking at :-) 13:45
jnthn Yeah, I'm pretty far into figuring it out too 13:51
brrt timotimo: any idea why the throwlexpayload thingy broke? 13:54
i mean, it looked good
timotimo no clue :( 14:07
it might have caused a frame to be jitted that has some other op in it that's not right
dogbert17 jnthn: have run spectest to completion with 128k nursery only t/spec/S17-supply/syntax.t failed 14:12
dalek arVM: c45cb0e | jnthn++ | src/io/ (6 files):
MVMROOT around putting work on concurrent queue.

It was missing in this set of places (will audit for further ones; the timer case was a cause of problems for S17-supply/syntax.t).
14:14
dogbert17 cool, I wonder have many RT's your fixes will close 14:15
dalek arVM: 81c5d27 | jnthn++ | src/io/eventloop.c:
MVMROOT eventloop queue when polling it.
14:22
jnthn wonders how so much worked before these fixes :P 14:27
nwc10 "because it hates you"? 14:28
ilmari the wrong kind of luck?
nwc10 the wrong kind of lunch?
ilmari yeh, the late kind
have been dealing with an outage at work
nwc10 :-( 14:29
expense pizza delivery?
jnthn It's "wrong kind of luck" pretty much.
With normal GC intervals you're fairly unlikely to run into many of these. 14:30
nwc10 as long as lunch is ultimately "better late than never" that's probably OK
timotimo oh, those fixes are in time for the release, too 14:31
jnthn All fixes are in time for the next release. :P
MVM_GC_DEBUG is really not bad at catching stuff these days, especially at level 2 14:32
It gets stuff that we used to get away with even in small nursery size tests. 14:33
brrt timotimo: i wonder if we can add jit bisecting for those kind of things, too 14:34
jnthn Darnit. S17-supply/syntax.t has a test near the end that runs for ages. I thought it was going to make it through with the above fix. But no...there's still something. 14:36
And I forgot to stick the breakpoint on MVM_panic... 14:37
So, here we go again...
timotimo should be possible 14:38
brrt thinks about it for a bit 14:42
dogbert17 now all spectests passed for me with a 64k nursery 14:45
I meant 128k nursery 14:46
jnthn dogbert17: Still having trouble with syntax.t on a 28KB one
dogbert17 ok, i'll try that as well
jnthn I did 32KB then went for 28KB to try and catch things that get lucky on power-of-2 boundaries :) 14:47
Guess we could pick prime sizes... :)
dogbert17 got it with 32k 14:49
jnthn In syntax.t?
dogbert17 yes
jnthn is trying another tweak he's done locally
Found a very sneaky way that the inter-generational invariant mighta been violated
(The "gen2 must not point to nursery unless it's in the inter-gens list" one) 14:50
dogbert17 here's a gist, does it confirm your suspicions? gist.github.com/dogbert17/93a8ce5f...8c30b7d42c 14:51
jnthn That one actually looks fairly different 14:52
Mine is whlie it's working on test 68
dogbert17 aha, here it must have been test 54
jnthn yeah, I've not had it fail there 14:53
In quite a lot of runs
(all the ones where looking for 68)
But I'm on 28000 bytes of nursery
I'll try 32768 after
(Once I've nailed the current issue)
jnthn wonders if any of these match up with the docs build issue 14:54
dogbert17: I've run github.com/MoarVM/MoarVM/issues/450 quite a few times on HEAD (well, and one more local fix) and can't reproduce it 14:56
(So if you can I'm interested :)) 14:57
dogbert17 will try 14:58
jnthn: gist.github.com/dogbert17/d0de6163...1266c0a640 15:02
I', on 81c5d27ebbb99,i.e. 'MVMROOT eventloop queue when polling it.' 32k nursery 15:07
dalek arVM: 1bbd82f | jnthn++ | src/6model/reprs/ConcBlockingQueue.c:
Do MVM_ASSIGN_REF after block/unblock.

Suppose that we do it before, and the target of the assignment is in the nursery and has already been seen there once. Thus we don't put the target into the inter-generational set because it's not in gen2. Then, when we block/unblock around the lock acquisition, GC triggers and the target is promoted. At this point the node to add into the linked list may point to a nursery object, but since we didn't actually chain the new item into the linked list representing the queue yet, the GC could never spot it at the point of promotion, so it would not place the target into the inter-gen set at this 21abc2a | (Jimmy Zhuo)++ | src/core/frame.c: Fix more unrooted frame around SC object lookup.
15:10
jnthn 68 in S17-supply/syntax.t survives with that one 15:13
Checking it another time or two to try and be sure
dogbert17 you're quickly working towards another blogpost :-) 15:14
jnthn :P
Well, I've gotta write my advent post 15:15
So suspect that'll eat the bloggingz time this week
I did at least start writing the code I need for my advent post last night :)
dogbert17 did you see my socket gist above? 15:16
jnthn Yes, *also* in finish_store 15:18
Really want to try and recreate that locally 15:21
dogbert17 32k nursery does it for me
jnthn Yeah, will try after this latest run of syntax.t completes, provided it comes out OK 15:22
dogbert17 the syntax test which fails for me is: is $order, '123', 'multiple channels in whenever blocks work'; 15:23
jnthn Yeah, my 28k nursery gets through that fine every time...
Hopefully 32K will catch it 15:24
Oh, I just realized... 15:30
The code explodes inside of the Rakduo C extensions to the VM 15:31
But I didn't re-compile those
dogbert17 oO 15:32
jnthn So MVM_ASSIGN_REF won't have got the GC-debug version of itself
Which would explain why I don't see the issue ;)
Anyway, compiling with that to see
dogbert17 would be nice if you manage to reproduce the errors 15:33
jnthn Indeed
Need a quick break; bbi15 15:36
dogbert17: Hmm, still no luck 16:10
ooh, but with an 8KB nursery I provoke something 16:15
And yes, it's in the place you got it. 16:17
dogbert17 jnthn: I did a make clean on both moar and rakudo and now the problems are gone ?!? 16:20
I think we're good :) 16:22
jnthn No, it's certainly still busted
Well, it was :)
dogbert17 syntax.t?
jnthn Yeah
dogbert17 :)
jnthn With the error you got in finish_store
But I think I've patched it :)
dogbert17 cooool 16:23
must go afk, bbl 16:25
16:25 Ven joined 16:36 TimToady joined
jnthn Pushed that fix (in Rakudo repo). 16:54
16:56 pyrimidine joined
jnthn Turns out with a really tiny nursery you can sqeeze even more out of that socket read vs recv test 17:04
Which, bizzarely, makes it look like when the nursery is small enough it doesn't get properly zeroed o.O 17:06
JimmyZ jnthn: github.com/MoarVM/MoarVM/blob/b94d...me.c#L1017 looks like mssing a MVMROOT the 'code' Object? 17:11
jnthn JimmyZ: Yes, and will need pulling apart into two lines 17:12
(I didn't do a full audit of those yet) 17:13
JimmyZ the missing MVMROOT part is hard,, haha, it [will] is/be always here and there
jnthn Yeah...need to write less of this stuff in C :) 17:16
OK, think I've not the concentration for hunting down more stuff today 17:17
And should save some energy to start preparing my advent post
lizmat jnthn++ 17:46
jnthn: just as a datapoint: the last fix in rakudo did not take away the problem with HARNESS_TYPE=6 make spectest 17:47
Unhandled exception in code scheduled on thread 13 17:48
No such method 'end-entries' for invocant of type 'Match'
although I needed 4 runs this time before it happened, so maybe some stability was added by that fix after all
18:28 japhb joined 18:30 domidumont joined 18:33 domidumont joined 18:45 pyrimidine joined
jnthn lizmat: I'd say there's an 90% chance that one will boil down to a code-gen bug rather than a VM bug, fwiw :) 18:45
lizmat jnthn: after the MoarVM bump I haven't been able to get HARNESS_TYPE=6 to crash 18:46
4 runs so far
jnthn Hmm :) 18:47
Interesting.
5th time's a charm :P
Or maybe it really was one of today's fixes, though if so I'm very curious which :) 18:48
BTW, bumping MOAR_REVISION was fine :) 18:49
It's hard to imagine anything I did today making things worse rather than better :-)
timotimo i thought worse is better
lizmat no, less is moar :-) 18:50
jnthn moar or less...
lizmat $ 6 ''
real0m0.099s
wow, been a long time I've seen that below 100 msecs
jnthn :)
Wonder if that was because I turned off the accidentally enabled GC sanity checks in a few ops... 18:51
japhb Yay! \o/
jnthn oh no, japhb /o\
I accidentally the whole day fixing GC bugs and didn't yet get to the OO::Monitors thing
japhb I STRIKE FEAR INTO MORTALS
jnthn Though I did think about it some
japhb Fair enough.
lizmat jnthn: 5th run ok 18:52
18:54 pyrimidine joined
jnthn lizmat: Not bad. :) 18:54
lizmat testing Zoffix's last fix now 18:55
jnthn japhb: Gotta be afk for a bit now, but I pushed my idea that may help to the branch maybe-fix-exception-stuff in the OO::Monitors repo 18:57
japhb: Didn't have chance to turn the report into a proper test case and try if it helps yet 18:58
jnthn bbiab 18:59
19:07 FROGGS joined
lizmat jnthn: the 6th was the charm :-( 19:08
===Unhandled exception in code scheduled on thread 10
No such method 'end-entries' for invocant of type 'Match'
moritz end-entries is a Tap method, not Match, right? 19:10
lizmat yup 19:11
it gets confused
actually, not sure it's a Tap
it's a method from Entry::Handler in lib/TAP.pm6 19:12
or from class State 19:13
which does TAP::Entry::Handler 19:14
so it gets confused about a method that is also provided by a role that it does
moritz I've seen such confusion with parameterized roles, in single-threaded code 19:17
lizmat so perhaps it's JIT / SPESH / related 19:18
moritz RT#130183 19:19
synopsebot6 Link: rt.perl.org/rt3//Public/Bug/Displa...?id=130183
19:21 domidumont joined
moritz that one doesn't seem to SPESH-related though 19:21
maybe they are separate issues after all
19:30 pyrimidine joined 19:58 Ven joined
dalek arVM/missing_mvmroot: ee6b817 | (Jimmy Zhuo)++ | src/ (2 files):
add more missing mvmroot
20:09
JimmyZ jnthn: ^^ please review, thanks :) 20:10
jnthn JimmyZ: I will, but probably only makes sense to do it when I'm rested :) 20:38
So will look tomorrow :)
japhb jnthn: Sadly, doesn't look like maybe-fix-exception-stuff fixed the problem. :-( 20:44
jnthn japhb: Aww :( I thought that would nail the exception getting turned into cannot invoke... 20:48
japhb is sad also
20:48 domidumont joined
jnthn I figured it was because of the EXCEPTION(...) call that takes place inside of a CATCH block 20:49
A LEAVE is the same code path for success and exceptions
timotimo mumroot sounds vaguely edible 21:54
maybe you can brew a tea with mumroot. maybe it has some medicinal properties 21:55
geekosaur "enhances memory" :p 22:01
timotimo %) 22:04