github.com/moarvm/moarvm | IRC logs at colabti.org/irclogger/irclogger_logs/moarvm
Set by AlexDaniel on 12 June 2018.
00:07 lizmat left 00:45 MasterDuke left 01:05 MasterDuke joined 05:13 kanopis joined 05:28 nebuchadnezzar left 05:29 nebuchadnezzar joined 05:34 robertle left 05:47 kanopis left 06:46 robertle joined 07:40 domidumont joined 07:46 domidumont left 07:47 domidumont joined 08:03 masak joined
masak aloha. AlexDaniel``++ found a spesh-related regression: github.com/rakudo/rakudo/issues/2057 08:04
is there some way I can help pinpoint it?
m: "" ~~ /{ (make 0 for 0) }/ && print "." for ^100
camelia .Cannot bind attributes in a Nil type object
in regex at <tmp> line 1
in block <unit> at <tmp> line 1

.................................................................................................
08:37 brrt joined 09:19 lizmat joined
brrt \o 09:19
masak o/ 09:34
brrt ohai masak 09:42
long time no \o
samcv o/ 09:47
brrt ohai samcv 09:49
jnthn o/ 10:08
masak: You can try running it MVM_JIT_DISABLE=1 in the environment to rule out that part of the opts. Also MVM_SPESH_INLINE_DISABLE=1 and MVM_SPESH_OSR_DISABLE=1 each rule out various kinds of optimization.
brrt points at tools/jit-bisect.pl with --spesh option, which automagically figures out which options are decisive 10:16
given a test case that fails with a nonzero exit code 10:21
jnthn brrt: Any progress towards getting guards covered by exprjit? 10:29
(Unless that happened without me noticing... 10:30
brrt no... I hadn't thought of that as a priority 10:37
I can do that for sure
jnthn Well, they quite often show up in sequences of instructions 10:38
like do X, guard it, do stuff with it
brrt i see 10:48
I don't think implementing them should actually be a problem. It might trigger a sync because we might deoptimize 10:49
well, it would trigger a sync
lizmat hmmm... seeing one test break: S04-phasers/in-loop.t 14
double checking...
Geth MoarVM: 5429bf254d | (Samantha McVey)++ | build/Makefile.in
Add uthash_types.h to Makefile.in

This had been missed when the file was added.
10:50
MoarVM: e766345db9 | (Samantha McVey)++ | 2 files
Speed up hash garbage collection a lot (gc_mark); add new macros

Add new MVM_gc_worklist_add macros. These new macros assume that the worklist is large enough to hold the new entries so MVM_gc_worklist_presize_for must be called before hand removing one check and a potential function call and hence why these are named `nocheck`.
... (14 more lines)
masak brrt: I should hang out more on this channel. not doing so is more due to forgetfulness than anything else.
masak jnthn: oki, trying now
jnthn brrt: Can be defer the sync until we know we're deoptimizing, 'cus the guard fails? 10:52
masak jnthn: problem persists for MVM_JIT_DISABLE and MVM_SPESH_OSR_DISABLE, but goes away for MVM_SPESH_INLINE_DISABLE
jnthn (As in, can we in theory...)
lizmat yeah, confirmed: S04-phasers/in-loop.t test 14 fails, will bump nonetheless
masak adding this info to the github issue.
jnthn lizmat: Does it fail due to a MoarVM change or an NQP change, ooc? 10:53
lizmat good question
lemme double double check 10:54
nqp fad8b7cdaf2eae9aae appears to be a candidate 10:55
(in nqp) ?
checking if a revert fixes that 10:56
brrt jnthn: let me check 10:57
masak: welcome back :-)
(i hang out in #perl6 far too little) 10:58
masak heh :)
I'm mostly a 007 programmer these days, but recent developments have made me more interested in the runtime/bytecode end of things
lizmat jnthn: it's not that commit 11:02
jnthn lizmat: Phew :) 11:04
lizmat I'm going to bump now anyways... if it's still there in a day or two, I'll make a blocker issue
jnthn OK
lizmat m: for ^1 { POST .say; 42 } 11:10
camelia WARNINGS for <tmp>:
42
Useless use of constant integer 42 in sink context (line 1)
lizmat the essence of the test 11:11
After the bump, it becomes Mu
jnthn Hm, that can't even really be spesh to blame 11:19
Because it's a loop 1 time
Well, unless spesh breaks the compiler, but that's a stretch too for such a short program to be compiled 11:20
Geth MoarVM/new-deopt-point-algo: df0ecf2b0f | (Jonathan Worthington)++ | src/spesh/usages.c
Add unseen read handling to new deopt algorithm

So that it can better handle loops.
11:21
lizmat jnthn: maybe it used to work because of a bug, and you fixed the bug 11:22
test 15, the one after, is similar, but todoed
jnthn Hmm, that analysis is a bit expensive 11:23
brrt how's 007 going 11:26
samcv don't bump moarvm. may be some issues with my latest commit. may cause some tests to fail 11:51
timotimo the bump already happened, samcv
samcv oh ok
so it must not have issues? 11:52
:P
well i can bump it again otherwise
brrt or we screwed up and fix it later :-)
brrt has done that some times
timotimo it bumped to masak's typo fix commit i think?
samcv though not certain if it causes issues. but i'm compiling newest rakudo so i can check
masak gosh, I hope my typo fix is not causing issues :P 11:57
samcv ok the bump didn't have my change in it. so all is ok 12:04
Geth MoarVM: d8e1ad9fca | (Samantha McVey)++ | src/gc/worklist.h
Fix a crash caused by the previous commit

It turns out this can be null, so we really need to check before accessing the struct.
12:05
timotimo morepypy.blogspot.com/2015/10/pypy...nts-2.html 12:08
samcv jnthn: any more places that could benefit from quicker MVM_gc_worklist_add where we add a bunch of things at once? i'm working on adding it for arrays in addition to hashes 12:09
brrt so... I have a union of a struct and a int32; and I know the struct fits into the int32 12:10
I kind of want to use that union as 'just' integers at some point
samcv 'just' integers?
brrt I know that's not actually legal in C these days, but I also know that it will work
yes
I have an array of those unions, and one part of it is an integer, and the other is a tiny struct 12:11
samcv you mean like mystruct - mystruct = 0?
brrt no, i mean: MVMint32 *ints = (MVMint32*)structs; MVMint32 i = ints[10];
samcv ah so it's a struct of int's 12:12
brrt uhm, i mean MVMint32 *ints = (MVMint32*)unions;
samcv and the unions are 32 bits in size?
what else is in the struct other than int32?
brrt yes, the union is like union { struct { char x[4]; }; int y; }
that's basically it
samcv ah ok
timotimo samcv: when we mark SCRef maybe?
brrt so; it should never break if I cast it to integers, and deref that 12:13
on the other hand, what I understand from C standard UB is that this is not legal
samcv yeah it's not. though you could add a compile assert and make sure the size of the struct is the right size 12:14
if it is the right size i'd say you're okay
brrt can i do that in C? I know there's static_assert in C++
samcv yeah let me find it
brrt what I could do is take a reference to the y
as in: MVMint32 *ys = &(unions[10].y); 12:15
i think that'd be legal?
samcv brrt: stackoverflow.com/a/809465/8046034 12:16
essentially you make an assert macro. and if the condition is true, it makes a typedef of a positive number of bytes. but if it's not it tries to make it negative and the compiler errors 12:17
it's not a C feature until recently, but the macro is fully standards compliant, and doesn't assume anything odd about the compiler 12:18
and also won't make the code any bigger as well. there's other ways that do `char foo[sizeof(mything)==2]` but then you define a variable, and you can only put it where variables are allowed 12:22
though that should get removed when we compile on O3, but the typedef is nicer as it doesn't add variables
12:24 brrt left
jnthn samcv: I can only think of those two off hand 12:28
lizmat looks like Travis is red because build fails in stage optimize when moar=master 12:44
samcv jnthn: the order of the worklist doesn't matter rigth? 12:46
lizmat afk for a bit& 12:47
jnthn samcv: Not really. I mean, it's cache helpful if things we'll process close in time are close together
But correctness wise, no 12:48
samcv ok thanks
Geth MoarVM/new-deopt-point-algo: be56234ee0 | (Jonathan Worthington)++ | 6 files
Switch over to using the new deopt use method

We don't currently every throw away any deopt usages, so in the immediate this is a step backwards (since it removes the "can't do a deopt after the last deopting instruction" optimization). However, this will let us work out places where we fail to keep something alive for deopt that we should be.
12:51
jnthn Well, that explodes CORE.setting...
ah, turns out one thing I didn't think was needed in the old handling is :P 13:04
Geth MoarVM/new-deopt-point-algo: 787b7bb7e6 | (Jonathan Worthington)++ | src/spesh/usages.c
Value written by deopt instruction also needed
13:05
13:11 lizmat left, reportable6 left, reportable6 joined 13:13 brrt joined
brrt samcv: cool 13:18
samcv it looks like SCRef sometimes pushes null's but even if it does it's faster to just not check and check later when processing the worklist 13:22
jnthn Yeah :) 13:23
samcv though most of the time it doesn't. but i tested one of the roast files which did a lot of them
jnthn It's because lazy deserialization :)
samcv since i knew it was faster in most files which that doesn't happen
timotimo how about a vectorized null-check for a whole bunch of pointers at once? 13:24
for quickly skipping over big amounts of not-yet-deserialized objects
samcv not sure if that'd be faster?
timotimo there's sometimes like a hundred null pointers in a row
samcv does that even happen often enough?
timotimo well, the core setting is huge and many programs don't use most of the classes in there
jnthn True, but first of all you need a program that runs long enough to reach a gen2 collect. :) 13:25
timotimo also true
jnthn But yes, once that happens, the CORE.setting one should contain a lot of non-deserialized things
13:26 lizmat joined
timotimo do we ever clean up objects we've deserialized? i don't think we do. do we allocate them in the gen2 by default, then? 13:26
in that case we can skip checks for gen2 there, too
jnthn Well, SCs are elligible for GC 13:30
timotimo right, only when the SC itself gets collected do its objects die
jnthn But yeah, we presume that once we've deserialized them once, they'll be needed again 13:31
timotimo could be a tiny bit faster, too
jnthn Ugh. First attempt at using the new deopt data to decide what things we don't need to keep immediately throws away something important 13:37
Now to figure out why.
That said, a spectest comes out looking alright before I try deleting deopt usages that we in theory no longer need. 13:38
So at least the more precise capture of them seems to not be missing anything in the first place. 13:39
Geth MoarVM: 6ec68988d5 | (Samantha McVey)++ | src/6model/reprs/VMArray.c
Rename VMArray gc_mark to VMArray_gc_mark

This makes it easy to see which `gc_mark` is taking up time when we profile MoarVM code.
13:58
MoarVM: 8ac13c2942 | (Samantha McVey)++ | src/6model/reprs/VMArray.c
Optimize VMArray_gc_mark to be a bit faster

Use the new macros to get a bit of speed improvement.
samcv well this makes it a bit faster when that code is used
MoarVM: 7bf4c429f2 | (Samantha McVey)++ | src/6model/reprs/SCRef.c
Optimize SCRef_gc_mark by using faster MVM_gc_worklist_add calls

Use these faster calls which gives a noticable improvement in code that has to do a fair amount of SCRef gc marking.
samcv the SCRef that is. and the array as well too
jnthn Ah, found it 14:16
Call optimization can introduce additional guards
And they get their own deopt point
But none of them steal/move the deopt point that was in that area in the first place 14:17
Thus we think it's not used
samcv++ # GC speedups 14:19
Geth MoarVM/new-deopt-point-algo: 3345ec3646 | (Jonathan Worthington)++ | 5 files
Fix handling of added deopt points

During call optimization, we might choose to insert extra deopting instructions, each which get their own annotation. However, we were leaving the original annotation then on a prepargs, which is not a deopt-causing instruction, so we would wrongly drop use of the deopt point. Fix by establishing a link between those added "synthetic" deopt points and the original one they stole the target address from.
14:50
jnthn With that, the new more aggressive deopt point deletion builds NQP and Rakudo and passes make test in both. 14:52
Unfortunately, though, the algorithm that assembles the more precise deopt point information is le slow 14:53
lizmat le slow ? 14:54
jnthn Even bad things sound classy in French :P
lizmat hehe
jnthn I suspect I can speed it up some, though it wasn't really worth it until I knew the whole thing wasn't utterly flawed 14:55
This is attempt 4 at getting sufficient deopt handling improvement.
In various cases we lower things that could potentially deopt into things that simply could not. 14:56
I'd also like to make us able to delete guards that we don't use
lizmat
.oO( who guards the guards )
14:57
jnthn Well, rather, that we have but later prove we don't need
But I noticed that even if I did that, we'd still have kept instructions alive in case a guard that is no longer there would deopt. 14:58
And we didn't track which things we were keeping alive by the deopt point keeping them alive.
This was already an issue in that we'd lower a potentially deopting instruction into one that never could, but again have to assume the worst 14:59
So the branch gives us far more precise tracking of which deopt points keep alive which instructions
And then we do a pass afterwards seeing which deopt points can really deopt 15:00
And delete deopt usages for things that never could. 15:01
Anyway, this approach seems solid model wise.
15:13 brrt left
dogbert17 jnthn, lizmat: I'm quite certain that github.com/MoarVM/MoarVM/commit/1b...bc0b3de7e8 is the commit which breaks the Optimize stage on a 32 bit build 15:19
jnthn haha, part of it was O(MG) :)
Now a good bit better :) 15:20
dogbert17: Hmm, intresting.
dogbert17 I'm 90+ percent certain ... 15:21
jnthn Yeah, now even with spesh blocking the test times look sensible again :) 15:22
lizmat jnthn: so it wasn't as expensive as you thought ? 15:23
jnthn lizmat: Well, it *was* expensive the way I originally implemented it 15:24
lizmat aahhh ok
jnthn lizmat: But that's because I was repeatedly calculating an O(n) thing that was in itself in an O(a different n) thing
lizmat yeah, you quickly go to O(MG) then
jnthn Whereas it was possible to make the first thing O(1) and in some cases to realize we didn't need to do the O(n) at that point either. 15:25
That O(n) already being repeated per basic block
So basically it was cubic :)
Though in 3 different things, but I guess they can be expected to correlate in how they grow 15:26
nwc10 jnthn: is this evil O(bother) thing on the main thread? or on the spesh thread (so just slowing down optimisations, not main throughput)?
Geth MoarVM/new-deopt-point-algo: 91540080eb | (Jonathan Worthington)++ | src/spesh/usages.c
Remove leftover debugging code
15:27
MoarVM/new-deopt-point-algo: fc1378ad27 | (Jonathan Worthington)++ | src/spesh/usages.c
Don't repeatedly calcuate if preds are seen

We can instead keep track of which basic blocks have had all of their preds handled, and update it by looking at the succs of a basic block after processing it. Better still, if nothing changes, then we know we don't have to both processing the outstanding reads at all.
jnthn nwc10: The spesh thread, but that has to be sync'd up for GC reasons, so if it sticks itself in a C loop for ages then it'll clog up the lot
nwc10 thanks. As you can tell from needing that rider on the end, I'd forgotten that bit 15:28
jnthn Now, do I dare try this with NODELAY... :) 15:32
jnthn does it
dogbert17 go go go :) 15:33
jnthn Really hope it copes
dogbert17 is trying to build Rakudo with asan to see if it has anything to say ...
jnthn This was quite difficult to come up with :)
dogbert17 I'm optimistic 15:35
jnthn NQP is happy 15:36
dogbert17 yay 15:37
meh, my asan experiment failed 15:38
15:42 domidumont left 15:43 robertle left
jnthn Two NativeCall tests are a bit unhappy under NODELAY 15:44
dogbert17 that wasn't too much was it? 15:46
jnthn No, and I didn't actually run with NODELAY for a day or two, so should check if it even is my branch 15:47
My next task will be to sort out guards so we can handle proving they ain't needed 15:51
After that, I'll focus on cleaning up after all the changes :-)
And the Rakudo blockers
btw, for those of you working on spesh things: once the branch lands, the facts section includes for each register kept alive by deopt a list of the indexes of the deopt points keeping it alive. 15:54
That together with the DU graph means we now have a very clear picture of why things can't be eliminated 15:55
*DU work 15:56
Looks like what spectest failures under the nodelay exist are all error reporting/backtrace related 15:59
Which we know is an issue
And the nativecall one seems to be about OSR 16:02
And yeah, it was this branch that did it 16:04
dogbert17 do you know what the problem might be? 16:14
jnthn Though we're in unoptimized code when things go wrong 16:22
16:38 japhb joined
jnthn Ah, I think spesh plugins are perhaps to blame 16:57
They meddle with the graph a bit too much 16:58
16:58 japhb left
jnthn Hm, maybe 16:58
argh, yes, certainly
17:00 japhb joined
jnthn D'oh 17:01
That code will have to shuffle along to the optimize phase
Oh, and I see why the previous algorithm got away with it too 17:02
Though it'll be a tad more involved that just calling the code from a different place, 'cus it will also have to use that new synthetic deopt point mechanism also 17:03
Will fix it later. 17:04
dinner & 17:05
lizmat samcv: sanity check: if a hash is not altered wrt to keys, will each iteration on the hash produce the same order or not ? 17:42
samcv: nvm, it needs to be consistent across different hashes with the same keys
timotimo that won't be the same 17:43
lizmat yeah
17:47 greppable6 left, greppable6 joined 17:55 domidumont joined
dogbert17 interesting, valgrind gets angry with t/spec/S24-testing/3-output.t 18:01
perhaps something for jnthn: gist.github.com/dogbert17/d6172f84...aac890ffb4 18:03
18:03 domidumont left
lizmat jnthn timotimo moritz: if "Perl 5 is Turing complete, impure, untyped, locally-scoped, dynamically-dispatched with non-local control flow" 18:18
how would you characterize Perl 6 in one line ?
I guess s/untype/gradually typed/ 18:19
18:25 robertle joined
TimToady
.oO(The language that does everything more beautifully than you.)
18:38
with a nod to Martha Stewart...
18:43 buggable left, buggable joined 18:44 buggable left
lizmat fwiw, trying to draft a comment on news.ycombinator.com/item?id=17525175 to expose any difference between Perl 5 and Perl 6 18:44
18:45 buggable joined, buggable left 18:47 buggable joined 19:05 buggable left 19:07 buggable joined 19:08 buggable left, buggable joined 19:12 buggable left, buggable joined
lizmat m: my %m is Map = ^1000; %m.WHICH for ^100; say now - INIT now 19:14
camelia 0.8990476
lizmat when I --profile this ^^^ , the 2 top entries are: the Bool *prototype* in Mu, and find_method from Metamodel.nqp:1105 19:17
and they take up 40% of the CPU for that code.
that seems weird to me: is this some artefact? Should I make a ticket for it?
the Bool prototype appears to create 2 CallCapture objects for each call (or thereabouts) 19:19
*confused* 19:20
jnthn Something sounds a bit off there, yes
dogbert17 turns out valgrind is unhappy about a lot of scripts, same error though, in lessen_deopt_requires_for_bb() 19:38
lizmat jnthn: so ticket ? 19:39
jnthn dogbert17: Hah, I don't have to debug that then :D 20:26
dogbert17: That function ceased to exist in my current branch :) 20:27
lizmat: Yeah
lizmat jnthn: moar or rakudo issue ?
jnthn lizmat: Sounds like Rakudo 20:28
lizmat ok, will do
dogbert17 jnthn: lol 20:30
MasterDuke lizmat: fwiw, here's a perf report of your above example (with the two numbers both increased to 5000) gist.github.com/MasterDuke17/8192f...325a0afc35 21:15
huh, just tried to --profile it and got `MoarVM oops: Spesh: failed to fix up handler 1 in <unit> (930, -1, -1)` 21:16
lizmat aha, ok so something screwy going on 21:19
MasterDuke updated the gist with a gdb backtrace after putting a breakpoint in MVM_oops 21:23
21:26 dogbert11 joined 21:28 dogbert17 left 21:29 AlexDaniel left 21:30 AlexDaniel joined
jnthn tries a patch to fix the facts last known issue in the new deopt algo 22:24
yoleaux 20:58Z <lizmat> jnthn: do you think .hash should always return a .Hash , or can it also return an immutable Map ?
jnthn .tell lizmat It's a conextualizer, so so long as it returns something Hashish it's all good.
yoleaux jnthn: I'll pass your message to lizmat.
jnthn .tell lizmat Uh...Hash-ish :P
yoleaux jnthn: I'll pass your message to lizmat.
jnthn NQP is happy after the changes, and the failing NativeCall test also 22:26
Doing a full NODELAY Rakudo build now 22:27
22:33 robertle left
Geth MoarVM/new-deopt-point-algo: 8821a35820 | (Jonathan Worthington)++ | 2 files
Move speshresolve handling into optimize phase

Previously, the guards were inserted at the facts phase. We got away with this when deopt processing was more lax and in band with the facts analysis. However, now it takes place after. That's still OK for most guard insertion cases. But spesh plugins aren't a guard insertion. They are a significant graph change, rendering the graph too different from ... (10 more lines)
22:36
jnthn Hurrah. 22:38
MasterDuke how close is the branch to merging? 22:40
jnthn With luck, 3 minutes 22:41
ah, maybe a couple longer on this machine
And after that, though not today, my next task will be the changing of the guards :) 22:42
So that they produce a new SSA version
Which will allow us to understand data flow before/after the guard, to eliminate ones we can prove we don't need
Like if we call .chars we currently guard the result being an Int 22:43
But if we've inlined it, which we will because it's tiny, then we can see it box a box_i to type Int 22:44
MasterDuke nice
jnthn I'll then be very close to having my int $i = $str.chars; being able to optimize away the boxing
Geth MoarVM/master: 9 commits pushed by (Jonathan Worthington)++ 22:49
jnthn There we go :) 22:50
dogbert11 jnthn++ 22:55
t/spec/S17-promise/lock-async-stress2.t flaps. I'm quite certain it has nothing to do with your recent merge though. 23:14
jnthn No, don't think so, except to the degee that if you make things faster then you are more likely to expose existing races. 23:16
dogbert11 Not enough positional arguments; needed at least 2 in block at t/spec/S17-promise/lock-async-stress2.t line 11 23:17
23:42 lizmat left