01:23 Kaiepi joined
timotimo wow 01:27
we may want to keep this branch around for brrt so he can check performance differences of using the exprjit more or less ;) 01:30
01:45 Kaiepi joined
MasterDuke so that branch uses the exprjit more? 01:58
timotimo right, the beginning of a BB is a place where the exprjit can be "turned on"
it currently turns off at every BB's end
it also turns off after every sp_guard* op
by preventing BBs from being fused at the end of spesh when such an operation was in the last 4 ops, I thought i'd give it the ability to start up more quickly after such ops 01:59
MasterDuke hm. i didn't turn on the jit log, guess i could do that and actually count the BAILs 02:00
timotimo shouldn't influence BAIL at all 02:02
just the "built tree out of [a, b, c, d, e]" lines
would also change the "bytecode size" lines and their sum 02:03
MasterDuke ugh, not as easy to analyze
timotimo yes :( 02:07
i gots a one-liner for us 02:11
MasterDuke i just generated two logs
timotimo grep "Build tree " /tmp/no_fuse_coresettingspeshlog.txt | perl6 -e 'slurp.comb(/\w+ )> \,/).Bag.sort.say' 02:12
this counts every op in the "build tree out of" lines
MasterDuke gist.github.com/MasterDuke17/da8a9...07d950be0f 02:14
timotimo probably want .say for ... instead of .say at the end 02:16
MasterDuke hm, would that be more useful sorted by value (i.e., count)?
timotimo 186943 ā† no_fuse, 189182 ā† master 02:18
why? :\
yes, it would have been :)
MasterDuke gist updated
timotimo well, i'm maximum confuse. 02:19
maybe i got the two files flipped
oh! 02:22
the no_fuse_bb branch isn't based on latest master
that sucks :)
Geth MoarVM/no_fuse_bb_after_guard: 65e952b9b0 | (Timo Paulssen)++ | src/spesh/optimize.c
don't fuse BB right after a guard op

arguably the biggest beneficiary of fusing BBs is the exprjit. However, it currently bails out whenever it sees a sp_guard* op.
timotimo force-pushed a rebase
i bet that's why you asked earlier 02:23
MasterDuke re-running some builds 02:25
timotimo m: say 189891 / 186056
camelia 1.020612
timotimo hum. 2% more instructions fed to the exprjit
doesn't seem too amazing tbh
m: 17552119 / 17347709 02:26
camelia WARNINGS for <tmp>:
Useless use of "/" in expression "17552119 / 17347709" in sink context (line 1)
timotimo m: say 17552119 / 17347709
camelia 1.011783112
timotimo the output grows by just over 1%, too
that's the "bytecode size" there
1755... is on no-fuse-bb and the other is on master
MasterDuke well, it's now about the same parse time, not slower 02:28
timotimo grepping out the "Cannot handle" lines, i.e. what my fusing is about, gives me 1255 + 1172 + 115 on no_fuse and 1010 + 846 + 110 on master
which makes sense, given the exprjit tries to start more often, so it also gets to stop more often 02:29
m: say 81447 / 79122 02:30
camelia 1.029385
timotimo this is the factor between "Build tree out of" line occurences
MasterDuke have you read brrt's blog post? is any of that work approachable if one isn't brrt? 02:31
timotimo "explicit cleanup of spesh worker thread" seems like a small addition to the spesh worker shutdown branch i made a while ago 02:32
not sure about the others
MasterDuke btw, gist updated
timotimo jit bisect improvements might be easy
m: say 79122 / 186056; say 81447 / 189891 02:33
camelia 0.425259
0.428914
timotimo oops, those are reversed
m: say 79122 R/ 186056; say 81447 R/ 189891
camelia 2.351508
2.331467
timotimo that's the average number of instructions for every exprjit tree 02:34
ideally, no expr trees would become shorter, just shorter ones being added 02:35
gist.github.com/timo/ba2f23781b652...1773e790a8 - no_fuse_bb on the left, master on the right; how many elems in the tree vs how often it happened in the log 02:41
ideally it'd be more on the left every time 02:42
doesn't happen in for 9, for example 02:56
or 10, or 20, or 21, or 22, or 24
*shrug*, everything further would have to be analysis with knowledge of the file format
so i'll go to bed instead :)
good night!
MasterDuke later
02:58 ilbot3 joined 06:31 domidumont joined 07:06 domidumont joined 07:31 robertle joined 07:41 AlexDaniel joined 07:56 brrt joined
brrt . 07:56
yoleaux 27 Feb 2018 22:08Z <jnthn> brrt: "MoarVM in some places assumes that we have 'easy' access to the current instruction pointer, e.g. to lookup lexical variables" - did you mean s/lexical/dynamic/ ?
brrt yes, that's what i meant
.tell timotimo the usual suspect for bigger-code in the expr jit is spilling
yoleaux brrt: I'll pass your message to timotimo.
07:57 zakharyas joined
brrt .tell MasterDuke yes, some of these were very much intended to be easy-onramps :-) 08:03
yoleaux brrt: I'll pass your message to MasterDuke.
08:16 domidumont joined
nwc10 good *, #moarvm 09:02
samcv good * 09:59
09:59 domidumont joined 10:16 zakharyas joined 10:21 zakharyas joined 11:55 robertle joined 12:43 domidumont joined 12:49 domidumont joined 12:52 domidumont joined 12:56 zakharyas joined, domidumont1 joined 14:14 AlexDaniel joined 14:20 domidumont joined
timotimo huh, i wonder if brrt could check the assembly output, because it looks .. odd to my untrained eye 15:05
.tell brrt what's with the assembly near the end? add, add, and, add, add, and, add, add, xor, add, add, ror, add, add, and, add, add, and, add, add (bad) ... gist.github.com/timo/f6bc3b75005a5...10cd75ef5a 15:08
yoleaux timotimo: I'll pass your message to brrt.
timotimo is that the constant section that i heard about?
i suppose everything after "ret" is not assembly code at all 15:11
15:38 buggable joined
dogbert2_ hmm 15:46
ww
lizmat zz 15:47
16:01 Kaiepi joined 16:15 lizmat_ joined 16:19 jnthn_ joined 16:24 samcv_ joined 16:29 shareable6 joined, benchable6 joined, quotable6 joined, bisectable6 joined 16:31 timotimo joined, scovit joined 17:04 robertle joined 17:24 greppable6 joined, reportable6 joined, unicodable6 joined 18:15 domidumont joined 19:28 Ven`` joined
timotimo i wrote a little optimization that throws out repeated getspeshslots for the same sslot to the same register (if nothing else assigned to the register in the mean time) 19:28
it seems spectest-happy 19:29
now tbh i haven't checked how much time is actually spent in cursor_start 19:30
or how often it's called
but i imagine we have many pieces of code that looks like $!foo = 1; $!bar = 2; $!quux = 3; $!frob = 4
in any case, cursor_start doesn't even look half bad now 19:36
Geth MoarVM/no_fuse_bb_after_guard: fb07749cea | (Timo Paulssen)++ | src/spesh/optimize.c
Throw out repeated/redundant getspeshslot calls

if the target register and the spesh slot match and there's no instruction in between two calls that write to the same register, we can drop the later getspeshslot call.
... (6 more lines)
19:41
20:03 Kaiepi joined 20:05 Kaiepi joined 20:07 brrt joined
brrt ohai 20:07
yoleaux 15:08Z <timotimo> brrt: what's with the assembly near the end? add, add, and, add, add, and, add, add, xor, add, add, ror, add, add, and, add, add, and, add, add (bad) ... gist.github.com/timo/f6bc3b75005a5...10cd75ef5a
brrt .tell timotimo these are 0 bytes
yoleaux brrt: I'll pass your message to timotimo.
brrt in the constants section of the code
and yes, everything after 'ret' isn't code
20:26 zakharyas joined
brrt ok, i want to take another shot at the box_s+box_i bug 21:05
my current suspicion is overspilling
timotimo brrt: got a clue why preventing bb fusing after guaranteed exprjit-killers makes stage parse reliably slower? 21:14
yoleaux 20:07Z <brrt> timotimo: these are 0 bytes
timotimo oh, wait, MD said it's now "about the same time" 21:15
22:02 Ven`` joined, Kaiepi joined
MasterDuke timotimo: your latest commit didn't seem to change parse time much, which i found a little surprising, since a year ago it was in 9th place by exclusive time gist.github.com/MasterDuke17/7926d...0621e59b03 22:09
yoleaux 08:03Z <brrt> MasterDuke: yes, some of these were very much intended to be easy-onramps :-)
timotimo well, who knows how much it's actually worth ;) 22:10
22:14 Kaiepi joined 22:17 Ven`` joined
timotimo 1:20.92, 1:21.99, 1:21.57, 1:21.79 - my branch including the commit from 3 hours ago 22:29
1:21.52, 1:20.14, 1:20.93, 1:20.43 - master branch 22:30
looks like my branch is just a tiny bit slower :(
well, that isn't nice :) 22:32
looks like i have a phi-common-value-merge patch that doesn't crash things 23:04
ifnonnull only ever gets const-folded with my getenvhash-const-folding patches, i wonder how to make it happen more often 23:07
23:24 SmokeMachine joined 23:28 MasterDuke joined
MasterDuke timotimo: FYI, cherry picking your getspeshslots commit on top of master seems slightly faster 23:29
23:34 MasterDuke joined