MasterDuke man, all sorts of exciting stuff landed this weekend 00:03
timotimo so how did it go back up to 10 gc runs i wonder 00:04
what other things excite you, MasterDuke? 00:06
MasterDuke how much i'm enjoying pubg mobile 00:08
timotimo haha 00:12
i was thinking more along the lines of rakudo-related stuff :) 00:13
MasterDuke you're making profiling better, zoffix is making number stringification better/faster, nine is doing cool stuff with the build, etc 00:15
timotimo i agree on the latter two :P 00:16
and the etc
i'm not making good progress with the profiler bits :|
MasterDuke well, i'm not doing anything much at all, so you're ahead of me 00:18
timotimo you're among the folks providing me with profiler bugs %)
i just replaced the call to .sum with a tiny "clever trick" 00:19
MasterDuke oh? 00:20
timotimo yup, the original code goes kind of like this:
set up an array of a few zeroes so we can smooth out consecutive readings
every frame, shift off the first measurement, push a new measurement at the end, calculate the sum
the new code just subtracts the shifted-off value and adds the pushed value and keeps the value around 00:21
MasterDuke ah, nice
timotimo the call to the sum method was pessimized because it created an empty hash that went unused every time around (because named arguments) 00:22
we can often throw those away, but not in this case :( 00:23
wait, was the sum call already optimized properly the last time? 00:25
the allocations are real low now
i'll do a 300 second measurement 00:26
this one was only a minute long, but it also only had 1 gc run
i'm hopeful :)
i'd really like less than one run per minute 00:27
so a young object has two minutes to become unreferenced before it is incorrectly assumed to be long-lived (because it survived two nursery collections) 00:29
nice. 3 gcs over 5 minutes 00:31
MasterDuke seems pretty good 00:35
timotimo i'm not completely done yet 00:36
i don't think i can remove the BOOTHash allocs from prod-affinity-workers, and i can't remove the BOOTIntArray allocation from nqp::rusage, but i can perhaps get rid of the Int allocation from getrusage-total 00:37
MasterDuke is it going to julienne fries when you're done? 00:38
timotimo i don't know what that is :o 00:41
first attempt didn't work, anyway
MasterDuke "it slices, it dices, it even does juilienne fries!" 00:44
just a trope about over-the-top ads for things 00:45
to julienne something is to cut it into long thin slices 00:46
timotimo ah, ok 00:48
waiting for a 600s profile to finish now %) 00:49
didn't even do a short one first to see if my second approach would work
when will it finish ... 00:53
there it is
i doubled the amount of allocations :D 00:54
nope, can't get it to go away. oh well. 00:58
pushed the branch 01:09
Geth MoarVM: 5cff02dafe | (Timo Paulssen)++ | 3 files
jit a few ops that ThreadPoolScheduler uses

in particular, these are used by the supervisor thread that wakes up 100x per second.
01:10
MoarVM: e7fee68f1d | (Timo Paulssen)++ | src/profiler/instrument.c
getrusage allocates, so log that in the profile.
timotimo and the new jitted ops
MoarVM: 621ca3c220 | (Timo Paulssen)++ | src/spesh/facts.c
discover type of getrusage in spesh

this allows devirtualization of atpos_i accesses to getrusage's result. Probably barely worth anything cpu-time wise.
MasterDuke cool 01:12
timotimo i'll write up a pull request for the TPS supervisor stuff 01:13
i'll also do a full measurement at 5 minutes run time 01:14
timotimo 5 minutes is so long %) 01:19
timotimo github.com/rakudo/rakudo/pull/1653 01:35
timotimo bedtime 01:36
MasterDuke nice, later... 01:55
timotimo wow. if we change getrusage to fill a given array, we actually can get the supervisor allocation-free! holy cow. 16:06
that is, if inlining prod-affinity-workers doesn't cause allocations of closures or something 16:08
awesome! there's only BOOTIntArray allocations remaining and a few types of objects that are allocated far less often 16:11
Geth MoarVM: 7198eb8014 | (Jonathan Worthington)++ | src/io/procops.c
Convey the process ID of a started child
16:20
timotimo jnthn: do you think making getrusage fill an int array for you instead of allocating one every time it's run is sane? 16:22
*doc brown voice* if my calculations are correct, when this baby runs for 88 hours, you'll see some sick shit!
hell yeah! 16:33
ran for 300 seconds, 0 gc runs
it doesn't look like it's allocating anything during its main loop
jnthn timotimo: Hmm, I guess we could 16:34
timotimo i haven't looked at tweak-workers at all yet, fwiw, because in an empty program it doesn't run
and if the program's doing work, it's probably fine for the supervisor to allocate because the other work stuff is also likely to allocate 16:35
Geth MoarVM: fdb5e4d6bb | (Timo Paulssen)++ | 9 files
make getrusage op modify an existing array

instead of allocating one in the op itself.
16:55
MoarVM/spesh-refactor-iffy: 5 commits pushed by (Bart Wiegmans)++ 19:19
brrt \o 20:31
timotimo o/ 20:37
Geth MoarVM/spesh-refactor-iffy: 8fecf3bfae | (Bart Wiegmans)++ | src/spesh/optimize.c
[Spesh] Add optimize_unbox

The original optimize_iffy had a special loop for unboxing boxed integers (and other things). I've taken that out and generalized it for 'unbox' operations. But as far as I've been able to determine, that isn't actually effective (the optimization is never applied). Maybe that is because rakudo outputs better code than ... (6 more lines)
20:45
brrt I am going to make a pull request for that
timotimo brrt: we probably have lots of p6box_i instructions, those would probably not be picked up by this opt, right? 20:46
brrt it basically splits a bunch of optimizations in optimize_iffy in simpler optimizations that can be applied on individual opts
timotimo oh, it doesn't actually check for the opcode at all, nice.
brrt not sure, but that is plausible
well, you wrote that code 20:47
years ago
timotimo yeah, i only just recognized that it's mine :D
brrt it is correct, I think, but I don't think it is actually useful anymore
because the actual optimization is not applied anywhere in nqp + rakudo build 20:49
timotimo right 20:50
i bet our coverage shows that
moarvm.github.io/coverage/libmoar/....html#L372 - check it out!
so it looks like we never reach safety_cur == cur; maybe because it doesn't try to cross BB boundaries? 20:51
do you think the iffy refactor is ready to hit master? 20:55
i really, really want some way to put messages about the spesh process right into the spesh log in an easy-to-reference place ... 20:56
in graph 0x7f048c0c17f0 bb 0x7f048c1576e8 optimized an unbox coming from bb 0x7f048c157688 21:00
MoarVM panic: Register types do not match between value and node
fun!
timotimo this'll be a ginormous trace ... 21:14
no big deal, just 15 megs of diff 21:16
gist.githubusercontent.com/timo/69...tfile1.txt
gist.github.com/timo/c0381e1b84352...14e12bceaf - this tells you when exactly the unbox_i optimization took place 21:17
what the heck? sometimes it switches to a different spesh graph 21:19
that's why the diffs are so humongous 21:20
because it's got a full copy of the one graph every few commits
turned inlining off, now it's only 1.5 megs <3 21:26
gist.github.com/timo/31244796c1022...d872602682 - github still truncates it :< 21:27
should be in commits f56ff15 and af2a4e2 this time, BBs 0x7ffff008c7f8 → 0x7ffff008c858 and 0x7ffff008cc18 → 0x7ffff008cbb8 should have the optimization in them 21:28
aha! 21:32
it was taking the result of the box_i (or similar) rather than the first argument
surprise surprise it doesn't crash any more (in the exact same spot at least) 21:33
not sure why, but i get a bunch of failures with telemetry again 21:37
fails with spesh disabled, too, though ... so not my fault :D 21:40
unless my optimization subtly broke the compiler
nope, even if i build all of rakudo with spesh disabled it still fails those tests 21:44
lizmat and another Perl 6 Weekly hits the Net: p6weekly.wordpress.com/2018/03/26/...ly-perl-6/ 21:56
timotimo runs a stresstest with the fixed version of optimize_unbox 22:10
jnthn timotimo++ # nice blog post 22:34
timotimo thanks :3