nwc10 Odd. hhvm 3.12 was released 11 days ago: github.com/facebook/hhvm/releases but there's nothing on the blog (hhvm.com/blog/), and nothing I spot in the repository that updates changelogs or NEWS or similar from one (or two) releases previously 08:50
and if Pyston can stick to its "every 4 months" plan then 0.5 should arrive in early March: blog.pyston.org/2015/11/03/102/ 08:52
brrt good * #moarvm 09:33
FROGGS o/
nwc10 good UGT, #moarvm
brrt it is apparantly release time of the month 09:34
timotimo yeah, even for perl6 10:37
timotimo watched that "data-oriented design" talk and nothing from moarvm comes to mind where an approach like that could help our performance :| 18:13
brrt timotimo: what 'data driven design' talk 20:09
not a talk about the radical idea that you should measure before you can design, right 20:10
i mean, that's not what any other form of engineering has been doing since forever
stronger even; that when you design without measurement, its really just sketching, not design
anyway, VMs kind of tend to be designed with a lot of exactly that kind measurement being distilled into the general style of doing things 20:11
timotimo no, not data-driven 20:24
timotimo brrt: it's about "data is more important than code" 20:25
brrt i'm curious 20:26
timotimo it was a keynote at the cppconf 2014 20:33
timotimo brrt: would you like to help me figure out the performance of a little script? 20:57
github.com/timo/SDL2_raw-p6/blob/m...e_noise.p6
in theory i suppose i could put a check outside if the $pitch is equal to exactly the number of pixels per row times the number of bytes per pixel. which in my machines is true 20:58
and then i can just have a $cursor that i ++
the profile says we spend 52.28% inclusive, 42.75% exclusive time in postcircumfix:<[ ]> 21:01
brrt i can try
timotimo (the rest is inside ASSIGN-POS - both of those things are perfectly inlined)
then comes 95.37% of inclusive, 36% exclusive time inside the outer loop in render()
and only 7.1% inside pick 21:02
i get like 14 FPS with this whole setup
oh, my local version has the my $then = now line and the 1 / now - then fps lines thrown out 21:03
because that added a lot of pressure on the GC ... from DIVIDE_NUMBERS and other things
brrt i suppose that loop is as slim as it can be 21:04
timotimo instead i have a $frame = $frame + 1 and say $frame at the end
brrt hmmm 21:05
you spend 50% of your time in [] 21:06
does nativecast return a proxy of sorts?
timotimo no, look at the spesh, it basically turns into just a bindpos 21:07
shall i upload the spesh for you?
brrt prolly, yes :-)
timotimo brrt_whitenoise_data.tar.gz - how does that sound %) 21:12
t.h8.lv/brrt_whitenoise_data.tar.gz
brrt :-)
timotimo what i find surprising is that perf doesn't show a lot of time spent inside the bindpos function at all 21:14
timotimo ==20217== brk segment overflow in thread #1: can't grow to 0x4a56000 21:21
==20217== brk segment overflow in thread #1: can't grow to 0x4a52000
==20217== brk segment overflow in thread #1: can't grow to 0x4a41000
running valgrind with cachegrind on that script, i get a screenful of this 21:22
and why the fucking hell is bn_mul_2d so oft-called? 21:30
i shall put a breakpoint. 21:31
brrt what is bn_mul_2d 21:34
timotimo sorry, bp_mul_2d
comes from libtom
that's apparently a bit-shift to the left by a count of bits 21:35
oh, it comes from set_int 21:37
MVM_bigint_mp_set_uint64 calls it
brrt why bigint, where do bigints come from
i'm sorry, i don't really have the brain left to help you today 21:38
timotimo m: say 4279312947.base(16)
camelia rakudo-moar 620f4e: OUTPUT«FF112233␤»
timotimo it comes from the numbers i put into that array
brrt ah man 21:39
timotimo i think it fits into 32bit though?
OH!
yeah, i know what it is
we're not removing the return value of nqp::bindpos from ASSIGN-POS 21:40
and it boxes the value into an Int with p6box_i
brrt aha
cool
how did you figure that out
timotimo i found that out earlier 21:44
well, it's just that there's still a p6box_i in there
brrt afk for today 21:45
timotimo oh, also, it hink the profiler doesn't instrument things that have been inlined perhaps? 21:46
yeah, the MVM_IS_32BIT_INT macro won't say "yup, it's a small int all right" because it's about signed integers here 21:54
that also means that the code would still benefit from freeing stuff to be put into a separate thread, because of all the Int objects it's gotta free there 21:59
timotimo aha! 22:02
m: say "now i gots me { 220 / 10.8 } fps" 22:03
camelia rakudo-moar 620f4e: OUTPUT«now i gots me 20.370370 fps␤»
timotimo m: say "now i gots me { 511 / 13.28 } fps"
camelia rakudo-moar 620f4e: OUTPUT«now i gots me 38.478916 fps␤»
timotimo m: say "{ 4 * 1024 * 1024 / 20 } P6bigint fit into a nursery" 22:12
camelia rakudo-moar 620f4e: OUTPUT«209715.2 P6bigint fit into a nursery␤»
timotimo m: say "{ 4 * 1024 * 1024 / (20 * 320 * 240) } frames worth of p6bigint fit into a nursery" 22:13
camelia rakudo-moar 620f4e: OUTPUT«2.730667 frames worth of p6bigint fit into a nursery␤»