github.com/moarvm/moarvm | IRC logs at colabti.org/irclogger/irclogger_logs/moarvm
Set by AlexDaniel on 12 June 2018.
00:18 MasterDuke joined, p6bannerbot sets mode: +v MasterDuke, MasterDuke left, MasterDuke joined, herbert.freenode.net sets mode: +v MasterDuke, p6bannerbot sets mode: +v MasterDuke
timotimo i've got a little program that does between 315 and 415 fps generating a 320x240 image from a very simple formula and blitting it to the screen 01:04
but when i replace nqp::bindpos_i with .ASSIGN-POS, it gets a whole lot slower
that's 165 to 243 frames per second 01:06
MasterDuke what about with .BIND-POS? 01:17
yoleaux 15 Nov 2018 10:18Z <AlexDaniel> MasterDuke: same can be said about rakudo releases :) But so far it wasn't bad, it's just that we need better ideas
timotimo that doesn't seem to exist? 01:18
MasterDuke src/Perl6/Metamodel/BOOTSTRAP.nqp:1341:
multis in Any, List, Array, etc 01:19
timotimo we might be able to get bindpos on a CArray to become faster, fwiw; it goes through an indirect function call to find how to assign to an int slot
s: List, BIND-POS, \(1, 1) 01:20
s: Array, "BIND-POS", \(1, 1) 01:21
int-typed carrays don't have BIND-POS, because there are no containers to speak of 01:23
is what i reckon
01:24 SourceBaby left
MasterDuke ah, i know nothing about carrays 01:24
timotimo its bind-pos is already literally just a bindpos_i with a decont 01:28
if we had loop-invariant code-motion, perhaps the guards for the type still being correct could be moved out of the hot loop
which i think is much of the remaining overhead
oh, another thing is that the code we get with ASSIGN-POS has the decont of the scalar, but doesn't know what type we're expecting 01:36
with bindpos_i directly in there, we not only get zero guards and such in between calculating the result and binding it into the carray, we also devirtualize so we directly jump into CArray's bind_pos function 01:38
that still has to 1) check whiat type of slot we have and 2) virtually call P6int's set_int function
if the jit devirtualizes that even further, the gap would widen even more 01:40
would*
there is not yet code to do that, of course :) 01:41
MasterDuke sounds like a quick little project
timotimo are you interested? :) 01:42
MasterDuke heh, if that hadn't been a sarcastic "quick little" i probably would be 01:44
02:37 evalable6 left 02:39 evalable6 joined, p6bannerbot sets mode: +v evalable6
timotimo oh 02:49
i don't think it'd be all that hard 02:50
my plan to implement it would be to add some functions to CArray.c, and a few to P6int.c that all have any access to the reprdata thrown out 02:51
so for every value that could be in the reprdata fields there'd be an individual function
and then probably a select_blah_for_jit function that takes a reprdata or stable or something and returns the appropriate function
i've recently had a commit that does something just like that, i think for nativeref 02:52
there's the added inefficiency in my approach that SDL has to copy and if i'm not mistaken also pixelformat-convert my data for every frame, which is a piece of overhead i actually see 03:04
07:38 domidumont joined 07:39 p6bannerbot sets mode: +v domidumont 07:42 lizmat left 08:48 domidumont left 10:06 robertle left 10:09 avar left, avar joined, avar left, avar joined, p6bannerbot sets mode: +v avar 10:10 p6bannerbot sets mode: +v avar, domidumont joined 10:11 p6bannerbot sets mode: +v domidumont 10:29 dalek joined, p6lert left, synopsebot_ joined, Geth joined, p6lert joined, synopsebot left, p6bannerbot sets mode: +v dalek 10:30 p6bannerbot sets mode: +v synopsebot_, p6bannerbot sets mode: +v Geth, p6bannerbot sets mode: +v p6lert 11:39 lizmat joined, p6bannerbot sets mode: +v lizmat 13:26 MasterDuke left 14:06 dogbert2_ left 14:09 patrickb joined 14:10 p6bannerbot sets mode: +v patrickb 14:23 leont joined 14:24 p6bannerbot sets mode: +v leont 14:51 zakharyas joined 14:52 p6bannerbot sets mode: +v zakharyas 15:02 lizmat left 15:06 Kaiepi left, zakharyas left 15:07 Kaiepi joined, zakharyas joined, p6bannerbot sets mode: +v Kaiepi 15:08 p6bannerbot sets mode: +v zakharyas 15:21 patrickz joined 15:22 p6bannerbot sets mode: +v patrickz 15:25 patrickb left 15:30 zakharyas left 15:31 zakharyas joined 15:32 p6bannerbot sets mode: +v zakharyas 16:06 colomon joined 16:07 p6bannerbot sets mode: +v colomon, zakharyas left 16:18 colomon left 16:52 zakharyas joined 16:53 p6bannerbot sets mode: +v zakharyas 17:05 zakharyas left, zakharyas joined 17:06 p6bannerbot sets mode: +v zakharyas 17:08 zakharyas left 17:09 zakharyas joined 17:10 p6bannerbot sets mode: +v zakharyas 17:28 lizmat joined, p6bannerbot sets mode: +v lizmat 18:04 zakharyas left
AlexDaniel M#996 is still not resolved and it'd be great to have some eyes on it 18:10
synopsebot_ M#996 [open]: github.com/MoarVM/MoarVM/issues/996 [⚠ blocker ⚠] SEGV when running example code from R#1951
19:57 patrickz left 20:04 domidumont left 20:14 brrt joined 20:15 p6bannerbot sets mode: +v brrt 21:18 brrt left, Kaiepi left, brrt joined 21:19 p6bannerbot sets mode: +v brrt
brrt \o 21:25
yoleaux 17 Nov 2018 18:04Z <AlexDaniel> brrt: bisected: colabti.org/irclogger/irclogger_lo...11-17#l196
brrt I actually think these are not the root cause
What I find is, if I change the trunc_i16 to be a signed-cast from 2 to 8 bytes, then everything works as I'd want 21:26
the current way of doing it is to unsigned-truncate them. I actually think that makes sense on a local level, but I think the problem is that it is copuled wiht a 16 bit store as well 21:27
and that may be the bit that is wrong
... yeah, I think that's the crux of the issue 21:28
the expr JIT tries to do the right thing with the small variable
and that is wrong in this case 21:29
21:38 Kaiepi joined 21:39 p6bannerbot sets mode: +v Kaiepi 21:40 brrt left
timotimo it'd be kind of cool if CStruct had the "managed?" bit in the reprdata instead of the individual instances ... 22:38
then the jit could build code that's a little bit tighter
jnthn Ain't it a property of an individual instance rather than the type? 22:42
timotimo depending on how we obtain it, it'd be a different one 22:43
but, true, we'd want a managed CArray[int] to match an unmanaged CArray[int], too
hadn't thought of that bit
one thing that's nice about CArray is that it doesn't do range checks 22:44
you might say "that's dangerous!", but all i care about is speed, lol
except when it's managed, then range checks are a thing
nine The managed bit is a static property of the code generating the CArray 22:49
timotimo so if we're still in the same frame as the one that generated the CArray, we can tell that it won't switch its managed bit and we can omit the check there :) 22:51