github.com/moarvm/moarvm | IRC logs at colabti.org/irclogger/irclogger_logs/moarvm
Set by AlexDaniel on 12 June 2018.
timotimo i've got a little program that does between 315 and 415 fps generating a 320x240 image from a very simple formula and blitting it to the screen 01:04
but when i replace nqp::bindpos_i with .ASSIGN-POS, it gets a whole lot slower
that's 165 to 243 frames per second 01:06
MasterDuke what about with .BIND-POS? 01:17
yoleaux 15 Nov 2018 10:18Z <AlexDaniel> MasterDuke: same can be said about rakudo releases :) But so far it wasn't bad, it's just that we need better ideas
timotimo that doesn't seem to exist? 01:18
MasterDuke src/Perl6/Metamodel/BOOTSTRAP.nqp:1341:
multis in Any, List, Array, etc 01:19
timotimo we might be able to get bindpos on a CArray to become faster, fwiw; it goes through an indirect function call to find how to assign to an int slot
s: List, BIND-POS, \(1, 1) 01:20
s: Array, "BIND-POS", \(1, 1) 01:21
int-typed carrays don't have BIND-POS, because there are no containers to speak of 01:23
is what i reckon
MasterDuke ah, i know nothing about carrays 01:24
timotimo its bind-pos is already literally just a bindpos_i with a decont 01:28
if we had loop-invariant code-motion, perhaps the guards for the type still being correct could be moved out of the hot loop
which i think is much of the remaining overhead
oh, another thing is that the code we get with ASSIGN-POS has the decont of the scalar, but doesn't know what type we're expecting 01:36
with bindpos_i directly in there, we not only get zero guards and such in between calculating the result and binding it into the carray, we also devirtualize so we directly jump into CArray's bind_pos function 01:38
that still has to 1) check whiat type of slot we have and 2) virtually call P6int's set_int function
if the jit devirtualizes that even further, the gap would widen even more 01:40
would*
there is not yet code to do that, of course :) 01:41
MasterDuke sounds like a quick little project
timotimo are you interested? :) 01:42
MasterDuke heh, if that hadn't been a sarcastic "quick little" i probably would be 01:44
timotimo oh 02:49
i don't think it'd be all that hard 02:50
my plan to implement it would be to add some functions to CArray.c, and a few to P6int.c that all have any access to the reprdata thrown out 02:51
so for every value that could be in the reprdata fields there'd be an individual function
and then probably a select_blah_for_jit function that takes a reprdata or stable or something and returns the appropriate function
i've recently had a commit that does something just like that, i think for nativeref 02:52
there's the added inefficiency in my approach that SDL has to copy and if i'm not mistaken also pixelformat-convert my data for every frame, which is a piece of overhead i actually see 03:04
AlexDaniel M#996 is still not resolved and it'd be great to have some eyes on it 18:10
synopsebot_ M#996 [open]: github.com/MoarVM/MoarVM/issues/996 [⚠ blocker ⚠] SEGV when running example code from R#1951
brrt \o 21:25
yoleaux 17 Nov 2018 18:04Z <AlexDaniel> brrt: bisected: colabti.org/irclogger/irclogger_lo...11-17#l196
brrt I actually think these are not the root cause
What I find is, if I change the trunc_i16 to be a signed-cast from 2 to 8 bytes, then everything works as I'd want 21:26
the current way of doing it is to unsigned-truncate them. I actually think that makes sense on a local level, but I think the problem is that it is copuled wiht a 16 bit store as well 21:27
and that may be the bit that is wrong
... yeah, I think that's the crux of the issue 21:28
the expr JIT tries to do the right thing with the small variable
and that is wrong in this case 21:29
timotimo it'd be kind of cool if CStruct had the "managed?" bit in the reprdata instead of the individual instances ... 22:38
then the jit could build code that's a little bit tighter
jnthn Ain't it a property of an individual instance rather than the type? 22:42
timotimo depending on how we obtain it, it'd be a different one 22:43
but, true, we'd want a managed CArray[int] to match an unmanaged CArray[int], too
hadn't thought of that bit
one thing that's nice about CArray is that it doesn't do range checks 22:44
you might say "that's dangerous!", but all i care about is speed, lol
except when it's managed, then range checks are a thing
nine The managed bit is a static property of the code generating the CArray 22:49
timotimo so if we're still in the same frame as the one that generated the CArray, we can tell that it won't switch its managed bit and we can omit the check there :) 22:51