github.com/moarvm/moarvm | IRC logs at colabti.org/irclogger/irclogger_logs/moarvm Set by AlexDaniel on 12 June 2018. |
|||
00:18
MasterDuke joined,
p6bannerbot sets mode: +v MasterDuke,
MasterDuke left,
MasterDuke joined,
herbert.freenode.net sets mode: +v MasterDuke,
p6bannerbot sets mode: +v MasterDuke
|
|||
timotimo | i've got a little program that does between 315 and 415 fps generating a 320x240 image from a very simple formula and blitting it to the screen | 01:04 | |
but when i replace nqp::bindpos_i with .ASSIGN-POS, it gets a whole lot slower | |||
that's 165 to 243 frames per second | 01:06 | ||
MasterDuke | what about with .BIND-POS? | 01:17 | |
yoleaux | 15 Nov 2018 10:18Z <AlexDaniel> MasterDuke: same can be said about rakudo releases :) But so far it wasn't bad, it's just that we need better ideas | ||
timotimo | that doesn't seem to exist? | 01:18 | |
MasterDuke | src/Perl6/Metamodel/BOOTSTRAP.nqp:1341: | ||
multis in Any, List, Array, etc | 01:19 | ||
timotimo | we might be able to get bindpos on a CArray to become faster, fwiw; it goes through an indirect function call to find how to assign to an int slot | ||
s: List, BIND-POS, \(1, 1) | 01:20 | ||
s: Array, "BIND-POS", \(1, 1) | 01:21 | ||
int-typed carrays don't have BIND-POS, because there are no containers to speak of | 01:23 | ||
is what i reckon | |||
01:24
SourceBaby left
|
|||
MasterDuke | ah, i know nothing about carrays | 01:24 | |
timotimo | its bind-pos is already literally just a bindpos_i with a decont | 01:28 | |
if we had loop-invariant code-motion, perhaps the guards for the type still being correct could be moved out of the hot loop | |||
which i think is much of the remaining overhead | |||
oh, another thing is that the code we get with ASSIGN-POS has the decont of the scalar, but doesn't know what type we're expecting | 01:36 | ||
with bindpos_i directly in there, we not only get zero guards and such in between calculating the result and binding it into the carray, we also devirtualize so we directly jump into CArray's bind_pos function | 01:38 | ||
that still has to 1) check whiat type of slot we have and 2) virtually call P6int's set_int function | |||
if the jit devirtualizes that even further, the gap would widen even more | 01:40 | ||
would* | |||
there is not yet code to do that, of course :) | 01:41 | ||
MasterDuke | sounds like a quick little project | ||
timotimo | are you interested? :) | 01:42 | |
MasterDuke | heh, if that hadn't been a sarcastic "quick little" i probably would be | 01:44 | |
02:37
evalable6 left
02:39
evalable6 joined,
p6bannerbot sets mode: +v evalable6
|
|||
timotimo | oh | 02:49 | |
i don't think it'd be all that hard | 02:50 | ||
my plan to implement it would be to add some functions to CArray.c, and a few to P6int.c that all have any access to the reprdata thrown out | 02:51 | ||
so for every value that could be in the reprdata fields there'd be an individual function | |||
and then probably a select_blah_for_jit function that takes a reprdata or stable or something and returns the appropriate function | |||
i've recently had a commit that does something just like that, i think for nativeref | 02:52 | ||
there's the added inefficiency in my approach that SDL has to copy and if i'm not mistaken also pixelformat-convert my data for every frame, which is a piece of overhead i actually see | 03:04 | ||
07:38
domidumont joined
07:39
p6bannerbot sets mode: +v domidumont
07:42
lizmat left
08:48
domidumont left
10:06
robertle left
10:09
avar left,
avar joined,
avar left,
avar joined,
p6bannerbot sets mode: +v avar
10:10
p6bannerbot sets mode: +v avar,
domidumont joined
10:11
p6bannerbot sets mode: +v domidumont
10:29
dalek joined,
p6lert left,
synopsebot_ joined,
Geth joined,
p6lert joined,
synopsebot left,
p6bannerbot sets mode: +v dalek
10:30
p6bannerbot sets mode: +v synopsebot_,
p6bannerbot sets mode: +v Geth,
p6bannerbot sets mode: +v p6lert
11:39
lizmat joined,
p6bannerbot sets mode: +v lizmat
13:26
MasterDuke left
14:06
dogbert2_ left
14:09
patrickb joined
14:10
p6bannerbot sets mode: +v patrickb
14:23
leont joined
14:24
p6bannerbot sets mode: +v leont
14:51
zakharyas joined
14:52
p6bannerbot sets mode: +v zakharyas
15:02
lizmat left
15:06
Kaiepi left,
zakharyas left
15:07
Kaiepi joined,
zakharyas joined,
p6bannerbot sets mode: +v Kaiepi
15:08
p6bannerbot sets mode: +v zakharyas
15:21
patrickz joined
15:22
p6bannerbot sets mode: +v patrickz
15:25
patrickb left
15:30
zakharyas left
15:31
zakharyas joined
15:32
p6bannerbot sets mode: +v zakharyas
16:06
colomon joined
16:07
p6bannerbot sets mode: +v colomon,
zakharyas left
16:18
colomon left
16:52
zakharyas joined
16:53
p6bannerbot sets mode: +v zakharyas
17:05
zakharyas left,
zakharyas joined
17:06
p6bannerbot sets mode: +v zakharyas
17:08
zakharyas left
17:09
zakharyas joined
17:10
p6bannerbot sets mode: +v zakharyas
17:28
lizmat joined,
p6bannerbot sets mode: +v lizmat
18:04
zakharyas left
|
|||
AlexDaniel | M#996 is still not resolved and it'd be great to have some eyes on it | 18:10 | |
synopsebot_ | M#996 [open]: github.com/MoarVM/MoarVM/issues/996 [⚠ blocker ⚠] SEGV when running example code from R#1951 | ||
19:57
patrickz left
20:04
domidumont left
20:14
brrt joined
20:15
p6bannerbot sets mode: +v brrt
21:18
brrt left,
Kaiepi left,
brrt joined
21:19
p6bannerbot sets mode: +v brrt
|
|||
brrt | \o | 21:25 | |
yoleaux | 17 Nov 2018 18:04Z <AlexDaniel> brrt: bisected: colabti.org/irclogger/irclogger_lo...11-17#l196 | ||
brrt | I actually think these are not the root cause | ||
What I find is, if I change the trunc_i16 to be a signed-cast from 2 to 8 bytes, then everything works as I'd want | 21:26 | ||
the current way of doing it is to unsigned-truncate them. I actually think that makes sense on a local level, but I think the problem is that it is copuled wiht a 16 bit store as well | 21:27 | ||
and that may be the bit that is wrong | |||
... yeah, I think that's the crux of the issue | 21:28 | ||
the expr JIT tries to do the right thing with the small variable | |||
and that is wrong in this case | 21:29 | ||
21:38
Kaiepi joined
21:39
p6bannerbot sets mode: +v Kaiepi
21:40
brrt left
|
|||
timotimo | it'd be kind of cool if CStruct had the "managed?" bit in the reprdata instead of the individual instances ... | 22:38 | |
then the jit could build code that's a little bit tighter | |||
jnthn | Ain't it a property of an individual instance rather than the type? | 22:42 | |
timotimo | depending on how we obtain it, it'd be a different one | 22:43 | |
but, true, we'd want a managed CArray[int] to match an unmanaged CArray[int], too | |||
hadn't thought of that bit | |||
one thing that's nice about CArray is that it doesn't do range checks | 22:44 | ||
you might say "that's dangerous!", but all i care about is speed, lol | |||
except when it's managed, then range checks are a thing | |||
nine | The managed bit is a static property of the code generating the CArray | 22:49 | |
timotimo | so if we're still in the same frame as the one that generated the CArray, we can tell that it won't switch its managed bit and we can omit the check there :) | 22:51 |