🦋 Welcome to the IRC channel of the core developers of the Raku Programming Language (raku.org #rakulang). This channel is logged for the purpose of history keeping about its development | evalbot usage: 'm: say 3;' or /msg camelia m: ... | Logs available at irclogs.raku.org/raku-dev/live.html | For MoarVM see #moarvm
Set by lizmat on 8 June 2022.
00:00 kjp left, kjp joined 06:09 japhb left, japhb joined 07:33 [TuxCM] joined 08:08 [TuxCM] left 09:21 finanalyst joined 10:27 finanalyst left 12:19 [TuxCM] joined 13:30 [TuxCM] left 14:19 [TuxCM] joined 15:46 [TuxCM] left 15:56 finanalyst joined 15:59 [TuxCM] joined 16:08 [TuxCM] left 17:08 sjm_ joined 17:54 finanalyst left 21:09 MasterDuke joined
MasterDuke i'm planning to (finally) get Text::Diff::Sift4 into the fez/zef ecosystem. as part of that move, i thought i'd try to un-nqpify it, hoping that rakudo has gotten faster enough that it's not necessary anymore 21:10
i've gotten it down to only 3x slower (with nqp ops being used in only one specific way) 21:11
but i'm not seeing many more things to try to get that last bit 21:12
here are the two versions if anybody is interested: gist.github.com/MasterDuke17/d8403...0dcdd89b56 21:15
ugexe maybe predeclare my Bool $isTrans 21:16
MasterDuke i did try that, seems to make a small-but-noticeable negative difference 21:18
yep, my rough benchmark went from ~1.46s to ~1.8s 21:20
21:22 finanalyst joined
MasterDuke i just re-measured the nqp version to make sure i was using the same benchmark, and turns out it's only 2x faster (i.e., ~0.75s for the more nqp version, ~01.46 for the less nqp version) 21:22
a profile just says all the time is spent in the main `while` loop, nothing else really stands out 21:23
i was originally using a hash for offsets in the raku version, but switch to a class was a decent improvement 21:25
ugexe is it under the inline limit? potentially you could split it up more to do so 21:28
also thats a bit unintuitive that predeclare slows it down but not impossible 21:29
MasterDuke yeah, wasn't expecting that at all 21:30
i haven't looked at a spesh log in a while, i'll take a look
but afk for a bit
ugexe you could technically calculate one of the ords once, the nqp::ordat($s2, $c2) 21:43
instead of twice
21:45 MasterDuke left
ugexe $s1.chars / $s2.chars would also be faster although obviously not important in the overall performance of the function 21:49
($x + $y).abs will be slightly faster than abs($x + $y) 21:54
i suspect it might be faster to do `my Bool $isTrans;` instead of `my Bool $isTrans = False;` 22:07
more algorithmically i wonder if the splice could be moved outside the inner loop so it isn't called as much 22:11
22:36 finanalyst left 22:51 MasterDuke joined
MasterDuke ugexe: the method form of a bunch of those things (e.g., `abs`, `chars`) is slower. i believe it's because the sub form accepts natives, but the method form has to box the native it's called upon 22:53
m: my int $a; for ^10_000_000 -> int $i { $a = abs($i + $a) }; say now - INIT now; say $a;
camelia 0.066115068
49999995000000
MasterDuke m: my int $a; for ^10_000_000 -> int $i { $a = ($i + $a).abs }; say now - INIT now; say $a; 22:54
camelia 2.012868178
49999995000000
ugexe ah 22:57
MasterDuke with some of the recommendations i'm now at ~1.36s 23:09
actually closer to ~1.30s 23:10
ugexe: and i'm not sure it's safe to cache `nqp::ordat($s2, $c2)`, since i believe `$c2` can change between the first and second call 23:14
timo in theory, we can eliminate the boxing in spesh, but i'm not up to date on the exact circumstances that may prevent that
MasterDuke timo: well, even if nothing prevents it, it wouldn't happen until spesh decided to optimize the call, right? 23:15
afk for a bit again 23:21
timo yes, true. the call also has to be inlined, or it won't work 23:30
unless the static optimizer happens to already have done it? do we actually do that?
23:40 MasterDuke left