01:03 guifa left 01:09 guifa joined 08:29 librasteve_ joined 09:37 lizmat_ left, lizmat joined
Geth rakudo/andthen-orelse: 67ec4989da | ab5tract++ | src/core.c/Promise.rakumod
Use .then instead of starting synchronous Awaitables
11:14
rakudo/andthen-orelse: 7516cb67d1 | ab5tract++ | src/core.c/Promise.rakumod
Fix the rest of the synchronous stuff
11:31
13:36 MasterDuke joined
MasterDuke m: my int $a; for ^10_000_000 -> int $i { $a += $i }; say now - INIT now; say $a 13:36
camelia 0.059958812
49999995000000
MasterDuke m: my uint $a; for ^10_000_000 -> uint $i { $a += $i }; say now - INIT now; say $a
camelia 2.379863236
49999995000000
MasterDuke and even looking at the `--target=optimize` output of something as simple as `my int $a = 0; my int $b = rand.Int; my int $c = $a + $b; say $c` vs the same with uints shows the problem 13:37
it's in `optimize_p6typecheckrv`, github.com/rakudo/rakudo/blob/main...3091-L3103 13:38
lizmat is that still worthwhile fixing? or would that be easy (didn't look at the code in question to not be distracted too much) 13:40
MasterDuke and it's because the addition is done with an nqp::add_i, but its return type doesn't match 13:41
so removing the p6typecheckrv doesn't happen 13:42
lizmat so it's a case of incorrect codegenning for uint += uint ?
MasterDuke well, there is no add_u 13:43
lizmat ah
MasterDuke o
lizmat $rettype_ps |= 1 if $retttype_ps = 10 ?? 13:44
$rettype_ps := 1 if $retttype_ps = 10 ??
MasterDuke i'd say worth fixing, but i'm not sure exactly where/how. i can't tell if rakuast would help (it currently doesn't because both are equally many times slower) 13:45
and i'm not sure if ^^^ would always be safe
or if we really do need to create an add_u (and probably sub_u, and maybe others), even if it just gets converted to an add_i lower down in the stack 13:47
lizmat nine might have an idea about that 13:50
MasterDuke interesting, if i allow the optimization when either side is 1 or 10, the `--target=optimize` output now is essentially identical, but runtime performance is unchanged 13:59
makes me think spesh is also involved
lizmat are you sure you're looking at the right thing? 14:00
MasterDuke dunno. i think so 14:01
int version: `In total, 21457 call frames were entered and exited by the profiled code. Inlining eliminated the need to create 9985097 call frames (that's 99.79%)` 14:05
uint version: `In total, 10006554 call frames were entered and exited by the profiled code. Inlining eliminated the need to create 0 call frames (that's 0%).`
lizmat ah, yes, :-) 14:07
MasterDuke from a spesh log, in the "after" of the mainline: `param_rp_u        r7(1), liti16(0)  # [000] bailed argument spesh: expected arg flag 0 to be uint or box a uint; type at position was null type tuple` 14:22
lizmat tries to invoke the Timo
14:29 MasterDuke left 14:44 MasterDuke joined
MasterDuke type_tuple is null when MVM_spesh_args is called, so the call to can_prim_spec at github.com/MoarVM/MoarVM/blob/main/...rgs.c#L489 returns 0 14:46
but i don't know how/why the tyle_tuple would be null 14:50
lizmat your guess is much better than mine 14:51
MasterDuke the MVMSpeshPlanned that holds it has populated cs_stats, type_stats, etc 14:52
there is a difference in the body before spesh optimizes, the uint version has two coerce_ui before the add_i and a coerce_iu after 15:10
they do get removed in the optimized version, but the param_rp_u doesn't (compared to the int version where the param_rp_i is removed) 15:13
15:44 MasterDuke left 16:17 guifa left 16:54 guifa joined 21:31 rakkable left, rakkable joined 22:58 librasteve_ left