01:03
guifa left
01:09
guifa joined
08:29
librasteve_ joined
09:37
lizmat_ left,
lizmat joined
|
|||
Geth | rakudo/andthen-orelse: 67ec4989da | ab5tract++ | src/core.c/Promise.rakumod Use .then instead of starting synchronous Awaitables |
11:14 | |
rakudo/andthen-orelse: 7516cb67d1 | ab5tract++ | src/core.c/Promise.rakumod Fix the rest of the synchronous stuff |
11:31 | ||
13:36
MasterDuke joined
|
|||
MasterDuke | m: my int $a; for ^10_000_000 -> int $i { $a += $i }; say now - INIT now; say $a | 13:36 | |
camelia | 0.059958812 49999995000000 |
||
MasterDuke | m: my uint $a; for ^10_000_000 -> uint $i { $a += $i }; say now - INIT now; say $a | ||
camelia | 2.379863236 49999995000000 |
||
MasterDuke | and even looking at the `--target=optimize` output of something as simple as `my int $a = 0; my int $b = rand.Int; my int $c = $a + $b; say $c` vs the same with uints shows the problem | 13:37 | |
it's in `optimize_p6typecheckrv`, github.com/rakudo/rakudo/blob/main...3091-L3103 | 13:38 | ||
lizmat | is that still worthwhile fixing? or would that be easy (didn't look at the code in question to not be distracted too much) | 13:40 | |
MasterDuke | and it's because the addition is done with an nqp::add_i, but its return type doesn't match | 13:41 | |
so removing the p6typecheckrv doesn't happen | 13:42 | ||
lizmat | so it's a case of incorrect codegenning for uint += uint ? | ||
MasterDuke | well, there is no add_u | 13:43 | |
lizmat | ah | ||
MasterDuke | o | ||
lizmat | $rettype_ps |= 1 if $retttype_ps = 10 ?? | 13:44 | |
$rettype_ps := 1 if $retttype_ps = 10 ?? | |||
MasterDuke | i'd say worth fixing, but i'm not sure exactly where/how. i can't tell if rakuast would help (it currently doesn't because both are equally many times slower) | 13:45 | |
and i'm not sure if ^^^ would always be safe | |||
or if we really do need to create an add_u (and probably sub_u, and maybe others), even if it just gets converted to an add_i lower down in the stack | 13:47 | ||
lizmat | nine might have an idea about that | 13:50 | |
MasterDuke | interesting, if i allow the optimization when either side is 1 or 10, the `--target=optimize` output now is essentially identical, but runtime performance is unchanged | 13:59 | |
makes me think spesh is also involved | |||
lizmat | are you sure you're looking at the right thing? | 14:00 | |
MasterDuke | dunno. i think so | 14:01 | |
int version: `In total, 21457 call frames were entered and exited by the profiled code. Inlining eliminated the need to create 9985097 call frames (that's 99.79%)` | 14:05 | ||
uint version: `In total, 10006554 call frames were entered and exited by the profiled code. Inlining eliminated the need to create 0 call frames (that's 0%).` | |||
lizmat | ah, yes, :-) | 14:07 | |
MasterDuke | from a spesh log, in the "after" of the mainline: `param_rp_u r7(1), liti16(0) # [000] bailed argument spesh: expected arg flag 0 to be uint or box a uint; type at position was null type tuple` | 14:22 | |
lizmat tries to invoke the Timo | |||
14:29
MasterDuke left
14:44
MasterDuke joined
|
|||
MasterDuke | type_tuple is null when MVM_spesh_args is called, so the call to can_prim_spec at github.com/MoarVM/MoarVM/blob/main/...rgs.c#L489 returns 0 | 14:46 | |
but i don't know how/why the tyle_tuple would be null | 14:50 | ||
lizmat | your guess is much better than mine | 14:51 | |
MasterDuke | the MVMSpeshPlanned that holds it has populated cs_stats, type_stats, etc | 14:52 | |
there is a difference in the body before spesh optimizes, the uint version has two coerce_ui before the add_i and a coerce_iu after | 15:10 | ||
they do get removed in the optimized version, but the param_rp_u doesn't (compared to the int version where the param_rp_i is removed) | 15:13 | ||
15:44
MasterDuke left
16:17
guifa left
16:54
guifa joined
21:31
rakkable left,
rakkable joined
22:58
librasteve_ left
|