01:05
vendethiel joined
01:29
vendethiel joined
01:52
vendethiel joined
02:32
btyler joined,
dalek joined,
hoelzro joined,
oetiker joined,
lue joined
02:41
lue joined
03:29
vendethiel joined
04:28
JimmyZ joined
05:04
kjs_ joined
05:18
vendethiel joined
06:04
vendethiel joined
06:40
vendethiel joined
11:10
oetiker joined,
hoelzro joined,
dalek joined,
btyler joined
11:13
vendethiel joined
11:29
zakharyas joined
11:37
zakharyas1 joined
|
|||
dalek | arVM: 898fe2a | (Timo Paulssen)++ | src/spesh/optimize.c: integrate set with previous instruction if possible cases where an operation sets an intermediate register that only gets read by a set immediately following that operation occur frequently enough that this should be worth the tiny bit of analysis work. |
11:40 | |
arVM: 63c5d88 | (Timo Paulssen)++ | src/spesh/args.c: non-passed optional parameters make some BBs unreachable they don't get removed yet, perhaps because they are still referred to by the dominance tree? |
12:32 | ||
timotimo | ^- this also wants an accompanying case for the other branch of named optional parameters | 12:33 | |
where we turn the instruction into a goto and ignore the "other" successor | |||
and i'm not entirely sure what keeps the BBs alive that no longer get referred to | 12:34 | ||
oh, i know what's wrong with that | 12:38 | ||
the potential goto we're kicking out would skip the next block. we're just turning a potential skip into a guaranteed fall-through | 12:39 | ||
that's fair, then | |||
future optimizations may be happy to see the predecessors disappear for that particular future block | |||
12:39
dalek joined
12:53
tgt joined
|
|||
timotimo | i may have to build a "not quite legit" fact discovery thing that relies on the mechanism we use to turn the SSA back into regular bytecode ... | 13:26 | |
by looking at version gaps and building an artificial "set", or alternatively raising the version on the instruction that writes to a too-low version ... | |||
it seems like it's now safe to uncomment the optimize_can_op now | 13:33 | ||
timotimo is spec-testing right now | |||
dalek | arVM: f89f527 | (Timo Paulssen)++ | src/spesh/optimize.c: re-activate optimize_can_op; survives spec tests now. |
13:44 | |
arVM: 2aa669d | (Timo Paulssen)++ | src/spesh/optimize.c: turning stuff into sets should also cause fact copying |
|||
timotimo | hmpf. | 13:57 | |
given a not_i whose source operand's value is known, setting the target operand's known value as well makes the core setting compilation fail | 13:58 | ||
could it be we're setting a known_value somewhere that's not actually correct?! | |||
dalek | arVM/spesh_constant_folding: 7f1646f | (Timo Paulssen)++ | src/spesh/optimize.c: trivial constant folding for not_i this breaks the rakudo build; maybe there's an optimization somewhere that sets a KNOWN_VALUE that ends up not having the correct value set? |
14:13 | |
14:42
FROGGS[mobile] joined
|
|||
timotimo | is it expected that the benchmark '[+] (1 .. 10_1024).comb>>.Int' will spend a whole lot of time in bind and bind_one_param? (compared to the rest of things, that is) | 14:57 | |
13.49% of exclusive time spent in "find_best_dispatchee", 11.55% of x-time in bind and 11.05% in bind_one_param | |||
next one is &return with 9.16% x-time spent | 14:58 | ||
also the GC activity is fun to look at. first a bunch of stuff gets promoted, then 10 collections long it'll only promote and retain a bit of stuff, almost exactly the same amount each time | 14:59 | ||
then it basically promotes + retains about 100% each time for 35 runs | |||
then it promotes & retains almost nothing at all for almost 500 runs | 15:00 | ||
then it promotes and retains wildly fluctuating amounts between 25% and 100% | |||
interestingly, the profiler notices exactly two allocations in total | 15:01 | ||
one BOOTCode and one Block | |||
huh. 'my int $i = 0; while $i < 1024 { my int $j = 0; while $j < 1024 { $k = $i + $j; $j = $j + 1 }; $i = $i + 1 }; say $k' also spends almost all of its time in find_best_dispatchee + bind_one_param and bind | 15:06 | ||
23.75% + 23.29% + 21.91% | 15:07 | ||
then 10% in at_key and 5.6% in a different at_key | |||
15:26
dalek joined
16:10
kjs_ joined
16:30
kjs_ joined
|
|||
japhb | That seems ... wildly wrong. All types are known and native; there are no loop controls and the loops are of the simplest non-infinite type. Why is this not optimized out the wazoo, and never dispatching anywhere? | 16:36 | |
timotimo | good q. | 16:46 | |
i would have expected that as well. | |||
something quite weird happens with another of the benchmarks where there's a whole bunch of GC runs that discard almost 100% of their data, but the allocations tab doesn't show much; a manual introspection of the nursery with my gdb helper thingie reveals that the nursery gets filled up with MVMStaticFrame instances over and over and over | 16:50 | ||
on the one hand, these things annoy me greatly because i have no clue where to look to find out what's wrong | 17:08 | ||
on the other hand: yay, there's still performance improvements that can be had! | |||
also: why does it look like the loops in that last benchmark don't get inlined? | |||
timotimo is hoping for a little bit of jnthn magic to fix these things up a bit | 17:09 | ||
i wonder if we do worse than we used to in these benchmarks? i can't really run benchmarks on my desktop at the moment, as its ipv4 is b0rked from a distro upgrade ... | 17:11 | ||
17:52
FROGGS__ joined
17:58
zakharyas joined
18:28
colomon joined
|
|||
timotimo | i see that our profiler has no way to handle operations that may or may not allocate | 18:32 | |
how about adding a "does the object that was passed to prof_allocated reside at the very end of the nursery?" and set more extops in rakudo to "ALLOCATING"? | |||
jnthn: sorry for the huge amount of backlog filler %) | 18:41 | ||
so ... the at_key use comes from Stash | 18:51 | ||
why the hell would it want to access a Stash object rather than going directly through the lexpad for that code?! | |||
dalek | arVM: a1a9f88 | (Timo Paulssen)++ | src/core/interp.c: when bindlex fails, we should report "bindlex", not "getlex" |
19:01 | |
arVM: e92c6c8 | (Timo Paulssen)++ | src/profiler/instrument.c: takeclosure is a very popular allocating op. |
|||
timotimo | ^- this gives us more precise profiles | ||
bind_one_param allocates 4194308 BOOTCode objects in the doubly nested while loop we have up there | 19:02 | ||
which ... just wow. | |||
is_bindable and at_key give 1048577 each and find_best_dispatchee is responsible for another 2097160 | 19:03 | ||
i *still* don't know why we're going through Stash, though | |||
we really should be inlining these blocks anyway | |||
m: say "test" | |||
camelia | rakudo-moar 3bbf7b: OUTPUT«test» | ||
timotimo | m: "that benchmark allocates { 8389651 / 4194308 } as many BOOTCode as it does Scalar" | 19:04 | |
camelia | ( no output ) | ||
timotimo | m: say "that benchmark allocates { 8389651 / 4194308 } as many BOOTCode as it does Scalar" | ||
camelia | rakudo-moar 3bbf7b: OUTPUT«that benchmark allocates 2.00024676 as many BOOTCode as it does Scalar» | ||
timotimo | that's pretty impressive. | ||
but all of these Scalar allocations happen in at_key and bind_one_parameter | |||
japhb: see what i did there? for some reason i was missing the declaration of $k as a lexical and that made stuff blow up >_< | 19:26 | ||
and of course that's already fixed in newest perl6-bench | 19:27 | ||
grrr. been hunting ghosts again | 19:28 | ||
jnthn: if you don't have time (or energy (which i would understand)) to backlog over my wall of text, i'd like you to answer at least this simple-ish question: | 19:29 | ||
for ops (i've only looked at extops here, really) that only sometimes allocate, should there be a check "does the value we got back from that operation look like it was allocated very recently?" | |||
jnthn | Well, for simplicity we may want to consider just always putting the check in. | 19:31 | |
Sicne it goes in the profiling code | |||
timotimo | that's what i thought; so you think the check sounds like a sane idea? | ||
i'd probably compare if object address + object size is equal to or at least "close to" the current allocation pointer of the nursery | 19:32 | ||
hm, but that wouldn't cover allocating "directly to gen2" | |||
jnthn | To the degree a guy who has been traveling for 20 hours knows what's sane... :P | ||
We don't need to cover "directly to gen2" really | |||
timotimo | OK | ||
jnthn | It's typically done by thing slike deserialization or bytecode loading | ||
Which aren't really anything the user can do anything about in their program | |||
timotimo | fair enough | 19:34 | |
jnthn: github.com/rakudo/rakudo/blob/nom/...aops.pm#L3 - do you know of a way how to make this allocate a ludicrous amount of BOOTCode? | 19:35 | ||
could nqp::getstaticcode or something similar be used for that? | |||
maybe a macro would be sensible ... not that we have that working yet %) | |||
jnthn | Well, I put a desugars mechanism into Actions with the idea of turning some of these very common meta-ops into just some QAST nodes | 19:37 | |
I can live with all the list-processing meta-ops involving a bit of HOP; you're normally dealing with a bunch of data. | |||
timotimo | ah | 19:38 | |
jnthn | But would prefer the assign and not ones just do some code-gen, I think... | ||
timotimo | great. what should i grep for to find that? | ||
jnthn | But can only do that when you know they are executing immediately | 19:39 | |
So it's not so straightforward | |||
timotimo | oh, you mean we have to check we're not doing something like my &foo = &[+=] | ||
jnthn | Exact. | ||
timotimo | i have an idea how to figure that out in the optimizer. in the Actions, however ... not so much | ||
maybe it could be a use case for Want? | 19:40 | ||
er, no, that doesn't make sense | |||
if += appears in void context, it wouldn't do anything at all | |||
ah, the desugar thing is the very first thing in actions | 19:42 | ||
jnthn | The optimizer may actually be a much easier place than the desugar... | 19:46 | |
Since you have the context to hand | |||
timotimo | that should decrease the memory pressure on tight loops that use += a *whole* lot | 19:47 | |
19:50
Ven joined
|
|||
timotimo | 294 collections instead of 938 when i write the += out | 19:51 | |
jnthn | so collect /o\ | 19:55 | |
bbiab | |||
20:10
Ven joined
|
|||
dalek | arVM/6pe: 5a3f555 | jonathan++ | docs/6model-parametric-extensions.markdown: Start documenting the parametric 6model design. |
20:59 | |
arVM/6pe: b9c4ee9 | jonathan++ | / (6 files): Stub parametricity-related ops. |
|||
arVM/6pe: 7812954 | jonathan++ | src/6model/6model.h: STable extensions for parametricity. |
|||
arVM/6pe: 5edcbb5 | jonathan++ | src/gc/collect.c: GC marking for STable parametricity bits. |
|||
jnthn | Hm, and there were more commits, but I overflew dalek.... | 21:00 | |
FROGGS__ | jnthn: that's a prep for NSA? | 21:02 | |
such abbr | |||
jnthn | Amongst other things, yes. | 21:03 | |
It's one of the two main VM-level pieces needed for NSA | |||
FROGGS__ | what's the other one? | 21:04 | |
jnthn | Well, or it will be when I get it done. :P | ||
Other one is the native references. | |||
I'm very contented with the 6pe design. | |||
Well, what I have of it so far | |||
FROGGS | that sounds good to me :o) | ||
jnthn | Still need some more brain cycles on the native references stuff. Something felt a little off last time I was working on those. | 21:05 | |
Probably I just need some more concentrated, non-exhausted time. | |||
Train journeys tend to be good thinking time, and I'll be back and forth to Stockholm like a yoyo for the next several weeks... So I think I'll get a design I like straightened out and coded up within the next weeks. :) | 21:06 | ||
FROGGS | what are native refs in one sentence? | ||
jnthn | Well, consider the naive compilation of: my $x := @omg-i'm-a-native-int-array[42]; $x = 69; | 21:08 | |
There's no Scalar container in a native array. A native reference is an assignable thingy that is a reference to a native location. | |||
FROGGS | ohh, understood | 21:09 | |
jnthn | They're a bit curious to design because you're trying to optimize for being able to kill them off in the earliest possible optimizer. :) | ||
FROGGS | thanks :o) | ||
jnthn | That is, those that Perl6::Optimizer can kill off, it should. What it can't, spesh + inlining should be able to do something about. | ||
At least, for the kinds of cases people are likely to write. | 21:10 | ||
So jetlag. Very bedtime. zzz | 21:12 | ||
TimToady | o/ | ||
FROGGS | gnight jnthn :o) | ||
timotimo | &infix:<+> is put into the metaop as a QAST::Var lexical | 22:02 | |
that's a tiny bit problematic | |||
FROGGS | ⁺ <--- just a tiny bit | 22:03 | |
timotimo | :) | ||
i wonder how i should analyze this to figure out it's not going to change under my feet | |||
m: &infix:<+> = sub ($a, $b) { 1 }; | 22:04 | ||
camelia | rakudo-moar 3bbf7b: OUTPUT«Cannot modify an immutable Sub+{<anon>}+{Precedence} in block <unit> at /tmp/1sBJjsoRIM:1» | ||
FROGGS | m: my &infix:<+> = sub ($a, $b) { 1 }; | ||
camelia | ( no output ) | ||
FROGGS | would that hurt also? | ||
timotimo | no | 22:11 | |
it's just being referenced | |||
oh, i'd just copy the name over into the call's name | |||
and that'd be fine |