github.com/moarvm/moarvm | IRC logs at colabti.org/irclogger/irclogger_logs/moarvm Set by AlexDaniel on 12 June 2018. |
|||
00:04
MasterDuke left
00:08
MasterDuke joined
|
|||
jnthn | sleep & | 00:11 | |
evalable6 | Use of uninitialized value of type Callable in numeric context in block <unit> at /tmp/8… |
||
jnthn, Full output: gist.github.com/946c87169c3653d975...7cbd83d1ac | |||
jnthn | :D | ||
timotimo | that's lovely | 00:15 | |
00:55
stmuk_ joined
00:57
stmuk left
01:49
stmuk joined
01:51
stmuk_ left
03:20
MasterDuke left
05:02
rubes joined
05:29
brrt joined
05:51
robertle joined
07:00
domidumont joined
07:06
domidumont left
07:07
domidumont joined
07:32
rubes left
|
|||
Geth | MoarVM: 27940b4d48 | (Bart Wiegmans)++ | 3 files Add a tool to dump sizes of REPR structures Mainly so that I have an idea of how large these structures really are and how much we'd waste when aligning to a cache line boundary. |
07:41 | |
07:42
lizmat joined
09:16
lizmat left
09:19
lizmat joined
09:22
robertle left
|
|||
jnthn | morning o/ | 09:30 | |
09:30
nativecallable6 joined
09:38
lizmat_ joined
09:39
brrt left
09:40
lizmat left
09:49
lizmat_ is now known as lizmat
|
|||
lizmat | Would it make sense to check in the Optimizer for the infix:<~> struct that " $_ $_ $_ " generates, and replace that by a structure that does | 10:12 | |
nqp::join('',nqp::list(" ",$_.Str," ",$_.Str," ",$_.Str," ")) | 10:13 | ||
preliminary testing shows this could make this particular case 1.8x as fast with about a third of the allocations | 10:15 | ||
nwc10 | jnthn: ASAN finds your branch to be very boring | ||
lizmat | fwiw, this is inspired by the introduction in Perl 5 of a specific op to handle this | ||
which is one of the reasons why 5.28 got a bit faster | 10:16 | ||
I figured we have most of the tools for this already :-) | |||
nwc10 | (it might be that I'm missing stuff where tests shell out, and any barfage gets eaten and reported as "just a failure". I'm finding the spectests a bit noisy under ASAN) | ||
jnthn | lizmat: With the caveat of needing to check there's no user-defined ~ and we're calling the setting one | ||
nwc10: Glad to be boring to ASAN :) | 10:17 | ||
lizmat | jnthn: how would I know that in the optimizer? | 10:19 | |
jnthn | Loads of opts do it, I think there's even a "is in setting" sub or method to hep | 10:20 | |
*help | 10:22 | ||
yay, my decont less during binding patches in Rakudo pass spectest | |||
stresstest now | |||
nwc10 | surely stresstest starts about 20:00 CEST this evening, and serious stresstest happens if Extra Time ends at a draw? | 10:26 | |
jnthn | :D | 10:28 | |
Really | |||
Croatia looked better in the group games; I'm hoping that's just a case of "it's easy to look better under weaker opposition" rather than "they lost their groove for a couple of matches but might now find it again" | 10:31 | ||
10:37
brrt joined
|
|||
jnthn | Patches I have locally get an inlined .chars method call from 13 instructions including 3 guards down to 9 instructions include 1 guard | 10:37 | |
Presumably similar applies for many other such calls | 10:38 | ||
And there's more squeezing possible yet | |||
Yeah, 2 more `set` instructions can go away | 10:40 | ||
Probably takedispatcher can go away | 10:41 | ||
Maybe a third set also | 10:42 | ||
5 ops is probably as low as it can go without escape analysis and scalar replacement and other cleverness that'd let us eliminate one more guard post-inline | 10:43 | ||
And the boxing can go away with upcoming opts | |||
10:46
brrt left
|
|||
lizmat | .oO( I love unboxing fast presents ) |
10:49 | |
dogbert17 | jnthn: am I correct in assuming that, atm, some programs can be expected to run slower than they did before the rescalar work? | 10:54 | |
lizmat | dogbert17: answering for jnthn: yes, they can (as shown in test-t btw) | 10:55 | |
jnthn | Yes, that can happen. If you have any *small* examples, I'd be interested to see them. | ||
The main thing I know about is hash access | |||
Well, hash vivifying access | 10:56 | ||
dogbert17 | how about this: gist.github.com/dogbert17/1a084fce...ae89838ca1 | 10:57 | |
jnthn | Not sure why that one'd be slower :) | 11:04 | |
Will have a poke :) | |||
dogbert17 | for me, it takes ~8 sec on 2018-06 but ~10 secs now | 11:10 | |
11:11
brrt joined
|
|||
dogbert17 | i.e. moarvm du-chains-and-opts | 11:11 | |
jnthn | But it was the rescalar merge that made it slower, I presume? | ||
Not the du-chains-and-opts branch? | |||
dogbert17 | both it would seem, the cutrrent version, i.e. Camelia takes 9 secs | 11:12 | |
*current | 11:13 | ||
I can try to figure out which commit might have caused this if it will save you some time for cool opts :) | 11:14 | ||
timotimo | du-chain-and-opts still has all those checks in there that probably delay speshing by a noticable bit? but probably not by two seconds | 11:15 | |
jnthn | It also turns out use of spesh stats for decont | 11:16 | |
*turns on | |||
Which can lead to huge wins | |||
But did imply more guards in various cases | 11:17 | ||
But the commit I just did in Rakudo can reduce that :) | |||
So you may want to try it with Rakudo HEAD | |||
ooh, lunch time :) | |||
bbl | |||
dogbert17 | will try it out right away :) have a nice lunch | 11:19 | |
lizmat | Sometimes I wonder whether MoarVM should have an opt for integer sort: sorting.cr.yp.to :-) | 11:22 | |
dogbert17 | things improved quite a bit, hooray | 11:23 | |
lizmat doesn't see it, could be noise | 11:24 | ||
6.083 -> 5.923 | |||
dogbert17 | it was a specific example which improved but there's still some slowness. I'll see if I can figure out where it comes from | 11:27 | |
11:35
robertle joined
|
|||
Geth | MoarVM: 0e5d6aa4ce | (Samantha McVey)++ | src/6model/reprs/MVMHash.c Add some branch prediction hints to MVMHash.c |
11:43 | |
MoarVM/Unicode-11.0: 6f6cd10706 | (Samantha McVey)++ | 3 files Update Unicode to 11.0 and update grapheme break rules Update to the 11.0 version of the Unicode database. Get rid of multiple old rules using now obsolete Emoji rules/guidelines for favor of the new Extended_Pictographic property. We don't pass three of the Unicode grapheme break tests 11.0, though this is about where we were for the Unicode 10.0 tests and is acceptable enough to change versions. |
|||
brrt | .ask samcv - is it possible to nest strands? | 11:51 | |
yoleaux | brrt: I'll pass your message to samcv. | ||
brrt | as in, does that ever happen | 11:52 | |
it is certainly possible, it seems | |||
timotimo | it's possible, but we flatten strands in strands | 11:53 | |
so it doesn't occur | |||
brrt | hmm | 11:56 | |
ok | |||
good to know | |||
otherwise the grapheme iterator would be problematic | |||
timotimo | "flatten" is perhaps misleading | 11:57 | |
we copy the strands of one string into the strands array of another, like, append it | |||
except it may go in the middle, of course | |||
brrt | i see | 12:01 | |
that makes sense | |||
the maximum number of strands is defined as 64, is that even respected? | 12:02 | ||
timotimo | not sure | ||
jnthn back | 12:04 | ||
samcv | what timotimo said | ||
yoleaux | 11:51Z <brrt> samcv: - is it possible to nest strands? | ||
jnthn | dogbert17: How much did things improve with that script with the latest Rakudo change? | 12:06 | |
OK, I think I finally need to take care of the `set` elimination code... | 12:09 | ||
So it doesn't bust up the graph | |||
And probably can do it better with the DU chains also | 12:12 | ||
timotimo | that seems likely | 12:13 | |
brrt | samcv: i just noticed that in strings/ops.c collapse_strands, we first have a loop to check if they all have a common storage type, and if so, to use a fast path for the collapse | 12:15 | |
samcv | brrt: yep | 12:16 | |
brrt | only thing is, if not, we assign common_storage_type to 1, which is MVM_STRING_GRAPHEME_ASCII | ||
oh, hang on | |||
i'm misreading because we're assigning to -1, not +1 | |||
never mind me | 12:17 | ||
:-) | |||
samcv: given that you know the string code pretty well, any opinion about an in-situ string type | 12:25 | ||
lizmat | alas, brrt's blog post dropped from the HN front page | 12:42 | |
jnthn | dogbert17: Oh, just realized. The DU check branch still has the DU checks turned on | 12:49 | |
dogbert17: Which means we're doing a bunch of expensive sanity checking | |||
That'll be switched off before merge | |||
(It's a debugging option, I'm just keeping it on for now to catch my mistakes :)) | 12:50 | ||
nwc10 | isn't that what users are for? :-) | 12:51 | |
jnthn | It's too cruel for them to suffer *all* my mistakes :P | 12:52 | |
dogbert17 | jnthn: got it :) the Rakudo fix posted before lunch definitely helped the example I gisted | 12:53 | |
jnthn | Nice | ||
Yeah, I think all the opts we did with set before plus can be subsumed into two more general opts | 12:58 | ||
Which catch more cases | |||
timotimo | register renaming in general perhaps? | 12:59 | |
13:01
lizmat left
13:04
lizmat joined
|
|||
brrt | my post was on HN at all? | 13:04 | |
lizmat | brrt: yup, made it to the front page for a while, 22 was the highest I've seen it | ||
news.ycombinator.com/item?id=17496789 | |||
33 now | 13:05 | ||
brrt | that kind of dwarfed my earlier visitor statistics :-D | 13:07 | |
jnthn | Yay, the new set elimination actually gets rid of those two set instructions I mentioned in the `chars` spesh | 13:15 | |
Aside from the takedispatcher, it's now as few instructions it could be for the arguments in question | |||
brrt | :-o | 13:16 | |
jnthn waits for spectest | 13:17 | ||
gist.github.com/jnthn/2c891b816f9a...73b6631120 | 13:21 | ||
Annotated version | |||
Note how it proved the return type check of Int:D could be omitted too | 13:22 | ||
Geth | MoarVM/du-chains-and-opts: d69506a33b | (Jonathan Worthington)++ | 3 files Factor out inc/dec op check |
13:26 | |
MoarVM/du-chains-and-opts: 6cf1841527 | (Jonathan Worthington)++ | src/spesh/optimize.c A new algorithm for removing unrequired `set`s Checks if either the `set` is the only reader of the value that it writes, or if there is only one user of the value that it writes. Uses the DU-chain to quickly find the paired writer or reader, and if there's on conflicting register use then does a graph rewrite to enable deletion of the `set` instruction. These seem to cover the cases that the previous more adhoc `set` removal optimizations did, while also not making a mess of the SSA form in the graph. |
|||
jnthn | gah, typo. If there's *no* conflicting... | ||
nwc10 | git neuralize --interactive ... | 13:28 | |
jnthn | Czech lesson now. Hope I speak better than I'm typing today... | 13:29 | |
bbl | |||
13:49
brrt left
14:32
brrt joined
|
|||
jnthn back | 14:50 | ||
brrt | \o | 14:51 | |
lizmat++ for comments on the HN discussion :-) | |||
jnthn | So...what was next on my todo list... :) | 14:54 | |
brrt | well, is it about time for curry yet? | 14:55 | |
jnthn | Not quite :) | 14:57 | |
Anyway, the thing I was going to do was the box/unbox pair elimination | 14:58 | ||
timotimo | thought that'd be it | ||
jnthn | The set elimination was a good warmup for that :) | 14:59 | |
Ohh... :/ | 15:02 | ||
p6box_i is not a box_i :S | |||
timotimo | ah, right | 15:03 | |
jnthn | I was going to change that anyway though | ||
timotimo | i was surprised the day before yesterday to see no spesh functions in rakudo's ops | ||
you'll turn it into code-gen? | |||
brrt wonders about the reason for the sequence of 'data = OBJECT_BODY(root); cbq = (ConcBlockingQueue*)data;' | |||
jnthn | Yeah, thing so | ||
brrt: I forget the details but it gets *very* hairy in there making sure we don't look at outdated state at all the "invisible" points GC might happen | 15:04 | ||
timotimo: Trying to decide between hllboxtype_i + box_i and wval + box_i | 15:08 | ||
timotimo | i was about to ask what you'd prefer | ||
jnthn | wval inlines cross-HLL | 15:09 | |
timotimo | wval gives you a spesh slot that'll have to be GC'd | ||
jnthn | But the former is two bytes shorter | ||
hllboxtype_i won't inline at all | |||
timotimo | oh, we don't turn that into a wval in inlining? | ||
15:10
domidumont left
|
|||
jnthn | No, though we could, or a spesh slot | 15:10 | |
timotimo | i mean we totally could do that and make it inlinable that way. want me to implement it? | ||
jnthn | Hmm, that's a good point. | ||
Then we can have shorter bytecdoe :) | |||
It'll still be longer than today | |||
But still :) | |||
timotimo: Feel free to do that | 15:11 | ||
timotimo | do you prefer spesh slot or wval? | ||
spesh slot is shorter in the bytecode | |||
jnthn | We can't be sure it's in an SC | ||
So ss | |||
ooh, just realized...I can do this just by turning p6box_i into a desugar :) | 15:12 | ||
timotimo | i thought that's what i suggested when i said "codegen" up there %) | ||
15:13
robertle left
|
|||
jnthn | Yeah, just hadn't realized if I do it as a desguar it's probably very easy :) | 15:13 | |
timotimo | right, right | ||
i wonder if get*ref should have something done to it because it also uses the hll | 15:14 | ||
jnthn | I don't think there's any practical cases of this | 15:17 | |
(cross-hll use I mean) | |||
timotimo | OK | 15:19 | |
brrt | oh, I see what you mean; the object body etc. aren't MVMROOT-ed | ||
timotimo | jnthn: do you prefer i remove :useshll from the hllboxtype_* ops, or to special case it in the "is_graph_inlinable" function to make those exempt? | ||
jnthn | timotimo: rewrite them in try_get_inline_graph and remove the annotation | 15:20 | |
timotimo | OK | ||
brrt | then again; we could simplify that by allocating the body with malloc, so it never moves | 15:21 | |
timotimo | actually, i was putting the rewrite into merge_graph, which is also where the rewrites for wvals, strings, etc lives | ||
jnthn | ah | 15:22 | |
No, I think we probably want it in the same place that we handle lexical stuff | |||
to the getlexvia opcodes | |||
timotimo | that's also in merge_graph :) | 15:23 | |
jnthn | Oh? | ||
Hm :) | |||
OK, I guess it's alright there then | |||
d'oh, we don't map the hllboxtype ops | |||
brrt | ... and, if I implement an 'unshift' method for ConcBlockingQueue, I can make an object that literally 'jumps the queue', and tells the spesh worker to stop working | 15:24 | |
plan in the making | |||
timotimo | map? | 15:25 | |
oh, you mean the desugar can't reach it | |||
jnthn | right | 15:26 | |
timotimo | i've got hllboxtype_[ins] as well as hlllist and hllhash, should i also do iter? any others? | ||
jnthn | Easily solved | ||
Think that's enough | |||
I'm not sure Rakudo even uses hlllist and hllhash though | |||
timotimo | mhm | ||
they were right next to it in the list ;) | |||
lizmat | no hlllist/hllhash in src/core | 15:28 | |
timotimo | i don't think i can really test if it works, so i'll just commit it right now. ok, jnthn? | 15:29 | |
15:31
lizmat left
|
|||
jnthn | Hm, I guess :) | 15:33 | |
Well, you can maybe make a synthetic example to try it out | |||
15:33
lizmat joined
|
|||
timotimo | hm, now that hllboxtype_* is mapped in nqp | 15:34 | |
jnthn | brrt: Only to under 20 extops for Perl 6 once I kill of p6box_* :) | 15:37 | |
*Down to under | |||
15:38
robertle joined
|
|||
timotimo | i'm not actually sure how to get a rakudo sub that uses hllboxtype_* to be tried to inline into an nqp sub | 15:38 | |
jnthn | Probably easier the other way round since you can use :from<nqp> maybe | 15:39 | |
brrt | jnthn++ 🎉 | 15:40 | |
timotimo | oof | ||
lizmat | afk& | 15:41 | |
timotimo | ooh | 15:44 | |
looks like a moarvm test in rakudo's t/ tickles this perhaps? | |||
hm, no, not sure that's what's happening | 15:45 | ||
i get a MoarVM oops: Malformed DU chain: writer sp_getspeshslot of 4(1) in BB 19 is incorrect | |||
but i don't have to set the writer of that register when what i do is just replace an op's info and its operands list | 15:46 | ||
oh, huh, what. | 15:48 | ||
ah, yes. | 15:49 | ||
that makes the patch besser | |||
i'm happy with the patch now, i'll go ahead and push it for you | 15:50 | ||
Geth | MoarVM/du-chains-and-opts: f8c4648fb6 | (Timo Paulssen)++ | 3 files allow hllboxtype_* across hll in inlines |
||
MoarVM: 69e2a388eb | (Timo Paulssen)++ | src/profiler/instrument.c add a few missing allocating ops to profiler |
17:03 | ||
jnthn | Hmm, the box/unbox elim is going to need some fresh brane tomorrow, I think :) | 17:15 | |
timotimo | can you push a WIP? maybe i can have a look? | 17:16 | |
jnthn | Well, it's actually to do with the conflict-free checking in the immediate | ||
timotimo | mhm | ||
jnthn | Which I thought I could liberalize a tad, but an exploding CORE.setting suggests not | ||
timotimo | so with the less liberal version it works but isn't worth quite as much? | 17:17 | |
jnthn | Well, misses my test case :) | 17:18 | |
Geth | MoarVM/du-chains-and-opts-WIP: 51ad24c88c | (Jonathan Worthington)++ | src/spesh/optimize.c Liberalize set elimination somewhat Sometimes we've no interesting control flow to worry about within the graph: even if there may be calls to things outside the graph, those can never touch our locals. Thus a linear sequence of instructions is enough to look for to see we have on conflicts. Further, due to SSA form, we need only look for *writes* that make a new version of the ... (5 more lines) |
17:20 | |
MoarVM/du-chains-and-opts-WIP: d259d7a060 | (Jonathan Worthington)++ | src/spesh/optimize.c Untested sketch of box/unbox elimination |
|||
jnthn | Heading home for noms and semis :) | 17:21 | |
timotimo | good noms, and happy sportsballin' | ||
17:25
domidumont joined
17:42
domidumont left
18:28
brrt left
18:53
Ven`` joined
|
|||
Geth | MoarVM: 1b075ec1a7 | (Samantha McVey)++ | src/strings/ops.c Set MVM_string_substrings_equal_nocheck to static We don't use it anywhere else so make it a static function (could maybe cause it to be inlined more often). |
19:07 | |
19:09
dogbert17 left
19:17
lizmat left
19:24
lizmat joined
|
|||
lizmat has given an answer to www.reddit.com/r/perl/comments/8xh...d/e27bj4k/ | 20:00 | ||
I was *really* glad to see that in this particular benchmark, you can say that Perl 6 is now 3x as fast as Perl 5 | 20:01 | ||
I wonder how many other benchmarks we could try after jnthn's work is merged :-) | |||
.ask jnthn if I see "identity" from BOOTSTRAP take 20% of a benchmark, and I see identity simply returning its parameter, shouldn't that get optimised away completely ? | 20:06 | ||
yoleaux | lizmat: I'll pass your message to jnthn. | ||
lizmat | m: my int @a; my int $i; @a.push(++$i) while $i < 5_000_000; say now - INIT now # the code in question | 20:07 | |
camelia | 0.1196005 | ||
20:40
MasterDuke joined
|
|||
timotimo | that's nice | 20:53 | |
a tiny bit sad that my int @a = 1 .. 5_000_000 takes almost 2x as long | |||
hm, actually, 0.26 - 0.1 vs 0.51 - 0.1 | |||
jnthn | lizmat: Would be interested to see the code that does that, though if that really is covering the main cost of the program, we might just be measuring profile overhead... | 20:58 | |
yoleaux | 20:06Z <lizmat> jnthn: if I see "identity" from BOOTSTRAP take 20% of a benchmark, and I see identity simply returning its parameter, shouldn't that get optimised away completely ? | ||
jnthn | Arguably if we optimize an inlinee out so completely that it is only the profile instructions, we could just toss them :) | 21:00 | |
timotimo | true | ||
jnthn | Though it's maybe possible to argue that the other way. :) | ||
timotimo | yeah, the identity frame after spseh is literally no_op, sp_getarg_o, return_o | 21:01 | |
jnthn | Right, and the no_op isn't actually spat out :) | 21:02 | |
timotimo | ah, it has a prof_allocated in there, too | 21:04 | |
enterspesh, getarg, allocated, exit, return | |||
we'd be able to tell when the sp_getarg_o has been turned into a set that it can't have allocated | |||
jnthn | Yeah, though note that after inlining, once I get things as aggressive as I want, identity will probably just look like enterinlined, allocated, exit or some such :) | 21:05 | |
And if that allocated went away then enterinlined/exit being together could be the "delete" heuristic. | |||
timotimo | ok so push looks like enterspesh, getarg, allocated, set, set, getarg_i, takedispatcher, push_i | 21:06 | |
and then it passes the object that's been pushed into to identity | |||
is that right? | |||
heh, also: calls prof_allocated twice on the thing, one extra time after the "return" from the inlined code | 21:07 | ||
jnthn | Would have to see it in context | ||
I need to spend some time on natives, especially nativeref | |||
timotimo | the code is a multi method push on intarray:D; it's nqp::push_i(self, $value); self | 21:08 | |
i don't think i remember when identity was introduced | |||
jnthn | if it's at the end, return type check | 21:09 | |
Or return decont check | |||
walk, bbiab | |||
timotimo | oh, from a spesh plugin? | 21:10 | |
good walk! | |||
but yeah, lizmat, that's really just measuring profiler overhead, some of which could arguably already be kicked out with a bit of intelligence | 21:12 | ||
lizmat | ah, so you're saying "identity" wouldn't get called if there's no profiling going on ? | 21:17 | |
timotimo | no, it does get called | 21:19 | |
but it doesn't do anything except record for the profiler that it was called | 21:20 | ||
oh, jnthn, can you give me your box/unbox removal test case, too? | 21:22 | ||
MasterDuke | fwiw, perf has 'push' at the top with 7.5% for lizmat's bench | 21:23 | |
21:41
MasterDuke left
21:44
MasterDuke joined
21:48
Ven`` left
|
|||
timotimo | jnthn: try_eliminate_one_box_unbox seems to have lacked a unbox_ins->info = MVM_op_get_op(MVM_OP_set) | 21:49 | |
that doesn't make things work right, of course ;) | 21:54 | ||
env MVM_SPESH_BLOCKING=yes MVM_SPESH_NODELAY=yes perl6 -e 'use Test' | 21:57 | ||
===SORRY!=== | |||
)))))))))))))))))))))))))))))))) | |||
21:58
Ven`` joined
21:59
Ven`` left
|
|||
jnthn | timotimo: I didn't even compile it, I don't think, just shoved what I had in a branch so I could go home :) | 22:33 | |
Before that, I could do with figuring out why the looser conditions for set opts break things | 22:36 | ||
Though I think I have an idea why :) | |||
timotimo | i didn't actually invest noticably much time into it apart from spewing debug printfs all over and getting super confusing explosions :) | 22:38 | |
jnthn | Yeah, when I allowed set elimination go to multi-BB, I neglected to remember that one of the benefits of it being single BB was that just checking the instruction wasn't in an inlined BB was enough | 22:40 | |
Which could casue some quite interesting explosions due to missing stuff deopt might need | 22:41 | ||
timotimo | oh, so the problem is in set elimination, not box/unbox elimination | ||
jnthn | Dammit | ||
Still blows up in CORE.setting | 22:42 | ||
Oh. It seems that it might actually be the change I did to only consider there to be a conflict if we see a write creating a new version of the register | 22:45 | ||
Ah, and that may make sense | |||
timotimo | plus hopefully something about inc/dec? | 22:48 | |
wait, does that make sense ... | |||
jnthn | No, it's not about that, I already rule those out of the opt :P | 22:49 | |
So, it's that reads are actually conflicting maybe | 22:50 | ||
jnthn writes some examples | |||
Yes | 22:51 | ||
const_s r3(1), 'blah' | |||
getlex r1(1), 'blah' | |||
getlex_n r2(2), r3(1) | |||
set r3(2), r1(1) | |||
Consider a case like that (yes, pseudo-codey) | |||
We can't turn that into `getlex r3(2), 'blah'` becuase otherwise we break the read of r3(1) | |||
timotimo | mhm | 22:52 | |
so we'll have to look at users again once we find our target? | |||
jnthn | No, this is just in the blocking check | 22:54 | |
But if we consider set optimization 2 | |||
set r3(1), r2(1) | |||
prepargs | |||
arg_o 1, r2(1) | |||
arg_o 1, r3(1) | |||
oops, imagine arg_o 0 in the first case, but anyway | |||
We can change the last one safely to arg_o 1, r2(1) | |||
Even though there's a read | 22:55 | ||
'cus in this case we're pushing the value down | |||
Not pulling it up | |||
So the answer is that is_conflict_free needs to take an option | |||
Let's try that out :) | 22:59 | ||
timotimo | looking forward to seeing the result | ||
23:04
Kaiepi left
23:05
Kaiepi joined
|
|||
jnthn | Yes, that seems to hold up | 23:11 | |
Geth | MoarVM/du-chains-and-opts: 454313773c | (Jonathan Worthington)++ | src/spesh/optimize.c Liberalize set elimination somewhat Sometimes we've no interesting control flow to worry about within the graph: even if there may be calls to things outside the graph, those can never touch our locals. Thus a linear sequence of instructions is enough to look for to see we have on conflicts. Further, due to SSA form, in the case we're moving a read forwards we need only look for ... (7 more lines) |
23:19 | |
timotimo | there's an Also there that probably doesn't belong? | 23:23 | |
OK, there's two things that make my int @a = ^50_000_000 slower than it has to be | 23:34 | ||
first: sp_getlex_ins doesn't have an expr template, so the whole thing gets jitted by the lego jit (though sometimes the exprjit makes things slower by itself) | 23:35 | ||
jnthn | huh, did I forget to push my Rakudo change to adopt the box/unbox over p6box... | ||
Darn, apparently not :( | |||
timotimo | second: the push_i doesn't get devirtualized because OSRing didn't let us figure out the type of the array being put in | ||
is there anything in particular that OSR could be missing? does it look at the types of arguments? | 23:36 | ||
it doesn't have facts for the result of sp_getarg_o, that makes it harder for this to be good | 23:39 | ||
when i warmup that STORE method by storing ^100 to an int array 100 times it goes down from 2.82s to 0.96s | 23:40 | ||
ooh, it also does the thing where it keeps logging osr hits | 23:41 | ||
so the spesh thread was partially busy the whole time, too | |||
jnthn | Despite the commit I need to try my original example still setting on my computer in the office, manually fashioning one using box_i does seem to do the right thing | 23:47 | |
timotimo | sitting, you mean? | 23:48 | |
jnthn | yes :) | ||
timotimo | that allows the sentence to parse %) | ||
Geth | MoarVM/du-chains-and-opts: 3727d18b97 | (Jonathan Worthington)++ | src/spesh/optimize.c Implement box/unbox elimination This happens in the second phase, after inlinings, thus giving us more potential pairings to work with. When we see a box, we look at its usages and find matching unbox/decont instructions. We also look down `set` chains in case the unbox/decont is a `set` away. If we see a pair, and the original unboxed value is still available, we rewrite the unbox to a `set` of the original value, removing a usage of the boxed value. |
23:56 | |
jnthn | Still some things to take care of to make that actually useful. | ||
We don't delete the original box instruction | |||
Also in my Perl 6 example, a useless guard (because we can prove it ain't needed) would keep it alive anyway | 23:57 | ||
timotimo | but we do remove the user? | ||
jnthn | Yes | ||
/* Make unbox insturction no longer use the boxed value. */ | |||
MVM_spesh_usages_delete_by_reg(tc, g, unbox_ins->operands[1], unbox_ins); | |||
But we don't do another pass of DIE after this | |||
timotimo | oh wow, that's the acronym for that, eh? | 23:58 | |
that's cool | |||
anything speak against putting another round of DIE in there? | |||
jnthn | Not really. Then I can probably toss the `set` deletion and just let it be swept up | 23:59 | |
I suspect when I put in the stuff to eliminate native ref / deref pairs it's also going to produce dead instructions | |||
Probably easiest to just let them get swallowed at the end. | |||
Rather than try and delete them manually everywhere | |||
timotimo | agreed |