timotimo | i mean this surely got discussed a whole bunch of times before but i totally forgot :| | 00:01 | |
MasterDuke | no idea | 00:02 | |
i think Zoffix has done a bunch with sinking | 00:03 | ||
timotimo | it'd surely be easy to search for, but i have to get to bed | ||
MasterDuke | Zoffix: last statements in for loops don't get sunk. that's intentional, correct? | ||
later... | 00:08 | ||
TimToady | if the entire for loop is sunk, the last statement should also be sunk | 00:14 | |
yoleaux | 16 Apr 2018 10:56Z <Zoffix> TimToady: Maybe you have some interesting things to add to Perl 6 Museum: rakudo.party/post/WANTED-Perl-6-Hi...ical-Items | ||
00:16
Zoffix joined
|
|||
Zoffix | .tell MasterDuke it's meant to be sunk if the for loop is sunk. It's basically like a map(). The current system is all over the shop tho and needs some rework R#1704 R#1571 | 00:18 | |
yoleaux | Zoffix: I'll pass your message to MasterDuke. | ||
synopsebot | R#1704 [open]: github.com/rakudo/rakudo/issues/1704 Body of a loop statement not getting sunk (and/or not warning about `Useless use`) in many cases | ||
R#1571 [open]: github.com/rakudo/rakudo/issues/1571 Flaws in implied sinkage / `&unwanted` helper | |||
MasterDuke | .tell timotimo some info re sinking and loops from TimToady and Zoffix irclog.perlgeek.de/moarvm/2018-04-18#i_16059397 | 01:02 | |
yoleaux | 00:18Z <Zoffix> MasterDuke: it's meant to be sunk if the for loop is sunk. It's basically like a map(). The current system is all over the shop tho and needs some rework R#1704 R#1571 | ||
synopsebot | R#1704 [open]: github.com/rakudo/rakudo/issues/1704 Body of a loop statement not getting sunk (and/or not warning about `Useless use`) in many cases | ||
yoleaux | MasterDuke: I'll pass your message to timotimo. | ||
synopsebot | R#1571 [open]: github.com/rakudo/rakudo/issues/1571 Flaws in implied sinkage / `&unwanted` helper | ||
01:55
Zoffix left
01:57
ilbot3 joined
06:31
domidumont joined
|
|||
nwc10 | good *, #moarvm | 06:34 | |
06:39
domidumont joined
07:30
domidumont joined
07:41
ggoebel joined
08:16
zakharyas joined
09:45
dogbert2_ joined
09:55
AlexDaniel joined
11:24
brrt joined
|
|||
brrt | \o | 11:24 | |
timotimo | o/ brrt | 11:26 | |
how do you feel about adding types for spesh slot address and value to the lego jit's "emit c code" functionality? | |||
brrt | do we ever pass spesh slot addresses to c functions? | 11:30 | |
i think it's fine | 11:31 | ||
if that solves a practical problem for you :-) | 11:32 | ||
there is an alternative solution, of course | |||
which is, let the legacy jit explicitly build-and-call expression jit fragments | |||
for c functions | |||
even when not in expression jit mode | |||
timotimo | i could have implemented getwvalfrom and getstringfrom with the lego jit if we had those :) | 11:34 | |
brrt | i'm guessing you're implicitly asking me to add them :-) | 11:36 | |
timotimo | well, they're only in the exprjit right now; they might cause bails in some places due to missing from the lego jit | ||
brrt | yeah, i know | 11:37 | |
ok, so i tested, and what do you know, dynasm can actually use the xmm* registers with dynamic indexing | 11:45 | ||
ain't that awesome | 11:46 | ||
11:48
domidumont joined
|
|||
brrt | but the upper 8 registers will need some care to be addressed | 11:49 | |
timotimo | oh, you got something that you want to use them with? | 11:55 | |
vectorizing our code will be quite the challenge, won't it? | |||
i mean the code our jit outputs | 11:56 | ||
i'm not sure what kinds of vectorized instructions we can find that'll apply | |||
12:10
AlexDani` joined
12:52
zakharyas joined
|
|||
brrt | I use the sse registers for floating point calculations | 12:58 | |
which is specified by amd64 | 12:59 | ||
timotimo | right | 13:04 | |
so with that we'd just have more registers free to shuffle stuff around | 13:05 | ||
jnthn | The obvious source of things to vectorize is hyper-ops on native arrays | 13:06 | |
@a >>+<< @b where both are array[int] | |||
timotimo | yeah | 13:08 | |
but we'll have to do some mean inlining and analysis to get that right, or rakudo itself would have to supply all we need to know down to moar | |||
at the moment, we're not even using fastinvoke to run HYPER(&[+], $one, $two) | 13:13 | ||
i mean, sure, we'd probably want to specialcase hyper for native arrays anyway | 13:14 | ||
so we don't have to check dwim-compatibility on both iterators every step of the way | |||
m: my int @foo[3;2]; my int @bar[2;3]; say @foo >>+<< @bar | 13:29 | ||
camelia | Resource temporarily unavailable | ||
timotimo | should hyperops on shaped arrays 1) require the shapes to be compatible, and 2) create another shaped array as the result? or do we just ask the user to store it back into a shaped array and use the "make shaped array from long list of items" semantics? | 13:30 | |
ah, can't do that, it requires structured data for assignment | 13:31 | ||
lizmat | yeah | 13:35 | |
brrt | can i offer a counterpoint | 13:40 | |
autovectorizing by compiler is notoriously a hard problem | |||
timotimo | OK, so we signal to the VM | 13:41 | |
brrt | if rakudo can detect that we are doing something vectorizable, why not implement that as a moar opcode, and let the C compiler take over | ||
lizmat | that would mean all backends would need to support that opcode ? | 13:42 | |
or we'd have to conditionalize that at the Perl 6 / nqp level :-( | 13:43 | ||
timotimo | we'll have a million trillion moar opcodes :P | 14:09 | |
14:30
committable6 joined
14:36
bisectable6 joined
15:51
quotable6 joined
15:55
brrt joined
|
|||
brrt | lizmat: yes, but it'd be perfectly valid for an implementation to be just a for loop. or indeed, conditional to backend, or maybe in the QAST->MAST phase | 15:57 | |
i'm not afraid of a good old special case :-) | |||
nine | Would (global) deoptimization on hitting an nqp::backtrace op be a terribly bad idea? That way we would get a properly uninlined backtrace. Would anyone create a backtrace in performance sensitive code? | 16:06 | |
jnthn | Failure | 16:13 | |
Well, or at least, I guess it stashes info so it can create a backtrace later | |||
Though I guess it actually has to snapshot it as the program state changes | 16:14 | ||
nine | So probably not worth taking the hit. Anyway, duplicating what deopt_all does when creating backtraces shouldn't be that hard... | 16:22 | |
jnthn | No, probably not, it just needs the patience to do it :) | 16:23 | |
16:29
evalable6 joined
|
|||
jnthn | m: say log(149, 2) | 16:48 | |
camelia | 7.219168520462162 | ||
japhb | OK, now I'm curious. Why that particular math question? | 17:28 | |
17:30
robertle joined
17:35
domidumont joined
|
|||
jnthn | 149 is the largest codepoint fanout of an NFA state (it's almost certainly the NFA for infix) | 17:36 | |
We currently chug through all of those every time we want an infix operator | 17:37 | ||
log2 of that is how many comparisons a binary search would do if we were to sort those edges once and binary search them each time | |||
nine | 7.2 sounds somewhat better than 74 | 17:38 | |
jnthn | Well, and much better than 149, which is singificant because every time we are in the OPP we try to find another infix operator | ||
And fail when there ain't one | 17:39 | ||
Meaning that the case that we chug through without a match is also really common in that particular case | |||
17:40
zakharyas joined
|
|||
japhb | yeah, that seems like a quite significant drain. | 17:44 | |
17:45
zakharyas joined
|
|||
nine | Odd.... I can't reproduce the backtrace thing in a reduced test. But it fails reliably as part of the test file. Apparently I need the use lib $?FILE.IO.parent(2).add("packages"); use Test; | 17:53 | |
nine is boarding now | 18:09 | ||
timotimo | good flight | ||
18:12
FROGGS joined
|
|||
Geth | MoarVM/nfa-codepoint-fanout-opt: 510b578754 | (Jonathan Worthington)++ | 2 files For NFAs with many codepoints, use binary search So far, we've chugged through all of the codepoints, one by one. This is suboptimal, since the largest ones generated by the Perl 6 grammar can have up to ~150 possibilities, which is a lot to linearly search through. Instead, sort them once and then do a binary search. This shaves 1% off the total instructions spent parsing a sizable Perl 6 program, thanks to nqp_nfa_run accounting for only 87% of the instructions that it did prior to this change. |
18:27 | |
jnthn | dinner & | 18:28 | |
japhb | Nice targeted work. | ||
[Coke] | jnthn++ | 18:33 | |
timotimo | stepping through the program seems to imply it's not actually osr-ing the MYHYPER frame?! | 19:50 | |
jnthn | timotimo: What do the spesh stats say for it? | 20:21 | |
timotimo | it receives boatloads of spesh logs and the osr hits go up more and more | 20:22 | |
jnthn | And then it makes a specialized version, or? | ||
timotimo | i just found it a bit implausible that it'd be this fast without even entering speshed code | ||
yes, the specialized version gets built very fast | |||
jnthn | Do we deopt from it? | ||
timotimo | the profiler sees only a single deopt and that's from BUILDALL | ||
jnthn | There's a #define in deopt.c where it'll log them; could check that to see if it matches | 20:23 | |
timotimo | will do | ||
no, all deopts happen before the "say now" that's in front of the workload | 20:24 | ||
ix.io/188M is my current code, you can try it yourself. be sure to run it with commandline argument "optimized" | 20:25 | ||
all i can imagine is it's not succeeding in the guard tree somehow | |||
dinner time! \o/ | |||
20:46
AlexDaniel joined
21:01
Kaiepi joined
21:14
ggoebel joined
|
|||
timotimo | let's see if i can't figure this out. | 21:20 | |
22:37
releasable6 joined
|
|||
timotimo | ok, so ... it kind of almost seems like the callsite that was set during our MVM_spesh_log_entry differs from the one we see in osr-ing, and that causes us to not ever find a matching spesh candidate | 22:58 | |
sadly, rr has some problems with this code in particular ... | 23:05 |