Welcome to the main channel on the development of MoarVM, a virtual machine for NQP and Rakudo (moarvm.org). This channel is being logged for historical purposes. Set by lizmat on 24 May 2021. |
|||
00:03
reportable6 left
00:04
reportable6 joined
00:43
frost joined
|
|||
Geth | MoarVM/new-disp: 49624aeabd | (Timo Paulssen)++ | src/jit/graph.c legojit capturepos* ops and capture*named* |
01:25 | |
timo | i forgot if these go away or something but right now they were high in the bailer leaderboards | 01:26 | |
isfalse_s is also very strong | |||
02:10
linkable6 left,
evalable6 left
02:12
evalable6 joined
03:12
linkable6 joined
05:38
unicodable6 left,
reportable6 left,
coverable6 left,
nativecallable6 left,
tellable6 left,
bloatable6 left,
evalable6 left
05:39
unicodable6 joined
05:40
coverable6 joined,
reportable6 joined
05:41
evalable6 joined,
tellable6 joined
06:02
reportable6 left
06:04
reportable6 joined
|
|||
Nicholas | good *, reportable6 | 06:04 | |
07:41
nativecallable6 joined
08:04
tealecloud joined
|
|||
Nicholas | jnthnwrthngtn: coffee! | 09:37 | |
09:39
bloatable6 joined
|
|||
dogbert11 | perhaps he's run out of coffee | 10:10 | |
Nicholas | oh noes. I hope not. | 10:14 | |
10:19
tealecloud left
|
|||
dogbert11 | that would indeed be a disaster | 10:20 | |
m: my $str = "3.0/2"; my $num = +$str; | 10:23 | ||
camelia | ( no output ) | ||
10:24
tealecloud joined
|
|||
dogbert11 | m: my $str = "3.0/2"; my $num = +$str; say $num | 10:24 | |
camelia | 1.5 | ||
dogbert11 | looks innocent enough | ||
try running it under new-disp with MVM_SPESH_NODELAY=1 | 10:25 | ||
timo | m: say 15.002 / (15.002 + 80.654 + 8.536 + 1.558); | 10:26 | |
camelia | 0.141863 | ||
dogbert11 | hello timo :) | ||
timo | m: say (15.002 + 80.654 + 8.536 + 1.558) * 0.003 | 10:27 | |
camelia | 0.31725 | ||
timo | i'm trying to make my achievement from yesterday seem bigger | ||
the 0.3% of time spent in that one function was all in stage MAST, so i'd like to express it as a fraction of stage mast instead of a fraction of the whole core setting compilation | 10:28 | ||
m: say 0.31725 * 100 / 15.002 | 10:29 | ||
camelia | 2.114718 | ||
timo | i think this is the percentage of stage mast that the function was taking before i made it basically go away? | ||
timo crawls out of bed, shambles across the hallway moaning "prrraaaaaiiisseeeeee ... prrraaaaaiiisseeeeeee ..." | 10:30 | ||
10:30
squashable6 left
10:32
squashable6 joined
|
|||
jnthnwrthngtn | Hurrah, coffee | 10:37 | |
Nicholas | \o/ | ||
jnthnwrthngtn | I do appear to have slept the whole morning...darn cold. | 10:38 | |
Hm, do I see speedups in the backlog... :) | |||
10:39
sena_kun joined
|
|||
Nicholas | you seem to be suggesting that "after the morning" exists - this cold seems to be causing you to hallucinate :-( | 10:39 | |
jnthnwrthngtn | timo: Yes, the MAST section of the features page can go. And the roadmap could sure use an update... | 10:40 | |
sena_kun does bumps | |||
tellable6 | 2021-09-08T17:27:35Z #moarvm <Kaiepi> sena_kun merged the Data::Record, Kind::Subset::Parameteric, Trait::Traced prs | ||
dogbert11 | coffee FTW | 10:43 | |
jnthnwrthngtn | timo: I suspect fetching the ops we use often into &func is probably going to be reasonably good compared to the lookups every time; not sure there'll be much more to win with any custom dispatcher magic | ||
timo: The JITting you added is certainly worthwhile; those are used in implementing dispatchers quite a lot | 10:45 | ||
timo | there's now an nqp pull request for the lookup movement | 10:50 | |
there's still some in there that could go into the global scope like null and set and such | |||
but i have the feeling that we've got a list_b or two with a whole lot of entries | |||
unless we're actually creating a list now and serializing it t get it out with a wval | 10:54 | ||
oops the PR doesn't compile lmao | 10:55 | ||
10:57
MasterDuke_ joined
|
|||
sena_kun | hmm, I wanted to do a run with zef this time to get more info, but zef --config-path=data/zef-config.json update took 7 minutes and is still not finished. :/ | 10:57 | |
MasterDuke_ | timo: did I see you say that only the js backend uses directives? | ||
timo | i did say that! please tell me if i'm wrong? | 10:58 | |
MasterDuke_ | IIRC, thatās how I implemented getting the right file name and line number into the rakudo backtraces | ||
timo | interesting. i did not see any place that actually sets that dynamic variable | 10:59 | |
outside of vm/js/ | |||
jnthnwrthngtn | timo: The appraoch looks sensible to me | ||
Though yeah, please make sure it doesn't break anything :D | 11:00 | ||
MasterDuke_ | Hm, maybe the implementation has changed since? | ||
Oh, what about in rakudo? | |||
timo | when did you do that? i seem to recall it was only a month or two ago? | ||
i believe i looked there as well | |||
MasterDuke_ | Implement directives? That was ~2016 | 11:01 | |
timo | oh, haha | ||
MasterDuke_ | Probably the first largish thing I did | 11:02 | |
timo | '/home/timo/perl6/install/bin/nqp-m' --module-path=blib --ll-exception --target=mbc --output=blib/Perl6/Actions.moarvm gen/moar/Actions.nqp | ||
list_b op with 384 blocks | |||
MasterDuke_ | But feel free to optimize it, I think itās been untouched since then | ||
timo | when something is executed once instead of 385 times, it's *gotta* have an impact! | ||
MasterDuke_ | So if the dynamic variable can be removed that would be great | 11:03 | |
timo | the metamodel has 412 blocks, can't wait to see what the core setting has to offer | 11:06 | |
17.3k blocks :D | 11:08 | ||
MasterDuke_ | 17.3k blocks oughta be enough for everybody | 11:09 | |
timo | i'm doing a little timing for before/after moving the lookup out of the loop for list_b | 11:14 | |
m: say my $b = 1631272404295393032 - 1631272404295036551; say my $a = 1631272616735221172 - 1631272616704178601; say 100 * $b / $a | 11:18 | ||
camelia | 356481 31042571 1.148361713 |
||
timo | m: say 1631272739972972185 R- 1631272741032432729 | 11:19 | |
camelia | 1059460544 | ||
timo | m: say 31042571 / 1059460544 | ||
camelia | 0.0293003559 | ||
timo | haha, i improved something that takes a little less than a thirtieth of a second in the core setting stage mast | ||
but i made it almost 100x faster | 11:20 | ||
MasterDuke_ | The accumulation of marginal gains can yield noticeable results | 11:21 | |
timo | the other improved ops are much much harder to measure since they are called a whole lot more often and then i'll be reaching "too much overhead for measuring" territory | 11:22 | |
jnthnwrthngtn | I think anything that doesn't harm code quality (and this arguably improves it anyway) and gives a measurable, even if tiny, win is worth it; they add up. | 11:23 | |
Especially when added up over all people using Raku. :) | 11:25 | ||
timo | m: say 7926346 / 1059460544 | 11:26 | |
camelia | 0.0074814924 | ||
timo | 0.0075 seconds off of the install_core_modules task :) | ||
which took 20.58 seconds in total | 11:27 | ||
+++ Cleaning up MOAR | 11:33 | ||
rm -f -- | |||
[...] | |||
inst-perl6-debug-m.o .o .o .o .o .o dynext/libperl6_ops_moar.so | 11:34 | ||
someone *really* wants .o gone | |||
i'm running a spec test to be extra sure | 11:40 | ||
11:42
MasterDuke_ left
12:02
reportable6 left
12:04
reportable6 joined
|
|||
timo | it passed, but i forgot about the spec test run until now :D | 12:15 | |
sena_kun | no new findings in Blin so far. the modules we prepared PRs for should be bumped for the fixes to be visible, but other than that all the hard things left. | 12:42 | |
I'm running one with zef now. | |||
4 modules, 1 needs a PR and 3 others are hard, from less deps to more deps: hide-methods, DateTime::Timezones, HTML::Canvas. | 12:43 | ||
maybe running with zef will uncover something too | 12:44 | ||
12:44
frost left
|
|||
lizmat | sena_kun: any idea what's wrong with hide-methods ? | 12:45 | |
Nicholas | to check, if I understand "4 modules, 1 needs a PR and 3 others are hard" - that means just 3 modules left to investigate, that are broken by new-disp but work on master? | ||
sena_kun | lizmat, that's a question for jnthnwrthngtn I'd say. | ||
lizmat | what is the error mode? | ||
sena_kun | Nicholas, well, AFAIK they are "investigated", it's just that to create a fix is hard. | 12:46 | |
lizmat, # Failed test 'could B.bar not be found now' | |||
lizmat | hmmm... if does use ^find_method | 12:48 | |
sena_kun | lizmat, it's a diff between master and new-disp, so it is a matter of new-disp to be resolved I believe. | 12:49 | |
lizmat | ok, then I won't touch hide-methods until jnthnwrthngtn tells me I should fix something :-) | 12:50 | |
i guess this is the real troublezone: | 12:52 | ||
# cannot use nextcallee because that would refer | |||
# to the original method that got wrapped. | |||
if $class.^mro[1].can($name).head -> &nextone { | |||
nextone(SELF, |c) | |||
} | |||
timo | i wonder what makes the optimizer's visit_children so expensive. is it just that it's called so often, or is it something like "the istype if/else cascade is not optimal for the typical distribution of types" or something | 12:55 | |
12:56
tealecloud left
12:57
tealecloud joined
|
|||
timo | 13498 Stmts 1st | 13:06 | |
17642 Block | |||
18600 typecheck | |||
37405 Want | |||
43675 Xval or vm | |||
56221 Stmts 2nd | |||
113632 wval | |||
308335 Op | |||
323898 Var | |||
the order they are in is op, want, var, block, stmts 1st, stmts 2nd, regex, wval, typecheck, xval or vm | 13:07 | ||
13:08
brrt joined
|
|||
timo | so want is a whole lot earlier in the checks than it tends to appear (in the core setting), op and var could be switched but it's not a big difference, wval or vm wants a lot earlier | 13:09 | |
brrt | good * #moarvm | 13:14 | |
dogbert11 | hello brrt | 13:16 | |
jnthnwrthngtn | timo: I'd not invest too much time in Perl6::Optimizer as I'm not sure it'll survive in its current form beyond rakuast | ||
timo | ah i guess that's a good point :) | ||
jnthnwrthngtn | Of course, easy wins like reordering those branches by usage is fine | 13:17 | |
o/ brrt | |||
sena_kun: Wow, so we're really down to just 4 modules broken? :D | |||
timo | the most expensive code frame showing up in my perf reports is MATCH followed by _ws, then termish, then !alt | ||
not sure i can actually get any performance out them as easily | |||
jnthnwrthngtn | (Well, aside from those using nqp::ops where we've decided it's SEP...) | 13:18 | |
timo | SEP? | ||
jnthnwrthngtn | Somebody Else's Problem | ||
timo | oh, that was about modules, right | ||
jnthnwrthngtn | yes :) | 13:19 | |
timo | i should also mention of course that MVM_interp_run has like 7.82% so that's frames that aren't jitted | ||
so those just don't show up in the measurements as individual entries | |||
sena_kun | jnthnwrthngtn, well, tbh I am not fond of pakku output with some modules, I am trying zef but the update seems to be still running after... 36 minutes and so I cannot run it. So it still a possibility there are more of them, of course. But maybe not by a lot. | ||
brrt | timo: does that imply that we JIT 92% of runtime? | 13:20 | |
(or that 92% of the runtime, is JITted code?) | |||
or am I reading that number entirely wrong | |||
jnthnwrthngtn | brrt: I have an op sp_resumption which at runtime is just meant to do the same as `null` (e.g. null out one register), but it has a bunch of other operands...in the lego JIT it was easy to just make it JIT like `null`, when I tried to copy the `null` template in the EXPR JIT I got an error from the template compiler. | 13:21 | |
timo | well, that's not inclusive | ||
jnthnwrthngtn | brrt: Dunno if you've time/interest to have a poke at it; I'm guessing it's something small/silly | 13:22 | |
brrt | I have time and interest :-) | ||
timo | it literally just means time spent directly in interp_run. the next functions are frame_dispatch, disp_program_run, nqp_nfa_run, fixed_size_alloc, spesh_arg_guard_run, allocate_frame | ||
brrt | ah, okay | ||
and any other downstream functions, too, I presume | 13:23 | ||
jnthnwrthngtn | brrt: OK, then new-disp branches in moar/nqp/rakudo, copy-paste the template for null and change the name to `sp_resumption`, compile and then run probably anything in Rakudo | ||
iirc it was an oops | |||
brrt | ah | ||
only 220 commits away from master... | 13:24 | ||
oh, I was far behind with that branch | 13:30 | ||
hey, isn't that an implicit goto though? | |||
no, it has varargs registers | 13:31 | ||
timo | looks like reordering did not make a measurable difference so *shrug* | 13:36 | |
jnthnwrthngtn | brrt: Varargs regs, but they don't matter at all here | 13:47 | |
brrt | I think it is this: 'MoarVM oops in spesh thread: Can't add constant for operand type 18' | ||
jnthnwrthngtn | Looks familiar | ||
brrt | I ... think I know that code :-) | 13:56 | |
that would be MVM_reg_uint16 probably | 13:57 | ||
jnthnwrthngtn | Ah, just something the template compiler doesn't know about? | 14:00 | |
timo | brrt: do you isfalse_s or do i? | 14:08 | |
do we get a little node that negates an int register perhaps? | 14:10 | ||
maybe i'll introduce MVM_JIT_RV_INT_NEGATED | 14:16 | ||
14:27
MasterDuke_ joined
|
|||
MasterDuke_ | timo: I was doing some experimenting with jitting isfalse_s recently, by changing the c function to take a flag whether or not to negate the result (some other similar function in the same file does the same thing), but my rough testing showed it to be slower | 14:29 | |
But istr it was difficult to find good test raku code | 14:30 | ||
timo | well, i made the return value mode now that negates the value | ||
MasterDuke_ | Might want to search the logs, I may have given some more details then | ||
timo | that shouldn't make it slower in general | 14:31 | |
after jitting isfalse_s did anything commonly replace the bail with something else? | |||
14:32
MasterDuke_ left
14:33
MasterDuke_ joined
|
|||
MasterDuke_ | Yeah, thatād probably be better. Donāt remember why I didnāt try implementing that | 14:33 | |
Might be able to simplify whatever that other op was too | 14:34 | ||
timo | twitter.com/davidrevoy/status/1436...9768081412 - cool we have a nice illustration of it now | 14:36 | |
oh, did you have other ops where the negation would be helpful? | 14:37 | ||
also it's impressive that we negate an integer by issuing test + setnz + movxz + final mov to the target register | 14:38 | ||
MasterDuke_ | MVM_coerce_istrue has a āflipā parameter that you might be able to remove | 14:39 | |
timo | that is kind-of API, though | 14:43 | |
MasterDuke_ | How do you mean? The nqp op doesnāt have that parameter | 14:47 | |
Ha, look at src/spesh/optimize.c:1221 that can be simplified (using the existing flip argument), fine as is if the arg gets removed | 14:49 | ||
timo | MVM_blahblah that are exported can be considered API that users of moarvm could use | 14:50 | |
i'm on new-disp, the lines don't line up | 14:51 | ||
MasterDuke_ | Oh itās gone there, I guess part of that whole coerce_* turned into dispatchers | 14:52 | |
Geth | MoarVM/new-disp: 325625c74f | (Bart Wiegmans)++ | 2 files [JIT] Compile MVM_reg_uint16 constants |
14:53 | |
timo | dispatch_n isn't jitted, which is used for nqp-numify | 14:58 | |
sorry, i meant sp_dispatch_n | |||
jnthnwrthngtn | Oh...I thought all the sp_dispatch_* were | 15:21 | |
I suspect that's easy to rectify | |||
brrt++ # that was an easy one | 15:28 | ||
brrt: Bytecode invocation is now a single op (`sp_runbytecode_*`) rather than a sequence of ops, which maybe will open the way to expr JIT of that too | 15:29 | ||
Or at least remove one pain point | |||
15:33
evalable6 left,
linkable6 left
|
|||
brrt | I can have a look at some (later) time | 15:37 | |
jnthnwrthngtn | No hurry at all. :) They are done (by nine++) in the lego jit already | 15:44 | |
brrt | :-) | 15:45 | |
time for dinner! | |||
jnthnwrthngtn | Enjoy | ||
lizmat | brrt: but nine is on holiday for the next 10 days | ||
nudge nudge :-) | 15:46 | ||
brrt | hehe | 15:48 | |
nudge received | |||
15:52
MasterDuke_ left
15:53
brrt left
|
|||
Nicholas | oh, his timing is excellent | 16:03 | |
ASAN is very excited, and the only thing that changed was 1 commit on MoarVM | |||
.tell brrt You've made ASAN very excited! paste.scsys.co.uk/595856 | 16:05 | ||
tellable6 | Nicholas, I'll pass your message to brrt | ||
Nicholas | compiling the core setting | ||
Geth | MoarVM/new-disp: 4bcc5a0e88 | (Jonathan Worthington)++ | src/core/args.c Remove always-false null checks There will always be at least one callstack record below an executing frame record, and if that is a region start, it means there's a whole region of records prior to it. Spotted by mlschroe++. |
16:12 | |
Nicholas | not really in a position to test that. building a non ASAN build to see whether valgrind also gets excited | 16:13 | |
jnthnwrthngtn | grmbl, I tested it before rebasing with brrt's PR and now more core setting build fails with "stack smashing detected" | 16:16 | |
ah, I see what's wrong | 16:18 | ||
Nicholas | you're doing better than I am | ||
jnthnwrthngtn | .oO( Smashing stack, grommit ) |
16:19 | |
Geth | MoarVM/new-disp: c0320b0818 | (Jonathan Worthington)++ | src/jit/expr.c Survive varargs ops in the expression JIT They can have more operands than the maximum for non-vararg ops, which could lead to buffer overruns. |
16:20 | |
Nicholas | if it makes more sense to rebase to fix this cleanly, please do it. I'm not going to come visit and wave a pitch fork in frustration, even though I *think* I probably can actually travel to you with one without hitting security theathre | 16:21 | |
jnthnwrthngtn | Naughty fingers typed "bugger overruns", glad I caught that before pushing. | ||
lizmat | .oO( it is better to be buffered ) |
||
Nicholas | oh, my, varargs. The thing that (if I vageuly remember) the x86_64 ABI says have the argument count in a register. That regular functions do not. | 16:22 | |
At least, IIRC, you can't safely cast one to the other and have it work | |||
jnthnwrthngtn | Nicholas: varargs MoarVM ops, not ABI varargs, thankfully | ||
Nicholas | ah OK. phew | ||
jnthnwrthngtn | They don't imply having to do C-level varargs calls | ||
Mixed feelings on the rebase; the reason that the commit introduced a SEGV was more that it uncovered another shortcoming, rather than being in itself wrong | 16:23 | ||
Nicholas | yes, then I think don't | 16:24 | |
"it should have worked" rather than "whoops, goof" | |||
um, oops, ASAN even more excited | 16:25 | ||
now in NQP. (twice, due to parallel make) | 16:26 | ||
jnthnwrthngtn | Oh, curious. Maybe I didn't understand the expr jit code as well as I thought then... | ||
Curious, my Rakudo build completed fine :) | |||
Nicholas | er, two different backtraces. | 16:27 | |
gen/moar/stage1/nqpmo.moarvm is paste.scsys.co.uk/595858 | 16:28 | ||
jnthnwrthngtn | lol | 16:29 | |
Nicholas | er, no, I'm confused, they are the same i think | 16:30 | |
jnthnwrthngtn | Just before that | ||
/* A HACK. | |||
lol | |||
OK, I can fix this easily | |||
Nicholas | again, "you're doing better than I am" | 16:31 | |
I'm told that it's time to eat. So & | |||
Geth | MoarVM/new-disp: ebfd4d5d6e | (Jonathan Worthington)++ | src/jit/expr.c Ensure at least 2 slots in expr JIT operands array It relies on this when compiling inc_i/dec_i ops. |
16:32 | |
16:33
evalable6 joined
|
|||
jnthnwrthngtn | That should do it | 16:33 | |
Happy nomming | 16:34 | ||
16:34
linkable6 joined
16:57
sena_kun left
17:49
leont left
17:50
leont joined
18:02
reportable6 left
18:04
reportable6 joined
18:13
brrt joined
|
|||
Geth | MoarVM/new-disp: 29b67875bd | (Jonathan Worthington)++ | 3 files Better sp_bindcomplete handling on inline Instead of making it block an inline, we can simply delete it, since if we're inlining then (at least for now) we know it will be a no-op. |
18:24 | |
MoarVM/new-disp: 4f2b0bef8e | (Jonathan Worthington)++ | src/spesh/disp.c Propagate value result facts in dispach programs This means that dispatchers that resolve to the identity function - such as hllize - will not end up losing the facts known about their input. |
|||
jnthnwrthngtn | Curiously, getting dispatch program translation to not insert guards if we already have facts about types proving the guard isn't needed leads to a SEGV in NQP compilation. | 18:28 | |
Will have to track down why; that should be a trivial opt to do. (And while in theory we could eliminate those guards later, their insertion involves an SSA version split, which is quite expensive.) | 18:30 | ||
Though it seems we aren't always deleting proven guards either, and evidently aren't doing so in these cases. | |||
Time for food/rest | 18:31 | ||
18:33
tealecloud left
19:11
tealecloud joined
19:26
brrt left
19:32
brrt joined
|
|||
Nicholas | brrt: jnthnwrthngtn fixed the thing that .tell will tell you about | 19:43 | |
jnthnwrthngtn: now somewhere in spectest with your most current MoarVM | 19:44 | ||
brrt | o | 19:45 | |
tellable6 | 2021-09-10T16:05:09Z #moarvm <Nicholas> brrt You've made ASAN very excited! paste.scsys.co.uk/595856 | ||
brrt | huh | ||
Nicholas | see scrollback. your change hit a bug | ||
er, log. | 19:46 | ||
brrt | ah, I see it | ||
broken assumptions were broken; MVM_MAX_OPERANDS no longer made sense | 19:47 | ||
my C89 heart is a little disappointed by the variable size array | 19:52 | ||
but I'll accept | |||
timo | sp_bindcomplete also showed up regularly on the bailed jit messages | 20:32 | |
Geth | MoarVM/new-disp: 32e0c279f7 | (Timo Paulssen)++ | 3 files implement negating return values for c functions, isfalse_s |
20:40 | |
jnthnwrthngtn | Nicholas: Does that mean no ASAN excitement? | 20:46 | |
dogbert11 | jnthnwrthngtn: there are some bugs lurking in new-disp | 21:08 | |
e.g. try running: while MVM_SPESH_NODELAY=1 ./rakudo-m -e 'sleep 1; sub f($str) { my $num = +$str; }; f "3.0/2"'; do :; done | 21:09 | ||
21:16
tealecloud left
|
|||
timo | Type check failed in binding to parameter 'nu'; expected Int but got Rat (3.0) | 21:38 | |
this error? | |||
dogbert11 | timo: yes | 21:51 | |
21:51
brrt left
|
|||
timo | ok so we've got the check istype($result, Int) && istype($denom, Int) and that calls Rat.new, otherwise we call infix:</> with the two numbers | 21:59 | |
and it is in fact calling Rat.new here | |||
i can't seem to get the code of interest into the spesh log, let's see ... | 22:01 | ||
ah, haha | 22:03 | ||
i was ctrl-c'ing during the "sleep 1" of the next run | |||
so the code had in fact not happened yet | |||
dogbert11 | I snuck the 'sleep 1' since it helped reveal the problem quicker | 22:06 | |
timo | when i have time i'll have to rr this i'm afraid | 22:34 | |
22:51
evalable6 left,
linkable6 left
22:52
evalable6 joined
22:54
linkable6 joined
23:54
coverable6 left,
tellable6 left,
linkable6 left,
unicodable6 left,
benchable6 left,
releasable6 left,
nativecallable6 left,
sourceable6 left,
bloatable6 left,
statisfiable6 left,
bisectable6 left,
reportable6 left,
evalable6 left
23:55
tellable6 joined,
benchable6 joined
23:57
nativecallable6 joined,
coverable6 joined
|