Welcome to the main channel on the development of MoarVM, a virtual machine for NQP and Rakudo (moarvm.org). This channel is being logged for historical purposes.
Set by lizmat on 24 May 2021.
00:03 reportable6 left 00:04 reportable6 joined 00:43 frost joined
Geth MoarVM/new-disp: 49624aeabd | (Timo Paulssen)++ | src/jit/graph.c
legojit capturepos* ops and capture*named*
timo i forgot if these go away or something but right now they were high in the bailer leaderboards 01:26
isfalse_s is also very strong
02:10 linkable6 left, evalable6 left 02:12 evalable6 joined 03:12 linkable6 joined 05:38 unicodable6 left, reportable6 left, coverable6 left, nativecallable6 left, tellable6 left, bloatable6 left, evalable6 left 05:39 unicodable6 joined 05:40 coverable6 joined, reportable6 joined 05:41 evalable6 joined, tellable6 joined 06:02 reportable6 left 06:04 reportable6 joined
Nicholas good *, reportable6 06:04
07:41 nativecallable6 joined 08:04 tealecloud joined
Nicholas jnthnwrthngtn: coffee! 09:37
09:39 bloatable6 joined
dogbert11 perhaps he's run out of coffee 10:10
Nicholas oh noes. I hope not. 10:14
10:19 tealecloud left
dogbert11 that would indeed be a disaster 10:20
m: my $str = "3.0/2"; my $num = +$str; 10:23
camelia ( no output )
10:24 tealecloud joined
dogbert11 m: my $str = "3.0/2"; my $num = +$str; say $num 10:24
camelia 1.5
dogbert11 looks innocent enough
try running it under new-disp with MVM_SPESH_NODELAY=1 10:25
timo m: say 15.002 / (15.002 + 80.654 + 8.536 + 1.558); 10:26
camelia 0.141863
dogbert11 hello timo :)
timo m: say (15.002 + 80.654 + 8.536 + 1.558) * 0.003 10:27
camelia 0.31725
timo i'm trying to make my achievement from yesterday seem bigger
the 0.3% of time spent in that one function was all in stage MAST, so i'd like to express it as a fraction of stage mast instead of a fraction of the whole core setting compilation 10:28
m: say 0.31725 * 100 / 15.002 10:29
camelia 2.114718
timo i think this is the percentage of stage mast that the function was taking before i made it basically go away?
timo crawls out of bed, shambles across the hallway moaning "prrraaaaaiiisseeeeee ... prrraaaaaiiisseeeeeee ..." 10:30
10:30 squashable6 left 10:32 squashable6 joined
jnthnwrthngtn Hurrah, coffee 10:37
Nicholas \o/
jnthnwrthngtn I do appear to have slept the whole morning...darn cold. 10:38
Hm, do I see speedups in the backlog... :)
10:39 sena_kun joined
Nicholas you seem to be suggesting that "after the morning" exists - this cold seems to be causing you to hallucinate :-( 10:39
jnthnwrthngtn timo: Yes, the MAST section of the features page can go. And the roadmap could sure use an update... 10:40
sena_kun does bumps
tellable6 2021-09-08T17:27:35Z #moarvm <Kaiepi> sena_kun merged the Data::Record, Kind::Subset::Parameteric, Trait::Traced prs
dogbert11 coffee FTW 10:43
jnthnwrthngtn timo: I suspect fetching the ops we use often into &func is probably going to be reasonably good compared to the lookups every time; not sure there'll be much more to win with any custom dispatcher magic
timo: The JITting you added is certainly worthwhile; those are used in implementing dispatchers quite a lot 10:45
timo there's now an nqp pull request for the lookup movement 10:50
there's still some in there that could go into the global scope like null and set and such
but i have the feeling that we've got a list_b or two with a whole lot of entries
unless we're actually creating a list now and serializing it t get it out with a wval 10:54
oops the PR doesn't compile lmao 10:55
10:57 MasterDuke_ joined
sena_kun hmm, I wanted to do a run with zef this time to get more info, but zef --config-path=data/zef-config.json update took 7 minutes and is still not finished. :/ 10:57
MasterDuke_ timo: did I see you say that only the js backend uses directives?
timo i did say that! please tell me if i'm wrong? 10:58
MasterDuke_ IIRC, that’s how I implemented getting the right file name and line number into the rakudo backtraces
timo interesting. i did not see any place that actually sets that dynamic variable 10:59
outside of vm/js/
jnthnwrthngtn timo: The appraoch looks sensible to me
Though yeah, please make sure it doesn't break anything :D 11:00
MasterDuke_ Hm, maybe the implementation has changed since?
Oh, what about in rakudo?
timo when did you do that? i seem to recall it was only a month or two ago?
i believe i looked there as well
MasterDuke_ Implement directives? That was ~2016 11:01
timo oh, haha
MasterDuke_ Probably the first largish thing I did 11:02
timo '/home/timo/perl6/install/bin/nqp-m' --module-path=blib --ll-exception --target=mbc --output=blib/Perl6/Actions.moarvm gen/moar/Actions.nqp
list_b op with 384 blocks
MasterDuke_ But feel free to optimize it, I think it’s been untouched since then
timo when something is executed once instead of 385 times, it's *gotta* have an impact!
MasterDuke_ So if the dynamic variable can be removed that would be great 11:03
timo the metamodel has 412 blocks, can't wait to see what the core setting has to offer 11:06
17.3k blocks :D 11:08
MasterDuke_ 17.3k blocks oughta be enough for everybody 11:09
timo i'm doing a little timing for before/after moving the lookup out of the loop for list_b 11:14
m: say my $b = 1631272404295393032 - 1631272404295036551; say my $a = 1631272616735221172 - 1631272616704178601; say 100 * $b / $a 11:18
camelia 356481
timo m: say 1631272739972972185 R- 1631272741032432729 11:19
camelia 1059460544
timo m: say 31042571 / 1059460544
camelia 0.0293003559
timo haha, i improved something that takes a little less than a thirtieth of a second in the core setting stage mast
but i made it almost 100x faster 11:20
MasterDuke_ The accumulation of marginal gains can yield noticeable results 11:21
timo the other improved ops are much much harder to measure since they are called a whole lot more often and then i'll be reaching "too much overhead for measuring" territory 11:22
jnthnwrthngtn I think anything that doesn't harm code quality (and this arguably improves it anyway) and gives a measurable, even if tiny, win is worth it; they add up. 11:23
Especially when added up over all people using Raku. :) 11:25
timo m: say 7926346 / 1059460544 11:26
camelia 0.0074814924
timo 0.0075 seconds off of the install_core_modules task :)
which took 20.58 seconds in total 11:27
+++ Cleaning up MOAR 11:33
rm -f --
inst-perl6-debug-m.o .o .o .o .o .o dynext/libperl6_ops_moar.so 11:34
someone *really* wants .o gone
i'm running a spec test to be extra sure 11:40
11:42 MasterDuke_ left 12:02 reportable6 left 12:04 reportable6 joined
timo it passed, but i forgot about the spec test run until now :D 12:15
sena_kun no new findings in Blin so far. the modules we prepared PRs for should be bumped for the fixes to be visible, but other than that all the hard things left. 12:42
I'm running one with zef now.
4 modules, 1 needs a PR and 3 others are hard, from less deps to more deps: hide-methods, DateTime::Timezones, HTML::Canvas. 12:43
maybe running with zef will uncover something too 12:44
12:44 frost left
lizmat sena_kun: any idea what's wrong with hide-methods ? 12:45
Nicholas to check, if I understand "4 modules, 1 needs a PR and 3 others are hard" - that means just 3 modules left to investigate, that are broken by new-disp but work on master?
sena_kun lizmat, that's a question for jnthnwrthngtn I'd say.
lizmat what is the error mode?
sena_kun Nicholas, well, AFAIK they are "investigated", it's just that to create a fix is hard. 12:46
lizmat, # Failed test 'could B.bar not be found now'
lizmat hmmm... if does use ^find_method 12:48
sena_kun lizmat, it's a diff between master and new-disp, so it is a matter of new-disp to be resolved I believe. 12:49
lizmat ok, then I won't touch hide-methods until jnthnwrthngtn tells me I should fix something :-) 12:50
i guess this is the real troublezone: 12:52
# cannot use nextcallee because that would refer
# to the original method that got wrapped.
if $class.^mro[1].can($name).head -> &nextone {
nextone(SELF, |c)
timo i wonder what makes the optimizer's visit_children so expensive. is it just that it's called so often, or is it something like "the istype if/else cascade is not optimal for the typical distribution of types" or something 12:55
12:56 tealecloud left 12:57 tealecloud joined
timo 13498 Stmts 1st 13:06
17642 Block
18600 typecheck
37405 Want
43675 Xval or vm
56221 Stmts 2nd
113632 wval
308335 Op
323898 Var
the order they are in is op, want, var, block, stmts 1st, stmts 2nd, regex, wval, typecheck, xval or vm 13:07
13:08 brrt joined
timo so want is a whole lot earlier in the checks than it tends to appear (in the core setting), op and var could be switched but it's not a big difference, wval or vm wants a lot earlier 13:09
brrt good * #moarvm 13:14
dogbert11 hello brrt 13:16
jnthnwrthngtn timo: I'd not invest too much time in Perl6::Optimizer as I'm not sure it'll survive in its current form beyond rakuast
timo ah i guess that's a good point :)
jnthnwrthngtn Of course, easy wins like reordering those branches by usage is fine 13:17
o/ brrt
sena_kun: Wow, so we're really down to just 4 modules broken? :D
timo the most expensive code frame showing up in my perf reports is MATCH followed by _ws, then termish, then !alt
not sure i can actually get any performance out them as easily
jnthnwrthngtn (Well, aside from those using nqp::ops where we've decided it's SEP...) 13:18
timo SEP?
jnthnwrthngtn Somebody Else's Problem
timo oh, that was about modules, right
jnthnwrthngtn yes :) 13:19
timo i should also mention of course that MVM_interp_run has like 7.82% so that's frames that aren't jitted
so those just don't show up in the measurements as individual entries
sena_kun jnthnwrthngtn, well, tbh I am not fond of pakku output with some modules, I am trying zef but the update seems to be still running after... 36 minutes and so I cannot run it. So it still a possibility there are more of them, of course. But maybe not by a lot.
brrt timo: does that imply that we JIT 92% of runtime? 13:20
(or that 92% of the runtime, is JITted code?)
or am I reading that number entirely wrong
jnthnwrthngtn brrt: I have an op sp_resumption which at runtime is just meant to do the same as `null` (e.g. null out one register), but it has a bunch of other operands...in the lego JIT it was easy to just make it JIT like `null`, when I tried to copy the `null` template in the EXPR JIT I got an error from the template compiler. 13:21
timo well, that's not inclusive
jnthnwrthngtn brrt: Dunno if you've time/interest to have a poke at it; I'm guessing it's something small/silly 13:22
brrt I have time and interest :-)
timo it literally just means time spent directly in interp_run. the next functions are frame_dispatch, disp_program_run, nqp_nfa_run, fixed_size_alloc, spesh_arg_guard_run, allocate_frame
brrt ah, okay
and any other downstream functions, too, I presume 13:23
jnthnwrthngtn brrt: OK, then new-disp branches in moar/nqp/rakudo, copy-paste the template for null and change the name to `sp_resumption`, compile and then run probably anything in Rakudo
iirc it was an oops
brrt ah
only 220 commits away from master... 13:24
oh, I was far behind with that branch 13:30
hey, isn't that an implicit goto though?
no, it has varargs registers 13:31
timo looks like reordering did not make a measurable difference so *shrug* 13:36
jnthnwrthngtn brrt: Varargs regs, but they don't matter at all here 13:47
brrt I think it is this: 'MoarVM oops in spesh thread: Can't add constant for operand type 18'
jnthnwrthngtn Looks familiar
brrt I ... think I know that code :-) 13:56
that would be MVM_reg_uint16 probably 13:57
jnthnwrthngtn Ah, just something the template compiler doesn't know about? 14:00
timo brrt: do you isfalse_s or do i? 14:08
do we get a little node that negates an int register perhaps? 14:10
maybe i'll introduce MVM_JIT_RV_INT_NEGATED 14:16
14:27 MasterDuke_ joined
MasterDuke_ timo: I was doing some experimenting with jitting isfalse_s recently, by changing the c function to take a flag whether or not to negate the result (some other similar function in the same file does the same thing), but my rough testing showed it to be slower 14:29
But istr it was difficult to find good test raku code 14:30
timo well, i made the return value mode now that negates the value
MasterDuke_ Might want to search the logs, I may have given some more details then
timo that shouldn't make it slower in general 14:31
after jitting isfalse_s did anything commonly replace the bail with something else?
14:32 MasterDuke_ left 14:33 MasterDuke_ joined
MasterDuke_ Yeah, that’d probably be better. Don’t remember why I didn’t try implementing that 14:33
Might be able to simplify whatever that other op was too 14:34
timo twitter.com/davidrevoy/status/1436...9768081412 - cool we have a nice illustration of it now 14:36
oh, did you have other ops where the negation would be helpful? 14:37
also it's impressive that we negate an integer by issuing test + setnz + movxz + final mov to the target register 14:38
MasterDuke_ MVM_coerce_istrue has a ‘flip’ parameter that you might be able to remove 14:39
timo that is kind-of API, though 14:43
MasterDuke_ How do you mean? The nqp op doesn’t have that parameter 14:47
Ha, look at src/spesh/optimize.c:1221 that can be simplified (using the existing flip argument), fine as is if the arg gets removed 14:49
timo MVM_blahblah that are exported can be considered API that users of moarvm could use 14:50
i'm on new-disp, the lines don't line up 14:51
MasterDuke_ Oh it’s gone there, I guess part of that whole coerce_* turned into dispatchers 14:52
Geth MoarVM/new-disp: 325625c74f | (Bart Wiegmans)++ | 2 files
[JIT] Compile MVM_reg_uint16 constants
timo dispatch_n isn't jitted, which is used for nqp-numify 14:58
sorry, i meant sp_dispatch_n
jnthnwrthngtn Oh...I thought all the sp_dispatch_* were 15:21
I suspect that's easy to rectify
brrt++ # that was an easy one 15:28
brrt: Bytecode invocation is now a single op (`sp_runbytecode_*`) rather than a sequence of ops, which maybe will open the way to expr JIT of that too 15:29
Or at least remove one pain point
15:33 evalable6 left, linkable6 left
brrt I can have a look at some (later) time 15:37
jnthnwrthngtn No hurry at all. :) They are done (by nine++) in the lego jit already 15:44
brrt :-) 15:45
time for dinner!
jnthnwrthngtn Enjoy
lizmat brrt: but nine is on holiday for the next 10 days
nudge nudge :-) 15:46
brrt hehe 15:48
nudge received
15:52 MasterDuke_ left 15:53 brrt left
Nicholas oh, his timing is excellent 16:03
ASAN is very excited, and the only thing that changed was 1 commit on MoarVM
.tell brrt You've made ASAN very excited! paste.scsys.co.uk/595856 16:05
tellable6 Nicholas, I'll pass your message to brrt
Nicholas compiling the core setting
Geth MoarVM/new-disp: 4bcc5a0e88 | (Jonathan Worthington)++ | src/core/args.c
Remove always-false null checks

There will always be at least one callstack record below an executing frame record, and if that is a region start, it means there's a whole region of records prior to it. Spotted by mlschroe++.
Nicholas not really in a position to test that. building a non ASAN build to see whether valgrind also gets excited 16:13
jnthnwrthngtn grmbl, I tested it before rebasing with brrt's PR and now more core setting build fails with "stack smashing detected" 16:16
ah, I see what's wrong 16:18
Nicholas you're doing better than I am
.oO( Smashing stack, grommit )
Geth MoarVM/new-disp: c0320b0818 | (Jonathan Worthington)++ | src/jit/expr.c
Survive varargs ops in the expression JIT

They can have more operands than the maximum for non-vararg ops, which could lead to buffer overruns.
Nicholas if it makes more sense to rebase to fix this cleanly, please do it. I'm not going to come visit and wave a pitch fork in frustration, even though I *think* I probably can actually travel to you with one without hitting security theathre 16:21
jnthnwrthngtn Naughty fingers typed "bugger overruns", glad I caught that before pushing.
.oO( it is better to be buffered )
Nicholas oh, my, varargs. The thing that (if I vageuly remember) the x86_64 ABI says have the argument count in a register. That regular functions do not. 16:22
At least, IIRC, you can't safely cast one to the other and have it work
jnthnwrthngtn Nicholas: varargs MoarVM ops, not ABI varargs, thankfully
Nicholas ah OK. phew
jnthnwrthngtn They don't imply having to do C-level varargs calls
Mixed feelings on the rebase; the reason that the commit introduced a SEGV was more that it uncovered another shortcoming, rather than being in itself wrong 16:23
Nicholas yes, then I think don't 16:24
"it should have worked" rather than "whoops, goof"
um, oops, ASAN even more excited 16:25
now in NQP. (twice, due to parallel make) 16:26
jnthnwrthngtn Oh, curious. Maybe I didn't understand the expr jit code as well as I thought then...
Curious, my Rakudo build completed fine :)
Nicholas er, two different backtraces. 16:27
gen/moar/stage1/nqpmo.moarvm is paste.scsys.co.uk/595858 16:28
jnthnwrthngtn lol 16:29
Nicholas er, no, I'm confused, they are the same i think 16:30
jnthnwrthngtn Just before that
/* A HACK.
OK, I can fix this easily
Nicholas again, "you're doing better than I am" 16:31
I'm told that it's time to eat. So &
Geth MoarVM/new-disp: ebfd4d5d6e | (Jonathan Worthington)++ | src/jit/expr.c
Ensure at least 2 slots in expr JIT operands array

It relies on this when compiling inc_i/dec_i ops.
16:33 evalable6 joined
jnthnwrthngtn That should do it 16:33
Happy nomming 16:34
16:34 linkable6 joined 16:57 sena_kun left 17:49 leont left 17:50 leont joined 18:02 reportable6 left 18:04 reportable6 joined 18:13 brrt joined
Geth MoarVM/new-disp: 29b67875bd | (Jonathan Worthington)++ | 3 files
Better sp_bindcomplete handling on inline

Instead of making it block an inline, we can simply delete it, since if we're inlining then (at least for now) we know it will be a no-op.
MoarVM/new-disp: 4f2b0bef8e | (Jonathan Worthington)++ | src/spesh/disp.c
Propagate value result facts in dispach programs

This means that dispatchers that resolve to the identity function - such as hllize - will not end up losing the facts known about their input.
jnthnwrthngtn Curiously, getting dispatch program translation to not insert guards if we already have facts about types proving the guard isn't needed leads to a SEGV in NQP compilation. 18:28
Will have to track down why; that should be a trivial opt to do. (And while in theory we could eliminate those guards later, their insertion involves an SSA version split, which is quite expensive.) 18:30
Though it seems we aren't always deleting proven guards either, and evidently aren't doing so in these cases.
Time for food/rest 18:31
18:33 tealecloud left 19:11 tealecloud joined 19:26 brrt left 19:32 brrt joined
Nicholas brrt: jnthnwrthngtn fixed the thing that .tell will tell you about 19:43
jnthnwrthngtn: now somewhere in spectest with your most current MoarVM 19:44
brrt o 19:45
tellable6 2021-09-10T16:05:09Z #moarvm <Nicholas> brrt You've made ASAN very excited! paste.scsys.co.uk/595856
brrt huh
Nicholas see scrollback. your change hit a bug
er, log. 19:46
brrt ah, I see it
broken assumptions were broken; MVM_MAX_OPERANDS no longer made sense 19:47
my C89 heart is a little disappointed by the variable size array 19:52
but I'll accept
timo sp_bindcomplete also showed up regularly on the bailed jit messages 20:32
Geth MoarVM/new-disp: 32e0c279f7 | (Timo Paulssen)++ | 3 files
implement negating return values for c functions, isfalse_s
jnthnwrthngtn Nicholas: Does that mean no ASAN excitement? 20:46
dogbert11 jnthnwrthngtn: there are some bugs lurking in new-disp 21:08
e.g. try running: while MVM_SPESH_NODELAY=1 ./rakudo-m -e 'sleep 1; sub f($str) { my $num = +$str; }; f "3.0/2"'; do :; done 21:09
21:16 tealecloud left
timo Type check failed in binding to parameter 'nu'; expected Int but got Rat (3.0) 21:38
this error?
dogbert11 timo: yes 21:51
21:51 brrt left
timo ok so we've got the check istype($result, Int) && istype($denom, Int) and that calls Rat.new, otherwise we call infix:</> with the two numbers 21:59
and it is in fact calling Rat.new here
i can't seem to get the code of interest into the spesh log, let's see ... 22:01
ah, haha 22:03
i was ctrl-c'ing during the "sleep 1" of the next run
so the code had in fact not happened yet
dogbert11 I snuck the 'sleep 1' since it helped reveal the problem quicker 22:06
timo when i have time i'll have to rr this i'm afraid 22:34
22:51 evalable6 left, linkable6 left 22:52 evalable6 joined 22:54 linkable6 joined 23:54 coverable6 left, tellable6 left, linkable6 left, unicodable6 left, benchable6 left, releasable6 left, nativecallable6 left, sourceable6 left, bloatable6 left, statisfiable6 left, bisectable6 left, reportable6 left, evalable6 left 23:55 tellable6 joined, benchable6 joined 23:57 nativecallable6 joined, coverable6 joined