Welcome to the main channel on the development of MoarVM, a virtual machine for NQP and Rakudo (moarvm.org). This channel is being logged for historical purposes.
Set by lizmat on 24 May 2021.
Nicholas good *able6, #moarvm 07:25
MasterDuke i spun up my win10 vm last night to track down the (hopefully final) windows bug in my gmp branch. but then i realized it was late on a sunday and i didn't really want to do debugging on windows. so i killed virtual mechanical dinosaurs instead 07:51
patrickb MasterDuke++ # being reasonable by keeping things -Ofun 07:59
nine MasterDuke: virtual mechanical dinosaurs? Ha! Same here :) 08:25
MasterDuke nine: also playing horizon zero dawn? or has the next one been released already? 08:34
brrt \o 08:37
Nicholas o/
MasterDuke brrt: while you're here. is there a way to invert the result of a function call in the lego jit? it's easy in the template jit, but i don't know how to do it in src/jit/graph.c 08:39
nine MasterDuke: Yes, currently in my first New Game+ 08:41
MasterDuke ah, nice. this is my first time playing it (i've gotten years behind in my games) 08:42
patrickb Just noticed that moarvm.org/roadmap.html and moarvm.org/features.html are largely out of date. 08:44
moarvm.org/contributing.html still has freenode on it 08:46
brrt MasterDuke: I thought there was a way but I don't recall 08:47
MasterDuke k. it's easy enough to add an argument to just flip the return value 08:48
brrt there isn't, yet.
but, there is `MVMJitRvMode` which will allow you to add stuff
Nicholas patrickb: I think it's possible to make PRs against github.com/MoarVM/moarvm.org/ 08:49
MasterDuke i meant it's easy in this case to add an argument to the function i'd be calling. but interesting, i hadn't thought about a new MVMJitRVMode... 08:53
patrickb Nicholas: I thought about creating a PR, but figured, that especially for the roadmap and features pages I don't have a broad and deep enough insight into the current state of things and where potential for improvement is / what the plans are to make a decent update.
Nicholas makes sense. 08:54
brrt that's what we use MVMJitRvMode for, also for dereference and stuff
Nicholas I should add that I have no idea how things get from that repository to deployment and publication
jnthnwrthngtn Any commits to the master branch will be automatically deployed in 10-15 minutes 10:25
(for moarvm.org, that is)
Geth MoarVM: 2f5c21fb65 | (Nicholas Clark)++ | src/core/str_hash_table.c
Fix another bug in MVM_str_hash_fsck().

Calling `MVM_str_hash_entries` when just the control structure was allocated
  (the empty hash optimisation) would trigger an assertion failure.
We need to check `control->cur_items` and `control->max_items` explicitly. It also makes sense to check for `control` being NULL and handling that case
  (instead of segfaulting).
10:38
lizmat and yet another Rakudo Weekly News hits the Net: rakudoweekly.blog/2021/07/26/2021-...in-summer/ 13:27
jnthnwrthngtn lizmat++ 13:37
lizmat: I guess "branche" is a typo
nine lizmat: dogbert17++ has helped a lot with new-disp as well by finding reproducible ways to provoke the segfaults I then fixed 13:40
lizmat jnthnwrthngtn: indeed, fixed :-) 13:41
added dogbert17++ 13:42
nine lizmat++
jnthnwrthngtn Turns out that I left the commit where I was going to debug the SSA version splitting thing at home. No matter; support for `callwith`/`nextwith` also needs doing :) 13:44
Altai-man jnthnwrthngtn, gods bless you. 13:52
dogbert17 hooray, I'm a celebrity :) 14:01
sometimes I get spurious test fails with the message 'No subtests run'. When I rerun the test everything is ok. 14:20
but this time I got a coredump 14:21
nine, timo: is it possible to extract anything of interest from this: gist.github.com/dogbert17/0825eaf6...0f8604eb65 14:25
jnthnwrthngtn dogbert17: That'd imply spesh_cand is somehow bogus; no idea how 14:26
oh
It could be unlucky timing
dogbert17 it *feels* timing related 14:27
jnthnwrthngtn Hm, my first idea doesn't quite make sense
dogbert17 darn
jnthnwrthngtn I wondered if it was because we didn't initilize ->spesh_cand to NULL before doing the logging, so it'd be a bogus pointer when we walk the callstack 14:28
MasterDuke that's on new-disp?
jnthnwrthngtn But no, I can see the nulling right above it
MasterDuke: Must be, since MVM_frame_dispatch is in the stacktrace and that doesn't exist on master :)
MasterDuke heh. hard to argue with that logic 14:29
dogbert17 I guess I'll have to run with --no-optimize for a while hoping that the problem returns 14:36
jnthnwrthngtn m: say 1325 / 1349 14:43
evalable6 0.982209 14:44
dogbert17 passing tests?
jnthnwrthngtn Yeah, +4 higher than the best number I've seen so far, but we're gradually losing more due to missing fixes from `master`, so I think the work I just did for `callwith` on methods got us 6 new fully passing test files 14:45
dogbert17 that's really cool. Are you considering a rebase? 14:46
jnthnwrthngtn Sometime later this week. 14:47
They're a little inconvenient for everyone working on the new-disp branches
dogbert17 so how many tests are left if we ignore master
jnthnwrthngtn Don't know exactly, but I think less than 20 14:50
jnthnwrthngtn gets more tea and then has a go at making callwith work with the wrap dispatcher 14:51
jnthnwrthngtn Hurrah, now all but 1 of the wrap tests passes 16:04
The failing one being thanks to the wrong exception type when there's no resumable dispatch in scope. 16:05
MasterDuke jnthnwrthngtn: are is(true|false)_s going to become dispatchers? 16:09
jnthnwrthngtn No 16:11
There's nothing to dispatch
They work on a str register
And always do the same thing 16:12
[Coke] moarvm.org - don't think the source for this is in a repo...
... dammit, didn't review! :)
MasterDuke oh. right. timo mentioned that earlier. everyone forget i asked that question (again)...
Nicholas [Coke]: has the tide in your coffee cup gone out? 16:13
[Coke] Nicholas: well, I didn't try to add the gatorade powder to the coffee today, so I guess I'm doing better.
Nicholas :-) 16:14
[Coke] afks to do some errands.
jnthnwrthngtn Quite a lot of new passing tests with wrap callwith fixed up, alas no new entirely passing test files to add to the tally.
Altai-man [Coke], what's wrong with moarvm.org? 16:22
jnthnwrthngtn Altai-man: It was mentioned in backlog that it still mentions freednode 16:23
Altai-man jnthnwrthngtn, ah, on the contributing page, alright. I thought I screwed something up with a release.
jnthnwrthngtn No :)
Altai-man btw, no https is apparently a point of annoyance for it as well. :( 16:24
jnthnwrthngtn moarvm.org/ # works fine? 16:32
Altai-man ah, you're right 16:33
Geth MoarVM/new-disp: df38d6b7fa | (Jonathan Worthington)++ | 3 files
Allow configurable "no resumption in scope" error
16:42
jnthnwrthngtn m: say 1328 / 1349 17:23
evalable6 0.984433
jnthnwrthngtn Home time o/ 17:24
Altai-man maybe it's time for a blin run 17:27
MasterDuke this is backwards progress. i jitted is(true|false)_s and got my test runtime to increase from 0.37s to 0.43s 17:38
brrt oops.... 17:50
MasterDuke anyone see anything terribly wrong about gist.github.com/MasterDuke17/87042...659d8d216f ? 17:53
my test case is `my int $a := 0; my $b; my int $s := nqp::time; my str $si := ~$s; while $a < 100_000_000 { if $si eq "" { $b := 3 } else { $b := 5 }; ++$a }; say(nqp::time - $s); say($b)` in nqp 17:54
nqp: my int $a := 0; my $b; my int $s := nqp::time; my str $si := ~$s; while $a < 10_000_000 { if $si eq "" { $b := 3 } else { $b := 5 }; ++$a }; say(nqp::time - $s); say($b) 17:58
oh, camelia is down
nine: ^^^
but wow. that version above take ~0.37s locally. if the loop body is instead `$b := $s eq "" ?? 3 !! 5;` it takes ~5.8s 18:00
MasterDuke each version only has 14 frames. but the if does no garbage collections and the ternary version does 1144 gcs. 100% of the extra time is spent in gc 18:05
nine So what's the ternary version allocating? 18:18
MasterDuke VMString
100000004 of them vs 6 18:19
nine Wait a minuge. In one it's $si eq "" in the other it's $s eq "" 18:23
MasterDuke they have different --target=optimize, but each version is essentially the same as their body. if, condition, bind, value, bind, value vs bind, if, condition value, value
doh 18:24
now ternary is only ~0.04s slower 18:25
nine ha 18:26
MasterDuke profiles look identical, except ternary is slower, but no obvious reason why 18:27
nine what's the margin of error?
MasterDuke don't know exactly. results are relatively consistent 18:28
fastest ternary was 0.410s, slowest if was 0.379 18:29
over ~10 runs of each
same with MVM_SPESH_BLOCKING=1 18:32
seems like the difference is probably still there with spesh disabled 18:35
nine m: say "glad to be back" 18:43
camelia glad to be back
MasterDuke nqp: my int $a := 0; my $b; my int $s := nqp::time; my str $si := ~$s; while $a < 100_000_000 { $b := $si eq "" ?? 3 !! 5; ++$a }; say(nqp::time - $s); say($b) 18:45
camelia 382541405
5
MasterDuke nqp: my int $a := 0; my $b; my int $s := nqp::time; my str $si := ~$s; while $a < 100_000_000 { if $si eq "" { $b := 3 } else { $b := 5 }; ++$a }; say(nqp::time - $s); say($b)
camelia 346955696
5
nine nqp: my int $a := 0; my $b; my int $s := nqp::time; my str $si := ~$s; my $three := 3; my $five := 5; while $a < 10_000_000 { $b := $si eq "" ?? $three !! $five; ++$a }; say(nqp::time - $s); say($b) 18:58
camelia 31529592
5
nine nqp: my int $a := 0; my int $b; my int $s := nqp::time; my str $si := ~$s; while $a < 10_000_000 { $b := $si eq "" ?? 3 !! 5; ++$a }; say(nqp::time - $s); say($b) 18:59
camelia 36367617
5
nine nqp: my int $a := 0; my int $b; my int $s := nqp::time; my str $si := ~$s; while $a < 10_000_000 { $b := $si eq "" ?? 3 !! 5; ++$a }; say(nqp::time - $s); say($b)
camelia 39076562
5
MasterDuke both are a little bit faster with the $three/$five, but the difference between the two remains (locally) 19:00
MasterDuke perf shows ~%22 spent in MVM_coerce_istrue_s in the fastest version, but yeah, my jitting makes it worse (i assume it's not the jits fault, but the added branch in MVM_coerce_istrue_s 19:09
nine Well it comes down to a single set instruction that makes the difference. The ternary version actually writes the result into a temporary register in the branches and at the end writes from the temporary to the target register. The if/else version writes to the target directly.
MasterDuke nice find. is the temp reg use in the ternary version necessary? 19:10
nine Not even spesh can eliminate this additional set, because we cross a PHI boundary
MasterDuke huh. i've always thought of ternary as same-to-faster than if/else 19:11
nine Well...obviously no, because there's an equivalent version that gets by without it. But, writing a code generator that pulls this off might be a bit tricky
It looks easy in a case like $a := $cond ?? $b !! $c; But ternaries are not just used in assignments. For something like foo($cond ?? $a !! $b) you very much need that register for the result. 19:14
nqp: my int $a := 0; my int $b; my int $s := nqp::time; my str $si := ~$s; while $a < 10_000_000 { $si eq "" ?? ($b := 3) !! ($b := 5); ++$a }; say(nqp::time - $s); say($b) 19:17
camelia 35218600
5
nine This compiles to exactly the same as the if/else version
MasterDuke well, i guess if it's any consolation, both nqp and rakudo have only a couple lines that match `git grep -P '^\s*\$\w+ := .*\?\?'` 19:20
which actually kind of surprises me 19:21
oh, a few more if you use \S+ instead of \w+ 19:22
and a lot of other are broken up over multiple lines and won't match that regex 19:24
[Coke] Didn't lizmat do a bunch of work to replace unoptimized raku/nqp code with more optimized nqp calls? wouldn't surprise me if that's why 19:25
... also I wonder if once new-disp hits if we can consider unrolling some of that.
MasterDuke `git grep -P -P '^\s*[$@%]\S+\s+:= .*\?\?'` finds some more 19:27
i also find it crazy that literal numbers in nqp are slower than binding a variable to the number and then using the variable. how difficult would that be to optimize? 19:33
nine Well the $foo := $cond ?? $a !! $b pattern does lend itself well to static optimization. 19:42
The issue with the literal numbers is that they will get boxed for every loop iteration, while my version with $three and $five pulls that boxing out of the loop. 19:43
The my int $b; version of course doesn't have the boxing overhead in the first place 19:44
MasterDuke hm, right 19:46
lizmat so: $foo := $cond ?? a !! b is bad ?? 19:55
MasterDuke it's a tiny bit slower than $cond ?? ($foo := a) !! ($foo := b) 19:59
i wouldn't go about mass re-writing any of nqp/rakudo yet 20:03
nine It's really very tiny bit slower 20:05
lizmat but isn't that more bytecode, and thus less likely to inline?
nine by about one instruction in the loop 20:06
timo i'm thinking since the exprjit currently doesn't cross bb boundaries, we're leaving a little on the table in terms of performance when emitting the actual assembly code 20:13
[Coke] (rewrite) no, not without a clear advantage, agreed.
timo nine: when you say that one set tthat we can't eliminate crosses a phi boundary, can we perhaps change what writes to that register to write to the register that set writes to? 20:14
nine Well it's certainly possible, since there is the equivalent version that gets by without the temp register. Getting it right may just be tricky. 20:16
timo here's an idea, put the ++$a in front, before the ??!! 20:24
don't think it'll actually do a lot 20:25
but it'd put the add operation into the first block's exprjit tree
MasterDuke oh wow, that is a bunch faster 20:26
nqp: my int $a := 0; my int $b := 0; my int $s := nqp::time; while $a < 1_000_000_000 { $b := $a < 100 ?? 3 !! 4; ++$a; }; say(nqp::time - $s); say($b) 20:27
camelia 1639696338
4
MasterDuke nqp: my int $a := 0; my int $b := 0; my int $s := nqp::time; while $a < 1_000_000_000 { ++$a; $b := $a < 100 ?? 3 !! 4; }; say(nqp::time - $s); say($b)
camelia 1421361707
4
MasterDuke i get about the same results locally with MVM_SPESH_BLOCKING=1 20:28
nqp: my int $a := 0; my int $b := 0; my int $s := nqp::time; while $a < 1_000_000_000 { ++$a; $a < 100 ?? ($b := 3) !! ($b := 4); }; say(nqp::time - $s); say($b) 20:29
camelia 1651706311
4
MasterDuke now that version is slower
timo so just noise?
MasterDuke yes. no. dunno 20:30
maybe we'd need to look at more realistic code to see the difference
optimizing/benchmarking is hard, i'm going to go kill some more virtual mechanical dinosaurs 20:31
timo good choice 20:37
jnthnwrthngtn m: say 1330 / 1349 22:29
camelia 0.985915
jnthnwrthngtn 3 more from callwith/nextwith support for multi dispatch
ah, sorry, 2 more 22:30