#moarvm on 22 October 2021 - Raku Programming Language Log

Welcome to the main channel on the development of MoarVM, a virtual machine for NQP and Rakudo (moarvm.org). This channel is being logged for historical purposes. Set by lizmat on 24 May 2021.
00:02 reportable6 left 00:05 reportable6 joined 00:37 evalable6 joined 01:37 evalable6 left, linkable6 left 01:38 linkable6 joined 02:18 [Coke] left 02:21 [Coke] joined 02:27 tbrowder left 02:28 tbrowder joined 02:39 evalable6 joined 05:33 notable6 left, linkable6 left, statisfiable6 left, committable6 left, quotable6 left, unicodable6 left, squashable6 left, nativecallable6 left, evalable6 left, coverable6 left, shareable6 left, sourceable6 left, bloatable6 left, benchable6 left, bisectable6 left, greppable6 left, tellable6 left, releasable6 left, reportable6 left, evalable6 joined, linkable6 joined 05:35 nativecallable6 joined, tellable6 joined, benchable6 joined 05:36 statisfiable6 joined, quotable6 joined 06:02 reportable6 joined
nine	jnthnwrthngtn: we really should merge github.com/MoarVM/MoarVM/pull/1573 before the release	06:27	Copy link Message link Add to gist Remove
06:33 bisectable6 joined 06:34 releasable6 joined 06:35 bloatable6 joined, coverable6 joined 07:31 patrickb joined 07:33 sourceable6 joined 07:34 greppable6 joined, notable6 joined 08:33 squashable6 joined 08:35 shareable6 joined
lizmat	I'll wait to do a bump until that merge happened	09:04	Copy link Message link Add to gist Remove
jnthnwrthngtn	moarning o/	09:34	Copy link Message link Add to gist Remove
	Bit breezy on the walk to the office, but nothing like I imagine it was yesterday :)		Copy link Message link Add to gist Remove
	Good: a sudden gust didn't blow me into the river		Copy link Message link Add to gist Remove
	Bad: a sudden gust didn't blow me into a pub either		Copy link Message link Add to gist Remove
09:34 committable6 joined
jnthnwrthngtn	nine: Oh! Somehow I thought that was already in; I'm not sure why	09:35	Copy link Message link Add to gist Remove
Nicholas	\o		Copy link Message link Add to gist Remove
Geth	MoarVM: b92ca73b48 \| (Stefan Seifert)++ \| src/spesh/optimize.c Fix uninitialized registers after deopt from dispatch guards A dispatch gets translated to a sequence of operations culminating in the runbytecode instruction. The pre-deopt index of the original instruction will be found on the runbytecode itself or any of the guards stacked up before it. When looking for the pre-deopt index, we didn't take into account, that the instruction holding a suitable deopt pre ins annotation may also itself have ... (10 more lines)	09:37	Copy link Message link Add to gist Remove
	MoarVM: 8c7b734d87 \| (Jonathan Worthington)++ (committed using GitHub Web editor) \| src/spesh/optimize.c Merge pull request #1573 from MoarVM/fix-pea-segfaults Fix uninitialized registers after deopt from dispatch guards		Copy link Message link Add to gist Remove
lizmat	I'll take that as my cue ?	09:38	Copy link Message link Add to gist Remove
	jnthnwrthngtn ^^		Copy link Message link Add to gist Remove
jnthnwrthngtn	Yup		Copy link Message link Add to gist Remove
lizmat	oki		Copy link Message link Add to gist Remove
jnthnwrthngtn	japhb: Nice to see more new-disp speedup results. Even the worst of the rather variable mandelbrot measurements is better too...	09:46	Copy link Message link Add to gist Remove
lizmat	MoarVM bumpred	09:56	Copy link Message link Add to gist Remove
	.oO( bumpred? )		Copy link Message link Add to gist Remove
Nicholas	burped?		Copy link Message link Add to gist Remove
jnthnwrthngtn	So, further to my experiment with moving ->work allocations into the callstack, it seems that we can also safely allocate env there for stack-allocated frames, with the condition that upon heap promotion, we also copy the env area into something allocated by the FSA too	10:01	Copy link Message link Add to gist Remove
nine	jnthnwrthngtn: regarding the finalizer discussion. What makes a point (like returning) safe for invoking something? Or what's the problem with invoking after arbitrary ops?	10:03	Copy link Message link Add to gist Remove
jnthnwrthngtn	Reasoning: the only way we could end up with something looking at the ->env at a distance (such as from another thread) is if there was a heap reference.		Copy link Message link Add to gist Remove
	nine: At a minimum, deopt relies on every such point being a deopt all point		Copy link Message link Add to gist Remove
	nine: And that same mechanism is also used by the frame walker and now also dispatch resumption	10:04	Copy link Message link Add to gist Remove
	You even fixed a bug not long back where we had an invocation by, I think, loadbytecode, and it wasn't marked as a deopt all point		Copy link Message link Add to gist Remove
nine	Is there a cost to making something (like e.g. goto) a deoptallpoint?	10:05	Copy link Message link Add to gist Remove
	Hah! Indeed I did :)		Copy link Message link Add to gist Remove
jnthnwrthngtn	Yes, deopt points imply deopt usages in the spesh graph. Those in turn inhibit things like DCE, set elimination, etc.		Copy link Message link Add to gist Remove
	To the degree I've been trying to work out how a replay scheme would look (that is, we deopt not to the precise instruction, but to the last pure instruction, so deopt points come earlier and we get less deopt usages)	10:06	Copy link Message link Add to gist Remove
	Almost every time you look at a spesh log and think "grr, why didn't it delete this instruction", the answer is a deopt usage.	10:07	Copy link Message link Add to gist Remove
nine	Feels like I should have been able to come up with all these answers myself :/ I certainly knew each of these bits already at some point		Copy link Message link Add to gist Remove
jnthnwrthngtn	Not sure they're obvious.	10:08	Copy link Message link Add to gist Remove
nine	Well I did fix a missing deopatllpoint and I did wonder quite a few times why we couldn't simplify the bytecode some more and ran into deopt when investigating	10:09	Copy link Message link Add to gist Remove
lizmat	hmm... the logs server appears to use significantly more memory after this bump		Copy link Message link Add to gist Remove
	(in my dev situation, the live server still runs on 2021.09	10:10	Copy link Message link Add to gist Remove
jnthnwrthngtn	grr, of course doing the env thing is going to make OSR a little more fun too...	10:14	Copy link Message link Add to gist Remove
nine	But, but, but, more fun is good, isn't it? :D	10:16	Copy link Message link Add to gist Remove
jnthnwrthngtn	Maybe not when it's me, Friday, and pointer arithmetic :D	10:17	Copy link Message link Add to gist Remove
	On the upside, I think I won't end up having to do a delicate GC dance around continuations, which was about 60% of the time spent on getting ->work allocated on the callstack	10:18	Copy link Message link Add to gist Remove
10:34 evalable6 left, linkable6 left 10:37 evalable6 joined 10:44 Altai-man joined
Altai-man	lizmat++ # bump	10:49	Copy link Message link Add to gist Remove
tellable6	2021-10-20T18:21:31Z #raku-dev <tbrowder> Altai-man you are very welcome		Copy link Message link Add to gist Remove
10:49 Geth left 10:50 Geth joined
nine	Nice to be able to close a segfault issue ticket with just a comment for a change :) (re #4520)	10:59	Copy link Message link Add to gist Remove
jnthnwrthngtn	grmbl, wonder how I've managed to make a segv...	11:26	Copy link Message link Add to gist Remove
	lunch, bbiab	11:41	Copy link Message link Add to gist Remove
12:02 reportable6 left 12:37 linkable6 joined
	jnthnwrthngtn back	12:49	Copy link Message link Add to gist Remove
	Guess I shouldn't feel too bad, my mistake involved one of the 3 hardest things in computer science... :P	13:19	Copy link Message link Add to gist Remove
Altai-man	You don't mean an off-by-one right? :P	13:21	Copy link Message link Add to gist Remove
nine	jnthnwrthngtn: the comment in github.com/MoarVM/MoarVM/blob/mast...sp.c#L1155 only talks about runbytecode, but that flag is also set for runcfunc. Which one is wrong?	13:23	Copy link Message link Add to gist Remove
jnthnwrthngtn	No, cache invalidation	13:27	Copy link Message link Add to gist Remove
nine	const_i64 r11(0), liti64(1099511627775) # [014] unboxed literal to value 1099511627775	13:36	Copy link Message link Add to gist Remove
	sp_runnativecall r5(3), r9(0), liti64(140737352346240), r10(0), r11(0)		Copy link Message link Add to gist Remove
	That's just beautiful :)		Copy link Message link Add to gist Remove
13:36 unicodable6 joined 13:37 linkable6 left
jnthnwrthngtn	:D	13:39	Copy link Message link Add to gist Remove
13:40 linkable6 joined
jnthnwrthngtn	nine: runcfunc in optimize.c does have code to free them	13:40	Copy link Message link Add to gist Remove
nine	so the comment should include runcfunc?	13:41	Copy link Message link Add to gist Remove
jnthnwrthngtn	nine: So I'd say the comment is wrong		Copy link Message link Add to gist Remove
	Yes		Copy link Message link Add to gist Remove
	I think it originally was only runbytecode and then it got tweaked		Copy link Message link Add to gist Remove
nine	"Make sure we delay release of temporaries since optimization can add further ones." covers it well enough I'd say. It's clear from the case statements to which ops this applies to	13:42	Copy link Message link Add to gist Remove
jnthnwrthngtn	up	13:44	Copy link Message link Add to gist Remove
	*yup		Copy link Message link Add to gist Remove
	OK, apart from fixing up OSR, moving env to the callstack for non-heap frames seems to work		Copy link Message link Add to gist Remove
	Another 1.5s off the full Rakudo build, around 1s of it from stage parse	13:45	Copy link Message link Add to gist Remove
nine	LOL "MoarVM panic: Unknown disaptch op when resolving callsite"	13:47	Copy link Message link Add to gist Remove
jnthnwrthngtn	wat		Copy link Message link Add to gist Remove
nine	Why can't I find the source of this message? Because I didn't copy and paste it into my ack command. I typed it fresh and didn't do the typo		Copy link Message link Add to gist Remove
jnthnwrthngtn	Oh, I only just spotted it!	13:48	Copy link Message link Add to gist Remove
nine	Turns out, there are quite a few places one needs to add new dispatchy ops to		Copy link Message link Add to gist Remove
	Now why does it try to mark that int register like an object pointer in the GC?	13:50	Copy link Message link Add to gist Remove
	Easy: because sometimes just copying code without understanding it is not really enough. Asked for a temp register with the wrong kind in the UnboxInt translation	13:53	Copy link Message link Add to gist Remove
Geth	MoarVM/new-disp-nativecall: 11 commits pushed by (Stefan Seifert)++ review: github.com/MoarVM/MoarVM/compare/e...1a37f36665	14:18	Copy link Message link Add to gist Remove
nine	This push contains the first working version of sp_runnativecall	14:19	Copy link Message link Add to gist Remove
jnthnwrthngtn	Wow, including JITting?	14:37	Copy link Message link Add to gist Remove
	Ah, I guess it's easily possible without that	14:38	Copy link Message link Add to gist Remove
	ah, I see :)	14:39	Copy link Message link Add to gist Remove
	Still very nice progress		Copy link Message link Add to gist Remove
nine	Surprisingly this seems to cover all the native calls that occur during csv-ip5xs.pl	14:46	Copy link Message link Add to gist Remove
	So in the good tradition of benchmark driven development, the next step will indeed be to get some JITing going	14:47	Copy link Message link Add to gist Remove
jnthnwrthngtn	Does it show an improvement with this much done?	14:54	Copy link Message link Add to gist Remove
nine	I do think so	15:02	Copy link Message link Add to gist Remove
15:05 reportable6 joined
nine	0m14.187s before 0m13.902s after (best of 10 runs each)	15:06	Copy link Message link Add to gist Remove
	Variability is high with results in a range of up to +2s, but values seem to be better on average as well.	15:07	Copy link Message link Add to gist Remove
jnthnwrthngtn	I guess the JITting would be the big win at this point, since it'll eliminate all of the dyncall/libffi setup overhead	15:08	Copy link Message link Add to gist Remove
nine	In theory it could have even been worse, as sp_runnativecall causes JIT bails which sp_dispatch_o doesn't		Copy link Message link Add to gist Remove
jnthnwrthngtn	Ah, yes, that also	15:09	Copy link Message link Add to gist Remove
Geth	MoarVM/cheaper-frames: 92f3bac575 \| (Jonathan Worthington)++ \| 5 files Allocate some frame environments on the callstack Only do this for frames that live on the callstack rather than on the heap. This is the case when we have a lexical environment but it is never captured or the frame doesn't escape in other ways. When frames do escape, we have to also move the environment out of the callstack and onto the heap. We already do go to some effort to allocate on the heap ... (5 more lines)	15:11	Copy link Message link Add to gist Remove
15:17 patrickb left 15:25 nebuchadnezzar left
nine	Now that I think of it, there's actually no reason to insist on native functions to box their results.	15:31	Copy link Message link Add to gist Remove
jnthnwrthngtn	Not at all :)	16:04	Copy link Message link Add to gist Remove
	Hm, I thought that splitting out the specialized vs. unspecialized forms of MVM_frame_dispatch would be a win, but it apparently is not one at all	16:07	Copy link Message link Add to gist Remove
	Or I did something wrong.		Copy link Message link Add to gist Remove
	Looking forward to the new nativecall JIT integration, so we can get rid of frame->args and frame->cur_args_callsite	16:09	Copy link Message link Add to gist Remove
nine	Would be nice if I could make it a much more regular part of the JIT, too	16:11	Copy link Message link Add to gist Remove
jnthnwrthngtn	m: say 1.222 / 1.278	16:19	Copy link Message link Add to gist Remove Run code
camelia	0.956182		Copy link Message link Add to gist Remove
jnthnwrthngtn	Seems the work/env move is another 4% off test-t		Copy link Message link Add to gist Remove
	Seems it's about that off everything that isn't a micro-benchark that ends up mostly inlined	16:21	Copy link Message link Add to gist Remove
nine	Hm...to be able to avoid the unboxing, we'd have to start out with a natively typed dispatch instruction like dispatch_i instead of dispatch_o. But can't get it to emit that even when assigning the result of an --> int64 sub directly into an int64 variable	16:23	Copy link Message link Add to gist Remove
jnthnwrthngtn	No, we don't do the code-gen for that properly yet	16:25	Copy link Message link Add to gist Remove
	I think setting a .returns on the QAST::Op call node would do it	16:26	Copy link Message link Add to gist Remove
nine	Sounds like a bit of a yak	16:31	Copy link Message link Add to gist Remove
jnthnwrthngtn	Indeed	16:41	Copy link Message link Add to gist Remove
	Probably not immediately worth it		Copy link Message link Add to gist Remove
nine	I don't even find where the call node really gets created	16:52	Copy link Message link Add to gist Remove
jnthnwrthngtn	nine: Maybe the easiest place is in the optimizer, where if we know what we're calling, we can look at the .returns of the callee, and set that on the QAST::Op call	17:00	Copy link Message link Add to gist Remove
	I don't think it can be done at the creation time of that node as subs can be post-declared, for example, so we don't try and resolve the sub at that point		Copy link Message link Add to gist Remove
	home time o/	17:06	Copy link Message link Add to gist Remove
17:08 Altai-man left
nine	jnthnwrthngtn: apparently past you has already thought along the same lines and implemented just that: github.com/rakudo/rakudo/blob/mast....nqp#L3188	17:42	Copy link Message link Add to gist Remove
	A little over 10 years ago actually: github.com/rakudo/rakudo/commit/c0...810595f4e6	17:46	Copy link Message link Add to gist Remove
lizmat	wow	17:54	Copy link Message link Add to gist Remove
nine	Of course that raises the question of why we don't see dispatch_i used then. The answer is: we set the returns on the QAST::Want instead of the QAST::Op(:op<call>) node	17:55	Copy link Message link Add to gist Remove
lizmat	so that never worked ?	17:56	Copy link Message link Add to gist Remove
nine	It did. Till commit 3cc9d765b2b350c9d15d0164ed53a9914b333afb in 2012	17:59	Copy link Message link Add to gist Remove
17:59 linkable6 left
lizmat	well, that's before I really got involved, so "never" is pretty accurate to me then	18:01	Copy link Message link Add to gist Remove
18:02 reportable6 left 18:05 reportable6 joined
jnthnwrthngtn	nine: Nice detective work :)	18:10	Copy link Message link Add to gist Remove
nine	Still quite the yak. Fixing that leads to "Unsupported register return kind for dispatch op" with a $res_kind of $MVM_reg_int32	18:12	Copy link Message link Add to gist Remove
18:12 nebuchadnezzar joined
nine	Which kinda makes sense. We need to extend smaller ints to the full width if we want to use native registers. So needs a coercion there	18:14	Copy link Message link Add to gist Remove
	Though not even the old invocation code had such coercions. It simply used the primspec to decide on the instruction kind.	18:21	Copy link Message link Add to gist Remove
20:01 evalable6 left 20:02 evalable6 joined
jnthnwrthngtn	m: say 1.79 / 2.07	20:28	Copy link Message link Add to gist Remove Run code
camelia	0.864734		Copy link Message link Add to gist Remove
jnthnwrthngtn	A recursive fib benchmark (we can't inline recursions, so it's a decent test of callframe setup/teradown) shows a nice improvement with work/env moved to the callstack	20:29	Copy link Message link Add to gist Remove
MasterDuke	nice	21:24	Copy link Message link Add to gist Remove
21:31 SmokeMachine left, discord-raku-bot left, leont left, Nicholas left, Nicholas joined, discord-raku-bot joined, SmokeMachine joined 21:32 leont joined
timo	\o/	21:48	Copy link Message link Add to gist Remove
21:58 discord-raku-bot left, discord-raku-bot joined 22:02 linkable6 joined 22:04 kjp left 22:06 kjp joined 22:51 kjp left 22:52 kjp joined 23:02 Mondenkind is now known as moon-child
japhb	Rebuilt rakudo just after the bump (so only the bump commit new on the rakudo side, but 31 commits farther on the MoarVM side). mandelbrot-pixels was yet faster, but still variable:	23:46	Copy link Message link Add to gist Remove
	16 zooms of 62460 pixels each in 5.202 seconds = 192102 pixels/second		Copy link Message link Add to gist Remove
	16 zooms of 62460 pixels each in 5.505 seconds = 181546 pixels/second		Copy link Message link Add to gist Remove
	16 zooms of 62460 pixels each in 6.526 seconds = 153125 pixels/second		Copy link Message link Add to gist Remove
	16 zooms of 62460 pixels each in 5.891 seconds = 169635 pixels/second		Copy link Message link Add to gist Remove
	16 zooms of 62460 pixels each in 6.493 seconds = 153912 pixels/second		Copy link Message link Add to gist Remove
	Carefully watching it, it seems like the speed was uneven within a single run (not just overall slower or faster) -- some zooms were noticeably slower than others each run.	23:47	Copy link Message link Add to gist Remove
	(I suppose there's some of that to be expected just from the math, but it was enough to make me wonder.)	23:48	Copy link Message link Add to gist Remove
	That variability is after quiescing my machine.		Copy link Message link Add to gist Remove
	(I thought that might have been contributing last time -- despite it not having affected the 2021.09 runs much -- so I shut down most of my apps.)	23:51	Copy link Message link Add to gist Remove
	After running the test a bunch of times, I'm seeing that 15xxxx pixels/second is way more common than the faster variants, but they still do show up occasionally.		Copy link Message link Add to gist Remove
	Yeah, confirmed faster with attacks benchmark as well:	23:54	Copy link Message link Add to gist Remove
	Min: 73.0 ms (13.7 fps) - Ave: 190.9 ms (5.2 fps) - Max: 376.1 ms (2.7 fps)		Copy link Message link Add to gist Remove
	50%: 186.9 ms - 75%: 235.3 ms - 90%: 259.0 ms - 95%: 269.8 ms - 99%: 359.9 ms		Copy link Message link Add to gist Remove
timo	might be interesting to extract out of a spesh log which frames get inlined into which other frames, and to see if that differs noticeably between slow and fast runs	23:55	Copy link Message link Add to gist Remove
japhb	Is there an easy way to do that already, or is it a SMOP?	23:58	Copy link Message link Add to gist Remove

Please report any issues / comments / feature requests as an issue on App::Raku::Log.

Thank you!