#moarvm on 8 October 2021 - Raku Programming Language Log

Welcome to the main channel on the development of MoarVM, a virtual machine for NQP and Rakudo (moarvm.org). This channel is being logged for historical purposes. Set by lizmat on 24 May 2021.
00:02 reportable6 left 01:04 reportable6 joined
japhb	m: my int @a = (^100); my int $b; my int $i = (^100).pick; say $i; for ^50_000_000 -> int $n { $b = @a.AT-POS($i) }; say now - INIT now; say $b	01:16	Copy link Message link Add to gist Remove Run code
camelia	97 16.736326012 97		Copy link Message link Add to gist Remove
japhb	???		Copy link Message link Add to gist Remove
	That's almost exactly twice as slow, which is very suspicious	01:17	Copy link Message link Add to gist Remove
moon-child	twice as slow as what?		Copy link Message link Add to gist Remove
japhb	moon-child: Masterduke posted the same thing 6 hours ago, except with `$b = @a[$i]`. Which itself was several times slower with $i being a native int, and thus tripping over failed optimizations of lexical refs, rather than an Int.	01:19	Copy link Message link Add to gist Remove
	Combined, both problems lead to my version being over 11x slower than the faster version.	01:20	Copy link Message link Add to gist Remove
moon-child	oh; I need to read more scrollback		Copy link Message link Add to gist Remove
01:32 squashable6 joined 02:36 bloatable6 left, evalable6 left, committable6 left, squashable6 left, shareable6 left, quotable6 left, greppable6 left, coverable6 left, nativecallable6 left, bisectable6 left, benchable6 left, statisfiable6 left, linkable6 left, sourceable6 left, unicodable6 left, reportable6 left, tellable6 left, notable6 left, releasable6 left, statisfiable6 joined 02:37 notable6 joined, releasable6 joined, greppable6 joined, evalable6 joined, bloatable6 joined, shareable6 joined 02:38 nativecallable6 joined, committable6 joined, linkable6 joined, sourceable6 joined 02:39 squashable6 joined, quotable6 joined 03:36 bisectable6 joined 03:38 tellable6 joined, benchable6 joined 04:36 unicodable6 joined 04:37 coverable6 joined
japhb	I've updated my draft MoarVM JIT AArch64 port scoping doc to address all the feedback I've gotten so far: gist.github.com/japhb/0c2108affd31...b3eb6a7947	05:05	Copy link Message link Add to gist Remove
	It has several new sections (and new subsections of existing sections).		Copy link Message link Add to gist Remove
moon-child	'even x64 may trap on unaligned SIMD access' there are unaligned versions of most simd ops	05:09	Copy link Message link Add to gist Remove
	(so, not a blocker; just a nice-to-have in the same respsects as regular alignment improvements)	05:10	Copy link Message link Add to gist Remove
japhb	moon-child: Do those unaligned variants require more recent updates to x64 SIMD? At least one reviewer considered this problem a blocker, so I'm curious if it's just a difference in expected ISA extensions ....	05:34	Copy link Message link Add to gist Remove
moon-child	doesn't look like it. E.g. movapd and movupd are both sse2 (amd64 baseline)	05:44	Copy link Message link Add to gist Remove
Nicholas	good ,	05:46	Copy link Message link Add to gist Remove
	I didn't really meant to cause an accidental digression on SIMD. The reason I'd mentioned it was not in the context of x86_64 JITs, but really more C code. Don't assume that C compilers on x86_64 will keep compiling your code which relies on lax alignment (ie undefined behaviour)	05:48	Copy link Message link Add to gist Remove
	it's a trap! :-)		Copy link Message link Add to gist Remove
06:00 linkable6 left, evalable6 left
nine	jnthnwrthngtn: but pass-decontainerized does not do a track-attr, but a track-arg!	06:12	Copy link Message link Add to gist Remove
	Well both, but track-arg comes first and the value's type is used in the conditional		Copy link Message link Add to gist Remove
Nicholas	japhb: looks good to me. I like how you've phrased some of the things we mentioned, and I can't see what to add/change		Copy link Message link Add to gist Remove
nine	jnthnwrthngtn: ah, we're both right. Yes the track-attr does imply that guard, but we only do the track-attr if the value is of a certain type. If it's not in the first run, we never do the track-attr, thus no guard gets installed	06:14	Copy link Message link Add to gist Remove
japhb	moon-child: Ah OK, thank you		Copy link Message link Add to gist Remove
	Nicholas: Excellent, thank you.		Copy link Message link Add to gist Remove
	Anyone have any objections to adding the current porting scope draft to github.com/MoarVM/MoarVM/tree/master/docs/jit ? Seems like this research ought to be captured for posterity, no matter who actually picks up the porting project.	06:18	Copy link Message link Add to gist Remove
Nicholas	I don't, but I guess that brrt ought to check the draft first (but is that just "a PR and ask brrt to review it?")	06:21	Copy link Message link Add to gist Remove
japhb	Nicholas: That's a decent point,		Copy link Message link Add to gist Remove
	s/','/'.'/	06:22	Copy link Message link Add to gist Remove
Geth	MoarVM: japhb++ created pull request #1560: Doc research on scoping for an AArch64 JIT port	06:34	Copy link Message link Add to gist Remove
japhb	(Feel free to pile on as reviewers if y'all so desire; I've already added brrt at Nicholas's suggestion.)	06:36	Copy link Message link Add to gist Remove
07:01 linkable6 joined 07:04 patrickb joined 07:05 reportable6 joined 07:40 brrt joined
brrt	good * #moarvm	07:57	Copy link Message link Add to gist Remove
Nicholas	good *, brrt		Copy link Message link Add to gist Remove
patrickb	o/	08:05	Copy link Message link Add to gist Remove
08:57 brrt left 09:02 evalable6 joined
jnthnwrthngtn	moarning o/	10:00	Copy link Message link Add to gist Remove
Nicholas	\o	10:01	Copy link Message link Add to gist Remove
jnthnwrthngtn	nine: Ah, I guess the point you're making is that a given callsite may deal in both containerized and non-decontainerized args.	10:02	Copy link Message link Add to gist Remove
	In the same positions		Copy link Message link Add to gist Remove
10:02 evalable6 left, linkable6 left
jnthnwrthngtn	Which could happen, yes, although the majority of callsites will be monomorphic in everything including this, a bunch that are polymorphic otherwise will still be monomorphic in this, so we're talking about a small number of cases	10:04	Copy link Message link Add to gist Remove
10:04 linkable6 joined
jnthnwrthngtn	And then the question is whether having all the permutations of deconts possible stacked up at the callsite is wise.	10:04	Copy link Message link Add to gist Remove
10:05 evalable6 joined
jnthnwrthngtn	Explicitly looking for if the site is blowing up may be a more bullet-proof solution.	10:05	Copy link Message link Add to gist Remove
	otoh, in the case of a non-multi dispatch we'd end up with type guards on values that'd not be there otherwise in many cases	10:06	Copy link Message link Add to gist Remove
	on the third hand, those guards might be inserted to pick a linked specialization		Copy link Message link Add to gist Remove
	In summary, worse might be better or better might be better :)	10:07	Copy link Message link Add to gist Remove
Nicholas	jnthnwrthngtn: I think you meant "gripping hand". ie en.wikipedia.org/wiki/The_Gripping_Hand	10:10	Copy link Message link Add to gist Remove
10:35 brrt joined
jnthnwrthngtn	Hm, I can't remember if I already had my second coffee or not...this isn't a good sign.	10:42	Copy link Message link Add to gist Remove
Geth	MoarVM/master: 8 commits pushed by (Jonathan Worthington)++ - Normalize filenames for debug server - Produce one breakpoint instruction per line - Fix issues with resuming suspended threads - Only produce debugger debugging output when asked - Fix regression in stepping - A little more debug output for stepping - Avoid duplicate response for suspend/resume all - Merge pull request #1559 from MoarVM/debug-server-fixes	10:45	Copy link Message link Add to gist Remove
10:53 sena_kun joined
nine	jnthnwrthngtn: I came across the situation with github.com/niner/Inline-Perl5/blob...5.pm6#L257 where the first caller passed a non-containerized value and the second caller a Scalar and value gets passed on to $!p5.p5_get_type which then exploded	11:16	Copy link Message link Add to gist Remove
11:31 frost joined
nine	So, back to square 1 with the stack_top assertions	11:34	Copy link Message link Add to gist Remove
jnthnwrthngtn	nine: Oh...you re-used it for NativeCall	11:52	Copy link Message link Add to gist Remove
	nine: Yeah, this isn't good re-use. NativeCall absolutely depends on it. The normal situation has it as an optimization.	11:53	Copy link Message link Add to gist Remove
nine	I didn't re-use the sub, but copied and modified the code. Then I ran into this issue and figured that it'd probably affect the original sub as well		Copy link Message link Add to gist Remove
jnthnwrthngtn	Yes, but their contexts are different		Copy link Message link Add to gist Remove
nine	Ah, ok, that makes sense then.		Copy link Message link Add to gist Remove
jnthnwrthngtn	Pre-coffee me earlier didn't catch on to your problem being what it does with NativeCall :)		Copy link Message link Add to gist Remove
nine	In hindsight, I could have mentioned that :)	11:54	Copy link Message link Add to gist Remove
12:02 reportable6 left 12:03 reportable6 joined
nine	jnthnwrthngtn: I think the most important take away from this is that misunderstanding aside, I understand dispatchers well enough to be able to figure out such issues :)	12:06	Copy link Message link Add to gist Remove
jnthnwrthngtn	bus-factor++ :)	12:10	Copy link Message link Add to gist Remove
12:16 ggoebel joined
jnthnwrthngtn	Curiously, on the machine I'm working on today, the script dogbert17 posted yesterday is less time-variable than my home one...but it does sometimes take rather longer to run	12:39	Copy link Message link Add to gist Remove
	From doing a few measurements it at first looked like simplifying the bytecode size threshold scheme yielded an improvement, but I wasn't sure so hacked up a script to do 100 runs and give me a histogram. Turns out that with that to look at, it's actually no help at all	12:40	Copy link Message link Add to gist Remove
dogbert17	I noticed that the code can run slowly even if MVM_JIT_EXPR_DISABLE=1. That doesn't happen very often though.	12:46	Copy link Message link Add to gist Remove
lizmat	fwiw, I've noticed over the years that sometimes Raku just runs a lot slower, for the same program	12:47	Copy link Message link Add to gist Remove
	I've always assumed some worst case in hashing		Copy link Message link Add to gist Remove
jnthnwrthngtn	Yes, but the worse case in hashing is potentially so much worse because it affects optimization decisions that in turn have a significant impact.	12:50	Copy link Message link Add to gist Remove
	Anyway, it's clear from this that doing a lot of measurements and looking at the histogram is going to be important here, because it's easy to do a change and a handful of runs and think there's an improvement.	12:51	Copy link Message link Add to gist Remove
	Whereas one can just get lucky/unlucky	12:52	Copy link Message link Add to gist Remove
nine	Benchmarking is hard :/	12:53	Copy link Message link Add to gist Remove
jnthnwrthngtn	Indeed, especially with VMs		Copy link Message link Add to gist Remove
	Which do complex opts based on sampling		Copy link Message link Add to gist Remove
	MoarVM is in very good company in having problems in this area.		Copy link Message link Add to gist Remove
	dl.acm.org/doi/pdf/10.1145/3133876 for anyone curious	12:54	Copy link Message link Add to gist Remove
	But the punchline is	12:55	Copy link Message link Add to gist Remove
	""Repeating our experiment on 3 different machines, we found that at most 43.5% of ?VM, benchmark? pairs consistently reach a steady state of peak performance."		Copy link Message link Add to gist Remove
	This is how it looks in my office machine for the triangle numbers case: gist.github.com/jnthn/a3bfc7d0a32c...e46ee63706	12:57	Copy link Message link Add to gist Remove
	(this is without any changes applied)		Copy link Message link Add to gist Remove
lizmat	wow		Copy link Message link Add to gist Remove
nine	That 9.2 is when MoarVM went for a coffee break?	12:58	Copy link Message link Add to gist Remove
jnthnwrthngtn	That or just a couple of important missed inlinings.	12:59	Copy link Message link Add to gist Remove
	Obtaining full spesh logs distorts the problem, but I did dump out inlinings when looking at this before writing the tool to do the histogram and noticed in the longest runs we missed inlining of slip-all, for example.	13:01	Copy link Message link Add to gist Remove
dogbert17	what can cause that to happen		Copy link Message link Add to gist Remove
jnthnwrthngtn	One experiment I did (added to gist) is what happens if we have larger spesh log buffers. That turns out not to help at all, and in fact seems to make things slightly worse	13:03	Copy link Message link Add to gist Remove
	Specialization order is partly decided on by stack depth, so this can be involved.	13:05	Copy link Message link Add to gist Remove
	(if the numbers are inaccurate for example)	13:06	Copy link Message link Add to gist Remove
	I do wonder a bit if the fact that the stack model doesn't really account for continuations is involved		Copy link Message link Add to gist Remove
nine	With a MVM_CALLSTACK_RECORD_NESTED_RUNLOOP Inline::Perl5 makes it through its tests as well! Well, except for a single test that ends up passing a Proxy to a native function	13:15	Copy link Message link Add to gist Remove
jnthnwrthngtn	Yay :)	13:16	Copy link Message link Add to gist Remove
	How is the performance at this point? I know it's not wired through to JIT and stuff yet.		Copy link Message link Add to gist Remove
13:20 psydroid left, AlexDaniel left
nine	Can't really tell yet. csv-ip5xs.pl breaks if run with more than 1931 lines of input - even with spesh disabled	13:22	Copy link Message link Add to gist Remove
13:22 AlexDaniel joined
jnthnwrthngtn	OK	13:22	Copy link Message link Add to gist Remove
13:22 psydroid joined
jnthnwrthngtn	Hm, so...	13:22	Copy link Message link Add to gist Remove
	gist.github.com/jnthn/a3bfc7d0a32c...-frame-txt		Copy link Message link Add to gist Remove
	Turns out that maintaining stack depth in the sim stack frame rather than as a property of the sim stack itself does have a very noticeable effect: while still not eliminating the problem, it seems to increase the likelihood of reaching one of the lower timings	13:23	Copy link Message link Add to gist Remove
nine	So some 1900 times the native sub returns a Pointer. But suddenly it returns Int instead. What the?	13:24	Copy link Message link Add to gist Remove
jnthnwrthngtn	o.O		Copy link Message link Add to gist Remove
nine	That'd be a classic spesh issue, except for that I'm pretty sure I've spelt MVM_SPESH_DISABLE=1 right	13:25	Copy link Message link Add to gist Remove
	Next guess is GC		Copy link Message link Add to gist Remove
dogbert17	missing root?	13:29	Copy link Message link Add to gist Remove
timo	my guess is rr knows :P		Copy link Message link Add to gist Remove
dogbert17	haha	13:30	Copy link Message link Add to gist Remove
	timo: btw, has the cat moved away from your monitor?		Copy link Message link Add to gist Remove
timo	yes	13:33	Copy link Message link Add to gist Remove
Geth	MoarVM/spesh-stability: a3b6e7bb34 \| (Jonathan Worthington)++ \| 2 files Increase stack depth tracking accuracy Stack depth is used in order to make decisions about what order to specialize things in. Since we sample the interpreted program, we can end up with missing data at some points, and further we update the statistics in batches, and can have buffer ends being reached in all kinds of situations. Tracking the stack depth by keeping it on the frame rather than as a global property seems to lead to more accurate results and thus more regularly reaching peak performance.	13:48	Copy link Message link Add to gist Remove
timo	you may not like it, but this is what peak performance looks like	13:49	Copy link Message link Add to gist Remove
jnthnwrthngtn	dogbert17: I'm curious if that branch makes any difference on your machine	13:50	Copy link Message link Add to gist Remove
nine	Ooh...looks like it is GC, but not something mundane like missing rooting. Looks more like a destructor is doing a native call (with a callback even) at an inopportune time		Copy link Message link Add to gist Remove
jnthnwrthngtn	o.O	13:53	Copy link Message link Add to gist Remove
nine	OTOH removing the DESTROY methods doesn't actually change anything. So that call might be innocent after all	13:54	Copy link Message link Add to gist Remove
Nicholas	timo: I'm confused by the context of "this is what peak performance looks like" and how it relates tot he cat		Copy link Message link Add to gist Remove
jnthnwrthngtn	It might relate to my commit message instead :P		Copy link Message link Add to gist Remove
timo	we postpone destruction / finalizer calls to after garbage collection finishes don't we?		Copy link Message link Add to gist Remove
nine	we do		Copy link Message link Add to gist Remove
timo	that is supposed to make things like your first assumption not able to happen :D	13:56	Copy link Message link Add to gist Remove
nine	timo: That's probably why it took me a year to figure this one out: github.com/niner/Inline-Perl5/comm...251d9f0d6d	13:57	Copy link Message link Add to gist Remove
	Well that and that it simply was incredibly hard to reproduce	14:00	Copy link Message link Add to gist Remove
dogbert17	jnthnwrthngtn: let me check	14:01	Copy link Message link Add to gist Remove
timo	wow, that's tricky	14:03	Copy link Message link Add to gist Remove
14:04 frost left
timo	i wonder if we can do anything better in spesh dumping in terms of slowing the spesh thread down	14:08	Copy link Message link Add to gist Remove
	in theory a second thread could be made responsible for rendering spesh graphs to text and outputting, then freeing the spesh data and such		Copy link Message link Add to gist Remove
dogbert17	jnthnwrthngtn: executed the program ten times, eight of thos were fast and two slow		Copy link Message link Add to gist Remove
jnthnwrthngtn	dogbert17: Whereas before none of them were fast?	14:10	Copy link Message link Add to gist Remove
timo	remember that paper that showed steady states of performance are often reached after a surprisingly long time?		Copy link Message link Add to gist Remove
dogbert17	I need to get the numbers ...	14:11	Copy link Message link Add to gist Remove
jnthnwrthngtn	timo: Is it a different one than that which I linked earlier?	14:12	Copy link Message link Add to gist Remove
nine	jnthnwrthngtn: preliminary results obtained by disabling the p5_sv_refcnt_dec function completely: master 22.78s, origin/new-disp-nativecall (dispatching to generated function bodies) 10.92s, dispatch to boot-foreign-code (calling the generic MVM_nativecall_invoke): 11.73s	14:13	Copy link Message link Add to gist Remove
timo	oh, earlier?	14:14	Copy link Message link Add to gist Remove
	i missed the link haha		Copy link Message link Add to gist Remove
	than looks like the one i was thinking of		Copy link Message link Add to gist Remove
jnthnwrthngtn	nine: so iiuc, with boot-foreign-code we are running in half the time of master, and competitive with the generated function bodies but without having to do all the generation pain?	14:16	Copy link Message link Add to gist Remove
nine	Yes, that's about it. It's also roughly the performance we've had pre-new-disp		Copy link Message link Add to gist Remove
jnthnwrthngtn	I assume the master timiing is with the generated function bodies too?		Copy link Message link Add to gist Remove
	Oh, master is master after new-disp		Copy link Message link Add to gist Remove
	OK :)		Copy link Message link Add to gist Remove
	I'd be very glad to see the generated function bodies vanish before rakuast :D	14:17	Copy link Message link Add to gist Remove
nine	master is not using the generated function bodies, because the $!do replacing trickery doesn't work as well anymore for unknown reason. Or rather it's unknown why it worked before new-disp		Copy link Message link Add to gist Remove
jnthnwrthngtn	hah :)		Copy link Message link Add to gist Remove
nine	It doesn't because we're replacing the $!do of the original function objects but call clones (which retain the original $!do). No idea why that should be different with new-disp	14:18	Copy link Message link Add to gist Remove
jnthnwrthngtn	Me either	14:19	Copy link Message link Add to gist Remove
	So in theory one we integrate the boot-foreign-code appraoch with the JIT, we should be winning quite nicely		Copy link Message link Add to gist Remove
	*once		Copy link Message link Add to gist Remove
nine	That's what I hope :)	14:20	Copy link Message link Add to gist Remove
	Will be a good base for NativeCall 2.0 as well	14:21	Copy link Message link Add to gist Remove
dogbert17	jnthnwrthngtn: it seems (after ten 'before' tests) that we had six fast executions (with a two sec variability) and four slow ones (approx. 8 secs slower)		Copy link Message link Add to gist Remove
jnthnwrthngtn	dogbert17: Aha, so the "runs slower with expr JIT" was really "often runs slower with expr JIT"?	14:22	Copy link Message link Add to gist Remove
dogbert17	indeed	14:23	Copy link Message link Add to gist Remove
	if expr JIT is off the program will run 'fast' 9-10 times of ten	14:25	Copy link Message link Add to gist Remove
timo	so the average time went up, but the fastest reachable state was perhaps better or at least the same?		Copy link Message link Add to gist Remove
jnthnwrthngtn	timo: More like the chance of it running fast went up	14:26	Copy link Message link Add to gist Remove
	OK, so that seems to help, even if not a full solution		Copy link Message link Add to gist Remove
dogbert17	yes	14:27	Copy link Message link Add to gist Remove
	I guess it helped in your tests as well	14:28	Copy link Message link Add to gist Remove
timo	ah i meant when we turn it on, not when we turn it off		Copy link Message link Add to gist Remove
jnthnwrthngtn	Yes, compare baseline (first histogram) and stack-depth-in-frame (final histogram) in gist.github.com/jnthn/a3bfc7d0a32c...e46ee63706	14:29	Copy link Message link Add to gist Remove
dogbert17	just to avoid any misunderstandings :) 1 - most programs seem to run a bit faster when the expr JIT is turned of. 2 - some programs (e.g. the gist) can sometimes run quite a bit slower than usual.	14:48	Copy link Message link Add to gist Remove
	so with the gist on my machine with expr JIT enabled: slow ~35s, fast ~27s with a 40 vs 60 percent chance of occuring. Without expr JIT a fast run is at ~26s and a slow one at ~32s, but statistically it seems run fast 90-95 percent of the time.	14:50	Copy link Message link Add to gist Remove
	with the patch from jnthnwrthngtn and with expr JIT on (i.e. normal settings) the stats are like 80% fast and 20% slow	14:54	Copy link Message link Add to gist Remove
	dogbert17 continues his ramblings	14:56	Copy link Message link Add to gist Remove
	another example is thundergnats 'White Noise' script, with expr JIT on the frame rate on my machine is ~74, if it's turned off the FPS is ~80	14:57	Copy link Message link Add to gist Remove
jnthnwrthngtn	I guess the smaller factor in the expr JIT being slower is missing devirt, and then wider variability in timings is about how well we make optimization decisions	14:58	Copy link Message link Add to gist Remove
	With the expr jit taking a bit longer to produce code making it more likely we get dubious timings		Copy link Message link Add to gist Remove
dogbert17	perhaps it's not relevant here but do you remember this commit comment:	14:59	Copy link Message link Add to gist Remove
	"This does mean that there is a discrepancy between when the template JIT wishes to add labels and when the lego JIT wishes to, which is worth a closer look, however even if that is figured out, it's still better not to do work in the template JIT that will ultimately be a throwaway due to a missing template."	15:00	Copy link Message link Add to gist Remove
jnthnwrthngtn	Yes, though I don't think that's a factor here.		Copy link Message link Add to gist Remove
dogbert17	ok, just thought I should throw it out there :)	15:01	Copy link Message link Add to gist Remove
jnthnwrthngtn	Best I can tell the problems relating to decision making mostly impact upon inlining choices		Copy link Message link Add to gist Remove
	*specialization ordering decisions	15:02	Copy link Message link Add to gist Remove
timo	i've been pondering like regularly throwing old spesh stuff out and redoing it with more info, and with outdated info tossed out. that is probably dependent on making that one object a collectable, forgot which it was, but it was difficult	15:03	Copy link Message link Add to gist Remove
15:04 ggoebel left
timo	could be more trouble than it's worth, dunno	15:08	Copy link Message link Add to gist Remove
	SETTING::src/core.c/ThreadPoolScheduler.pm6:602 does an allocation :o	15:10	Copy link Message link Add to gist Remove
15:10 ggoebel joined
timo	was pretty cool when the TPS supervisor didn't allocate at all	15:10	Copy link Message link Add to gist Remove
jnthnwrthngtn	timo: MasterDuke++ has a PR that does much of the work on letting us discard old candidates	15:12	Copy link Message link Add to gist Remove
	Based on excessive deopts		Copy link Message link Add to gist Remove
	I need to look at it now we've got new-disp merged		Copy link Message link Add to gist Remove
timo	ah, yes, MVMSpeshCandidate was heap-allocated and we wanted it gc-managed so we can let it die	15:13	Copy link Message link Add to gist Remove
lizmat	timo: what is exactly doing the allocation?		Copy link Message link Add to gist Remove
	the loop itself?		Copy link Message link Add to gist Remove
timo	MVM_frame_takeclosure it looks like	15:14	Copy link Message link Add to gist Remove
lizmat	timo: remind me while that is a loop { } and not an nqp::while(1) ?	15:15	Copy link Message link Add to gist Remove
timo	dunno		Copy link Message link Add to gist Remove
lizmat	I guess readability, but still		Copy link Message link Add to gist Remove
	afk for a few hours&		Copy link Message link Add to gist Remove
timo	the second time it hit the gc was in the clone op, probably cloning a Block object	15:16	Copy link Message link Add to gist Remove
	it does take a while for it to hit gc, so that's nice		Copy link Message link Add to gist Remove
15:17 brrt left
timo	one of the frames being invoked also has "please allocate on hea" set, which we set if we notice a frame tends to need to go on the heap, like when we take continuations	15:19	Copy link Message link Add to gist Remove
15:20 ggoebel left
timo	yeah it allocates at a very leisurely pace, nothing to worry about i guess	15:23	Copy link Message link Add to gist Remove
[Coke]	Nicholas: I was ready to come back and mention the gripping hand after reviewing, see you beat me... handily.	15:56	Copy link Message link Add to gist Remove
	<hades.gif>	16:00	Copy link Message link Add to gist Remove
MasterDuke	yeah, that remove-spesh-candidates PR was 99% done, hopefully just needed a final review and confirmation that something i thought was a bit odd was ok. i haven't tried rebasing it yet, i can probably start on that this weekend		Copy link Message link Add to gist Remove
16:40 patrickb left
jnthnwrthngtn	OK, so if we don't try and reconstruct stack depth in spesh, but instead just track stack depth in the interpreter and send that to spesh, we get a completely accurate stack depth, even accounting for continuations.	16:51	Copy link Message link Add to gist Remove
	That gets us to histograms like this: gist.github.com/jnthn/a3bfc7d0a32c...-depth-txt	16:52	Copy link Message link Add to gist Remove
	Which is a bimodal distribution. I did another run and it did the same.	16:53	Copy link Message link Add to gist Remove
	So I think this means that:		Copy link Message link Add to gist Remove
	1. Precise stack depths get rid of a lot of the cases where we do a little bit worse, leading to the majority of runs clustering around the lower end.	16:54	Copy link Message link Add to gist Remove
	2. Perhaps something else, not relating to the stack depths and thus ordering decisions, is going on to give us the few cases on the second, less common (on this machine) mode.	16:55	Copy link Message link Add to gist Remove
dogbert17	does this mean that you have another patch?	17:00	Copy link Message link Add to gist Remove
Geth	MoarVM/spesh-stability: 824fb58b72 \| (Jonathan Worthington)++ \| 7 files Track stack depths precisely in interpreter And pass those along to spesh. This means that it always has accurate stack depths to use when making decisions about ordering of the specializations.	17:07	Copy link Message link Add to gist Remove
jnthnwrthngtn	dogbert17: Yes, I'm curious what the results are on your machine		Copy link Message link Add to gist Remove
dogbert17	let's see	17:09	Copy link Message link Add to gist Remove
	ten attempts, five slow and five fast	17:17	Copy link Message link Add to gist Remove
jnthnwrthngtn	Interesting.	17:20	Copy link Message link Add to gist Remove
	Though curiously while I made it bi-modal, there were a few more entries towards the second mode than before	17:22	Copy link Message link Add to gist Remove
dogbert17	fwiw, 12 runs with MVM_SPESH_BLOCKING=1 were all 'fast'	17:27	Copy link Message link Add to gist Remove
Geth	MoarVM: e22a190b7d \| (Geoffrey Broadwell)++ \| 2 files Doc research on scoping for an AArch64 JIT port * Reasons to work on AArch64 as our second JIT platform * Currently known JIT porting risks * Required knowledge for porters * Potential development environments * Porting roadmap sketch * Alternate roadmap paths to consider * Example of expected x64 versus AArch64 tile differences * Resource links discovered during research	17:30	Copy link Message link Add to gist Remove
	MoarVM: 78fd4944f2 \| (Geoffrey Broadwell)++ (committed using GitHub Web editor) \| 2 files Merge pull request #1560 from japhb/master Doc research on scoping for an AArch64 JIT port		Copy link Message link Add to gist Remove
	MoarVM/spesh-stability: 5040b94723 \| (Jonathan Worthington)++ \| 2 files Slightly lower seems-monomorphic thresholds When we have a threshold of 150 invocations before specialization, it only takes one or two lost datapoints in sampling in order to end up determining something is not type-stable from the statistics. Loosen this up a little.	17:53	Copy link Message link Add to gist Remove
	MoarVM/spesh-stability: 1a2cf2d5b4 \| (Jonathan Worthington)++ \| src/spesh/log.c Ensure we always spesh log full type tuples If when we log an entry we don't have enough space in the spesh log to send all of the parameter types, then send the current log now, so we can record the complete entry and parameter types in the new one. This is because the spesh stats incorporation will, at log boundaries, discard any incomplete type tuple it sees.		Copy link Message link Add to gist Remove
jnthnwrthngtn	dogbert17: That final one seems to have gotten rid of the second mode for me, although I've only done one run (they're time-consuming, and I should go home and cook dinner :))	17:55	Copy link Message link Add to gist Remove
17:59 sena_kun left 18:02 reportable6 left
dogbert17	jnthnwrthngtn++, will check. Dinner sounds like an excellent idea :)	18:03	Copy link Message link Add to gist Remove
19:01 nebuchadnezzar left 19:04 reportable6 joined 19:19 dogbert17 left, dogbert17 joined
jnthnwrthngtn	Uff, back on my machine at home the results look no better at all	20:33	Copy link Message link Add to gist Remove
lizmat	m: sub a() {}; say a.^ver # jnthnwrthngtn does that seem like a sane thing ?	20:34	Copy link Message link Add to gist Remove Run code
camelia	6.c		Copy link Message link Add to gist Remove
jnthnwrthngtn	Well, it's the same as saying Sub.^ver	20:37	Copy link Message link Add to gist Remove
	And I guess that's defiend since 6.c :)	20:38	Copy link Message link Add to gist Remove
lizmat	I'm surprised we don't need the &		Copy link Message link Add to gist Remove
jnthnwrthngtn	oh hah :D	20:39	Copy link Message link Add to gist Remove
	Well, then it's same as Nil.^ver, and same story :)		Copy link Message link Add to gist Remove
	(It's calling the sub, which returns Nil)		Copy link Message link Add to gist Remove
	It can't be about the Sub itself, because .^ means it's about the type.	20:40	Copy link Message link Add to gist Remove
	(Even if an & were present, I mean)		Copy link Message link Add to gist Remove
lizmat	m: class a:ver<0.0.1> { }; sub a() { }; say a.^ver		Copy link Message link Add to gist Remove Run code
camelia	v0.0.1		Copy link Message link Add to gist Remove
lizmat	so how does work then ?		Copy link Message link Add to gist Remove
	I guess it's in the grammar handling .^ right ?	20:41	Copy link Message link Add to gist Remove
jnthnwrthngtn	Not at all, it's just how we parse names		Copy link Message link Add to gist Remove
	If it's a type it's a term, if not it's a listop		Copy link Message link Add to gist Remove
	The decision is made before we see the .^	20:42	Copy link Message link Add to gist Remove
	See term:sym<name> (iirc) in the grammar		Copy link Message link Add to gist Remove
lizmat	ok		Copy link Message link Add to gist Remove
jnthnwrthngtn	m: class say {}; say 42		Copy link Message link Add to gist Remove Run code
camelia	5===SORRY!5=== Error while compiling <tmp> Two terms in a row at <tmp>:1 ------> 3class say {}; say7⏏5 42 expecting any of: infix infix stopper statement end statement modifier st…		Copy link Message link Add to gist Remove
lizmat	well, at least I've figured out how App::Mi6 insists on versioning some of my modules as "v6.c" :-)	20:43	Copy link Message link Add to gist Remove
jnthnwrthngtn	Another example :)		Copy link Message link Add to gist Remove
lizmat	when using App::Mi6 you cannot have a distribution "foo" exporting just a sub "foo"	20:46	Copy link Message link Add to gist Remove
	as it will always set the version to "v6.c" because of this behaviour :-(		Copy link Message link Add to gist Remove
	contemplating solutions overnight&		Copy link Message link Add to gist Remove
moon-child	just noticed: moarvm.org/contributing.html still mentions freenode	21:03	Copy link Message link Add to gist Remove
21:17 linkable6 left, evalable6 left, evalable6 joined 21:19 linkable6 joined 21:36 ggoebel joined
dogbert17	jnthnwrthngtn: I got 6 fast and 4 slow with your latest fixes	21:54	Copy link Message link Add to gist Remove
	btw, did you get some nice food?		Copy link Message link Add to gist Remove
22:06 ggoebel left 22:38 ggoebel joined 22:49 colemanx left 23:01 ggoebel left
jnthnwrthngtn	I cooked some pasta and sausage dish and it was nice :)	23:04	Copy link Message link Add to gist Remove
	Goodness, I have numbers and they are weird	23:17	Copy link Message link Add to gist Remove
	Unlike the sort-of-modality I saw on my office machine before it became bimodal, it's bimodal from the start (before any of my changes) on my machine at home	23:18	Copy link Message link Add to gist Remove
	The following percentages of the number of times the desirable modality is achieved	23:19	Copy link Message link Add to gist Remove
	52% before any changes		Copy link Message link Add to gist Remove
	63% with just simple depth improvements		Copy link Message link Add to gist Remove
	65% with just complex depth improvements		Copy link Message link Add to gist Remove
	65% with just lower type stability percentages		Copy link Message link Add to gist Remove
	64% with just logging changes for full type tuples		Copy link Message link Add to gist Remove
	57% with all improvements together		Copy link Message link Add to gist Remove
	It gets better, if you combine lower stability percentages with logging changes for full types you get 53% and a trimodal distribution	23:35	Copy link Message link Add to gist Remove
	At this point I'm hoping this is a bad dream I'll wake up from, or that I'm miraculously drunk on two beers and doing the measurements wrong.	23:36	Copy link Message link Add to gist Remove
dogbert17	it's an odd problem, not easily solved it seems	23:37	Copy link Message link Add to gist Remove
	guess it depends on the beers, are they from some monestary in Belgium by any chance :)	23:41	Copy link Message link Add to gist Remove
jnthnwrthngtn	Alas no, they're two tasty but not especially strong IPAs.		Copy link Message link Add to gist Remove
	53% again for depth improvements + lower thresholds, although the histogram is even weirder	23:52	Copy link Message link Add to gist Remove

Please report any issues / comments / feature requests as an issue on App::Raku::Log.

Thank you!