#moarvm on 6 February 2023 - Raku Programming Language Log

Welcome to the main channel on the development of MoarVM, a virtual machine for NQP and Rakudo (moarvm.org). This channel is being logged for historical purposes. Set by lizmat on 24 May 2021.
00:00 reportable6 left 00:02 reportable6 joined 04:14 shareable6 left, benchable6 left, unicodable6 left, statisfiable6 left, bloatable6 left, notable6 left, nativecallable6 left, bisectable6 left, evalable6 left, tellable6 left, coverable6 left, linkable6 left, reportable6 left, squashable6 left, releasable6 left, greppable6 left, sourceable6 left, shareable6 joined, reportable6 joined, sourceable6 joined, evalable6 joined, statisfiable6 joined 04:15 bisectable6 joined, squashable6 joined, nativecallable6 joined, unicodable6 joined, notable6 joined, releasable6 joined 04:16 linkable6 joined, greppable6 joined, coverable6 joined 04:17 tellable6 joined, bloatable6 joined, benchable6 joined 05:17 sourceable6 left, linkable6 left, evalable6 left, shareable6 left, notable6 left, squashable6 left, tellable6 left, statisfiable6 left, reportable6 left, greppable6 left, unicodable6 left, bisectable6 left, benchable6 left, coverable6 left, releasable6 left, bloatable6 left, quotable6 left, nativecallable6 left, committable6 left, committable6 joined, sourceable6 joined, squashable6 joined, greppable6 joined 05:18 releasable6 joined, linkable6 joined, benchable6 joined, unicodable6 joined, bisectable6 joined, bloatable6 joined 05:19 quotable6 joined, evalable6 joined, tellable6 joined, shareable6 joined, nativecallable6 joined 05:20 notable6 joined, coverable6 joined, reportable6 joined, statisfiable6 joined 06:00 reportable6 left 06:01 reportable6 joined 08:35 sena_kun joined
timo1	this isn't microoptimization, this is nanooptimization, except when you do nanotechnology you can do impressive things you wouldn't be able to do with regular "small stuff", and nanooptimization is just useless :P	09:52	Copy link Message link Add to gist Remove
nine	It looks impressive though ;)	09:53	Copy link Message link Add to gist Remove
timo1	this improvement in the bytecode came from changing the nqp source code tho, so it really literally only applies in this one frame	10:41	Copy link Message link Add to gist Remove
nine	It's a common frame though	10:42	Copy link Message link Add to gist Remove
lizmat	and yet another Rakudo Weekly News hits the Net: rakudoweekly.blog/2023/02/06/2023-...en-davies/	11:53	Copy link Message link Add to gist Remove
12:00 reportable6 left 12:03 reportable6 joined
el gatito (* advocate)*	what is a "frame"?	13:41	Copy link Message link Add to gist Remove
Voldenet	stack frame perhaps	13:49	Copy link Message link Add to gist Remove
nine	Actually in this context a piece of bytecode, i.e. a code block	13:53	Copy link Message link Add to gist Remove
el gatito (* advocate)*	oh	13:54	Copy link Message link Add to gist Remove
timo1	nine: are we talking about the same code? the piece of code inside EXPORTHOW.nqp inside the nqp source?	14:04	Copy link Message link Add to gist Remove
	does this actually run more than once?		Copy link Message link Add to gist Remove
nine	I guess once in every process?	14:06	Copy link Message link Add to gist Remove
timo1	i see it up to twice (after putting an nqp::sin_n that fprintf stderr in) during build	14:10	Copy link Message link Add to gist Remove
	nqp startup is only 0.04 so this can barely do anything :P	14:14	Copy link Message link Add to gist Remove
japhb	I'd love to get that down an order of magnitude, but I suspect that requires more large-scale engineering. :-)	14:16	Copy link Message link Add to gist Remove
timo1	for sure		Copy link Message link Add to gist Remove
nine	I have always wondered what exactly we spend that 100ms on in rakudo startup	14:17	Copy link Message link Add to gist Remove
timo1	we do a load of deserialization of stuff for example	14:18	Copy link Message link Add to gist Remove
nine	Is there so much that we have to deserialize right away?	14:19	Copy link Message link Add to gist Remove
timo1	the "work queue" nature of the deserialization work code makes it a little tricky to attribute work to what "caused" it	14:24	Copy link Message link Add to gist Remove
japhb	Deserialization is lazy, isn't it?	14:44	Copy link Message link Add to gist Remove
	Kindof wonder if there's any performance benefit to figuring out what needs to be deserialized for -e '' and just doing that always, as fast as we can.	14:45	Copy link Message link Add to gist Remove
timo1	deserialization is lazy, yes	14:47	Copy link Message link Add to gist Remove
	you're thinking maybe less "context switches" would benefit startup performance if we blaze through a whole chunk of serialized data ahead of time?	14:49	Copy link Message link Add to gist Remove
nine	Even more if we store that stuff close together and benefit from caching	14:53	Copy link Message link Add to gist Remove
timo1	how hard is it going to be to reshuffle objects in the serialized blob?	14:54	Copy link Message link Add to gist Remove
japhb	Yeah, what both of you said		Copy link Message link Add to gist Remove
Woodi	maybe just serializing it once, compiling it, saving as executable to disc and then just loading it at startup ? if it is too specific then options for compiling different executables, eg. for one-liners, for long running services ? and if it can un-jit when needed then nothing is lost	15:57	Copy link Message link Add to gist Remove
16:01 evalable6 left, linkable6 left
Woodi	ultimate option: generating asm code / executable that just do what asked, like generating grep-like binary without object stuff...	16:01	Copy link Message link Add to gist Remove
16:02 evalable6 joined 16:04 linkable6 joined
Voldenet	perl5 starts in 5ms, nqp takes 33ms to start, raku takes 140ms to start, so nqp takes around 25% of startup time	16:07	Copy link Message link Add to gist Remove
timo1	the nqp command does some things that the raku command doesn't have to, i'm not sure how much sense it makes to express nqp as a fraction of raku startup	16:08	Copy link Message link Add to gist Remove
	for example, rakudo shouldn't load the nqp grammar, actions, world, and optimizer	16:09	Copy link Message link Add to gist Remove
Voldenet	I see, so the only real way to measure perf reliably is to actually use the profiler	16:11	Copy link Message link Add to gist Remove
timo1	i would say that is accurate, yeah		Copy link Message link Add to gist Remove
	don't forget that rakudo doesn't start up too much slower than perl5 if you include some support for classes like moo or moose or whichever	16:12	Copy link Message link Add to gist Remove
Voldenet	Moo itself takes 15ms to load	16:15	Copy link Message link Add to gist Remove
timo1	moose takes a lot longer, right? since moo is kind of "moose but lighter"?	16:16	Copy link Message link Add to gist Remove
Voldenet	Yeah, moo and mouse are a lot faster, moose takes 120ms	16:18	Copy link Message link Add to gist Remove
nine	While that makes us look less bad, it distracts from the fact that we could load a lot faster	16:19	Copy link Message link Add to gist Remove
timo1	right, not saying we shouldn't improve load times, it is definitely a goal up there in terms of priority	16:20	Copy link Message link Add to gist Remove
	you think perl5 without any modules is something we could eventually reach? or at least load in 2x to 3x the time?	16:24	Copy link Message link Add to gist Remove
nine	I honestly don't know. Perl doesn't have to do much when starting up	16:28	Copy link Message link Add to gist Remove
Voldenet	python3: 23ms, nodejs: 60ms, ruby: 50ms	16:29	Copy link Message link Add to gist Remove
timo1	rakudo runs just over 200M branches during startup and misses 1.94% of them, where python runs 15M and misses 4.20% (lol) of them	16:30	Copy link Message link Add to gist Remove
	how do we feel about turning spesh on a little later than when the program starts?	16:36	Copy link Message link Add to gist Remove
	disabling spesh gives me a time of 0.181 wallclock vs spesh enabled gives 0.178, but the task-clock is 171msec vs 255msec, which just means when spesh is on we use 1.43 cpus and when it's off we use 0.95 cpus	16:37	Copy link Message link Add to gist Remove
Voldenet	page-faults is especially big number on raku (17045) compared to python3 (1111) nodejs (2494) ruby (2263)	16:46	Copy link Message link Add to gist Remove
timo1	yeah that probably has something to do with how much ram we use also	16:49	Copy link Message link Add to gist Remove
	and how much of the files we map we read from probably?		Copy link Message link Add to gist Remove
Voldenet	ram usage probably matters a bit, but nodejs uses 3x more rss than ruby and there's not much of a difference in startup time	16:53	Copy link Message link Add to gist Remove
timo1	interesting	16:55	Copy link Message link Add to gist Remove
	we can use perf to measure where page faults tend to happen		Copy link Message link Add to gist Remove
	haha, 61% in __memset_sse2_unaligned_erms, 13.4% mi_page_free_list_extend, 9.18% in _dl_relocate_object	16:57	Copy link Message link Add to gist Remove
	here, MVM_bytecode_unpack has 2.28%, maybe_grow_hash lands at 1.26%, MVM_spesh_log_entry at a surprising (to me) 1.1%, another .85% in MVM_spesh_log_decont, 0.73% in MVM_serialization_demand_object, 0.55% in MVM_spesh_log_type	16:59	Copy link Message link Add to gist Remove
	this is the page-faults performance counter, not the minor-faults one	17:00	Copy link Message link Add to gist Remove
Voldenet	so apparently just allocating larger chunks could improve performance	17:01	Copy link Message link Add to gist Remove
timo1	for spesh logs we already allocate one big chunk that we just write data to linearly	17:02	Copy link Message link Add to gist Remove
Voldenet	I can see why page faults here are surprising		Copy link Message link Add to gist Remove
17:20 timo1 left 17:26 timo1 joined 18:00 reportable6 left 18:01 reportable6 joined 18:45 gfldex left, gfldex joined 21:51 mst_ joined 21:52 mst left, mst_ is now known as mst 23:17 squashable6 left 23:20 squashable6 joined 23:32 sena_kun left 23:53 rypervenche left

Please report any issues / comments / feature requests as an issue on App::Raku::Log.

Thank you!