#moarvm on 19 May 2025 - Raku Programming Language Log

Welcome to the main channel on the development of MoarVM, a virtual machine for NQP and Rakudo (moarvm.org). This channel is being logged for historical purposes. Set by lizmat on 24 May 2021.
Geth	MoarVM: MasterDuke17++ created pull request #1940: Fix two GCC compiler warnings	00:33	Copy link Message link Add to gist Remove
lizmat	And yet another Rakudo Weekly News hits the Net: rakudoweekly.blog/2025/05/19/2025-...s-is-mini/	10:01	Copy link Message link Add to gist Remove
13:19 JimmyZhuo joined 13:33 JimmyZhuo left 16:09 kjp left 16:10 kjp joined
jnthn	patrickb: (prevent dispatch programs from being recorded) You probably don't want to do that, because recording is the expensive step, but you can either 1) return something that receives the callee, does the work, and calls the callee, or 2) do it using dispatch resumptions, as are used in for example a `proto` with a body, that needs to run the body and then continue the multi-dispatch at the {*}	20:58	Copy link Message link Add to gist Remove
	.tell MasterDuke I somewhat suspect that, just as simd has been used for various kinds of fast parsing and validation, one could write something that makes good use of them to at lesat blast through utf-8 content and verify that it is already in normalized form (at least all ASCII range and no \r can tell that to a limited degree). A multi-pass algorithm is poor memory locality, so I'm not surprised that fast	21:08	Copy link Message link Add to gist Remove
tellable6	jnthn, I'll pass your message to MasterDuke		Copy link Message link Add to gist Remove
jnthn	utf8 decoding followed by normalization wasn't faster. Trouble is what if you chew from a meg and only then spot a problem, but I guess you could make a strand for the OK part and then continue with the full path.		Copy link Message link Add to gist Remove
	*chew through	21:09	Copy link Message link Add to gist Remove
21:36 mandrillone joined
mandrillone	What if Unicode normalization is done only when it is needed. That is when comparing strings?	21:37	Copy link Message link Add to gist Remove
	It often happens that one compares a single string with many others. In that case you can normalize this single string and then the algorithm for anything else is very simple (bar some unrealistic corner case)	21:38	Copy link Message link Add to gist Remove
	It is rare to sort long list of strings	21:39	Copy link Message link Add to gist Remove
	for others uses, such as counting graphemes, you don’t really need full fledged normalization	21:41	Copy link Message link Add to gist Remove
	also it is something you rarely do IMHO		Copy link Message link Add to gist Remove
	If you go with “Unicode normalization only when needed”, then you need to make string comparison non commutative from the performance perspective. A convention can be chosen for what is the pivot	21:43	Copy link Message link Add to gist Remove
	or some heuristic implemented to detect that		Copy link Message link Add to gist Remove
21:43 mandrillone left
lizmat	mandrillone: how would that handle .chars ?	21:53	Copy link Message link Add to gist Remove
tellable6	lizmat, I'll pass your message to mandrillone		Copy link Message link Add to gist Remove
21:55 rakkable left, rakkable joined
jnthn	By making it O(n) instead of O(1), I guess. And I dunno how one is meant to apply normalization to only one of the strings involved in a comparison and expect a sensible result, nor to know which of the strings is the more compared one locally.	22:09	Copy link Message link Add to gist Remove
23:46 mandrillone joined
mandrillone	Indeed, O(n) instead of O(1). Possibly with caching of the result	23:48	Copy link Message link Add to gist Remove
tellable6	2025-05-19T21:53:13Z #moarvm <lizmat> mandrillone: how would that handle .chars ?		Copy link Message link Add to gist Remove
mandrillone	If normalization is applied locally, you expect a small number of synthetics. The second string doesn’t need normalization since any graphene that doesn’t belong to the first string can be just handled on the flight	23:51	Copy link Message link Add to gist Remove
23:54 mandrillone left 23:55 mandrillone joined
mandrillone	Thing is that string comparison is a very frequent operation as such you expect some form of caching. The base algorithm should just store string as is and then normalize on the flight on each simple comparison. By starting with this terrible algorithm, one can improve it a lot step by step	23:56	Copy link Message link Add to gist Remove
	focusing on meaningful optimizations		Copy link Message link Add to gist Remove
23:56 mandrillone left

Please report any issues / comments / feature requests as an issue on App::Raku::Log.

Thank you!