#moarvm on 4 February 2025 - Raku Programming Language Log

00:42 MasterDuke joined
MasterDuke	timo: i had `-march=native` set for my desktop (x86-64), but not the laptop (aarch64). setting on the laptop slightly improved times	00:43	Copy link Message link Add to gist Remove
	adding `&& !writing_32bit` to the loop condition (to break out early) meant clang couldn't vectorize it (didn't try gcc)	00:45	Copy link Message link Add to gist Remove
timo	OK, presumably there's a way to get it to not have trouble with that. do you think putting the definition of the loop variable inside the for (***; ;) could help? it shouldn't keep the variable alive after, so it wouldn't matter if it's the number of the spot it stopped exactly	00:49	Copy link Message link Add to gist Remove
MasterDuke	nope, clang still says it can't vectorize	00:51	Copy link Message link Add to gist Remove
	also, i have a version that checks for crlf in that same loop, and that speeds things up even more (but makes an early exit even harder to do)	00:52	Copy link Message link Add to gist Remove
	let me push that so you can see	00:54	Copy link Message link Add to gist Remove
	done	00:57	Copy link Message link Add to gist Remove
	my test corpus is perhaps not very good though	00:59	Copy link Message link Add to gist Remove
	172 10	01:00	Copy link Message link Add to gist Remove
	255 65		Copy link Message link Add to gist Remove
	588 57		Copy link Message link Add to gist Remove
	1243 49		Copy link Message link Add to gist Remove
	2389 41		Copy link Message link Add to gist Remove
	4821 33		Copy link Message link Add to gist Remove
	8560 25		Copy link Message link Add to gist Remove
	17693 17		Copy link Message link Add to gist Remove
	45023 9		Copy link Message link Add to gist Remove
	205321 1		Copy link Message link Add to gist Remove
	count of lengths		Copy link Message link Add to gist Remove
	i really should just stick a print in and capture what's decoded during a rakudo build...	01:01	Copy link Message link Add to gist Remove
01:12 MasterDuke left
timo	do you think a special case for a single character could be worth it?	01:25	Copy link Message link Add to gist Remove
	i wonder what percentage come from the string heap in .moarvm files	01:26	Copy link Message link Add to gist Remove
02:47 MasterDuke joined
MasterDuke	timo: i got a print of what's decoded during a rakudo build, characteristics are a bit different:	02:48	Copy link Message link Add to gist Remove
	121992 11		Copy link Message link Add to gist Remove
	135586 12		Copy link Message link Add to gist Remove
	142789 9		Copy link Message link Add to gist Remove
	150224 10		Copy link Message link Add to gist Remove
	158566 8		Copy link Message link Add to gist Remove
	179682 7		Copy link Message link Add to gist Remove
	183802 3		Copy link Message link Add to gist Remove
	534090 4		Copy link Message link Add to gist Remove
	561218 5		Copy link Message link Add to gist Remove
	612082 6		Copy link Message link Add to gist Remove
	so based on this, no, i don't think special casing a single character is a good idea	02:49	Copy link Message link Add to gist Remove
	that's the problem with benchmarks...		Copy link Message link Add to gist Remove
	btw, this is my benchmark, very open to suggestions. `MVM_SPESH_BLOCKING=1 raku -e 'use nqp; my str $f; for ^10 { for "latin1".IO.slurp.lines -> str $s { my $a := nqp::list_i; nqp::encode($s, "iso-8859-1", $a); $f = nqp::decode($a, "iso-8859-1") }; }; say now - INIT now; say $f'`	02:52	Copy link Message link Add to gist Remove
	don't like how i'm also encoding everything, but i don't really grok encoding/decoding, so not sure if there's a way to avoid it in this case	02:53	Copy link Message link Add to gist Remove
timo	you can slurp the file as binary, then you get a buf out. haven't tried how .lines works with binary handles. with just \n as the line end byte i imagine it would work fine?	03:02	Copy link Message link Add to gist Remove
MasterDuke	hm. so `.slurp(:bin).lines -> $a { ...` ?	03:05	Copy link Message link Add to gist Remove
	`No such method 'lines' for invocant of type 'Buf[uint8]'`	03:06	Copy link Message link Add to gist Remove
	my branch doesn't seem to be as faster with this corpus as the input to the benchmark	03:10	Copy link Message link Add to gist Remove
timo	ah sorry it would ahve to be .IO.lines or so	03:14	Copy link Message link Add to gist Remove
	slurp gives the buffer, you want to use the lines iterator on the IO::Handle or IO::Path		Copy link Message link Add to gist Remove
	maybe make the array of bufs eagerly up front so it doesn't mix between the decoding work	03:15	Copy link Message link Add to gist Remove
MasterDuke	? in the past i've seen IO.lines as quite a bit slower than slurping and then .lines		Copy link Message link Add to gist Remove
timo	may want to put the "read file into array of bufs" outside the "for ^10" for the measurement then?	03:17	Copy link Message link Add to gist Remove
MasterDuke	well, given this file is quite a bit bigger than the previous one i've just ditched the `for ^10` part	03:18	Copy link Message link Add to gist Remove
	but i don't know how to not do the encode each time		Copy link Message link Add to gist Remove
timo	encode is what you do to go from Str to Buf	03:20	Copy link Message link Add to gist Remove
MasterDuke	yeah. it would be nice to directly slurp/lines directly into a Buf	03:21	Copy link Message link Add to gist Remove
timo	check what IO::Handle.get does when the handle is opened without an encoding	03:23	Copy link Message link Add to gist Remove
	depending on whether you set chomp on or off you'll either have strings where it's got a newline at the end every time, or not a single time	03:24	Copy link Message link Add to gist Remove
MasterDuke	but then i'll still need to turn the string into a buf	03:25	Copy link Message link Add to gist Remove
timo	i was hoping IO::Handle without encoding (or i guess with :bin?) gives you buf instead of Str	03:27	Copy link Message link Add to gist Remove
MasterDuke	`Cannot do 'get' on a handle in binary mode`	03:28	Copy link Message link Add to gist Remove
timo	>:(	03:29	Copy link Message link Add to gist Remove
	you know, i think we should just put support in for that for people who really know that it's really what they want and what can go wrong %)		Copy link Message link Add to gist Remove
	i need to go to bed i have a headache :(	03:30	Copy link Message link Add to gist Remove
MasterDuke	good luck with that. about to go to bed here too	03:31	Copy link Message link Add to gist Remove
03:42 MasterDuke left 09:01 sena_kun joined 09:56 sugarbee1 left 09:57 sugarbeet joined 10:39 El_Che left 18:42 japhb left 18:49 japhb joined
[Coke]	looks like we have a mix of 3rdparty https:// and git@ submodules. Any interest in standardizing?	18:56	Copy link Message link Add to gist Remove
ugexe	as we don't update those repositories we should use https	19:33	Copy link Message link Add to gist Remove
	https is generally recommended as well	19:34	Copy link Message link Add to gist Remove
	at $work we've explicitly gone through everything and changed them to https. we use github and github enterprise	19:36	Copy link Message link Add to gist Remove
19:47 [Coke] left, [Coke]_ joined 21:34 [Coke]_ is now known as [Coke]
[Coke]	what is 3rdparty/sha1 ? has a remote of moarvm itself?	21:34	Copy link Message link Add to gist Remove
	ah, it's not a 3rdparty. why are are storing it there?		Copy link Message link Add to gist Remove
timo	probably because it's so small? and unlikely to need to change?	21:38	Copy link Message link Add to gist Remove
[Coke]	why not keep it with the other source?	21:48	Copy link Message link Add to gist Remove
	ah, to keep licensing more smooth, probably		Copy link Message link Add to gist Remove
timo	that at least sounds like a good reason to me	21:50	Copy link Message link Add to gist Remove
22:29 sena_kun left

Please report any issues / comments / feature requests as an issue on App::Raku::Log.

Thank you!