#raku-dev on 4 March 2026 - Raku Programming Language Log

🦋 Welcome to the IRC channel of the core developers of the Raku Programming Language (raku.org #rakulang). This channel is logged for the purpose of history keeping about its development \| evalbot usage: 'm: say 3;' or /msg camelia m: ... \| Logs available at irclogs.raku.org/raku-dev/live.html \| For MoarVM see #moarvm Set by lizmat on 8 June 2022.
08:42 sivoais left 11:02 finanalyst joined 11:14 finanalyst left
lizmat	this feels LTA:	12:00	Copy link Message link Add to gist Remove
	m: 56792.chr		Copy link Message link Add to gist Remove Run code
camelia	( no output )		Copy link Message link Add to gist Remove
lizmat	m: say 56792.chr		Copy link Message link Add to gist Remove Run code
camelia	Error encoding UTF-8 string: could not encode Unicode Surrogate codepoint 56792 (0xDDD8) in block <unit> at <tmp> line 1		Copy link Message link Add to gist Remove
lizmat	ShimmerFairy ^^	12:01	Copy link Message link Add to gist Remove
	it's nqp::encoderepconf() throwing in Encoding::Encoder::Builtin.encode-chars	12:02	Copy link Message link Add to gist Remove
timo	really depends on whether we want Str in NFG to be able to hold a lone surrogate codepoint or not	13:33	Copy link Message link Add to gist Remove
	it makes sense that it explodes when trying to encode it to a buf in order to print it out	13:34	Copy link Message link Add to gist Remove
	m: 56792.chr.encode("utf8-c8").say		Copy link Message link Add to gist Remove Run code
camelia	Error encoding UTF-8 string: could not encode Unicode Surrogate codepoint 56792 (0xDDD8) in block <unit> at <tmp> line 1		Copy link Message link Add to gist Remove
timo	i guess that's not how you get that out of there huh		Copy link Message link Add to gist Remove
	gist.github.com/milseman/c22a0413d...07e3ee7c8b - surely interesting to cross-reference	13:40	Copy link Message link Add to gist Remove
[Coke]	m: 56792.chr.NFKD.say	13:45	Copy link Message link Add to gist Remove Run code
camelia	NFKD:0x<ddd8>		Copy link Message link Add to gist Remove
16:32 sivoais joined 16:48 [Coke]_ joined 16:51 [Coke] left 18:38 [Coke]_ is now known as [Coke]
Geth	rakudo/main: a97c7a33c1 \| (Elizabeth Mattijsen)++ \| src/core.c/IO/Path.rakumod Rework IO::Path.slurp(:bin) Inspired by af30c7bed30b725a12 Instead of always asking for the filesize beforehand, it now asks for the filesize if the initial read filled the initial size buffer (1MB). If that exceeds INT_MAX (minus 1MB) a slow path is taken ... (9 more lines)	19:30	Copy link Message link Add to gist Remove
lizmat	m: use nqp; my $b = nqp::setelems(Buf.new,0x080000000).decode	19:32	Copy link Message link Add to gist Remove Run code
camelia	MoarVM panic: Memory allocation failed; could not allocate 2147483648 bytes		Copy link Message link Add to gist Remove
lizmat	m: use nqp; my $b = nqp::setelems(Buf.new,0x07fffffff).decode		Copy link Message link Add to gist Remove Run code
camelia	MoarVM panic: Memory allocation failed; could not allocate 2147483647 bytes		Copy link Message link Add to gist Remove
lizmat	m: use nqp; my $b = nqp::setelems(Buf.new,0x07ffffff).decode		Copy link Message link Add to gist Remove Run code
camelia	( no output )	19:33	Copy link Message link Add to gist Remove
lizmat	m: use nqp; say nqp::setelems(Buf.new,0x07ffffff).decode.chars		Copy link Message link Add to gist Remove Run code
camelia	134217727		Copy link Message link Add to gist Remove
lizmat	m: use nqp; say nqp::setelems(Buf.new,0x07ffffff).decode.chars.base(16)		Copy link Message link Add to gist Remove Run code
camelia	7FFFFFF		Copy link Message link Add to gist Remove
lizmat	m: use nqp; say nqp::setelems(Buf.new,0x07fffffff).decode.chars.base(16)	19:34	Copy link Message link Add to gist Remove Run code
camelia	MoarVM panic: Memory allocation failed; could not allocate 2147483647 bytes		Copy link Message link Add to gist Remove
lizmat	weird, that works on my machine, but then again that has 64G		Copy link Message link Add to gist Remove
	0x07fffffff works for me, 0x08000000 fails	19:35	Copy link Message link Add to gist Remove
	so I guess I will put in a check for > 0x07fffffff before trying to decode		Copy link Message link Add to gist Remove
timo	PIO? oh yeah that's uhhh Physical Input/Output of course :)	19:51	Copy link Message link Add to gist Remove
20:27 patrickb left 20:41 patrickb joined 20:47 finanalyst joined
Geth	rakudo/main: 06f16f6d58 \| (Elizabeth Mattijsen)++ \| 2 files Provide better error when trying to decode too large blobs Apparently at least on some OSes (and/or on MoarVM) it's impossible to decode buffers that have more than 0x07fffffff elements: this used to throw a hard untrappable memory exceeded error thrown from the guts of the VM. This adds logic to check the number of elements in the blob to be decoded, and produces a Failure if the number of elements is too large: Too many bytes to decode: 2418709326 is more than 2147483647	21:03	Copy link Message link Add to gist Remove
[Coke]	;win 12	21:56	Copy link Message link Add to gist Remove
22:06 finanalyst left
ShimmerFairy	lizmat: To respond to earlier, surrogates are where things get real subtle. Surrogate codepoints are, in fact, codepoints, and you can manipulate them just like you would any other codepoint (though, obviously, their practical use is quite limited). Surrogates however are not "Unicode scalar values", which is the only kind of codepoint allowed in the UTF encodings. So trying to store a surrogate in any UTF encoding is an error, but	23:36	Copy link Message link Add to gist Remove
	holding onto one as a codepoint isn't. Since NFG strings operate at a higher level than raw UTF encoding, there's a logic to them being OK in Strs.		Copy link Message link Add to gist Remove
	I tried looking, but Unicode doesn't seem to have guidance on how a language's string type should handle surrogate codepoints, and I'm not sure what I think the right answer is. For one thing, since Strs are conceptualized as grapheme sequences (i.e. codepoint sequences with bookkeeping), it makes sense to allow all valid codepoints in them. On the other hand, you'll never be able to do UTF-based I/O with such strings, so their	23:40	Copy link Message link Add to gist Remove
	usability is limited to within the program, and I can't offhand think of any uses.		Copy link Message link Add to gist Remove
timo	put it in a string you really don't want your program to output so users or other processes can see them :) :)	23:44	Copy link Message link Add to gist Remove
ShimmerFairy	Yeah, you could (ab)use them as like noncharacters that absolutely must not exit the program's internal memory.	23:47	Copy link Message link Add to gist Remove

Please report any issues / comments / feature requests as an issue on App::Raku::Log.

Thank you!