#raku-beginner on 19 June 2022 - Raku Programming Language Log

This channel is intended for people just starting with the Raku Programming Language (raku.org). Logs are available at irclogs.raku.org/raku-beginner/live.html Set by lizmat on 8 June 2022.
03:46 discord-raku-bot left 03:47 discord-raku-bot joined 04:01 guifa left 06:15 frost joined 08:04 frost left 08:31 discord-raku-bot left, discord-raku-bot joined 08:37 discord-raku-bot left, discord-raku-bot joined 08:49 discord-raku-bot left, discord-raku-bot joined 10:23 guer joined
guer	I have difficulties trying to call an (old) API which I suspect can't handle ut8-strings ("åäö", ...). I've tried lots more or less "creative" conversion attempts ut8 -> latin-1, but they all end up in a ny utf8-string. What is the proper way?	10:29	Copy link Message link Add to gist Remove
	... in a new ...		Copy link Message link Add to gist Remove
	Maybe I could rephrase my question as:	10:38	Copy link Message link Add to gist Remove
	How do I get a string from a Buf as in: "'åäö'.encode('latin-1') ==> Buf"		Copy link Message link Add to gist Remove
10:40 discord-raku-bot left, discord-raku-bot joined
lizmat	but you cannot convert utf8 into latin1, unless the actual string only has codepoints < 128 ?	10:49	Copy link Message link Add to gist Remove
	or am I mistaking ASCII for Latin 1 here?	10:52	Copy link Message link Add to gist Remove
guer	Yes, maybe. Latin-1 has/had representaatioins for those characters/strings.	10:53	Copy link Message link Add to gist Remove
lizmat	ok, waking up a bit I guess	10:57	Copy link Message link Add to gist Remove
	so you're saying that the old API is tripping on 'åäö' ?	10:59	Copy link Message link Add to gist Remove
	m: say "åäö".ords # that seems to be correct values ?		Copy link Message link Add to gist Remove Run code
camelia	(229 228 246)		Copy link Message link Add to gist Remove
11:00 thundergnat joined
guer	Can't remeber really, Lat1 was a long time ago, But they seems plusible.	11:00	Copy link Message link Add to gist Remove
	plausible	11:01	Copy link Message link Add to gist Remove
thundergnat	m: say 'åäö'.encode('Latin-1').decode('Latin-1'); # perhaps?		Copy link Message link Add to gist Remove Run code
camelia	åäö		Copy link Message link Add to gist Remove
lizmat	m: .say for 'åäö'.encode('Latin-1')		Copy link Message link Add to gist Remove Run code
camelia	229 228 246		Copy link Message link Add to gist Remove
lizmat	the encoding into Latin-1 didn't change thos	11:02	Copy link Message link Add to gist Remove
	e		Copy link Message link Add to gist Remove
guer	thundergnat: Thats what I started out with. But the result seems to be a ut8 string.	11:03	Copy link Message link Add to gist Remove
thundergnat	Well, utf8 upto code point 255 is identical to Latin-1.		Copy link Message link Add to gist Remove
lizmat	right	11:04	Copy link Message link Add to gist Remove
	so if that API borks, it's getting fed other codepoints ?		Copy link Message link Add to gist Remove
	lizmat yawns again, trying to wake up		Copy link Message link Add to gist Remove
guer	Hmm, maybe I have to dig deeper into Lat1 codes ... But if they where identical why would the api complain?	11:07	Copy link Message link Add to gist Remove
lizmat	guer: that's my point: you would have to find out about what the API is complaining exactly :-)	11:08	Copy link Message link Add to gist Remove
guer	thundergnat: Hard experiences from some decades ago teached me that the unicode points for these Swedish characters do not map Latin1 representations.	11:10	Copy link Message link Add to gist Remove
	well, anyway: thanks for your help. I hoped that this way to test could sharpen my complaints to the API manainers, but I can't get any further rightt now.	11:14	Copy link Message link Add to gist Remove
lizmat	m: .say for (229,228,246).Buf.decode("latin1").uninames	11:16	Copy link Message link Add to gist Remove Run code
camelia	No such method 'Buf' for invocant of type 'List' in block <unit> at <tmp> line 1		Copy link Message link Add to gist Remove
lizmat	m: .say for Buf.new(229,228,246).decode("latin1").uninames		Copy link Message link Add to gist Remove Run code
camelia	LATIN SMALL LETTER A WITH RING ABOVE LATIN SMALL LETTER A WITH DIAERESIS LATIN SMALL LETTER O WITH DIAERESIS		Copy link Message link Add to gist Remove
lizmat	guer: looks like those 3 are latin-1 ?		Copy link Message link Add to gist Remove
guer	Yes, I know. That was one of the reasons that I suspected the output was unicode, not Lat1.	11:17	Copy link Message link Add to gist Remove
thundergnat	It's actually remarkably difficult to get Raku to NOT treat those characters as Latin-1 characters. Raku normalizes by default.	11:35	Copy link Message link Add to gist Remove
	m: put 'åäö'.comb>>.NFD>>.list>>.chrs>>.uniname>>.join(', ').join: "\t";	11:36	Copy link Message link Add to gist Remove Run code
camelia	LATIN SMALL LETTER A, COMBINING RING ABOVE LATIN SMALL LETTER A, COMBINING DIAERESIS LATIN SMALL LETTER O, COMBINING DIAERESIS		Copy link Message link Add to gist Remove
guer	1. I don't understand why the above shows that the string is Latin-1. To me it is just two possible unicode ways to represent the string.	11:45	Copy link Message link Add to gist Remove
	2. Maybe I wasn't clear enough from the beginning: The string I presented above is a utf8 string. So showing that this string witthout conversion is still a Unicode string confuses me.	11:49	Copy link Message link Add to gist Remove
	Thanks again for your efforts! I hope I won't bother you again in this matter.	12:00	Copy link Message link Add to gist Remove
12:39 Nemokosch joined 13:33 discord-raku-bot left, discord-raku-bot joined
stevied	so when testing a module, is there a good way to run unit tests on methods that are private?	15:32	Copy link Message link Add to gist Remove
	I guess there's a big debate about whether it should even be done	15:33	Copy link Message link Add to gist Remove
15:35 discord-raku-bot left 15:36 discord-raku-bot joined 16:28 guer left 16:42 Nemokosch left
	looking at raku.land/zef:tony-o/Data::Dump	22:44	Copy link Message link Add to gist Remove
	says "all of these options can be overridden in the DATA_DUMP environment variable in the format: indent=4,color=false"		Copy link Message link Add to gist Remove
	where is the variable and how do I change it? I don't see it in %*ENV	22:45	Copy link Message link Add to gist Remove
	adding it to %*ENV doesn't make a difference	22:48	Copy link Message link Add to gist Remove
kjp	Sorry to be rather late to the party, but that discussion 12 hours ago about utf8 and latin-1 seemed to be very confused. In particular, people seemed to be conflating "utf8" and "unicode code point".	23:49	Copy link Message link Add to gist Remove
	utf8 is only identical to latin-1 (really ASCII) up to code point 127. Unicode code points are identical to latin-1 up to 255.	23:50	Copy link Message link Add to gist Remove
	Thus 229, 228 and 246 are both latin-1 and unicode code points for the same characters, but they are not the utf8 representaion of anything.	23:52	Copy link Message link Add to gist Remove

Please report any issues / comments / feature requests as an issue on App::Raku::Log.

Thank you!