This channel is intended for people just starting with the Raku Programming Language (raku.org). Logs are available at irclogs.raku.org/raku-beginner/live.html
Set by lizmat on 8 June 2022.
05:33 siavash joined 07:50 guest23 joined
guest23 Hi. I'd like to titlecase some stuff. Is there a good way to do so in raku? I thought about using Inline::Perl5 and Lingua::EN::Titlecase or Lingua::EN::Titlecase::Simple, but don't seem to be able to alter the rules via “our” variables (e.g. @SMALL_WORDS in *::Simple). Is there a way do to this? Thanks 07:53
08:01 ab5tract joined
guest23 I found this on the Inline::Perl5 github. I'm not sure whether it means it's possible, but needs examples, or isn't possible: github.com/niner/Inline-Perl5/issues/139 08:08
siavash You can do it on words with `wordcase` method and on strings with `tclc` method, but it seems `wordcase` is what you want, unless you want something that comes with predefined rules docs.raku.org/type/Str#routine_wordcase 08:21
08:30 ab5tract left
guest23 I'd prefer a module since there're lots of edge cases (e.g. handling first/last words, first word in parens, hyphenated words, etc.), but I'll have. look. Thanks 08:34
08:45 tonyo left, tonyo joined 09:14 siavash left 09:15 siavash joined 09:16 guest23 left, guest23 joined 09:24 siavash left, CIAvash joined 09:28 CIAvash left 09:48 CIAvash joined 10:02 CIAvash left 10:11 guest23 left 12:40 ab5tract joined 12:43 lizmat_ joined 12:46 ab5tract_ left, lizmat left 12:47 lizmat_ left, lizmat joined 12:48 ab5tract left 13:03 guest23 joined 14:07 guest23 left 14:16 guest23 joined
guest23 I'm having a problem with IO::Path. When I use .Str (or .path) on one with certain characters such as ÿ in them, I get weird symbols printed to my terminal. When I use uniname on the characters I get <private-use-10FFFD> printed. Any idea what's going on? I'm getting the paths by using the code from the bottom of docs.raku.org/routine/dir. 14:20
I'm on macOS using Rakudo™ v2023.09
librasteve guest23: I just tried that on my mac (example program that lists all files and directories recursively) - works fine for me 14:25
I am on 2023.08
it's fine on 2023.09 also 14:27
can you maybe post your code to a gist and I can try that?
guest23 hmm. It must be something with the filename. I just created a test directory and manually typed the same name and it was fine 14:34
librasteve i suspect that unicode in filenames is still very novel and maybe some apps do not handle well (Finder rename?) 14:36
here is something from reddit (when getting files saved by others) ... In macOS, you can do this by going to System Preferences > Language & Region > Advanced, and selecting the encoding used by her coworkers under "Number, Currency, and Date Formatting". 14:39
maybe what you have is not utf-8? sis you read pawels blogs on unicode, there's likely a way to check what you have in raku
guest23 Thank you. I'll have a look 14:40
Making the directory using iTerm 2 works fine, but making it using the Finder causes the issue.
librasteve dev.to/bbkr/series/23930
guest23 System Preferences > Language & Region > Advanced doesn't seem to exist anymore 14:42
(System Settings, rather)
librasteve what do you get with echo $LANG? 14:45
I get en_GB.UTF-8
guest23 en_GB.UTF-8 14:47
I've created two dictories: A/ÿ using iTerm and B/ÿ using Finder. ls A | hexdump gives me bfc3 000a, ls B | hexdump gives me cc79 0a88. 14:59
Blob.new(0xC3, 0xBF).decode.say gives me ÿ
librasteve hmmm looks like iTerm is round tripping correctly but is set to a differenet encoding 15:00
My (vanilla) Terminal > Prefs > Encodings has 'Unicode (UTF-8)' and a bunch of other like Western (ASCII), Japanese (MacOS) and so on ... 15:02
guest23 But $path.path.comb.map: *.uniname.say; outputs the weird private-use things 15:07
librasteve Can you maybe try with vanilla Terminal? 15:08
guest23 raku -e 'dir("A").map: *.basename.say' 15:09
ÿ
raku -e 'dir("B").map: *.basename.say'
y􏿽xCC􏿽x88
raku -e 'dir("A").map: *.basename.uninames.say'
(LATIN SMALL LETTER Y WITH DIAERESIS)
raku -e 'dir("B").map: *.basename.uninames.say'
(LATIN SMALL LETTER Y <private-use-10FFFD> LATIN SMALL LETTER X LATIN CAPITAL LETTER C LATIN CAPITAL LETTER C <private-use-10FFFD> LATIN SMALL LETTER X DIGIT EIGHT DIGIT EIGHT)
nemokosch could it consist of combining code points? 15:10
the second one 15:11
guest23 hexdump says the filename created with Finder contains the bytes cc79 0a88, rather than the bytes bfc3 000a are used for the filename created with Terminal. Combining diaresis is CC88, so it looks like Finder is using a combining codepoint and Terminal is using a combined character (C3BF) 15:18
nemokosch dang 15:19
what are you trying to do again?
guest23 Should be able to treat them the same in raku though, right?
Just process some albums. Broke on Queensrÿche. I can just use mv to fix it, but I don't think I should need to, should I? 15:20
librasteve I just tried this the way you describe (ie use finder to rename to 'Broke on Queensrÿche' and then use the raku -e you show) - I now get the same private-use issue as you 15:44
it was working ok for me before since I was just making then reading the filename in Terminal 15:45
my guess is that macOS file system and raku have different ideas on how to manage combined unicode chars (as nemo suggests) ... it's probably worth making an issue for this on rakudo since afaik macOS is a gold standard for this 15:47
guest23 Gotcha. Yeah, I was discovering as I went so wasn't clear. Glad you can reproduce
librasteve anyway, back to titlecase, this should work for you:use Inline::Perl5; my $p5 = Inline::Perl5.new; $p5.use('Lingua::EN::Titlecase'); my $tc = $p5.invoke('Lingua::EN::Titlecase', 'new', 'word_punctuation => "[^]"'); $p5.run( 'print join(" ", sort(keys(%Lingua::EN::Titlecase::LC)), "\n")' ); $p5.run( '$Lingua::EN::Titlecase::LC{"yep"} = 1' ); $p5.run( 'print join(" ", 15:48
sort(keys(%Lingua::EN::Titlecase::LC)), "\n")' ); print $tc.title("and again and again yep, cow^catcher not cow£catcher but differently");
And Again and Again yep, Cow^catcher Not Cow£Catcher but Differently%
I ended up going with the non Simple cpan module since that has an explicit constructor where you can control the ATTRS such as word_punctuation => 1, original => 1, title => 1, uc_threshold => 1, mixed_threshold => 1, 15:49
15:50 habere-et-disper joined
metacpan.org/release/ASHLEY/Lingua...tlecase.pm 15:50
guest23 Ah, thanks. That's a lot more complex than I'd tried. I was just doing `use Lingua::EN::Titlecase::Simple 'titlecase'; @Lingua::EN::Titlecase::Simple::SHORD_WORD.push: ...` and hoping it'd work. 15:51
I think I'll just use Lingua::EN::Titlecase::Simple and fix up a few bits instead.
Thanks for that for future Inline::Perl5 bits
Do you want me to create the issue about IO::Path or do you have a better idea of what to write? 15:52
habere-et-disper m: say sqrt: 100
camelia ===SORRY!=== Error while compiling <tmp>
Unsupported use of bare "sqrt". In Raku please use: .sqrt if you meant
to call it as a method on $_, or use an explicit invocant or argument,
or use &sqrt to refer to the function as a noun.
at <…
guest23 (Made a typo in my code above)
m: say sqrt 100 15:54
camelia 10
habere-et-disper In the REPL I get `100` for `sqrt: 100` 15:57
16:02 guifa left
guest23 It is treating sqrt: as a label there or something? (I'm not sure. I should leave this to the professionals.) 16:02
*Is it
habere-et-disper That sounds plausible, thanks. :) 16:03
nemokosch I think so, actually 16:11
librasteve you can make an issue at github search rakudo 16:12
16:20 habere-et-disper left
m: say sqrt 100 16:46
Raku eval 10
librasteve m: say 100.sqrt
Raku eval 10
librasteve the colon is only needed when there are additional arguments to a method call 16:47
m: say [1,2].push: 3 16:48
Raku eval [1 2 3]
18:17 ab5tract_ joined 19:20 Nemokosch joined 20:04 Nemokosch left 22:11 guest23 left
nhail sub altsum(Array @x) { return sum @x } say altsum [0,1]; I don't understand what's wrong here? I get the error Type check failed in binding to parameter '@x'; expected Positional[Array] but got Array ([0, 1]) 23:02
23:12 teatime joined
nemokosch Array @x is @x of Array, not @x is Array 23:16
nhail So what should I put in the type signature? 23:17
23:42 teatime left, teatime joined
nemokosch m: sub altsum(Array $x) { return sum $x } say altsum [0,1]; 23:47
Raku eval 1
nemokosch One way, kind of a Gordian knot solution 23:48
m: sub altsum(@x where Array) { return sum @x } say altsum [0,1];
Raku eval 1
23:48 teatwo joined
nemokosch With a custom type check 23:48
I wonder know 23:49
m: sub altsum(@x of Int) { return sum @x } my Int @x = 0, 1; say altsum @x; 23:50
Raku eval Exit code: 1 ===SORRY!=== Error while compiling /home/glot/main.raku Cannot resolve caller trait_mod:<of>(Parameter:D, Int:U); none of these signatures matches: (Mu:U $target, Mu:U $type) (Routine:D $target, Mu:U $type) at /home/glot/main.raku:1
nemokosch Yeah lest traits do anything type-related in a signature :\ 23:51
23:52 teatime left 23:55 teatwo left, teatwo joined