01:04 librasteve_ left 04:30 sugarbeet left 04:39 sugarbeet joined 10:20 sugarbeet left 10:21 sugarbeet joined
timo we don't seem to have any information about gpg signatures of our release tarballs on the moarvm site 11:37
is the "fork me on github" banner on the top right broken for everyone else too? when i open the image in a new tab i get shown an Access Denied error by amazon aws 11:41
11:50 ShimmerFairy joined
ShimmerFairy So today I discovered that MoarVM was still on Unicode 15, and I thought it'd be nice to bump it up to Unicode 17. Little did I know this happens to include changes to how graphemes work, so I no longer feel qualified to try this for the very first time. 12:01
I have at least gotten as far as generating C files that compile, but that's about it at the moment. 12:02
timo that's already very useful
do you also have a link or two for the grapheme changes we have to adopt?
ShimmerFairy This is the new rule, GB9c, which involves a new property (that happens to require changes to ucd2c.pl, which I've managed): www.unicode.org/reports/tr29/#GB9c 12:03
timo that looks to be the same as the definition of conjunctCluster in Table 1c right? and the [ ] syntax there are character classes? so [\p{InCB=Extend} \p{InCB=Linker}]* means any amount of characters with Extend or Linker as the InCB property 12:08
the reason why there's a single InCB=Linker between those [...]* is just to make sure there is at least one =Linker somewhere between the =Consonant and all the =Consonant ones at the end? 12:09
oh, no, the parenthesis that are +ed goes around everything except the first =Consonant 12:10
ShimmerFairy I haven't fully internalized the rule myself, I just happened to notice that Unicode 15.1 just so happened to affect a core feature of Raku. 12:11
timo right, thanks a lot for keeping an eye out for us :) 12:12
ShimmerFairy I was exploring how Rakudo uses unicode in some of its default grammar rules, and then I had the bright idea to ask "wait, what version of Unicode are we using right now, exactly?". 12:13
timo i think the right way to implement the new requirements to the grapheme cluster algo is to give MVMNormalizer a field to track "are we in a sequence of InCB-related codes", and "have we seen a Linker yet" 12:24
until now we've only looked at the unicode property Grapheme_Cluster_Break, which I assume the InCB-relevant characters don't have, since there's no mention of Indic_Conjunct_Break in Table 2 there 12:28
can I already get your unicode db changes somewhere? 12:29
ShimmerFairy I can commit the MoarVM changes to a branch (or main branch if it's fine to have a half-baked update there). 12:31
timo i'd prefer a branch
ShimmerFairy That's what I was thinking, especially considering it's been 3 years since MoarVM's unicode support has been updated. 12:32
timo > The derivation of the values for the Indic_Conjunct_Break (InCB) property is fairly complex. 12:34
o_O
we just take this from a database, right? we don't have to implement any of these rules ourselves?
ah, phew. it's in the DerivedCoreProperties.txt which i think we use 12:37
ShimmerFairy Yeah, and that file required a change to ucd2c.pl, because until the new InCB property it was filled with only binary properties. 12:38
timo I see
ShimmerFairy I'll push the branch just as soon as I figure out/remember how to wrangle GitHub into letting me. 12:40
timo interestingly, the InCB=Extend is based off of GCB=Extend and removing the codes that have InCB=Linker or InCB=Consonant
that may or may not let us put parts of the code to handle InCB into paths that are already there so we have to query the property less often? 12:41
ShimmerFairy Oh, apparently I'm not a part of MoarVM (pretty sure I'm a part of the Raku org though) 12:46
timo feel free to send the patch through alternative means if github is annoying 12:47
ShimmerFairy if you can't add me to the MoarVM repo then I could engineer a temporary fork at least.
timo i don't have my github login details on this device :| 12:49
ShimmerFairy That's alright, I can bug people later. Kinda funny how throughout all the years I never needed access to MoarVM until now. 12:50
lizmat ShimmerFairy o/ 12:51
ShimmerFairy lizmat o/
lizmat japhb was working on upping Unicode from 15 to 17
ShimmerFairy I saw some evidence of that (the MoarVM/docs/ file for unidata mentions the InCB addition, it just didn't materalize anywhere in the codebase). 12:52
timo didn't we merge fixes to the ucd2c.pl script already that came from those efforts?
lizmat and has done quite a lot of work already, mostly cleaning up
timo: could be.... but maybe japhb can tell themselves... but maybe busy with BF 12:53
timo ~6 months ago looks to be when all that happened?
lizmat (as in Black Friday)
well, it came after the Raku Core Summit
and that's less than 6 months ago
ShimmerFairy Here's what I had to do to get the updated files working, at any rate: github.com/ShimmerFairy/MoarVM/tre...icode-17.0
lizmat well, I guess close 12:54
ShimmerFairy (This is the first time in my life I've ever had to write Perl code. It's probably very unidiomatic.) 12:55
lizmat yeah, I wonder why we couldn't make that raku at this point
ShimmerFairy It would be nice, especially since "parsing text files" is exactly the sorta thing Raku has great features for. 12:56
(and the fact that the generated files are committed means a new dev doesn't need Raku to build MoarVM the first time, so that's not a concern.) 12:57
lizmat indeed.. but please hold off until japhb has been able to chime in: I would hate to see double effort (wasted) with this
timo we don't have a function that turns a range from its string representation into a proper range object right?
lizmat m: dd "1..4".EVAL 12:58
camelia 1..4
timo well, yeah, but i'd like a slightly less huge hammer :)
ShimmerFairy lizmat: Don't worry, I'm not in the mood to change this very beefy Perl code to Raku code just for kicks. I think getting up to v17 should be the first priority. 12:59
lizmat ShimmerFairy: agree, but my point is that japhb might have done a lot already
ShimmerFairy Fair, wonder where that work would be if it's anywhere public. I didn't see any branches in MoarVM or the like already. 13:00
lizmat i *think* it was in a branch 13:01
timo it looks like japhb/moarvm has no commits in it from the last year, so if there's any changes he's made since the last merged commits, they are not easy to find :)
unfortunately, the "branches" list on github says all branches have been updated 6 months ago, which i'm not sure if that was just from cloning or from pushing all branches from a local clone to that fork or what
ShimmerFairy by the way, another future improvement would be for the UCD download script to allow for smarter updating than "delete UNIDATA", if possible. Just on principle it didn't feel nice downloading data from unicode.org over and over when fixing the emoji download. 13:02
timo but a "git log --graph --all" doesn't show commits from japhb in any branch outside of main
lizmat ok, then maybe I got it all wrong :-(
ShimmerFairy There were definitely updates (I saw that ucd2c.pl was cleaned up a bit, for example), it's just not immediately apparent that any 15->17 work has happened. 13:03
lizmat ok, well the impetus for that work was the desire to up the unicode version, I know that for sure :-) 13:04
ShimmerFairy I have to say, it is odd that docs/unidata-file-formats.md discusses the data change in Unicode 16 as if it's been implemented somewhere.
timo i'll be AFK for a little while 13:08
ShimmerFairy I think I might keep poking at regen'ing the roast tests, just to see what my changes have done, but I won't commit anything until japhb (and possibly others) can chime in. 13:18
japhb o/ 15:57
Welp, I can tell you there were things I expected out of today, but seeing my name a dozen times in the log was not one of them. :-D
timo :D 15:59
japhb So ... I have a few remaining diffs I can commit and push to UCD-download.raku and ucd2c.pl. IIRC when I left off I was working on making it so the rearranged data files on the FTP server Just Worked, and cleaning up the last untouched/undocumented bits of ucd2c.pl. 16:02
I am happy to push my bits and then let someone else take over; my Raku time has lately been mostly in the Terminal::* modules, and I'd rather not be in a "licked the cookie" situation, which it sounds like I was. :-( 16:04
ShimmerFairy: How do you want me to push my changes, for least damage to your work? 16:05
ShimmerFairy japhb: Since you've been working on it long before I was, I figure your changes are a lot less hacky and "just make it work" than mine. I only did the minimum to get something working, while googling things like "how do you get the size of an array in Perl". So my work's quality ain't the best. 16:06
If you're still concerned, you can check the branch in my MoarVM fork and see that it really wasn't much. 16:11
japhb OK, I'll take a look at that. I'm multitasking kind of heavily this morning, but I will let you know soon. 16:19
16:49 nine left, nine joined
ShimmerFairy Take your time, there's nothing I'm doing that needs the last couple of Unicode versions right this second. 17:05
22:21 rakkable left 22:22 rakkable joined