[10:07] *** rnddim is now known as ShimmerFairy

[10:14] <Geth> ¦ MoarVM: patrickbkr++ created pull request #2010: fix gettid on older glibc's

[10:14] <Geth> ¦ MoarVM: review: https://github.com/MoarVM/MoarVM/pull/2010

[11:34] <timo> should we put an `#ifndef gettid` or so?

[12:16] <ShimmerFairy> timo: Taking a quick look, if we include unistd.h (which we do, looking at the PR changes), and if we define the _GNU_SOURCE macro (which I don't know if we do), then we'd already be pulling in a definition of gettid, which might cause conflicts. gettid is a real function prototype on my system, though, so ifndef wouldn't catch it.

[12:17] <timo> OK, sounds like merging the pull request as-is is fine, then?

[12:22] <ShimmerFairy> I just manually edited in the PR fix to my local copy, and it compiled fine (though I didn't bother re-running the Configure script first, on the very slim offchance that matters). So it at least builds fine, even if I don't care for how it unconditionally replaces the real gettid() function for people with new-enough glibcs.

[12:22] <ShimmerFairy> (For reference, glibc 2.30 came out in August 2019, so you'd have to be running a pretty old system to not have gettid() at this point. Not so old to be implausible, though.)

[12:25] <timo> right, for building binary releases it's generally a good idea to just build on the oldest OS you can think of, so you get maximum portability, at least that's what I think the thought is

[12:27] <ShimmerFairy> Makes sense, though I would suggest that nearly 7 years old software is pushing it a bit (but I'm a gentoo user, so I have an unusual perspective on using old versions of software).

[12:28] <ShimmerFairy> In any case, if we had a build system that could test for the presence of a function and define a macro to let us conditionally put in a replacement when needed, that would be better.

[12:30] <timo> we do have that in the build system, yes. I have recent-ish commits in my branch for musttail-based interpreter loop stuff

[12:34] <ShimmerFairy> It's pretty much just a theoretical concern, I just have a philosophical issue with overriding the system-provided function when available (after all, what if someday gettid() isn't the same as that syscall?). It's probably just fine as-is, though.

[12:35] <timo> that makes sense to me

[12:36] <timo> I wouldn't be against saying something like "when you're making a release tarball, set -DMVM_REALLY_OLD_GLIBC and work off of that

[12:39] <ShimmerFairy> Looks like there are glibc version macros you could use to gate the gettid() #define, not quite sure yet if you're meant to use them in user code though.

[12:49] <ShimmerFairy> timo: Just added a comment on the PR about a conditional check that I think should work, figured it'd be better explaining it there than in IRC.

[12:55] <timo> does this work fine with musl libc too? I don't know if it defines the __GLIBC__ symbol and such

[12:58] <ShimmerFairy> I wouldn't know, I'm not very practiced in writing tests for older systems. (The best solution is still probably just the build system testing and defining a HAVE_GETTID-style macro.)

[13:03] <ShimmerFairy> Oh, looks like there's __GLIBC_PREREQ(2, 30) that makes it easier to test for a glibc version.

[13:07] <ShimmerFairy> Also, a quick glance tells me that musl doesn't really define any macros to test it's being used?

[13:16] <ShimmerFairy> As of a few years ago, at last, the musl devs were utterly allergic to defining any kind of __MUSL__ macros, so having the config system test for the presence of the function is the only good solution for musl users.

[13:17] <ShimmerFairy> *at least

[15:39] <timo> there is only a single mention of "gettid()" in the entire file. I think it's probably still correct to use `syscall(SYS_gettid)` even when glibc offers a gettid() function, and since this is for a very linux-specific feature, it might be no issue at all to just always use syscall here?

[15:50] <ShimmerFairy> That sounds reasonable to me. I do wonder what the point of the gettid() function is in the first place, if there's a reason beyond "it'd be nice to not need syscall() directly". If for example gettid() existed on non-Linux systems, and if our linux-specific code could someday be applied to those other systems, then it'd make sense to use gettid() instead.

[15:57] <timo> the code that uses it is to output the jitdump format, which currently is to my knowledge linux specific - at least the specification lives in the linux source tree

[16:14] <ShimmerFairy> Sounds good to me then, not like it can't be changed later on anyway. If we want something unconditional and not involving the build system any more, then I think using the syscall() that always works would be better than the gettid() that sometimes doesn't.

[16:16] <ShimmerFairy> In all though, I want to reiterate that my "objection" really isn't much of one. I just couldn't help but notice the PR was technically clobbering gettid() on systems that do have it, and wanted to at least point it out.

[16:16] <timo> yeah it's fair

[16:26] <timo> I pushed a commit but Geth isn't pointing it out. maybe it actually ended up in patrickb's own repository rather than the moarvm one, so there wasn't a call to the notification webhook

[16:26] <Geth> ¦ MoarVM/main: 2766e8ef86 | (Patrick Böker)++ (committed using GitHub Web editor) | src/jit/compile.c

[16:26] <Geth> ¦ MoarVM/main: fix gettid on older glibc's (#2010)

[16:26] <Geth> ¦ MoarVM/main: 

[16:26] <Geth> ¦ MoarVM/main: * fix gettid on older glibc's

[16:26] <Geth> ¦ MoarVM/main: 

[16:26] <Geth> ¦ MoarVM/main: Glibc < 2.30 does not define `gettid()`. The man page states:

[16:26] <Geth> ¦ MoarVM/main: 

[16:26] <Geth> ¦ MoarVM/main:     Glibc does not provide a wrapper for this system call; call it using

[16:26] <Geth> ¦ MoarVM/main: <…commit message has 12 more lines…>

[16:26] <Geth> ¦ MoarVM/main: review: https://github.com/MoarVM/MoarVM/commit/2766e8ef86

[16:27] <timo> this has the squash message which also has my message as part of it

[16:29] <timo> I'm still not sure how we should address the nativecall issue on clang where clang and gcc disagree on the upper bits of a smaller-than-64bit argument being cleared by the caller or not

[16:31] <timo> that's what is breaking one of our nativecall tests in CI

[16:31] <timo> anyway thanks patrickb++

[16:48] <[Coke]> wasn't there a commit that fixed the Changelog?

[16:50] <timo> it looks like there wasn't one yet, but you did ask for one, so I guess I'll write one? a bit later today though, gotta err a runnand

[16:50] <[Coke]> thought I saw an email but I don't see it in the closed PR. :(

[16:51] <timo> maybe it was done in the changelog wiki page instead of in the code repo?

[16:57] <[Coke]> there is no changelog wiki page for moarvm.

[16:57] <[Coke]> that's for rakudo itself.

[16:59] <[Coke]> later today is perfectly fine, thanks

[17:01] <lizmat> meanwhile I'll be bumping NQP and Rakudo  :-)

[18:41] <timo> what would the changelog entry look like? I imagine it'd go into a freshly-created "New in 2025.06" section? "+ Don't rely on `gettid` function for JITDUMP format" maybe?

[19:01] <timo> [Coke]: opinions?

[19:27] <[Coke]> New: is fine. Theoretically it's a 2025.06 but we don't know for sure yet.

[19:28] <[Coke]> or New in *

[19:30] <timo> I think commonly changelogs have a section at the top with a "dummy name" essentially

[19:36] <[Coke]> App::Mi6 has {{$NEXT}}

[19:36] <[Coke]> there's no automation on it in the moar release process, so any placeholder is fine. You can even use the next release as a placeholder.

[19:44] <patrickb> that gettid fix is a direct copy of libuv code. That's why I was pretty confident there wouldn't be any strange side effects of preprocessor overriding the name.

[19:45] <[Coke]> mentioned in #raku-dev, but we need to  catch potential release breakers like this... before the release. simple matter of adding all the binary rleleae OSes to the azure pipeline yml?

[19:48] <timo> possibly, yeah. we may have to go through docker images rather than just having azure give us the OS we want?

[20:05] <patrickb> yeah. Azure doesn't give us such old distros. It's been quite fiddly to get the current setup working. (And had to fix stuff up once or twice as well because distros shit down their package mirrors...)

[20:49] <[Coke]> I assume we want like a split, where we have new stuff in azure "standard", and specific stuff running in containers.

[20:49] <[Coke]> (or maybe we just move everything into containers)

[21:23] <Geth> ¦ MoarVM/utf8_and_c8_decode_mark_thread_blocked: 43749e1ec1 | (Timo Paulssen)++ | 2 files

[21:23] <Geth> ¦ MoarVM/utf8_and_c8_decode_mark_thread_blocked: When decoding over 10k bytes of utf8 or utf8-c8 data, mark thread blocked

[21:23] <Geth> ¦ MoarVM/utf8_and_c8_decode_mark_thread_blocked: 

[21:23] <Geth> ¦ MoarVM/utf8_and_c8_decode_mark_thread_blocked: this allows GC runs to happen while a thread is busy going through a large

[21:23] <Geth> ¦ MoarVM/utf8_and_c8_decode_mark_thread_blocked: buffer of bytes.

[21:23] <Geth> ¦ MoarVM/utf8_and_c8_decode_mark_thread_blocked: 

[21:23] <Geth> ¦ MoarVM/utf8_and_c8_decode_mark_thread_blocked: Smaller buffers of bytes don't really need us to go to the effort of blocking

[21:23] <Geth> ¦ MoarVM/utf8_and_c8_decode_mark_thread_blocked: and unblocking the thread, as with a series of smaller decodes, the allocation

[21:23] <Geth> ¦ MoarVM/utf8_and_c8_decode_mark_thread_blocked: of the resulting string would join in on GC runs waiting to happen in a timely

[21:23] <Geth> ¦ MoarVM/utf8_and_c8_decode_mark_thread_blocked: manner, I expect.

[21:23] <Geth> ¦ MoarVM/utf8_and_c8_decode_mark_thread_blocked: review: https://github.com/MoarVM/MoarVM/commit/43749e1ec1

[21:24] <Geth> ¦ MoarVM: timo++ created pull request #2012: When decoding over 10k bytes of utf8 or utf8-c8 data, mark thread blo…

[21:24] <Geth> ¦ MoarVM: review: https://github.com/MoarVM/MoarVM/pull/2012

[21:24] <timo> lizmat: this one's for you

[21:28] <Geth> ¦ MoarVM/utf8_and_c8_decode_mark_thread_blocked: 545d0337a6 | (Timo Paulssen)++ | docs/ChangeLog

[21:28] <Geth> ¦ MoarVM/utf8_and_c8_decode_mark_thread_blocked: Changelog entry for gc while decode

[21:28] <Geth> ¦ MoarVM/utf8_and_c8_decode_mark_thread_blocked: review: https://github.com/MoarVM/MoarVM/commit/545d0337a6

[21:28] <lizmat> wow... I had completely forgotten about that... it all makes sense now  :-)

[21:37] <timo> what "that" in particular?

[21:38] <timo> also, I can't tell you how i stumbled upon that gist, I think it was an open tab in one of my fifty million browser windows for uhhhhhhh a couple of months probably

[21:42] <timo> anyway, decoding a gigabyte of utf8 in 12 seconds isn't *that* terrible right?

[21:44] <japhb> timo: It's kinda terrible.  :-|

[21:44] <timo> decoding and normalizing i should say

[21:45] <timo> actually 12 seconds is utf8-c8, utf8 is 9.2s

[21:46] <japhb> Don't know what's normal for normalizing time in the broader world.  I know that verification and initial decode of UTF-8 can be pretty dang fast these days -- within a factor of 2 of the raw memcopy rate as I recall

[21:46] <japhb> But our normalization is a lot more than that, so ... not sure

[21:47] <timo> we do have a fast path in there in theory for situations where we don't need to do anything for normalization

[21:47] <japhb> Maybe comparison with Swift might be enlightening?  As I recall Swift has proper grapheme support (though not necessarily implemented the same as ours)

[21:47] <timo> yeah I believe they have it put into any ops that do traversal in strings

[21:48] <timo> instead of up-front

[21:51] <timo> doing verification without normalization is almost trivial, but for normalization you have to hit the unicode database

[21:56] <timo> decoding the same stuff to latin1 is 0.94 seconds (this is the entire process lifetime, not just the decoding)

[21:58] <timo> decoding a gigabyte of just zeroes is a pretty poor benchmark in any case; the big benefit is you can generate the full array of zeroes very quickly

[22:01] <japhb> nodnod

[22:10] <timo> anyway, oops i broke it :)

[22:11] <lizmat> ugexe: re POPULATE, I think  PRODUCE-META-ATTACHABLES is the wrong place for it, as it is also being called for roles

[22:11] <tellable6> lizmat, I'll pass your message to ugexe

[22:12] <lizmat> oops, ww

[22:25] <Geth> ¦ MoarVM/utf8_and_c8_decode_mark_thread_blocked: 0a3852be50 | (Timo Paulssen)++ | src/strings/utf8_c8.c

[22:25] <Geth> ¦ MoarVM/utf8_and_c8_decode_mark_thread_blocked: Don't b0rk the temp root stack in utf8-c8 decode

[22:25] <Geth> ¦ MoarVM/utf8_and_c8_decode_mark_thread_blocked: review: https://github.com/MoarVM/MoarVM/commit/0a3852be50

[22:29] <timo> if only we had a macro that made this easier

[22:33] <timo> someone got opinions on attempting a fuzzing campaign against a fuzzing target that is "compile code with legacy compiler, then with rakuast and if one but not the other gives an error, consider that a "desired result" so we find cases where they disagree?

[22:33] <timo> disagree about something being wrong, that is

[22:48] <japhb> Seems valuable to me.  :-)

[22:59] *** apogee_ntv left
[23:00] *** apogee_ntv joined
