japhb m: say .is-identifier for < foo _foo foo3 _foo3 >; # lizmat 00:13
camelia True
False
False
False
AlexDaniel what 00:17
6c: say .is-identifier for < foo _foo foo3 _foo3 >; 00:18
japhb AlexDaniel: Earlier today she sped up Pair.raku by factoring out that routine. Unfortunately I don't think it's quite right.
AlexDaniel: It's new today.
committable6 AlexDaniel, gist.github.com/7cd536bdd1321f8ad0...b0eb26bb77
japhb OK, found the *start* of the rakudo range that will cause the cro-websocket/t/http-router-websocket.t segfault on my system -- 2019.11-291-g9a571b685 works just fine, and 2019.11-292-g6bc64c713 is the first rakudo that will trigger the segfaults. 00:59
Note that I am testing with the *github HEAD* version of cro-websocket, not the CPAN version, because I need to test using the updated certificate files (otherwise it all just fails with OpenSSL cert expiry failures) 01:00
That may have been why other people saw a different situation.
In any case, that's the reliable beginning of trouble for me locally. I have to $life again for a while, but I'll backlog later. 01:01
01:52 Altai-man_ joined 01:55 sena_kun left
toddr AlexDaniel: Sorry to pound the issue but now I've explained the risks of where the put the repo, what path would you like it put in? I'm good with whatever. Note what wherever you put it, that is where rt.perl.org will redirect to so it needs to be the semi-permanent destination. 02:03
03:03 sourceable6 left, releasable6 left, coverable6 left, notable6 left, shareable6 left, statisfiable6 left, reportable6 left, greppable6 left, bisectable6 left, quotable6 left, unicodable6 left, nativecallable6 left, squashable6 left, benchable6 left, committable6 left, bloatable6 left 03:04 bloatable6 joined, reportable6 joined 03:05 notable6 joined, shareable6 joined, statisfiable6 joined, squashable6 joined, releasable6 joined, unicodable6 joined, greppable6 joined 03:06 bisectable6 joined, committable6 joined, nativecallable6 joined, quotable6 joined, benchable6 joined, sourceable6 joined, coverable6 joined
Geth rakudo/revert-3388-problem-solving-142: 9b1cb33135 | (Vadim Belman)++ (committed using GitHub Web editor) | 4 files
Revert "Export packages by user given names"
03:22
rakudo: vrurg++ created pull request #3406:
Revert "Export packages by user given names"
rakudo: 9b1cb33135 | (Vadim Belman)++ (committed using GitHub Web editor) | 4 files
Revert "Export packages by user given names"
03:23
rakudo: b12ba42c8f | (Vadim Belman)++ (committed using GitHub Web editor) | 4 files
Merge pull request #3406 from rakudo/revert-3388-problem-solving-142

Revert "Export packages by user given names"
03:53 sena_kun joined 03:55 Altai-man_ left 05:52 Altai-man_ joined 05:54 sena_kun left
AlexDaniel toddr: github.com/Raku/old-issue-tracker 06:07
toddr: feel free to recreate the repo if you need to
bartolin lizmat: afaics github.com/rakudo/rakudo/commit/09e66e504f didn't cause any problems for the JVM backend 06:24
07:53 sena_kun joined 07:55 Altai-man_ left
lizmat Files=1295, Tests=109667, 211 wallclock secs (28.61 usr 8.34 sys + 2958.63 cusr 272.70 csys = 3268.28 CPU) 08:06
09:37 Ulti left 09:52 Altai-man_ joined 09:54 sena_kun left
nine japhb: sorry, still cannot reproduce in any way 11:09
jdv79: same for you. None of my usual tricks helps in recreating the segfault. 11:24
There's nothing I can do right now :/ 11:25
11:53 sena_kun joined 11:55 Altai-man_ left
|Tux| Rakudo version 2019.11-393-gb12ba42c8 - MoarVM version 2020.01
csv-ip5xs0.712 - 0.740
csv-ip5xs-206.454 - 6.457
csv-parser22.491 - 23.172
csv-test-xs-200.427 - 0.430
test7.220 - 7.451
test-t1.726 - 1.743
test-t --race0.805 - 0.811
test-t-2029.371 - 29.711
test-t-20 --race9.281 - 9.337
12:03
nine Well, well....I can't reproduce it on my development machine, but of course, things explode immediately in our production system... 12:39
lizmat ;-( 12:40
nine Why the hell do these things only pop up immediately after a release?! 12:41
dogbert11 nine: I can repro jdv's error as well 12:42
I'm 'bisecting' atm and the problem is introduced *very* shortly after fc41473c03659 12:43
seems to be 357adb45bf831 12:45
github.com/MoarVM/MoarVM/commit/35...6542636fa5 12:46
nine But...how? 12:55
In our development V 13:01
In our development VM, I got the segfault exactly once. Then it only complained about not being able to read certificate files and log files and after fixing those it always dies with "Could not find symbol '&SSL_CTX_use_certificate_chain_file'" 13:02
dogbert11 I cloned github.com/croservices/cro-http.git and ran the test from the clone directory 13:05
'zef install cro' doesn't work for me atm. Problems with certificates and OO::Monitors 13:06
nine Ah, yes, that gives me a reproducible segfault in the dev MV 13:09
dogbert11 yay
nine The bad thing is, that I don't have rr there
dogbert11 does the 'bisect' above help at all? 13:10
nine Any information can help 13:12
dogbert11 for me things work on fc41473c03659 but SEGV's on 357adb45bf83 13:14
lizmat: did you see colabti.org/irclogger/irclogger_lo...0-01-06#l1 ? 13:16
lizmat yeah, I was working on fixing that when I had to start working on the Weekly 13:17
thanks for the hint
bisectable6: say (_foo => 42).perl
bisectable6 lizmat, Bisecting by output (old=2015.12 new=b12ba42) because on both starting points the exit code is 0
lizmat, bisect log: gist.github.com/690318c36dd7483dd3...70b0d22308
lizmat, (2020-01-05) github.com/rakudo/rakudo/commit/0d...5a99c578fa
lizmat yup, that's the one :-) 13:18
nine dogbert11: it may depend on openssl version 13:28
I have 1.0.2j on the dev system where it fails and 1.1.1d on my machine where it works 13:29
sena_kun is happy to see there is a progress with identifying a segfault (and unhappy it is there after a release) 13:31
dogbert11 I have 1.0.2g it seems 13:33
sena_kun I hope it is possible to make a hotfix one later, though we need to fix some things beforehand (e.g. cro, jnthn is already notified about a need of cpan releases) and do another ecosystem run
nine And works on our newer dev VM with OpenSSL 1.1.0i 13:35
dogbert11 I have 1.1.1d on my RPi 4 :)
lizmat sena_kun: only a MoarVM release, right ? 13:36
sena_kun lizmat, we don't have a rakudo one yet 13:39
lizmat, so yes
lizmat ok, *phew*, so a MoarVM point release is still an option :-) 13:40
13:40 pmurias_ left
sena_kun lizmat, I suspect this is THE option currently, right. 13:41
lizmat now only the minor issue of finding and fixing the problem :-)
sena_kun lizmat, and me torturing ecosystem modules. ;) 13:42
nine And yes! Upgrading OpenSSL on the old dev VM make the segfaults go away
lizmat :-) 13:43
nine I guess the "Could not find symbol '&SSL_CTX_use_certificate_chain_file'" comes from OpenSSL determining the libopenssl version at compile time. Obviously not the right thing to do 13:48
13:52 Altai-man_ joined, pmurias_ joined 13:55 sena_kun left
nine And downgrading libopenssl made the error appear on my laptop where I can use rr :) 14:00
dogbert11 excellent 14:02
nine Well there's definitely a bug in IO::Socket::Async::SSL. It never checks the return value of the SSL_CTX_new call which can be NULL when an error happened. 14:20
Ah yes, the "human-readable error message" for that call is: error:140A90A1:lib(20):func(169):reason(161) 14:24
No everything is clear!
14:26 pmurias_ left, pmurias joined
nine Ah, with some more initialization, we get the slightly more useful error: "error:140A90A1:SSL routines:SSL_CTX_new:library has no ciphers" 14:26
The missing initialization for error messages and the segfault actually have the same reason: IO::Socket::Async::SSL does call both OpenSSL::SSL::SSL_library_init() and OpenSSL::SSL::SSL_load_error_strings() in its mainline. 14:34
But then this happens, iluminated by a dgb warning: warning: Temporarily disabling breakpoints for unloaded shared library "/usr/lib64/libssl.so"
s/dgb/gdb/
So we load the native lib, then unload it again and then load it without initialization. 14:35
I didn't even know that one could unload a shared library
So my fixes are actually quite innocent! It's just another one of these "how could this have worked for so long??" cases 14:42
The actual source of the issue is: github.com/MoarVM/MoarVM/blob/mast...all.c#L147
When we GC a NativeCall site (i.e. any native function), we free the library handle. But the handle will be shared among all functions that come from the same library! 14:43
[Coke] nine++ 14:44
lizmat ew... so that implies you're ok if you're just using one function from each library you load ?
nine lizmat: or if you keep those functions around, which is what happens in most cases 14:45
lizmat aha... so the assumption was not that crazy
nine But if you assume that functions are never collected, there's no reason to implement the unloading in gc_free... 14:48
Well the proper fix would probably involve some reference counting for lib handles. A quick fix can be to just remove the call to MVM_nativecall_free_lib. That seems to fix it here 14:49
lizmat I'll take some leakage for complicated reference counting shenanigans any time 14:50
especially since those are no guarantee against leakage when buggy
(note the use of the word "when" rather than "if" :-) 14:51
nine Well in this place it's really just one place that allocates and one place that may free, so I ought to get it right in no more than 13 tries
in this case
Damn it, the perfect joke would have been "get it right in no more than -1 tries" :) 14:52
Even jokes about reference counting seem to be hard to get right...
lizmat once had a countdown on a program where *many* people were eagerly awaiting what was to come 14:53
and then the countdown when to -1, -2, -3 ARRGH... :-)
and people thought that was the joke :-( 14:54
moritz :-) 14:58
nine Actually I could even use the GC instead of a reference count. That would involve adding a new 6model repr. 14:59
In any case I'll probably need a hash to record the loaded library handles in 15:00
pmurias [Coke]: I emailed Makoto, and I have to wait for a vote. I'll contact you one email for anything important/urgent 15:06
tellable6 2020-01-05T22:26:21Z #raku-dev <[Coke]> pmurias - I haven't heard anything since I spoke to makoto. IRC's not a great way to reach me, please use email or TPF slack for more urgent things.
15:12 lucasb joined
[Coke] pmurias: Sorry about the delays, but hopefully this is the final stretch. Thanks again for your work on the JS backend. 15:14
15:16 pmurias left, pmurias joined
toddr AlexDaniel then I can just rename p6rt. DONE! github.com/Raku/old-issue-tracker/issues 15:27
nine Oh, gc_free is also called by repossession.
AlexDaniel toddr: soā€¦ this is it?
toddr yep. I've got the RT mapping. I think lizmat wanted it. 15:28
AlexDaniel toddr: you mean rt id ā†’ github issue?
lizmat whee!
AlexDaniel I want to have that too
lizmat yeah, I want to have that to fix all references in core code to use the new numbers :)
toddr ok... gist? 15:29
AlexDaniel sure
toddr gist.github.com/toddr/5c182f56f4b6...c8b922955f 15:30
I've passed this on to P6 NOC. He'll setup the mapping and do the final shutdown of rt.perl.org 15:32
p5 NOC? 15:33
whatevs
Words are hard
nine Actually, we never free the handle that is later used for the failing call. There seem to be 2 different handle pointers in use for libssl.so. Unloading the one will also uninitialize the other (as the lib is probably still only loaded once) 15:39
[Coke] toddr++ 15:47
nine Ok, the sequence seems to be: load IO::Socket::Async::SSL, run its mainline which will cause loading of libssl.so and the calls to SSL_library_init and SSL_load_error_strings, then we repossess the objects which triggers gc_free and unloading of the lib. Then we load the lib again when we reinitialize all those repossessed objects 15:53
15:53 sena_kun joined 15:54 Altai-man_ left
nine So the real trick will be to have the library survive repossession. Even a reference count won't do as we gc_free all NativeCall objects, even when they will still be used 15:58
I guess the reason why Inline::Perl5 is not affected is that I run initialization in an INIT phaser exported by Inline::Perl5, i.e. only after repossession. 16:01
japhb catches up with segfault analysis backlog -- nine++ 16:05
OOC, why does the OpenSSL version matter for this?
nine I think it's that OpenSSL 1.1 doesn't depend as much on that icall to SSL_library_init 16:06
japhb Ah 16:10
nine japhb: just pushed the fix to MoarVM. Please test! 16:20
17:31 pmurias left 17:37 pmurias joined
jdv79 wonder how it happened but cool that it might be fixed! 17:39
not near my box at the moment but should be in 7h or so 17:40
17:52 Altai-man_ joined 17:54 sena_kun left 18:11 |Tux| left 18:16 |Tux| joined
lizmat notable6: weekly 18:28
notable6 lizmat, 9 notes: gist.github.com/065abbc760d5b81dc7...c04911bb92
lizmat notable6: weekly reset 18:41
notable6 lizmat, Moved existing notes to ā€œweekly_2020-01-06T18:41:13Zā€
AlexDaniel lizmat: definitely github.com/perl6/old-issue-tracker and toddr++ 18:47
lizmat AlexDaniel: that URL 404's ? 18:48
AlexDaniel I mean
github.com/Raku/old-issue-tracker
lizmat hehe
19:53 sena_kun joined 19:54 pochi_ left, pochi joined, Altai-man_ left
nine Oh wow... I just realized that the Open Build Service even checks for actual changes in the built files on a rebuild. 20:00
And it even writes those to the log. As e.g. for NQPHLL.moarvm which clearly has some differences in the order of some elements: build.opensuse.org/package/live_bu...ory/x86_64 20:01
Considering how the OBS already is so advanced with a huge build farm, web and command line interface, rebuilds of all dependent packages on any upload including running tests and as pointed out diffing files on rebuilds and all of that for a multitude of distributions and architectures... 20:03
... I wonder why we're even talking about building some test infrastructure ourselves.
Oh, and not to forget, in contrast to Travis and AppVeyor, the OBS is 100 % reliable. If a build fails, it's because there's something broken in our stuff 20:05
20:12 pochi left, pochi joined
japhb nine: The machine that I was seeing the segfaults on is at a different (network isolated) location sadly, and this machine runs a different distro than that one. Still, I might be able to recreate if I get (un?)lucky. I see you didn't bump -- what's the build invocation to ensure I get exactly the build of current rakudo/nqp but the MoarVM commit with the proposed fix? 20:18
nine japhb: I just use 3 separate git clones of rakudo, nqp and MoarVM. With that it's just a git pull && make install 20:19
japhb nine: Ah, OK. I *start* with separate bare clones, but then in my build script I clone my bare rakudo into a new working tree, and then clone the bare nqp and MoarVM into that same tree in the subdirectories that the default build expects, and then start the overall build with `perl Configure.pl --backends=moar --gen-nqp --gen-moar && make && make test && make install` -- so I'm relying on the _REVISION 20:25
files to do their thing.
sena_kun do we have anything against bumping the fix? 20:35
nine nope 20:38
I already deployed packages with the fix on our production system and it fixed the issue there as well 20:39
sena_kun oki 20:40
I'll bump then and with some tweaks will run another round of blin. luckily, Cro::TLS and friends installation is fixed.
20:49 AlexDaniel left
lizmat and another Rakudo Weekly News hits the Net: rakudoweekly.blog/2020/01/06/2020-...foresight/ 20:54
21:18 pmurias left 21:42 pyrimidine joined, pyrimidine left, pyrimidine joined
sena_kun t/spec/S11-modules/import-long.rakudo.moar ........................ Dubious, test returned 1 (wstat 256, 0x100) 21:43
hmmmmmmmmm
I am running spectest again, but... 21:46
21:46 |Tux| left
sena_kun t/08-performance/99-misc.t ...................................... Dubious, test returned 1 (wstat 256, 0x100) 21:48
getting such things is not normal, right?
21:52 Altai-man_ joined 21:55 sena_kun left
japhb sena_kun: There are some known flappers. I don't spectest much, so I'm out of date, but lizmat may know them. zoffix used to keep an up to date list, but I'm assuming that left when zoffix did. 21:55
tellable6 japhb, I'll pass your message to sena_kun
Altai-man_ japhb, thanks. 21:56
21:56 |Tux| joined
Altai-man_ this one is not a flapper, sadly... 22:03
Geth Blin/enchance: f65d27431e | Altai-man++ | bin/blin.p6
Revert "Ignore abandoned modules"

This reverts commit f3f8d4500d257f34a2d943f3d08c30fe9f11d4bf.
Too controversial to die on this hill.
22:20
Blin/master: 9 commits pushed by Altai-man++, (Aleks-Daniel Jakimenko-Aleksejev)++ 23:02
23:06 |Tux| left 23:08 |Tux| joined 23:12 AlexDaniel joined, AlexDaniel left, AlexDaniel joined
Geth rakudo/master: 10 commits pushed by (Daniel Green)++, MasterDuke17++
review: github.com/rakudo/rakudo/compare/b...5701171f23
23:42
23:51 lucasb left 23:53 sena_kun joined 23:55 Altai-man_ left