[00:02] *** dogbert11 joined [00:06] *** dogbert17 left [01:39] *** Kaiepi left [01:39] *** Kaiepi joined [01:52] *** leont left [01:56] *** leont joined [02:38] *** Kaiepi left [02:39] *** Kaiepi joined [03:51] *** evalable6 left [03:51] *** unicodable6 left [03:51] *** bloatable6 left [03:51] *** shareable6 left [03:51] *** tellable6 left [03:51] *** linkable6 left [03:51] *** squashable6 left [03:51] *** committable6 left [03:51] *** releasable6 left [03:51] *** statisfiable6 left [03:51] *** notable6 left [03:51] *** sourceable6 left [03:51] *** nativecallable6 left [03:51] *** benchable6 left [03:51] *** quotable6 left [03:51] *** coverable6 left [03:51] *** bisectable6 left [03:51] *** greppable6 left [03:51] *** bisectable6 joined [03:51] *** releasable6 joined [03:51] *** greppable6 joined [03:52] *** tellable6 joined [03:52] *** committable6 joined [03:52] *** unicodable6 joined [03:52] *** bloatable6 joined [03:52] *** sourceable6 joined [03:52] *** statisfiable6 joined [03:53] *** linkable6 joined [03:53] *** coverable6 joined [03:54] *** notable6 joined [03:54] *** nativecallable6 joined [03:54] *** squashable6 joined [03:54] *** evalable6 joined [03:54] *** benchable6 joined [03:54] *** quotable6 joined [03:54] *** shareable6 joined [07:31] *** japhb left [07:53] *** japhb joined [08:00] nine: re https://github.com/MoarVM/MoarVM/pull/1454#pullrequestreview-627484184 i assume i need to introduce a mutex as part of moving them to the instance? [08:12] *** linkable6 left [08:14] *** linkable6 joined [08:14] good *, #moarvm [08:35] Good * [08:57] *** linkable6 left [09:00] *** linkable6 joined [09:56] *** Kaiepi left [09:56] *** Kaiepi joined [09:57] ¦ MoarVM: 9b5d14c0c9 | (Daniel Green)++ | src/core/exceptions.c [09:57] ¦ MoarVM: Save malloc+free per frame when creating backtrace [09:57] ¦ MoarVM: [09:57] ¦ MoarVM: By using a fixed size char array. The number of instructions reported by [09:57] ¦ MoarVM: callgrind do vary a bit, even with `MVM_SPESH_BLOCKING=1`, but it seems [09:57] ¦ MoarVM: to save ~1m instructions for `my $a; $a = do given "hi" { when 3 { [09:57] ¦ MoarVM: "three" }; when 5 { "five" }; when "hi" { "hello" }; when * > 40 { [09:57] ¦ MoarVM: "> 40" }; default { "default" } } for ^1_000; say $a`. [09:57] ¦ MoarVM: review: https://github.com/MoarVM/MoarVM/commit/9b5d14c0c9 [09:57] ¦ MoarVM: 63a69af5c2 | (Daniel Green)++ | 3 files [09:57] ¦ MoarVM: Use the FSA for MVMActiveHandlers [09:57] ¦ MoarVM: [09:57] ¦ MoarVM: A lot of them get quickly allocated and freed when generating [09:57] ¦ MoarVM: backtraces, so use the FSA to hopefully avoid the actual malloc/free. [09:57] ¦ MoarVM: review: https://github.com/MoarVM/MoarVM/commit/63a69af5c2 [09:57] ¦ MoarVM: 42959e4645 | MasterDuke17++ (committed using GitHub Web editor) | 3 files [09:57] ¦ MoarVM: Merge pull request #1457 from MasterDuke17/minor_optimizations_when_generating_backtraces [09:57] ¦ MoarVM: review: https://github.com/MoarVM/MoarVM/commit/42959e4645 [09:58] MasterDuke: I don't think so? [09:58] If anything moving them into the instance would make accessing them (a bit) safer. [09:59] nine: hm. a naive moving them causing a pretty quick segv [10:01] https://gist.github.com/MasterDuke17/c0900fc0f176c7d43f878002d4e8227e don't have time to debug now, but this causes a segv with `raku --full-cleanup -e ''` [10:14] *** linkable6 left [10:17] *** linkable6 joined [10:35] *** linkable6 left [10:35] *** linkable6 joined [10:41] just verified that https://github.com/MoarVM/MoarVM/issues/1333 is still present in case anyone is interested :) [10:42] * dogbert11 worst sales pitch ever [11:07] dogbert11: how can I reproduce it? [11:10] nine: running it in a loop with ASAN and FSA_SIZE_DEBUG 1. It does not happen all the time. [11:15] Oh... at some point I removed the --asan from my debug-on script [11:17] this one is a bit tedious atm. I can try to narrow it down a bit, i.e. is it a specific test causing the problem. [11:19] there's also https://github.com/MoarVM/MoarVM/issues/1129 [11:20] it tends to crop up quite often when running the script in a loop [11:27] if we get a message like 'Cannot find method 'sink' on 'BOOTCode': no method cache and no .^find_method' does that mean that MoarVM has called MVM_exception_throw_adhoc? [11:28] MVM_exception_throw_adhoc_free in this case [11:28] MasterDuke: ah cool, so that's where a breakpoint should be placed then [11:38] *** linkable6 left [11:38] *** evalable6 left [11:38] dogbert11: I usually look for the error message ("no method cache" in this case) and set a breakpoint on that exact line. [11:39] Setting a break point on MVM_exception_throw_adhoc doesn't gain much since we do throw a lot of exceptions even in normal operations [11:39] *** evalable6 joined [11:40] *** linkable6 joined [11:42] nine: thx, the stacktrace looks like this [11:42] (gdb) bt [11:42] #0 MVM_6model_find_method (tc=0x55555555a180, obj=0x555558102cb0, name=0x55555889c860, res=0x5555582d7558, throw_if_not_found=1) at src/6model/6model.c:142 [11:42] #1 0x00007ffff784e785 in MVM_interp_run (tc=0x55555555a180, initial_invoke=0x7ffff79cee08 , invoke_data=0x5555556090e8, outer_runloop=0x0) at src/core/interp.c:1833 [11:42] #2 0x00007ffff79cef8e in MVM_vm_run_file (instance=0x555555559660, filename=0x5555555595e0 "/home/dogbert/.rakudobrew/versions/moar-master/install/share/perl6/runtime/perl6.moarvm") at src/moar.c:504 [11:42] #3 0x00005555555558e8 in main () [11:44] Which unfortunately doesn't help much, since that's just how executing a findmeth op looks like [11:45] FWIW I've already sunk a couple hours on this issue. It's clearly caused by spesh. Yet it is also quite elusive. For example I've yet to see it fail with MVM_SPESH_INLINE_DISABLE=1 but that doesn't mean anything since pretty much everything you could do makes it go away [11:48] I guess dumping the bytecode doesn't help much either [11:49] *** frost-lab joined [11:49] it failed now and the bytecode looks the same as in the original report so that part is consistent at least [11:57] multitasking a bit, I got the nonblocking await bug again, it seemed to fail after test 21 had completed [11:58] yeah, test 22 (Server survived) might possibly be a suspect [12:01] *** linkable6 left [12:01] nine: I just made the following fail, 'while MVM_SPESH_INLINE_DISABLE=1 perl6 --ll-exception -Ilib repos/rakudo/t/spec/S17-supply/return-in-tap.t; do :; done' [12:04] *** linkable6 joined [12:05] Thanks for proving my point about how devious this bug is :) [12:05] any tips about how to proceed? [12:06] Anyway this weekend's debugging made me think of another thing: hash randomization. Maybe we can find a hash salt that makes it fail more consistently [12:06] *** linkable6 left [12:06] MVM_SPESH_BLOCKING + disabled hash randomization should lead to very consistent runs [12:06] and where do I turn off randomization [12:07] MVM_HASH_RANDOMIZE in src/moar.h [12:08] thx, will try to turn it off [12:08] *** linkable6 joined [12:09] Then there's the constant hash salt in src/core/str_hash_table.c:151 which by default is 0 but can be set to other values to try different hash orderings. [12:09] Actually needs an addition at line 172 to set control->salt in that case, too [12:10] *** linkable6 left [12:12] so the salt needs to be set in two places? [12:12] yes [12:13] *** linkable6 joined [12:14] btw, I managed to golf issue 1333 a bit and wrote a comment in the case [12:16] *** linkable6 left [12:16] *** linkable6 joined [12:35] *** linkable6 left [12:36] *** linkable6 joined [12:40] *** domidumont joined [12:44] dogbert11: I've tested salts from 0 to 200 with 10 runs each. Haven't seen a single failure yet :( [12:49] :( [12:50] is the hash randomization affecting spesh then? [13:05] *** frost-lab left [13:20] Not directly. It can affect order of operations where hashes are involved which leads to different order of spesh logs (as calls are made in different order) which can mean that e.g. there may or may not be a speshed candidate available for inlining when we spesh a caller [13:30] perhaps it would make sense to not use hashes there ? [13:31] or perhaps have an interface to some dual list structure that would look like a hash, but wouldn't actually hash but do a serial lookup and be thus repeatable ? [13:31] not sure what the keys on those hashes are [14:10] lizmat: where? [14:11] spesh ? [14:11] you're saying hashes are being used in spesh logs, and as such are a source of unrepeatability [14:12] Spesh isn't using hashes for anything. But we use hashes for lots of things in Raku. Which means that the order of operations of lots of things depends on hash order. The spesh log just reflects that order of operations. [14:12] aaah... ok [14:12] sorry, I've misunderstood then [14:13] np [14:14] fwiw, I've replaced a lot of hashes in the RakuAST grammar by actual classes / objects [14:14] well, no, actually, I've made it possible to do so :-) [14:15] they still need to be hacked in :-) [14:16] That must be a win :) [14:28] heh, this bug is annoying as hell [14:29] tried something different, i.e. 'while MVM_SPESH_LOG=1 perl6 --ll-exception -Ilib repos/rakudo/t/spec/S17-supply/return-in-tap.t; do :; done' [14:29] ok 1 - return in a Supply quit handler works fine [14:30] 1..1 [14:30] ok 1 - return in a Supply quit handler works fine [14:30] Segmentation fault (core dumped) [14:31] Did you by any chance catch that core dump? [14:31] And have a build with debug symbols? [14:32] I might, let me see [14:34] nine: https://gist.github.com/dogbert17/c6fa4ea72ba201fb35fb0c45171ff701 [14:37] I suspect you'll be able to replicate the above quite easily [14:43] And another Rakudo Weekly News hits the Net: https://rakudoweekly.blog/2021/04/05/2021-14-floating-rats/ [14:43] lizmat++ [14:47] *** zakharyas joined [15:21] dogbert11: actually, I have yet to see a segfault with this. At least I got the "Cannot find method 'sink'" for the first time today [15:22] Since even with MVM_SPESH_BLOCKING=1 and hash randomization disabled the bug is...well, quite random, my only other idea is uninitialized memory [15:23] Which fits the bug apparently going away with just different debugging/logging options set [15:35] nine: don't you even get any problems when running with the spesh log [16:00] *** Kaiepi left [16:16] It seems to fail more often with spesh log and less often with spesh blocking [16:21] but you get neither SEGV's nor output like "free(): corrupted unsorted chunks" [16:33] nope [16:37] bizarre [16:38] the SEGV's disappear if MVM_SPESH_BLOCKING is set or if MoarVM is compiled with --no-optimize [16:42] I'm running on 'Rakudo(tm) v2021.03-116-g0b91b21af. Built on MoarVM version 2021.03-24-g1538f98da.' [16:55] * dogbert11 wipes and reinstalls rakudo [17:09] Disabling ASLR does not seem to make any difference [17:19] got the crash on my reinstalled rakudo [17:20] dogbert@dogbert-VirtualBox:~/repos/rakudo$ while MVM_SPESH_LOG=1 ./rakudo-m -Ilib t/spec/S17-supply/return-in-tap.t; do :; done [17:20] 1..1 [17:20] ok 1 - return in a Supply quit handler works fine [17:20] free(): corrupted unsorted chunks1..1 [17:20] 1..1 [17:20] *** domidumont left [17:20] ok 1 - return in a Supply quit handler works fine [17:20] Segmentation fault (core dumped) [17:47] *** zakharyas left [17:47] huh... [17:48] Meanwhile I've tried if memory poisoning from GC_DEBUG=3 makes a difference: no failure so far... [17:56] *** harrow left [18:02] something is very fishy but I don't know what. [18:03] the SEGV's only show up if I enable the spesh log, weird [18:05] always the same backtrace? [18:09] *** Kaiepi joined [18:16] *** harrow joined [18:44] nine: sorry for the late reply, was eating dinner. The backtrace (on my vanilla Rafudo install look like this) https://gist.github.com/dogbert17/321a7eccbbf5da55df7d166b6f2b3a08 [18:56] https://github.com/MoarVM/MoarVM/issues/1434 also mentions MVM_SPESH_LOG [19:03] *** zakharyas joined [19:04] MasterDuke: the backtrace in 1434 looks similar to the second one posted just above your comment [19:05] MasterDuke: does anything strange happen if you run 'while MVM_SPESH_LOG=1 ./rakudo-m -Ilib t/spec/S17-supply/return-in-tap.t; do :; done' [19:05] or is it only on my system [19:12] <[Coke]> ls [19:12] <[Coke]> ... ww [19:16] MasterDuke: how can I reproduce that one from 1434? [19:18] Ah, revert a5ca53a8b4bfd789ef5329218900737bd566a841 [19:18] *** linkable6 left [19:19] *** linkable6 joined [19:21] *** linkable6 left [19:22] *** linkable6 joined [19:24] *** linkable6 left [19:25] *** linkable6 joined [19:26] And yet again...the fix is to simply run things in rr [19:26] *** linkable6 left [19:26] rr is the magic "make MoarVM perfectly stable" tool [19:27] *** linkable6 joined [19:28] *** linkable6 left [19:31] *** linkable6 joined [19:32] *** linkable6 left [19:35] *** linkable6 joined [19:49] *** linkable6 left [19:49] *** linkable6 joined [19:51] *** linkable6 left [19:52] *** linkable6 joined [19:53] After 295 runs with --chaos, i.e. random scheduling decisions, it did end in a segfault. But a different one [19:55] sounds like a hard won success [19:55] *** linkable6 left [19:56] My segfault is deep inside vprintf when printing what appears to be a rather mundane guard_tree string [19:57] *** linkable6 joined [20:00] I've seen that as well [20:02] #3 _IO_new_file_xsputn (f=0x55a9b753a9d0, data=, n=7) at fileops.c:1197 [20:02] #4 0x00007fe5cafa6af2 in __vfprintf_internal (s=0x55a9b753a9d0, format=0x7fe5cb4adcc9 "After:\n%s", ap=0x7fe5ca94d970, mode_flags=2) at ../libio/libioP.h:948 [20:02] #5 0x00007fe5cb3ec5fe in MVM_spesh_debug_printf () from //home/dogbert/repos/rakudo/install/lib/libmoar.so [20:09] *** linkable6 left [20:10] *** linkable6 joined [20:11] *** linkable6 left [20:12] *** linkable6 joined [20:12] Ok, vfprintf is segfaulting because we already closed the spesh log file handle in MVM_vm_exit. It's just sloppy cleanup [20:13] *** linkable6 left [20:14] *** linkable6 joined [20:21] ¦ MoarVM/fix_segfault_on_exit_with_spesh_log: af5b7e4790 | (Stefan Seifert)++ | src/moar.c [20:21] ¦ MoarVM/fix_segfault_on_exit_with_spesh_log: Fix possible segfault on exit when using spesh log [20:21] ¦ MoarVM/fix_segfault_on_exit_with_spesh_log: [20:21] ¦ MoarVM/fix_segfault_on_exit_with_spesh_log: A normal exit without --full-cleanup did not stop the spesh thread. So spesh [20:21] ¦ MoarVM/fix_segfault_on_exit_with_spesh_log: may actually still be active and try to print things to the spesh log when [20:21] ¦ MoarVM/fix_segfault_on_exit_with_spesh_log: MVM_vm_exit closes the spesh log file handle. This would lead to a segfault in [20:21] ¦ MoarVM/fix_segfault_on_exit_with_spesh_log: vfprintf. Fix by stoping and joining the spesh thread even in MVM_vm_exit if [20:22] ¦ MoarVM/fix_segfault_on_exit_with_spesh_log: spesh logging is active. [20:22] ¦ MoarVM/fix_segfault_on_exit_with_spesh_log: review: https://github.com/MoarVM/MoarVM/commit/af5b7e4790 [20:22] ¦ MoarVM: niner++ created pull request #1458: Fix possible segfault on exit when using spesh log [20:22] ¦ MoarVM: review: https://github.com/MoarVM/MoarVM/pull/1458 [20:22] MasterDuke, dogbert11: please try ^^^ [20:22] *** linkable6 left [20:23] Lesson of the day: in multi threaded programs, always post backtraces of all threads [20:25] *** linkable6 joined [20:30] dogbert11: regarding issue 1333 try switching lines 885 and 886 in src/core/frame.c, i.e. set returner->extra = NULL before freeing it. [20:32] *** zakharyas left [20:36] *** linkable6 left [20:38] *** linkable6 joined [20:47] *** linkable6 left [20:50] *** linkable6 joined [20:55] nine: https://github.com/MoarVM/MoarVM/issues/1434 was triggered by the now removed t/09-moar/10-inline-array-pos.t [20:56] <[Coke]> updated listing in https://github.com/MoarVM/MoarVM/issues/1135, broke it out by merged vs. unmerged. shows how old, last committer... [20:57] <[Coke]> If someone OKs it, I can delete all the merged branches on the list. [20:57] <[Coke]> weird, there's a branch that travis committed to? [20:58] *** linkable6 left [21:00] *** linkable6 joined [21:03] <[Coke]> nwc10: https://github.com/MoarVM/MoarVM/issues/995 - any idea what platform that might be? [21:03] *** linkable6 left [21:05] *** linkable6 joined [21:06] nine: nm, i see you mention reverting the commit that removed them. and yes, after restoring the tests, master segfaults pretty quickly, your branch hasn't segfaulted after a while longer [21:13] [Coke]: fwiw, https://github.com/MoarVM/MoarVM/issues/995 might be OBE if/when https://github.com/MoarVM/MoarVM/pull/1402 is ever merged [21:16] *** linkable6 left [21:16] looks like nine++ solved the speshlog problem [21:17] *** linkable6 joined [22:02] *** dogbert17 joined [22:05] *** dogbert11 left