github.com/moarvm/moarvm | IRC logs at colabti.org/irclogger/irclogger_logs/moarvm
Set by AlexDaniel on 12 June 2018.
MasterDuke nine: re github.com/MoarVM/MoarVM/pull/1454...-627484184 i assume i need to introduce a mutex as part of moving them to the instance? 08:00
nwc10 good *, #moarvm 08:14
japhb Good * 08:35
Geth MoarVM: 9b5d14c0c9 | (Daniel Green)++ | src/core/exceptions.c
Save malloc+free per frame when creating backtrace

By using a fixed size char array. The number of instructions reported by callgrind do vary a bit, even with `MVM_SPESH_BLOCKING=1`, but it seems to save ~1m instructions for `my $a; $a = do given "hi" { when 3 {
   "three" }; when 5 { "five" }; when "hi" { "hello" }; when * > 40 {
   "> 40" }; default { "default" } } for ^1_000; say $a`.
09:57
MoarVM: 63a69af5c2 | (Daniel Green)++ | 3 files
Use the FSA for MVMActiveHandlers

A lot of them get quickly allocated and freed when generating backtraces, so use the FSA to hopefully avoid the actual malloc/free.
MoarVM: 42959e4645 | MasterDuke17++ (committed using GitHub Web editor) | 3 files
Merge pull request #1457 from MasterDuke17/minor_optimizations_when_generating_backtraces
nine MasterDuke: I don't think so? 09:58
If anything moving them into the instance would make accessing them (a bit) safer.
MasterDuke nine: hm. a naive moving them causing a pretty quick segv 09:59
gist.github.com/MasterDuke17/c0900...02d4e8227e don't have time to debug now, but this causes a segv with `raku --full-cleanup -e ''` 10:01
dogbert11 just verified that github.com/MoarVM/MoarVM/issues/1333 is still present in case anyone is interested :) 10:41
dogbert11 worst sales pitch ever 10:42
nine dogbert11: how can I reproduce it? 11:07
dogbert11 nine: running it in a loop with ASAN and FSA_SIZE_DEBUG 1. It does not happen all the time. 11:10
nine Oh... at some point I removed the --asan from my debug-on script 11:15
dogbert11 this one is a bit tedious atm. I can try to narrow it down a bit, i.e. is it a specific test causing the problem. 11:17
there's also github.com/MoarVM/MoarVM/issues/1129 11:19
it tends to crop up quite often when running the script in a loop 11:20
if we get a message like 'Cannot find method 'sink' on 'BOOTCode': no method cache and no .^find_method' does that mean that MoarVM has called MVM_exception_throw_adhoc? 11:27
MasterDuke MVM_exception_throw_adhoc_free in this case 11:28
dogbert11 MasterDuke: ah cool, so that's where a breakpoint should be placed then
nine dogbert11: I usually look for the error message ("no method cache" in this case) and set a breakpoint on that exact line. 11:38
Setting a break point on MVM_exception_throw_adhoc doesn't gain much since we do throw a lot of exceptions even in normal operations 11:39
dogbert11 nine: thx, the stacktrace looks like this 11:42
(gdb) bt
#0 MVM_6model_find_method (tc=0x55555555a180, obj=0x555558102cb0, name=0x55555889c860, res=0x5555582d7558, throw_if_not_found=1) at src/6model/6model.c:142
#1 0x00007ffff784e785 in MVM_interp_run (tc=0x55555555a180, initial_invoke=0x7ffff79cee08 <toplevel_initial_invoke>, invoke_data=0x5555556090e8, outer_runloop=0x0) at src/core/interp.c:1833
#2 0x00007ffff79cef8e in MVM_vm_run_file (instance=0x555555559660, filename=0x5555555595e0 "/home/dogbert/.rakudobrew/versions/moar-master/install/share/perl6/runtime/perl6.moarvm") at src/moar.c:504
#3 0x00005555555558e8 in main ()
nine Which unfortunately doesn't help much, since that's just how executing a findmeth op looks like 11:44
FWIW I've already sunk a couple hours on this issue. It's clearly caused by spesh. Yet it is also quite elusive. For example I've yet to see it fail with MVM_SPESH_INLINE_DISABLE=1 but that doesn't mean anything since pretty much everything you could do makes it go away 11:45
dogbert11 I guess dumping the bytecode doesn't help much either 11:48
dogbert11 it failed now and the bytecode looks the same as in the original report so that part is consistent at least 11:49
multitasking a bit, I got the nonblocking await bug again, it seemed to fail after test 21 had completed 11:57
yeah, test 22 (Server survived) might possibly be a suspect 11:58
dogbert11 nine: I just made the following fail, 'while MVM_SPESH_INLINE_DISABLE=1 perl6 --ll-exception -Ilib repos/rakudo/t/spec/S17-supply/return-in-tap.t; do :; done' 12:01
nine Thanks for proving my point about how devious this bug is :) 12:05
dogbert11 any tips about how to proceed?
nine Anyway this weekend's debugging made me think of another thing: hash randomization. Maybe we can find a hash salt that makes it fail more consistently 12:06
nine MVM_SPESH_BLOCKING + disabled hash randomization should lead to very consistent runs 12:06
dogbert11 and where do I turn off randomization
nine MVM_HASH_RANDOMIZE in src/moar.h 12:07
dogbert11 thx, will try to turn it off 12:08
nine Then there's the constant hash salt in src/core/str_hash_table.c:151 which by default is 0 but can be set to other values to try different hash orderings. 12:09
Actually needs an addition at line 172 to set control->salt in that case, too
dogbert11 so the salt needs to be set in two places? 12:12
nine yes
dogbert11 btw, I managed to golf issue 1333 a bit and wrote a comment in the case 12:14
nine dogbert11: I've tested salts from 0 to 200 with 10 runs each. Haven't seen a single failure yet :( 12:44
dogbert11 :( 12:49
is the hash randomization affecting spesh then? 12:50
nine Not directly. It can affect order of operations where hashes are involved which leads to different order of spesh logs (as calls are made in different order) which can mean that e.g. there may or may not be a speshed candidate available for inlining when we spesh a caller 13:20
lizmat perhaps it would make sense to not use hashes there ? 13:30
or perhaps have an interface to some dual list structure that would look like a hash, but wouldn't actually hash but do a serial lookup and be thus repeatable ? 13:31
not sure what the keys on those hashes are
nine lizmat: where? 14:10
lizmat spesh ? 14:11
you're saying hashes are being used in spesh logs, and as such are a source of unrepeatability
nine Spesh isn't using hashes for anything. But we use hashes for lots of things in Raku. Which means that the order of operations of lots of things depends on hash order. The spesh log just reflects that order of operations. 14:12
lizmat aaah... ok
sorry, I've misunderstood then
nine np 14:13
lizmat fwiw, I've replaced a lot of hashes in the RakuAST grammar by actual classes / objects 14:14
well, no, actually, I've made it possible to do so :-)
they still need to be hacked in :-) 14:15
nine That must be a win :) 14:16
dogbert11 heh, this bug is annoying as hell 14:28
tried something different, i.e. 'while MVM_SPESH_LOG=1 perl6 --ll-exception -Ilib repos/rakudo/t/spec/S17-supply/return-in-tap.t; do :; done' 14:29
ok 1 - return in a Supply quit handler works fine
1..1 14:30
ok 1 - return in a Supply quit handler works fine
Segmentation fault (core dumped)
nine Did you by any chance catch that core dump? 14:31
And have a build with debug symbols?
dogbert11 I might, let me see 14:32
nine: gist.github.com/dogbert17/c6fa4ea7...45171ff701 14:34
I suspect you'll be able to replicate the above quite easily 14:37
lizmat And another Rakudo Weekly News hits the Net: rakudoweekly.blog/2021/04/05/2021-...ting-rats/ 14:43
dogbert11 lizmat++
nine dogbert11: actually, I have yet to see a segfault with this. At least I got the "Cannot find method 'sink'" for the first time today 15:21
Since even with MVM_SPESH_BLOCKING=1 and hash randomization disabled the bug is...well, quite random, my only other idea is uninitialized memory 15:22
Which fits the bug apparently going away with just different debugging/logging options set 15:23
dogbert11 nine: don't you even get any problems when running with the spesh log 15:35
nine It seems to fail more often with spesh log and less often with spesh blocking 16:16
dogbert11 but you get neither SEGV's nor output like "free(): corrupted unsorted chunks" 16:21
nine nope 16:33
dogbert11 bizarre 16:37
the SEGV's disappear if MVM_SPESH_BLOCKING is set or if MoarVM is compiled with --no-optimize 16:38
I'm running on 'Rakudo(tm) v2021.03-116-g0b91b21af. Built on MoarVM version 2021.03-24-g1538f98da.' 16:42
dogbert11 wipes and reinstalls rakudo 16:55
nine Disabling ASLR does not seem to make any difference 17:09
dogbert11 got the crash on my reinstalled rakudo 17:19
dogbert@dogbert-VirtualBox:~/repos/rakudo$ while MVM_SPESH_LOG=1 ./rakudo-m -Ilib t/spec/S17-supply/return-in-tap.t; do :; done 17:20
1..1
ok 1 - return in a Supply quit handler works fine
free(): corrupted unsorted chunks1..1
1..1
dogbert11 ok 1 - return in a Supply quit handler works fine 17:20
Segmentation fault (core dumped)
nine huh... 17:47
Meanwhile I've tried if memory poisoning from GC_DEBUG=3 makes a difference: no failure so far... 17:48
dogbert11 something is very fishy but I don't know what. 18:02
the SEGV's only show up if I enable the spesh log, weird 18:03
nine always the same backtrace? 18:05
dogbert11 nine: sorry for the late reply, was eating dinner. The backtrace (on my vanilla Rafudo install look like this) gist.github.com/dogbert17/321a7ecc...6b6f2b3a08 18:44
MasterDuke github.com/MoarVM/MoarVM/issues/1434 also mentions MVM_SPESH_LOG 18:56
dogbert11 MasterDuke: the backtrace in 1434 looks similar to the second one posted just above your comment 19:04
MasterDuke: does anything strange happen if you run 'while MVM_SPESH_LOG=1 ./rakudo-m -Ilib t/spec/S17-supply/return-in-tap.t; do :; done' 19:05
or is it only on my system
[Coke] ls 19:12
... ww
nine MasterDuke: how can I reproduce that one from 1434? 19:16
Ah, revert a5ca53a8b4bfd789ef5329218900737bd566a841 19:18
nine And yet again...the fix is to simply run things in rr 19:26
nine rr is the magic "make MoarVM perfectly stable" tool 19:26
nine After 295 runs with --chaos, i.e. random scheduling decisions, it did end in a segfault. But a different one 19:53
dogbert11 sounds like a hard won success 19:55
nine My segfault is deep inside vprintf when printing what appears to be a rather mundane guard_tree string 19:56
dogbert11 I've seen that as well 20:00
#3 _IO_new_file_xsputn (f=0x55a9b753a9d0, data=<optimized out>, n=7) at fileops.c:1197 20:02
#4 0x00007fe5cafa6af2 in __vfprintf_internal (s=0x55a9b753a9d0, format=0x7fe5cb4adcc9 "After:\n%s", ap=0x7fe5ca94d970, mode_flags=2) at ../libio/libioP.h:948
#5 0x00007fe5cb3ec5fe in MVM_spesh_debug_printf () from //home/dogbert/repos/rakudo/install/lib/libmoar.so
nine Ok, vfprintf is segfaulting because we already closed the spesh log file handle in MVM_vm_exit. It's just sloppy cleanup 20:12
Geth MoarVM/fix_segfault_on_exit_with_spesh_log: af5b7e4790 | (Stefan Seifert)++ | src/moar.c
Fix possible segfault on exit when using spesh log

A normal exit without --full-cleanup did not stop the spesh thread. So spesh may actually still be active and try to print things to the spesh log when MVM_vm_exit closes the spesh log file handle. This would lead to a segfault in vfprintf. Fix by stoping and joining the spesh thread even in MVM_vm_exit if spesh logging is active.
20:21
MoarVM: niner++ created pull request #1458:
Fix possible segfault on exit when using spesh log
nine MasterDuke, dogbert11: please try ^^^
nine Lesson of the day: in multi threaded programs, always post backtraces of all threads 20:23
nine dogbert11: regarding issue 1333 try switching lines 885 and 886 in src/core/frame.c, i.e. set returner->extra = NULL before freeing it. 20:30
MasterDuke nine: github.com/MoarVM/MoarVM/issues/1434 was triggered by the now removed t/09-moar/10-inline-array-pos.t 20:55
[Coke] updated listing in github.com/MoarVM/MoarVM/issues/1135, broke it out by merged vs. unmerged. shows how old, last committer... 20:56
If someone OKs it, I can delete all the merged branches on the list. 20:57
weird, there's a branch that travis committed to?
[Coke] nwc10: github.com/MoarVM/MoarVM/issues/995 - any idea what platform that might be? 21:03
MasterDuke nine: nm, i see you mention reverting the commit that removed them. and yes, after restoring the tests, master segfaults pretty quickly, your branch hasn't segfaulted after a while longer 21:06
[Coke]: fwiw, github.com/MoarVM/MoarVM/issues/995 might be OBE if/when github.com/MoarVM/MoarVM/pull/1402 is ever merged 21:13
dogbert11 looks like nine++ solved the speshlog problem 21:16