github.com/moarvm/moarvm | IRC logs at colabti.org/irclogger/irclogger_logs/moarvm
Set by AlexDaniel on 12 June 2018.
00:04 lucasb left 00:24 AlexDaniel left 00:25 AlexDaniel joined, AlexDaniel left, AlexDaniel joined 01:55 mst left 05:37 evalable6 left, linkable6 left 05:39 evalable6 joined, linkable6 joined 06:15 Kaiepi joined
MasterDuke hm. has anyone pulled after my merge and run some spectests? i'm getting a bunch of random fails. different test file every time, they all run fine by themselves 07:53
haven't been able to catch anything in gdb, valgrind, etc 07:54
i've seen this before, so i don't think it's 100% caused by my PR. but maybe it exacerbated things? 08:04
oh. just got 'MoarVM panic: Must not GC when in the specializer/JIT' when building rakudo 08:05
ah. i have MVM_GC_DEBUG set to 1 08:08
would explain why building rakudo is so much slower 08:09
hm. the problems are hard to repro, and don't consistently happen in the same place. is there a way for MVM_panic to pause so i can attach gdb? or it just start gdb for me? 08:20
is it not safe to do MVM_repr_at_key_o or MVM_HASH_GET in src/spesh/optimize.c? do they need to be rooted or something like that? 08:27
github.com/MoarVM/MoarVM/blob/mast...ze.c#L1328 is where it's MVM_panic'ing 08:28
hm. there aren't any other MVM_gc_mark_thread_blocked() calls in src/spesh/optimize.c... 08:29
08:34 sena_kun joined
MasterDuke well, if i just remove it i get `MoarVM panic: Invalid owner in item added to GC worklist` instead (with MVM_GC_DEBUG set to 3 i'm now getting these when building nqp, and much more reliably) 08:43
nine And if you then add appropriate rooting? 09:01
MasterDuke working on that now 09:02
nine I seemed to remember something about spesh not being allowed to GC. But then I tried to check this and found a couple of places in spesh that did, so I thought I misremembered
MasterDuke not sure exactly what needs to be rooted. entry and hash? 09:03
hm. MVM_gc_mark_thread_(un)blocked removed and syms, hll_name, hash, entry all rooted, but same panic about invalid owner 09:05
nine I don't think MVM_repr_at_key_o or MVM_HASH_GET allocate, so there shouldn't be any GC there. Also that would throw the same "Must not GC when in the specializer" error anyway 09:09
Also entry is not an MVMCollectable, it won't move 09:10
MasterDuke then the only other call is to uv_mutex_lock ? 09:11
nine Sorry, no idea what would fail here :/ Guess it's a job for rr
MasterDuke ugh. guess i have to switch to the slow laptop then 09:12
09:24 Kaiepi left 09:25 leont joined, Kaiepi joined 09:35 Altai-man_ joined 09:38 sena_kun left 11:36 sena_kun joined 11:38 Altai-man_ left
timotimo there are spots where spesh is allowed to GC, it's only in between places when all our datastructures that may be pointing at stuff have been freed already 12:52
i wonder if some at_key_o calls will box 12:53
we don't have native-typed hashes, but maybe contexts or something like that have at_key ops for "native" things
MVMContext's at_key is indeed a bit more complicated 13:05
it uses the frame walker, gets the lexical via MVM_spesh_frame_walker_get_lex, which can vivify lexicals 13:06
which is an allocating op
lizmat is that the one that dies is the key cannot be found ?
timotimo nah, that one returns 0 when nothing was found 13:07
lizmat there's one REPR that does that, and it is very irritating :-)
timotimo umm, i think we shouldn't mark the spesh thread blocked actually 13:08
that would allow GC to kick in, have another thread steal this thread's work, and then gc would run inside of spesh 13:09
specifically, inside spesh's optimize phase
which is a big no-no
MasterDuke: would you like to tast my hypothesis by removing the "mark thread blocked"/"unblocked" from spesh's optimize? 13:10
Geth_ MoarVM/master: 5 commits pushed by (Stefan Seifert)++ 13:25
13:35 Altai-man_ joined 13:38 sena_kun left 13:39 travis-ci joined
travis-ci MoarVM build failed. Stefan Seifert 'Add a method for clearing a cached index on a MAST::Frame 13:39
travis-ci.org/MoarVM/MoarVM/builds/695662033 github.com/MoarVM/MoarVM/compare/5...7ee4e8f3b6
13:39 travis-ci left
MasterDuke timotimo: i tried that. removing the *_(un)block() calls changed the panic from `Must not GC when in the specializer/JIT` to `Invalid owner in item added to GC worklist` 14:20
and assuming it's not a problem to MVMROOT something that doesn't need it, rooting all of syms, hll_name, hash, and entry didn't change anything 14:23
dogbert11 Hmm, are things a bit buggy atm? I get MoarVM panics all the time. 15:05
it seems to fail in two places, either in 'run_gc' or in 'MVM_spesh_graph_mark' 15:08
MasterDuke dogbert11: yeah, looks like something was wrong in my PR i merged yesterday
dogbert11 ok, any clues? 15:09
MasterDuke been a bit of chat about it today with nine and timotimo
dogbert11 cool
MasterDuke well, i was calling MVM_gc_mark_thread_(un)blocked(). that's wrong. but removing those still panics and that we haven't figured out 15:10
calling those during spesh optimize
nine Any luck with rr? 15:11
The (un)block calls are definitely wrong
MasterDuke just starting building stuff on your server now
dogbert11 is it this one? MVM_gc_mark_thread_blocked (tc=0x699790) at src/gc/orchestrate.c:314 15:12
MasterDuke nope, src/spesh/optimize.c:1328
dogbert11 ok, I'll try to remove them as well 15:13
MasterDuke and the associated unblocked at the end of the function
nine dogbert11: just revert 9b60b37ca9cba24f027183d666343267efa1a172 locally
linkable6 (2020-05-03) github.com/MoarVM/MoarVM/commit/9b60b37ca9 Spesh and JIT get(cur)?hllsym
MasterDuke i wonder why these problems didn't show up while i was initially developing the pr 15:14
dogbert11 ok, removing the (un)block seems to have made one of the errors disappear, now I get the 'MVM_spesh_graph_mark' one all the time 15:17
15:19 lucasb joined
MasterDuke starting an `rr record` now 15:29
15:36 sena_kun joined 15:38 Altai-man_ left
MasterDuke apparently i haven't used rr in a while 15:39
i don't seem to have any symbols 15:41
i built moarvm with --debug=3, but i'm getting nothing in my `rr replay` 15:51
afk for a bit 15:58
16:54 mst joined, ChanServ sets mode: +o mst
MasterDuke nine: have you used rr on that machine before? 17:21
17:35 Altai-man_ joined 17:38 sena_kun left 17:52 zakharyas joined
nine no, just installed it 18:08
MasterDuke: where did you install rakudo to? 18:09
MasterDuke nine: i ran `perl Configure.pl --prefix=/home/masterduke/raku/ --debug=3 --telemeh && make -j12 install` in a moarvm checkout (after some source edits) and `perl Configure.pl --no-silent-build --prefix=/home/masterduke/raku/ --backends=moar && make -j12 install` in an nqp checkout 18:12
and then `rr record '/home/masterduke/raku/bin/moar' --libpath=src/vm/moar/stage0 src/vm/moar/stage0/nqp.moarvm --bootstrap --module-path=gen/moar/stage1 --no-regex-lib --target=mbc --setting=NULL --stable-sc=stage1 --output=gen/moar/stage1/NQPCORE.setting.moarvm gen/moar/stage1/NQPCORE.setting`
[Coke] tele... meh.
MasterDuke `rr replay` loads the recording, but a backtrace just shows addresses 18:13
nine MasterDuke: I just started rr replay as your user from your home directory and when I interrupt the program I get very nice backtraces 18:14
gist.github.com/niner/6184e97f889b...d903133c8a
MasterDuke huh. continuing does seem to work. but running didn't... 18:16
nine But that's how you use rr :)
It is a recorded program, so you just need to continue
MasterDuke well, their docs do say you can restart it though 18:17
fwiw, this is what i got gist.github.com/MasterDuke17/328a0...1b563c8e82
but yeah, continuing seems to be making progress 18:18
MasterDuke did look at the rr docs because /me hadn't used it in a while, but somehow still went wrong 18:21
huh. now i got a sigkill, not a break at MVM_panic 18:23
gist.github.com/MasterDuke17/f3c33...cc54a936d9 is the backtrace i get after the sigkill 18:51
19:15 mst left 19:19 mst joined, ChanServ sets mode: +o mst 19:36 sena_kun joined 19:38 Altai-man_ left
nine MasterDuke: is the sigkill reproducible? Have you tried a different recording? 19:55
20:07 Kaiepi left, Kaiepi joined 20:16 leont_ joined
MasterDuke nine: in rr, yes. been very afk, but just started another recording to see if that'll be the same 20:17
20:21 leont left, zakharyas left 20:28 zakharyas joined
MasterDuke the new recording does the same thing (sigkill) in replay 20:30
20:47 zakharyas left
timotimo MasterDuke: i bet you're just seeing the abort from the wrong thread 21:13
because the backtrace is just "spesh worker waiting for work to be submitted from another thread"
21:17 lucasb left 21:35 Altai-man_ joined 21:38 sena_kun left 21:53 rypervenche left 21:54 rypervenche joined, MasterDuke left, rypervenche left, MasterDuke joined 21:55 vrurg left, [Coke] left, vrurg joined, [Coke] joined, AlexDaniel` left 21:56 AlexDaniel` joined, rypervenche joined
MasterDuke hm. info threads only shows one 21:57
21:58 nebuchadnezzar left, TimToady left, colomon_ left, nine left 21:59 nebuchadnezzar joined, TimToady joined, colomon_ joined, nine joined
MasterDuke doh, i should restart it 21:59
22:00 AlexDaniel` left 22:10 AlexDaniel left 22:13 AlexDaniel` joined
timotimo why restart instead of going backwards? 22:19
hm, well, there are situations where it won't let you perhaps? 22:21
MasterDuke well, i don't know what i'm looking for yet 22:27
and why isn't my breakpoint in MVM_panic hitting>
?
22:28 AlexDaniel joined, AlexDaniel left, AlexDaniel joined 23:35 leont_ left 23:36 sena_kun joined 23:37 Altai-man_ left