github.com/moarvm/moarvm | IRC logs at colabti.org/irclogger/irclogger_logs/moarvm
Set by AlexDaniel on 12 June 2018.
MasterDuke hm. has anyone pulled after my merge and run some spectests? i'm getting a bunch of random fails. different test file every time, they all run fine by themselves 07:53
haven't been able to catch anything in gdb, valgrind, etc 07:54
i've seen this before, so i don't think it's 100% caused by my PR. but maybe it exacerbated things? 08:04
oh. just got 'MoarVM panic: Must not GC when in the specializer/JIT' when building rakudo 08:05
ah. i have MVM_GC_DEBUG set to 1 08:08
would explain why building rakudo is so much slower 08:09
hm. the problems are hard to repro, and don't consistently happen in the same place. is there a way for MVM_panic to pause so i can attach gdb? or it just start gdb for me? 08:20
is it not safe to do MVM_repr_at_key_o or MVM_HASH_GET in src/spesh/optimize.c? do they need to be rooted or something like that? 08:27
github.com/MoarVM/MoarVM/blob/mast...ze.c#L1328 is where it's MVM_panic'ing 08:28
hm. there aren't any other MVM_gc_mark_thread_blocked() calls in src/spesh/optimize.c... 08:29
MasterDuke well, if i just remove it i get `MoarVM panic: Invalid owner in item added to GC worklist` instead (with MVM_GC_DEBUG set to 3 i'm now getting these when building nqp, and much more reliably) 08:43
nine And if you then add appropriate rooting? 09:01
MasterDuke working on that now 09:02
nine I seemed to remember something about spesh not being allowed to GC. But then I tried to check this and found a couple of places in spesh that did, so I thought I misremembered
MasterDuke not sure exactly what needs to be rooted. entry and hash? 09:03
hm. MVM_gc_mark_thread_(un)blocked removed and syms, hll_name, hash, entry all rooted, but same panic about invalid owner 09:05
nine I don't think MVM_repr_at_key_o or MVM_HASH_GET allocate, so there shouldn't be any GC there. Also that would throw the same "Must not GC when in the specializer" error anyway 09:09
Also entry is not an MVMCollectable, it won't move 09:10
MasterDuke then the only other call is to uv_mutex_lock ? 09:11
nine Sorry, no idea what would fail here :/ Guess it's a job for rr
MasterDuke ugh. guess i have to switch to the slow laptop then 09:12
timotimo there are spots where spesh is allowed to GC, it's only in between places when all our datastructures that may be pointing at stuff have been freed already 12:52
i wonder if some at_key_o calls will box 12:53
we don't have native-typed hashes, but maybe contexts or something like that have at_key ops for "native" things
MVMContext's at_key is indeed a bit more complicated 13:05
it uses the frame walker, gets the lexical via MVM_spesh_frame_walker_get_lex, which can vivify lexicals 13:06
which is an allocating op
lizmat is that the one that dies is the key cannot be found ?
timotimo nah, that one returns 0 when nothing was found 13:07
lizmat there's one REPR that does that, and it is very irritating :-)
timotimo umm, i think we shouldn't mark the spesh thread blocked actually 13:08
that would allow GC to kick in, have another thread steal this thread's work, and then gc would run inside of spesh 13:09
specifically, inside spesh's optimize phase
which is a big no-no
MasterDuke: would you like to tast my hypothesis by removing the "mark thread blocked"/"unblocked" from spesh's optimize? 13:10
Geth_ MoarVM/master: 5 commits pushed by (Stefan Seifert)++ 13:25
travis-ci MoarVM build failed. Stefan Seifert 'Add a method for clearing a cached index on a MAST::Frame 13:39
travis-ci.org/MoarVM/MoarVM/builds/695662033 github.com/MoarVM/MoarVM/compare/5...7ee4e8f3b6
MasterDuke timotimo: i tried that. removing the *_(un)block() calls changed the panic from `Must not GC when in the specializer/JIT` to `Invalid owner in item added to GC worklist` 14:20
and assuming it's not a problem to MVMROOT something that doesn't need it, rooting all of syms, hll_name, hash, and entry didn't change anything 14:23
dogbert11 Hmm, are things a bit buggy atm? I get MoarVM panics all the time. 15:05
it seems to fail in two places, either in 'run_gc' or in 'MVM_spesh_graph_mark' 15:08
MasterDuke dogbert11: yeah, looks like something was wrong in my PR i merged yesterday
dogbert11 ok, any clues? 15:09
MasterDuke been a bit of chat about it today with nine and timotimo
dogbert11 cool
MasterDuke well, i was calling MVM_gc_mark_thread_(un)blocked(). that's wrong. but removing those still panics and that we haven't figured out 15:10
calling those during spesh optimize
nine Any luck with rr? 15:11
The (un)block calls are definitely wrong
MasterDuke just starting building stuff on your server now
dogbert11 is it this one? MVM_gc_mark_thread_blocked (tc=0x699790) at src/gc/orchestrate.c:314 15:12
MasterDuke nope, src/spesh/optimize.c:1328
dogbert11 ok, I'll try to remove them as well 15:13
MasterDuke and the associated unblocked at the end of the function
nine dogbert11: just revert 9b60b37ca9cba24f027183d666343267efa1a172 locally
linkable6 (2020-05-03) github.com/MoarVM/MoarVM/commit/9b60b37ca9 Spesh and JIT get(cur)?hllsym
MasterDuke i wonder why these problems didn't show up while i was initially developing the pr 15:14
dogbert11 ok, removing the (un)block seems to have made one of the errors disappear, now I get the 'MVM_spesh_graph_mark' one all the time 15:17
MasterDuke starting an `rr record` now 15:29
MasterDuke apparently i haven't used rr in a while 15:39
i don't seem to have any symbols 15:41
i built moarvm with --debug=3, but i'm getting nothing in my `rr replay` 15:51
afk for a bit 15:58
MasterDuke nine: have you used rr on that machine before? 17:21
nine no, just installed it 18:08
MasterDuke: where did you install rakudo to? 18:09
MasterDuke nine: i ran `perl Configure.pl --prefix=/home/masterduke/raku/ --debug=3 --telemeh && make -j12 install` in a moarvm checkout (after some source edits) and `perl Configure.pl --no-silent-build --prefix=/home/masterduke/raku/ --backends=moar && make -j12 install` in an nqp checkout 18:12
and then `rr record '/home/masterduke/raku/bin/moar' --libpath=src/vm/moar/stage0 src/vm/moar/stage0/nqp.moarvm --bootstrap --module-path=gen/moar/stage1 --no-regex-lib --target=mbc --setting=NULL --stable-sc=stage1 --output=gen/moar/stage1/NQPCORE.setting.moarvm gen/moar/stage1/NQPCORE.setting`
[Coke] tele... meh.
MasterDuke `rr replay` loads the recording, but a backtrace just shows addresses 18:13
nine MasterDuke: I just started rr replay as your user from your home directory and when I interrupt the program I get very nice backtraces 18:14
gist.github.com/niner/6184e97f889b...d903133c8a
MasterDuke huh. continuing does seem to work. but running didn't... 18:16
nine But that's how you use rr :)
It is a recorded program, so you just need to continue
MasterDuke well, their docs do say you can restart it though 18:17
fwiw, this is what i got gist.github.com/MasterDuke17/328a0...1b563c8e82
but yeah, continuing seems to be making progress 18:18
MasterDuke did look at the rr docs because /me hadn't used it in a while, but somehow still went wrong 18:21
huh. now i got a sigkill, not a break at MVM_panic 18:23
gist.github.com/MasterDuke17/f3c33...cc54a936d9 is the backtrace i get after the sigkill 18:51
nine MasterDuke: is the sigkill reproducible? Have you tried a different recording? 19:55
MasterDuke nine: in rr, yes. been very afk, but just started another recording to see if that'll be the same 20:17
MasterDuke the new recording does the same thing (sigkill) in replay 20:30
timotimo MasterDuke: i bet you're just seeing the abort from the wrong thread 21:13
because the backtrace is just "spesh worker waiting for work to be submitted from another thread"
MasterDuke hm. info threads only shows one 21:57
MasterDuke doh, i should restart it 21:59
timotimo why restart instead of going backwards? 22:19
hm, well, there are situations where it won't let you perhaps? 22:21
MasterDuke well, i don't know what i'm looking for yet 22:27
and why isn't my breakpoint in MVM_panic hitting>
?