MasterDuke is missing an MVMROOT? if i'm rr'ing correctly (somewhat doubtful to be honest) it seems to indicate this is where the location of an sc_ix gets changed from 0 to 4294967295 09:58
tellable6 2020-11-17T03:50:57Z #raku-dev <melezhik> MasterDuke: ^^
MasterDuke re nine's comment here i did `watch -l watch -l worklist->list[worklist->items]` at the point of the segv, but all i get is `Old value = <unreadable>; New value = (MVMCollectable **) 0x0` when i reverse-cont (or continue from start) 10:15
did i do that watchpoint correctly? 10:17
(copy pasta, i only had one `watch -l` when i actually entered it in rr) 10:18
nine MasterDuke: worklist->list[worklist->items] is the next free slot for adding to the worklist 10:34
MasterDuke: you actually want to watch -l gen2roots[i] 10:35
MasterDuke where's the i from? 10:43
tc->gen2roots[sc_idx]? 10:45
nine the MVM_gc_root_add_gen2s_to_worklist frame 10:46
MasterDuke ah, thanks 10:47
ok, i added that, reverse-cont, it hits 10:50
nine Is the object still intact at that point? 10:51
MasterDuke we're at `tc->gen2roots[tc->num_gen2roots] = c;` in MVM_gc_root_gen2_add 10:53
guess i'm interested in the `c`?
sc = {sc_idx = 4048871824, idx = 21871} 10:55
nine So, already broken when it gets added to gen2roots. Where are we at that point? 10:56
MasterDuke set_obj_at_offset in src/6model/reprs/P6opaque.c:21; bind_attribute in src/6model/reprs/P6opaque.c:389;
nine where does the value come from? 10:57
MVM_dump_bytecode(tc) comes in handy at that point
timotimo i recommend routinely asking gdb for "event"
i think that's how you spell that command at least 10:58
gives you a numbe you can use to return to this moment in "time"
nine isn't that checkpoint?
MasterDuke what i have so far 10:59
nine So you're in line 77 and the broken object is loc_8_obj, i.e. tc->cur_frame->work[8].o 11:01
and it's the result of getlex in 70
That's odd, since that's simply the %spec argument to trait_mod:<is>(Routine:D $r, :prec(%spec)!) 11:03
So I'd figure that it's not a missing MVMROOT of that object, but something else overwriting it. Thus a watch -l c-> and reverse-cont 11:04
MasterDuke old: 23; new: 30 11:05
nine Oh boy, I'm spending too much time on things like that. I didn't even have to look anywhere to type that command...
wait, 23? that's a sane number
MasterDuke in rebless
should i go forward again to the segv and then reverse? 11:07
nine I'd do that, yes. It's always better to be too careful and confirm results than to follow the wrong lead 11:10
MasterDuke old: 4045642928; new: 31 11:11
nine that's the spot!
MasterDuke deserialize
nine huh 11:12
MasterDuke gist updated
nine That's wrong: p (MVMObject)c 11:19
Should read p *(MVMObject*)c
MasterDuke $8 = {header = {sc_forward_u = {forwarder = 0x1600000017, sc = {sc_idx = 23, idx = 22}, st = 0x1600000017}, owner = 1, flags1 = 0 '\000', flags2 = 2 '\002', size = 144}, st = 0x556feffaa300} 11:39
nine so that's actually ok there 11:40
if you still got that watch -l c-> active, you can just cont to see where it gets overwritten
MasterDuke from when the gen2roots[i] hits? 11:41
nine yes 11:42
MasterDuke just to be sure i'm doing the steps right, i continued to the segv, reverse-continued to when the watch -l gen2roots[i] hit, then continued 11:44
the first watch -l c-> gives old: 23; new 31 11:45
next gives old:31; new: 4045642928
nine Sounds good. Since we know that the object is ok when it gets added to gen2roots, you can also just watch -l c-> when it segfaults and reverse-cont
ok the old:31; new: 4045642928 is where it gets interesting
where is that?
MasterDuke gist updated, rr3.log 11:46
fyi, this has been run with MVM_SPESH_DISABLE=1 (segvs with or without, i just thought it'd make things simpler) 11:49
nine oh, good
is tc->allocate_in_gen2 set? 11:50
MasterDuke $9 = 1
nine So, the collectable that gets overwritten is at 0x556ff154e190 11:54
The P6Opaque that's deserialized is at address 0x556ff154e118
And we're writing 0x78 bytes in, i.e. at 0x556FF154E190 11:55
That seems to suggest that the c pointer is in fact outdated and points at a freed or moved object. 11:56
Btw. have you tried with GC_DEBUG enabled? 11:57
Since it's clearly a GC issue that might help and at least explode earlier
MasterDuke haven't
nine Pro tip to rr users: clean your ~/.local/share/rr from time to time... 11:59
MasterDuke `MoarVM panic: SC index out of range`, no surprise... 12:01
with GC_DEBUG=2
seems to be the same thing. sc_idx is fine when the gen2roots[i] hits, crazy new value is in deserialize 12:08
just dies at the panic instead of a segv 12:09
nine Since the issue is that we operate with an outdated pointer to an object, GCing more often with GC_DEBUG=3 may help shake out the actual place where we miss the MVMROOT 12:14
MasterDuke so slow... 12:15
nine I know... OTOH it can run unattended
MasterDuke another panic, sc_idx = 3532567984. but MVM_gc_root_add_gen2s_to_worklist is not in the backtrace 12:32
but we do seem to be much earlier in the runtime. we're in gen/moar/stage2/QRegex.nqp:868 12:38
MasterDuke still, same crazy sc_id set in deserialize 13:14
nine so you'll have to trace c back to where it comes from 13:15
MasterDuke item right? cause we're in 13:24
nine yeah, the broken object 13:25
MasterDuke is correct? here value is rooted before allocate() is called 13:32
timotimo where is the root? 13:33
oh, it gets put into the egister
nine MasterDuke: the comment above it explains
MasterDuke well yeah, but the comment is repeated in clone 13:34
nine and both are correct 13:35
MasterDuke ah. value is used later in clone. though the comment has a type, there's no 'obj' in clone 13:37
*typo 13:38
ugh, i did `watch -l item`, but that's just hitting for every call of process_worklist 13:45
nine where do you want it to break? 13:48
MasterDuke well, i guess that could be fine. i'm just not 100% sure which call to process_worklist i care about 13:50
the first one before the sc_idx goes crazy? 13:51
nine I'd say you want to know where it entered the worklist
You already know where and how sc_idx gets overwritten. That didn't help because it's obviously not the writer's fault. It's the fault of whoever holds a pointer to an object and lets it get out-dated 13:52
timotimo i wonder what we must do to make perf report able to annotate jitted frames 15:16
with the perf map it can tell us which piece in the output belongs to what pel6-level frame 15:17
MasterDuke hm, where was i 20:01
looks like i did `watch -l *item` for some reason
brrt \o 20:11
nwc10 o/ 20:12
[Coke] 22:34
