github.com/moarvm/moarvm | IRC logs at colabti.org/irclogger/irclogger_logs/moarvm
Set by AlexDaniel on 12 June 2018.
00:58 ggoebel_ joined 01:35 Altai-man_ joined 01:36 sena_kun left 01:42 MasterDuke left 01:56 ggoebel_ left, ggoebel_ joined 02:31 ggoebel_ left 02:38 ggoebel_ joined
nine Gooood morning, #moarvm! 05:57
(no, I'm absolutely no morning person, that's why I have to fake it so badly)
07:48 patrickb joined
nine One of these segfaults during the build: dev.azure.com/MoarVM/MoarVM/_build...247c82120e 08:04
El_Che nine: I am 99% sure I can now keep the cores on my setup. Now loop building to get a segfault :) 08:12
nine El_Che: excellent!
El_Che it was not obvious, to say the least :) 08:13
Schrödinger's segfauts: when you are not looking for them they happen all the time, when you are everything keeps building fine 08:26
nine That's exactly the problem....
I've never had one of these segfaults when building locally on my machine. And I do build now and then 08:27
El_Che I think it's related to the limited resources of cloud instances 08:46
08:58 zakharyas joined 09:07 MasterDuke joined
MasterDuke just merged github.com/MoarVM/MoarVM/pull/1428 09:11
09:16 domidumont joined 09:50 MasterDuke left 09:54 MasterDuke joined 09:56 El_Che left
nine [ 207s] make: *** [Makefile:505: gen/moar/stage2/QASTNode.moarvm] Segmentation fault 09:57
They really are somewhat common - just not on any developer's machine :/
09:58 El_Che joined
MasterDuke nine: you know, i'm not sure what repossession actually is/does? 10:07
nine repossession happens, when your module "Foo::Bar" loads another module (e.g. "Foo") and get some object from that module, maybe a class (i.e. type object) or most often a stash (i.e. the "Foo" stash - the namespace where Foo::Bar gets stored) but really anything. If you modify this object during compilation of your module (e.g. by adding "Foo::Bar" to the "Foo" stash", the modifed version of the object gets 10:08
written into the precompilation file of your module. After all, you'd expect the modification to survive precompilation and subsequent loading of your module. I.e. you expect when you load "Foo::Bar" that the "Foo" namespace still contains a Bar. In other words, the Foo::Bar module takes possession of the Foo stash object.
Inside MoarVM this is done by taking the existing Foo stash object (created by loading Foo), clearing it (overwriting with 0s) and then reinitializing it with the data we got from the Foo::Bar precompilation file.
Luckily I've postet that explanation to our team channel earlier :)
MasterDuke ah, that make sense. ha
nine And I know, my sentence structure is horrible 10:09
jnthn morning o/ 10:26
nwc10 \o 10:28
11:10 linkable6 left, evalable6 left 11:12 evalable6 joined 11:13 linkable6 joined 12:18 zakharyas left, tobs` joined 12:19 tobs` is now known as tobs 13:15 MasterDuke left 13:19 MasterDuke joined 13:38 MasterDuke left, MasterDuke joined 13:46 zakharyas joined
MasterDuke hm, when adding a new spesh candidate, it's MVM_ASSIGN_REFed `MVM_ASSIGN_REF(tc, &(spesh->common.header), new_candidate_list[spesh->body.num_spesh_candidates], candidate);`. but when removing one, the others are just memcopied into a new chunk of memory. is that sufficient? or do they need to have done to them what their initial MVM_ASSIGN_REF did? 14:53
jnthn Should suffice because the collectable that is pointing to them was already pointing to them 14:57
When we add the other ones we also just memcpy the existing ones 14:58
MasterDuke doh, right!
thought i'd found something when i realized github.com/MoarVM/MoarVM/pull/1426...3230cbR465 wasn't necessary (and i thought maybe dangerous because it's going through the sf again instead of using 15:03
github.com/MoarVM/MoarVM/pull/1426...3230cbR411 ), but no change, still `MoarVM panic: Zeroed owner in item added to GC worklist`
no change after removing the MVM_spesh_arg_guard_discard
jnthn What keeps it from hitting that error? 15:04
MasterDuke not sure what you mean?
jnthn If MVM_spesh_candidate_discard_one is completely empty, does it still fail? 15:05
MasterDuke well, commenting out the call to MVM_spesh_candidate_discard_one succeeds. commenting out the cleanup at the end github.com/MoarVM/MoarVM/pull/1426...bR470-R474 and it fails 15:06
hm. guess i can try just copying the existing stuff to new memory without actually removing anything and see what that does 15:07
jnthn Yeah, I was gonna say, maybe go through it uncommenting things 15:10
The spesh->body.num_spesh_candidates-- happens a bit earlier than I'd maybe expect but I don't think it's likely to be harmfull
MasterDuke i had wondered about that, but moving it later didn't change anything
jnthn Did you carefully check that there's never multiple reads of cands_and_arg_guards? 15:12
MasterDuke i'm gonna say no, because i'm not sure how i'd do that
jnthn Well, I was thinking the update in frames.c, but it looks correct 15:13
MasterDuke oh. just copying everything instead of copying all except the one to remove succeeded just fine (for the currently failing file, now let me try the full nqp build) 15:17
i do still cleanup the old stuff, so it's something about only partially copying 15:19
yep, nqp builds
so this if/else is where the problem is github.com/MoarVM/MoarVM/pull/1426...bR437-R468 15:20
jnthn Hmmm
MasterDuke of course my first thought was an off-by-one in the copying logic of the if branch, but not only did i double and triple check it, it would also sometimes happen when the else branch was taken 15:22
(though i certainly still welcome an outside review of that copying logic)
jnthn I've just started at it for the third time, I don't think there's an off-by-one 15:24
15:25 patrickb84 joined 15:28 patrickb left
MasterDuke i assume there must be something still holding a reference to the non-copied candidate 16:39
17:06 squashable6 left 17:07 squashable6 joined 17:25 patrickb84 left 17:29 MasterDuke left
dogbert2 .seen nine 18:16
tellable6 dogbert2, I saw nine 2021-02-04T12:02:38Z in #raku-dev: <nine> lizmat: it doesn't say any more than that line I posted :/
dogbert2 nine: could this data race have anything to do with yesterdays SEGV? gist.github.com/dogbert17/f0d40faf...279764e1cf 18:17
I guess not but it doesn' hurt to ask :-) the line is close to the SEGV line (6360) 18:18
18:19 domidumont left
nine dogbert2: err...which of the segfaults? 18:46
dogbert2 m: my $n := 1; ($n + 1 for ^17000) xx 20 18:51
evalable6 (signal SIGSEGV) WARNINGS for /tmp/x7hgnyHmzK:
dogbert2 nine: ^^^
nine ah, that one :) 18:52
dogbert2: don't think so. And jnthn seems to know already where that segfault comes from 18:58
dogbert2 ah well, it was worth a shot. Thx for looking into it. 19:00
nine dogbert2: it's also hard to see where that data race would come from. Spesh's codegen writes into a freshly allocated buffer that's not visible to any outside code until codegen is done.
If there's a data race there, wouldn't that imply that we're using a buffer that's still in use elsewhere? I.e. someone's still processing free'd bytecode 19:01
Which....we don't have until MasterDuke's branch is merged :)
OTOH I've just finished my strength training and brane may not yet get it's usual share of oxygen 19:03
dogbert2 I guess we'll have to wait a few minutes to see if you change your mind :) 19:04
lizmat knows that feeling after cycling 42km at 23km/hour average
(without any electric support)
19:13 zakharyas left
nine No, it doesn't mean someone's using free'd bytecode, it means someone is using a free'd MVMCode object. If line 6350 is indeed MVMFrame *f = ((MVMCode *)GET_REG(cur_op, 6).o)->body.outer; 19:25
dogbert2 I believe thats the line in question. Hmm, don't we have a macro to check if that we're indeed using a freed object 19:31
19:41 MasterDuke joined
MasterDuke dogbert2: do you get the same output if you back moarvm to just before the PR i merged yesterday? 19:42
dogbert2 MasterDuke: yes 20:28
MasterDuke cool, good to know i didn't introduce a race 20:30
jnthn grmbl, my reward for refactoring the Rakudo method dispatchers to be ready to do method deferral is sigsegv 22:02
MasterDuke sadly, those have not been in short supply recently 22:08
jnthn gosh, these dispatchers sure pile up a capture tree... 22:09
For general entertainment... gist.github.com/jnthn/aeca91e75f20...92a97707c7 22:11
El_Che "Siri, describe geek humor in one gist" 22:12
jnthn huh, this gets stranger; it should be grabbing the attribute out of value 3, not of 2 22:19
El_Che my joke of the day: I am trying to get moarvm build to segfault all day in my devbuild github workflow (restarting it in a loop): nothing 22:23
on my regular package workflow where I am testing repo integrations but not the core files: 2 segfaults in a row
fml
jnthn Duh, it was silly. 22:32
El_Che: "fun" 22:33
MasterDuke El_Che: could be the optimization level. maybe try --debug, but not --optimize=0 22:35
El_Che MasterDuke: I have seen the devbuild break before, just not once the core setup was ready :) 22:36
I'll run it without the optimize
MasterDuke ah, heh
El_Che Altai-man_ informed me that bintray is closing so I will be moving the repos to cloudsmith (if everything works as expected there) 22:38
so far support was fast (ny gpg key couldn't be added to the setup, now ok) 22:39
MasterDuke: it looks that default optimizing is a source of segfaults 22:44
(probably triggered on low resource VMs/containers more than else where)
MasterDuke huh 22:45
El_Che I have seen way more segfaults in the regular builds compared to the debug ones
I am building on 24 distros/version, so the numbers are there to get a segfault once in a while 22:46
it seems not related to specific distros or releases
just random
rerun the job en the build is ok
MasterDuke do you get a core with the regular builds? 22:48
22:50 lizmat left
jnthn Enough for me today, but all the preparatory work is ready for me to have a go at callsame/nextsame with methods using the new dispatcher 22:53
MasterDuke nice. how close would that put you toward the end? 22:54
El_Che if you see the white light, stop 22:55
jnthn Still quite a few bits to do, but they're extensions of what is already in place; going from no mechanism to do dispatch resumption to having the heart of one is a significant step forward. 22:58
I still need to complete the bits for multiple active resumable bits of a dispatch (wrap in multi in method...), figure out how exactly to factor nextwith and callwith (that want to change the arguments), and see if my plan for multis with `where` clauses or similar works out. 23:00
After that, it'll be a case of migrating everything that hasn't already adopted the new dispatch mechanism to do so, making sure I can toss the current multi-dispatch cache, invocation spec and method caching mechanisms out, and teaching spesh a bunch more things about the dispatcher. 23:03
The final step being for performance, while the others are about correctness 23:04
Well, to the degree the whole thing isn't about performance anyway :P
MasterDuke sounds like a walk in the park 23:08
jnthn Hurts my brain less than EA, at least :) 23:11
MasterDuke ha 23:14