github.com/moarvm/moarvm | IRC logs at colabti.org/irclogger/irclogger_logs/moarvm Set by AlexDaniel on 12 June 2018. |
|||
00:58
ggoebel_ joined
01:35
Altai-man_ joined
01:36
sena_kun left
01:42
MasterDuke left
01:56
ggoebel_ left,
ggoebel_ joined
02:31
ggoebel_ left
02:38
ggoebel_ joined
|
|||
nine | Gooood morning, #moarvm! | 05:57 | |
(no, I'm absolutely no morning person, that's why I have to fake it so badly) | |||
07:48
patrickb joined
|
|||
nine | One of these segfaults during the build: dev.azure.com/MoarVM/MoarVM/_build...247c82120e | 08:04 | |
El_Che | nine: I am 99% sure I can now keep the cores on my setup. Now loop building to get a segfault :) | 08:12 | |
nine | El_Che: excellent! | ||
El_Che | it was not obvious, to say the least :) | 08:13 | |
Schrödinger's segfauts: when you are not looking for them they happen all the time, when you are everything keeps building fine | 08:26 | ||
nine | That's exactly the problem.... | ||
I've never had one of these segfaults when building locally on my machine. And I do build now and then | 08:27 | ||
El_Che | I think it's related to the limited resources of cloud instances | 08:46 | |
08:58
zakharyas joined
09:07
MasterDuke joined
|
|||
MasterDuke | just merged github.com/MoarVM/MoarVM/pull/1428 | 09:11 | |
09:16
domidumont joined
09:50
MasterDuke left
09:54
MasterDuke joined
09:56
El_Che left
|
|||
nine | [ 207s] make: *** [Makefile:505: gen/moar/stage2/QASTNode.moarvm] Segmentation fault | 09:57 | |
They really are somewhat common - just not on any developer's machine :/ | |||
09:58
El_Che joined
|
|||
MasterDuke | nine: you know, i'm not sure what repossession actually is/does? | 10:07 | |
nine | repossession happens, when your module "Foo::Bar" loads another module (e.g. "Foo") and get some object from that module, maybe a class (i.e. type object) or most often a stash (i.e. the "Foo" stash - the namespace where Foo::Bar gets stored) but really anything. If you modify this object during compilation of your module (e.g. by adding "Foo::Bar" to the "Foo" stash", the modifed version of the object gets | 10:08 | |
written into the precompilation file of your module. After all, you'd expect the modification to survive precompilation and subsequent loading of your module. I.e. you expect when you load "Foo::Bar" that the "Foo" namespace still contains a Bar. In other words, the Foo::Bar module takes possession of the Foo stash object. | |||
Inside MoarVM this is done by taking the existing Foo stash object (created by loading Foo), clearing it (overwriting with 0s) and then reinitializing it with the data we got from the Foo::Bar precompilation file. | |||
Luckily I've postet that explanation to our team channel earlier :) | |||
MasterDuke | ah, that make sense. ha | ||
nine | And I know, my sentence structure is horrible | 10:09 | |
jnthn | morning o/ | 10:26 | |
nwc10 | \o | 10:28 | |
11:10
linkable6 left,
evalable6 left
11:12
evalable6 joined
11:13
linkable6 joined
12:18
zakharyas left,
tobs` joined
12:19
tobs` is now known as tobs
13:15
MasterDuke left
13:19
MasterDuke joined
13:38
MasterDuke left,
MasterDuke joined
13:46
zakharyas joined
|
|||
MasterDuke | hm, when adding a new spesh candidate, it's MVM_ASSIGN_REFed `MVM_ASSIGN_REF(tc, &(spesh->common.header), new_candidate_list[spesh->body.num_spesh_candidates], candidate);`. but when removing one, the others are just memcopied into a new chunk of memory. is that sufficient? or do they need to have done to them what their initial MVM_ASSIGN_REF did? | 14:53 | |
jnthn | Should suffice because the collectable that is pointing to them was already pointing to them | 14:57 | |
When we add the other ones we also just memcpy the existing ones | 14:58 | ||
MasterDuke | doh, right! | ||
thought i'd found something when i realized github.com/MoarVM/MoarVM/pull/1426...3230cbR465 wasn't necessary (and i thought maybe dangerous because it's going through the sf again instead of using | 15:03 | ||
github.com/MoarVM/MoarVM/pull/1426...3230cbR411 ), but no change, still `MoarVM panic: Zeroed owner in item added to GC worklist` | |||
no change after removing the MVM_spesh_arg_guard_discard | |||
jnthn | What keeps it from hitting that error? | 15:04 | |
MasterDuke | not sure what you mean? | ||
jnthn | If MVM_spesh_candidate_discard_one is completely empty, does it still fail? | 15:05 | |
MasterDuke | well, commenting out the call to MVM_spesh_candidate_discard_one succeeds. commenting out the cleanup at the end github.com/MoarVM/MoarVM/pull/1426...bR470-R474 and it fails | 15:06 | |
hm. guess i can try just copying the existing stuff to new memory without actually removing anything and see what that does | 15:07 | ||
jnthn | Yeah, I was gonna say, maybe go through it uncommenting things | 15:10 | |
The spesh->body.num_spesh_candidates-- happens a bit earlier than I'd maybe expect but I don't think it's likely to be harmfull | |||
MasterDuke | i had wondered about that, but moving it later didn't change anything | ||
jnthn | Did you carefully check that there's never multiple reads of cands_and_arg_guards? | 15:12 | |
MasterDuke | i'm gonna say no, because i'm not sure how i'd do that | ||
jnthn | Well, I was thinking the update in frames.c, but it looks correct | 15:13 | |
MasterDuke | oh. just copying everything instead of copying all except the one to remove succeeded just fine (for the currently failing file, now let me try the full nqp build) | 15:17 | |
i do still cleanup the old stuff, so it's something about only partially copying | 15:19 | ||
yep, nqp builds | |||
so this if/else is where the problem is github.com/MoarVM/MoarVM/pull/1426...bR437-R468 | 15:20 | ||
jnthn | Hmmm | ||
MasterDuke | of course my first thought was an off-by-one in the copying logic of the if branch, but not only did i double and triple check it, it would also sometimes happen when the else branch was taken | 15:22 | |
(though i certainly still welcome an outside review of that copying logic) | |||
jnthn | I've just started at it for the third time, I don't think there's an off-by-one | 15:24 | |
15:25
patrickb84 joined
15:28
patrickb left
|
|||
MasterDuke | i assume there must be something still holding a reference to the non-copied candidate | 16:39 | |
17:06
squashable6 left
17:07
squashable6 joined
17:25
patrickb84 left
17:29
MasterDuke left
|
|||
dogbert2 | .seen nine | 18:16 | |
tellable6 | dogbert2, I saw nine 2021-02-04T12:02:38Z in #raku-dev: <nine> lizmat: it doesn't say any more than that line I posted :/ | ||
dogbert2 | nine: could this data race have anything to do with yesterdays SEGV? gist.github.com/dogbert17/f0d40faf...279764e1cf | 18:17 | |
I guess not but it doesn' hurt to ask :-) the line is close to the SEGV line (6360) | 18:18 | ||
18:19
domidumont left
|
|||
nine | dogbert2: err...which of the segfaults? | 18:46 | |
dogbert2 | m: my $n := 1; ($n + 1 for ^17000) xx 20 | 18:51 | |
evalable6 | (signal SIGSEGV) WARNINGS for /tmp/x7hgnyHmzK: | ||
dogbert2 | nine: ^^^ | ||
nine | ah, that one :) | 18:52 | |
dogbert2: don't think so. And jnthn seems to know already where that segfault comes from | 18:58 | ||
dogbert2 | ah well, it was worth a shot. Thx for looking into it. | 19:00 | |
nine | dogbert2: it's also hard to see where that data race would come from. Spesh's codegen writes into a freshly allocated buffer that's not visible to any outside code until codegen is done. | ||
If there's a data race there, wouldn't that imply that we're using a buffer that's still in use elsewhere? I.e. someone's still processing free'd bytecode | 19:01 | ||
Which....we don't have until MasterDuke's branch is merged :) | |||
OTOH I've just finished my strength training and brane may not yet get it's usual share of oxygen | 19:03 | ||
dogbert2 | I guess we'll have to wait a few minutes to see if you change your mind :) | 19:04 | |
lizmat knows that feeling after cycling 42km at 23km/hour average | |||
(without any electric support) | |||
19:13
zakharyas left
|
|||
nine | No, it doesn't mean someone's using free'd bytecode, it means someone is using a free'd MVMCode object. If line 6350 is indeed MVMFrame *f = ((MVMCode *)GET_REG(cur_op, 6).o)->body.outer; | 19:25 | |
dogbert2 | I believe thats the line in question. Hmm, don't we have a macro to check if that we're indeed using a freed object | 19:31 | |
19:41
MasterDuke joined
|
|||
MasterDuke | dogbert2: do you get the same output if you back moarvm to just before the PR i merged yesterday? | 19:42 | |
dogbert2 | MasterDuke: yes | 20:28 | |
MasterDuke | cool, good to know i didn't introduce a race | 20:30 | |
jnthn | grmbl, my reward for refactoring the Rakudo method dispatchers to be ready to do method deferral is sigsegv | 22:02 | |
MasterDuke | sadly, those have not been in short supply recently | 22:08 | |
jnthn | gosh, these dispatchers sure pile up a capture tree... | 22:09 | |
For general entertainment... gist.github.com/jnthn/aeca91e75f20...92a97707c7 | 22:11 | ||
El_Che | "Siri, describe geek humor in one gist" | 22:12 | |
jnthn | huh, this gets stranger; it should be grabbing the attribute out of value 3, not of 2 | 22:19 | |
El_Che | my joke of the day: I am trying to get moarvm build to segfault all day in my devbuild github workflow (restarting it in a loop): nothing | 22:23 | |
on my regular package workflow where I am testing repo integrations but not the core files: 2 segfaults in a row | |||
fml | |||
jnthn | Duh, it was silly. | 22:32 | |
El_Che: "fun" | 22:33 | ||
MasterDuke | El_Che: could be the optimization level. maybe try --debug, but not --optimize=0 | 22:35 | |
El_Che | MasterDuke: I have seen the devbuild break before, just not once the core setup was ready :) | 22:36 | |
I'll run it without the optimize | |||
MasterDuke | ah, heh | ||
El_Che | Altai-man_ informed me that bintray is closing so I will be moving the repos to cloudsmith (if everything works as expected there) | 22:38 | |
so far support was fast (ny gpg key couldn't be added to the setup, now ok) | 22:39 | ||
MasterDuke: it looks that default optimizing is a source of segfaults | 22:44 | ||
(probably triggered on low resource VMs/containers more than else where) | |||
MasterDuke | huh | 22:45 | |
El_Che | I have seen way more segfaults in the regular builds compared to the debug ones | ||
I am building on 24 distros/version, so the numbers are there to get a segfault once in a while | 22:46 | ||
it seems not related to specific distros or releases | |||
just random | |||
rerun the job en the build is ok | |||
MasterDuke | do you get a core with the regular builds? | 22:48 | |
22:50
lizmat left
|
|||
jnthn | Enough for me today, but all the preparatory work is ready for me to have a go at callsame/nextsame with methods using the new dispatcher | 22:53 | |
MasterDuke | nice. how close would that put you toward the end? | 22:54 | |
El_Che | if you see the white light, stop | 22:55 | |
jnthn | Still quite a few bits to do, but they're extensions of what is already in place; going from no mechanism to do dispatch resumption to having the heart of one is a significant step forward. | 22:58 | |
I still need to complete the bits for multiple active resumable bits of a dispatch (wrap in multi in method...), figure out how exactly to factor nextwith and callwith (that want to change the arguments), and see if my plan for multis with `where` clauses or similar works out. | 23:00 | ||
After that, it'll be a case of migrating everything that hasn't already adopted the new dispatch mechanism to do so, making sure I can toss the current multi-dispatch cache, invocation spec and method caching mechanisms out, and teaching spesh a bunch more things about the dispatcher. | 23:03 | ||
The final step being for performance, while the others are about correctness | 23:04 | ||
Well, to the degree the whole thing isn't about performance anyway :P | |||
MasterDuke | sounds like a walk in the park | 23:08 | |
jnthn | Hurts my brain less than EA, at least :) | 23:11 | |
MasterDuke | ha | 23:14 |