| IRC logs at
Set by AlexDaniel on 12 June 2018.
00:02 evalable6 left, committable6 left, linkable6 left 00:03 evalable6 joined, committable6 joined 00:04 linkable6 joined 00:30 frost-lab left 01:07 dogbert11 joined 01:10 dogbert17 left 02:10 quotable6 left, notable6 left, squashable6 left, bisectable6 left, committable6 left, bloatable6 left, evalable6 left, linkable6 left, nativecallable6 left, releasable6 left, benchable6 left, greppable6 left, coverable6 left, sourceable6 left, shareable6 left, unicodable6 left, tellable6 left, statisfiable6 left 02:11 sourceable6 joined, greppable6 joined, nativecallable6 joined, bloatable6 joined 02:12 tellable6 joined, notable6 joined, releasable6 joined, squashable6 joined, coverable6 joined, linkable6 joined, committable6 joined 02:13 shareable6 joined, benchable6 joined, quotable6 joined, statisfiable6 joined, bisectable6 joined 02:14 evalable6 joined, unicodable6 joined 02:41 lucasb left 03:13 leont left 05:36 MasterDuke left 07:40 domidumont joined 07:43 domidumont left 07:49 domidumont joined 07:56 domidumont left 07:58 domidumont joined
timotimo finally back at my regular desktop system ... 08:03
... wow, one of the drives on the raid isn't showing up, eh? that's great 08:05
08:09 sena_kun joined 08:14 dumarchie joined
dumarchie Isn't a classical example of a data race? 08:14
08:16 sivoais left, camelia left 08:20 Geth left, Geth joined 08:21 domidumont left 08:22 sivoais joined, camelia joined 08:28 domidumont joined
nine timotimo: can you re-add it? 08:36
I've had devices fall out of the RAID twice in the past couple of weeks. But a simple add and they're up again 08:37
08:52 zakharyas joined 09:08 Altai-man joined 09:10 sena_kun left 09:22 MasterDuke joined
MasterDuke timotimo: ping re 09:53
Geth MoarVM/modern-raku-script-extensions: a7efa956f9 | (Elizabeth Mattijsen)++ | 6 files
Fix some .p6 -> .raku changes that were missed

Nine++ for the spot
dumarchie MasterDuke, can you have a look at ? 10:05
MasterDuke i'm not sure what's the race there? 10:08
but i suspect you'll really need nine, jnthn, timotimo, or nwc10 to take a look 10:09
dumarchie Two threads see the cache is not properly initialized. Both assign a new object. Then later one of them finds out the object is different than expected.
MasterDuke hm 10:11
dumarchie I'm not a C programmer, but maybe even the pointer itself could become corrupted? 10:15
MasterDuke there is a mutex that's used for some multicache operations. e.g., 10:17
you could try wrapping that init in a lock/unlock 10:18
dumarchie I guess it's safe to use the mutex_multi_cache_add for initialization as well. I'll give it a try. It's probably more interesting than writing a hello_world.c :) 10:22
Otoh, do we want a lock/unlock for every MVM_multi_cache_add ? 10:23
jnthn iirc, the idea was that threads may race to install the multi cache, and whoever loses just ends up with that being GC'd later
tellable6 hey jnthn, you have a message:
dumarchie jnthn, that makes sense to me. But does the cur_node check in line 258 make sense from that perspective? 10:26
10:28 frost-lab joined 10:29 Kaeipi joined
dumarchie line 257 that is 10:29
10:29 Kaiepi left
MasterDuke dumarchie: do you still have the problem with a current rakudo? according to one of your recent comments, you still have a rakudo from 2020.10, even though your moarvm is from 2020.11 10:34
dumarchie All my recent debugging was with Rakudo v2020.10-275-gc63f078a2 10:36
MasterDuke there were some changes to cas between now and then, but i think they were just optimizations and shouldn't have changed how it works
dumarchie I don't see them in 10:39
MasterDuke 10:40
oh ha. was by you 10:41
dumarchie :) I checked and they are part of the Rakudo I built 10:42
MasterDuke but it says v2020.10-275-gc63f078a2 ? something's off then 10:43
did you pull tags? 10:47
dumarchie iirc I just did a `git pull upstream master` 10:48
That should pull tags, right? 10:49
MasterDuke nope. need to do a round of pull/push with `--tags`
dumarchie Oh... maybe a `--rebase` does it as well? That's what I normally use. 10:50
MasterDuke but this does make it interesting that i can't repro. i thought you had an old rakudo, but if not than i wonder why i don't get the panic
what's your OS and hardware? 10:51
dumarchie Windows 10. Let me figure out how to get CPU info. 10:52
Intel Core i5-6200U 10:53
I guess most devs have an AMD? 10:54
MasterDuke well, i know nine and i do 10:56
dumarchie A colleague of mine also failed to repro on Linux on AMD.
MasterDuke yeah, i'm on linux also
dumarchie But note that I can't consistently repro. Only once in a while I have the "Switching to Thread" followed by a panic. 10:58
MasterDuke what compiler are you using? 10:59
i usually build with gcc, but i'll try switching to clang
dumarchie For my latest build I used gcc provided by MinGW 11:00
11:02 tib left
MasterDuke been running in a loop for a couple minutes, still no panic 11:06
dumarchie Maybe you can speed it up with `benchmark/stack.raku 100` to limit the number of values pushed and popped per run. 11:12
MasterDuke couple minutes of that each with gcc and clang, no panic 11:23
dumarchie You also didn't see the "Switching to Thread"? 11:24
MasterDuke nope 11:28
dumarchie Can you limit the number of CPU cores MoarVM uses? 11:50
MasterDuke i think so 11:55
dumarchie Maybe it helps if you limit them to 4 or 2. 12:02
12:02 linkable6 left, evalable6 left
MasterDuke been running various configurations with 2 cores, no dice so far 12:02
12:03 linkable6 joined
MasterDuke how long does it usually take for you to get an error? 12:04
12:05 evalable6 joined
dumarchie It varies between 1 and about 20 runs. 12:10
MasterDuke i've probably done over 1k across the different configurations. seems like it might be a windows thing 12:11
i think [Coke] runs on windows, maybe he can repro to confirm 12:12
afk for a bit 12:13
13:02 lizmat_ joined 13:04 lizmat left 13:09 sena_kun joined 13:10 Altai-man left 13:12 frost-lab left 13:20 zakharyas left 13:28 leont joined 14:00 domidumont1 joined 14:02 dumarchie left, domidumont left
nine dumarchie: ignore the "Switching to Thread". That's just gdb saying that it switches to the thread that called abort() as that's most likely what you as a user want to look at 14:02
tellable6 nine, I'll pass your message to dumarchie
14:06 lucasb joined
nine Naively putting that allocation insided the locked area can cause a deadlock: allocation may trigger garbage collection. If another thread is waiting for that mutex, it won't be able to enter the GC, so GCing threads are waiting for that thread to join and that thread is waiting for the multi cache mutex to become available 14:08
14:11 lizmat_ is now known as lizmat 14:29 zakharyas joined 14:36 dumarchie joined
dumarchie nine, good point I guess 14:37
tellable6 2020-12-14T14:02:39Z #moarvm <nine> dumarchie: ignore the "Switching to Thread". That's just gdb saying that it switches to the thread that called abort() as that's most likely what you as a user want to look at
14:44 zakharyas1 joined 14:47 zakharyas left
dumarchie Maybe it would be better to obtain and manipulate a private pointer to the `cache` (i.e. the `cache_obj->body`) instead of manipulating and checking the `cache_obj` itself? 14:58
MasterDuke dumarchie: is valgrind available for windows? 15:00
nine dumarchie: I don't see what that would change 15:02
also that's exactly what happens in line 165 15:03
dumarchie: did you notice that the message says cur_node != 0, re-check == 0000000000000000? In my book, 0000000000000000 is decidedly 0. Previous messages in the same GH issue said <nil> which should be quite 0, too 15:09
dumarchie: what does gdb say that cur_node actually is?
dumarchie How do I ask gdb? 15:11
nine p cur_node 15:12
dumarchie Let met try to trigger another panic.
nine in that call frame. From MVM_panic you'll have to do up
dumarchie You mean `(gdb) up` and then `(gdb) p cur_node` ? 15:13
nine yes
MasterDuke might need to recompile with `--optimize=0` 15:18
dumarchie Just MoarVM, I suppose. How do I do that efficiently? 15:20
nine Just run MoarVM's manually and do a make install. No need to touch nqp or rakudo 15:21
dumarchie Running `nqp\MoarVM>perl --optimize=0 --debug=3` 15:28
15:34 MasterDuke left, Kaeipi left 15:35 Kaeipi joined, Kaeipi left 15:36 Kaeipi joined
[Coke] 15:39
15:41 MasterDuke joined
dumarchie OK, I hit the breakpoint and did `bt`. Should I also do `f 1`? 15:42
With just `up` and `p cur_node` I get `$1 = <optimized out>` 15:44
MasterDuke that should be the same as `up`, so yeah
you did `make install` after the configure? 15:45
dumarchie Yes, both in `nqp\MoarVM` and in the main `rakudo` directory. But I see that *install\bin\moar.dll* was not changed... 15:47
nine dumarchie: did you still have the debugger running when doing make install?
That could prevent make install from replacing the file 15:48
dumarchie No, I'm using just one command prompt.
Geth MoarVM/try_fix_multi_cache_add: f19e0e9f67 | (Stefan Seifert)++ | 2 files
Try to fix "Corrupt multi dispatch cache: cur_node != 0, re-check == 0"

MVM_multi_cache_find did a couple more checks when looking for a multi candidate than MVM_multi_cache_add, i.e. it was more picky. This could lead to the check for a pre-existing candidate to not find one and the code for finding the right tree node to extend getting surprised by finding a matching candidate after all.
nine dumarchie: can you please try this patch?
dumarchie It looks like the first `gmake install` from *nqp\MoarVM>* installed in *nqp\MoarVM\install\bin\* and the second `gmake install` thought that was OK... 15:56
nine dumarchie: oh, you have to specify the --prefix to
Same prefix as for rakudo 15:57
dumarchie The other files in the main *install\bin\* _were_ updated ...
nine by the make install in rakudo
dumarchie I guess so. To apply your patch I can just do a `git pull` in *nqp\MoarVM\* ? 15:59
nine and `git checkout try_fix_multi_cache_add`
then make install. After you ran again with a correct --prefix 16:00
dumarchie That would be `--prefix="c:\raku\rakudo\install\" in my case, I guess. 16:02
nine looks good
dumarchie Do you prefer the first or the last compile error? 16:05
First: src\gen\config.c:238: error: unterminated argument list invoking macro "MVMROOT" 16:06
nine There shouldn't be any :D
what the? I was nowhere near that code
dumarchie src\gen\config.c:26:5: error: 'MVMROOT' undeclared (first use in this function); did you mean 'MVMIOOps'?
src\gen\config.c:26:12: error: expected ';' at end of input 16:07
MVMROOT(tc, config, {
^ 16:08
nine There's something seriously wrong.
Can you please try a `git clean -xfd` followed by `perl Configure.-l --prefix="c:\raku\rakudo\install"` and `make install`?
dumarchie OK 16:09
nine If that alone doesn't help, then maybe deleting the c:\raku\rakudo\install completely before that will 16:10
dumarchie I did the `git clean -xfd` and `perl --optimize=0 --debug=3 --prefix="c:\raku\rakudo\install\".
nine No need for --optimize=0 --debug=3 16:11
dumarchie Just to be sure: does it matter where I do the `gmake install`?
nine in the MoarVM directory
dumarchie Same error. I will delete both c:\raku\rakudo\install and c:\raku\rakudo\nqp\MoarVM\install and try again. 16:14
B.t.w. `git status` tells me "modified: 3rdparty/libuv (untracked content)". Does that matter? 16:17
config.o is now compiled fine, so I guess not 16:20
nine step by step :) 16:21
dumarchie Okay, compiled the moar binaries. I guess I now have to do another `gmake install` in c:\raku\rakudo\ 16:25
nine no 16:26
or yes, since you deleted install previously
in that case you also need a make install in nqp
dumarchie Indeed, thanks :) 16:27
+++ Rakudo installed succesfully! 16:28
5th run without debugger: 16:30
MoarVM panic: Corrupt multi dispatch cache: cur_node != 0, re-check == 000000000892F3C0
Would you like me to try again with gdb?
nine Well....this is interesting. This is the first time that re-check actually isn't 0 16:32
please, gdb would be nice
Actually it would have been a good idea to just extend the error message to include cur_node's actual value
Then one wouldn't need the debugger for that
dumarchie Shall I wait for you commit? 16:38
Geth MoarVM/try_fix_multi_cache_add: a344c580ed | (Stefan Seifert)++ | src/6model/reprs/MVMMultiCache.c
Add some more diagnostics to "Corrupt multi dispatch cache" error
nine there ^^^
MasterDuke -cur_node? why the minus? 16:42
nine Because leaf nodes (containing the index into the results array) in the multi dispatch cache tree are marked by being negative 16:46
MasterDuke ah
dumarchie Hmm, now it takes more time to reproduce the panic. 16:54
MasterDuke but it does still panic? 16:56
dumarchie Not yet
16:56 dumarchie left 16:57 dumarchie joined
dumarchie Finally: MoarVM panic: Corrupt multi dispatch cache: cur_node != 0: -1 (000000000891F3C0), re-check == 000000000891F3C0 16:58
nine Makes me wonder: what if there are actually 2 bugs here and I did fix the one? 16:59
dumarchie In line 210, the `MVM_multi_cache_find` could operate on a `cache_obj` that is modified by a thread that has not reached line 209 yet, I think 17:03
nine How would that be modified? 17:04
dumarchie Line 162.
nine cache_obj is a local variable
line 162 allocates a new object that other threads do not have any reference to 17:05
dumarchie I'm not going to argue with you :) 17:06
I don't know C. I just saw the parameter declared as `MVMObject *cache_obj`
17:08 Altai-man joined
dumarchie I guess that dereferences the pointer? 17:08
17:10 sena_kun left
nine that declares a pointer. cache_obj is just a pointer. dereferencing is via -> 17:11
I only see 2 ways how a second check could find a candidate where the first one didn't: either the cache changed, or the thing we're looking for changed between the first and the second find 17:13
The cache is protected by the mutex. Unless there's some other code that modifies the cache without taking the mutex, I don't see how the cache could change 17:14
MasterDuke there is a MVM_MULTICACHE_DEBUG that enables a `dump_cache`. dump at every find?
dumarchie Afk to buy food. 17:15
17:49 vrurg left 18:04 domidumont1 left
nine Following the hypthesis that the thing we're looking for changes mid-way: we're looking for an MVMCallCapture. That's a set of arguments. The arguments to cas are: (Mu $target is rw, Mu \expected, Mu \value) 18:08
18:12 vrurg joined
nine What if $target gets changed by a different thread between us checking for that cas candidate for the first time and the second time? 18:13
18:17 vrurg left
timotimo yooooo 18:38
i was without a usable desktop machine for a couple of days :| 18:39
nine ouch 18:40
timotimo upgraded my media file storage from a 12tb raid0 to a 14tb raid1 18:43
three disks to two disks
nine Can't confirm my hypothesis of the changing capture 18:57
lizmat and yet another Rakudo Weekly News hits the Net: 19:09
19:30 zakharyas1 left
dumarchie nine: in my code the type of value contained in `$target` changes from `:U` to `:D` 19:31
19:33 MasterDuke left
dumarchie Furthermore the panic appears to be triggered by `prefix:<⚛>`, but maybe that first sees a `:U` and then a `:D` because another thread already performed a `cas`. 19:46
nine yes, that's what I figured, but I still cannot reproduce it even with creating those exact circumstances 20:00
20:02 dumarchie left 20:17 vrurg joined 20:18 patrickb joined 20:21 vrurg left 21:09 sena_kun joined 21:10 Altai-man left 21:18 dumarchie joined
dumarchie nine: I enabled MVM_MULTICACHE_DEBUG and created a gist with the output of the last four additions before a panic: 21:20
The first and last tree dump appear to be the same cache. Maybe the difference between them tells you something.
21:35 patrickb left 21:44 zakharyas joined 21:53 zakharyas left 21:56 vrurg joined 22:01 vrurg left 22:04 vrurg joined 22:09 vrurg left 22:12 sena_kun left 22:19 MasterDuke joined 23:28 vrurg joined 23:33 vrurg left 23:56 vrurg joined