github.com/moarvm/moarvm | IRC logs at colabti.org/irclogger/irclogger_logs/moarvm Set by AlexDaniel on 12 June 2018. |
|||
00:58
leont joined
01:49
Altai-man_ joined
01:52
sena_kun left
02:15
sivoais left
02:20
leont left
03:50
sena_kun joined
03:52
Altai-man_ left
05:50
Altai-man_ joined
05:52
sena_kun left
07:47
gugod1 is now known as gugod
07:51
sena_kun joined
07:52
Altai-man_ left
|
|||
nine | dogbert17: oh, I would love to know that! | 08:17 | |
dogbert17: can you reproduce it somehow at least some times? | |||
08:38
patrickb joined
09:50
Altai-man_ joined
09:53
sena_kun left
10:39
leont joined
|
|||
dogbert17 | nine: I can reproduce the problem if I run the scripts in a loop | 11:27 | |
it looks like this: gist.github.com/dogbert17/892d5892...580cf59e39 | 11:30 | ||
11:51
sena_kun joined
11:52
Altai-man_ left
13:14
lucasb joined
13:17
regreg left
13:50
Altai-man_ joined
13:52
sena_kun left
14:36
patrickb left
15:42
patrickb joined
15:51
sena_kun joined
15:52
Altai-man_ left
16:43
lucasb left
|
|||
nine | dogbert17: oh, so it's failing to acquire some lock | 16:45 | |
17:32
domidumont joined
|
|||
nine | Of course the issue refuses to get reproduced on my machine | 17:35 | |
17:49
Altai-man_ joined
17:52
sena_kun left
|
|||
nine | dogbert17: it would be interesting to know what return value causes the locking failure. We have that in r in 3rdparty/libuv/src/unix/signal.c:121 but change all return values to -1 in line 124. That seems unnecessary as all non-0 return values are interpreted as failure anyway. | 18:10 | |
dogbert17: you could change that to just return r and print the error value to stderr in line 150 before calling abort() | 18:11 | ||
dogbert17 | nine: I'll give it a shot | 18:19 | |
I usually run a spectest in another window in order to stress the system a bit | 18:24 | ||
nine | tried that already with TEST_JOBS=80 | 18:25 | |
dogbert17 | Now that's a high number :) | 18:27 | |
Now the question is, will it crash again ... | 18:32 | ||
patrickb | Can someone have a look at github.com/MoarVM/MoarVM/pull/1248 It's a pretty basic PR. | 18:34 | |
nine | patrickb: why is that even using qx in the first place? | 18:51 | |
patrickb | nine: What would be the alternative? | 18:52 | |
nine | system of course | 18:53 | |
Or open if you need to read the command's output | 18:56 | ||
Both take lists of arguments which is much safer | |||
open my $fh, '-|', $cmd, $arg1, $arg2, ...; | 18:57 | ||
18:58
domidumont left
|
|||
dogbert17 | Heh, the bug seems to have gone on strike | 18:59 | |
Noticed another strange thing though. I'm running while ./nqp t/nqp/111-spawnprocasync.t; do :; done | 19:00 | ||
and suddenly some odd text was printed between tests. It turned out to be this: | 19:01 | ||
Out-of-sync package detected in package_declarator_class at oder') {} | |||
my $dec := nqp::crea | |||
(value in braid: GLOBALish, value in $*PACKAGE: GLOBALish) | |||
patrickb | nine: I'll look | ||
dogbert17 gets something to eat | 19:09 | ||
19:16
zakharyas joined
19:50
sena_kun joined
19:52
Altai-man_ left
|
|||
dogbert17 | nine: it seems as if the 'real' return value is also -1 | 19:57 | |
nine | dogbert17: oh, damn....should have read the docs first: "On error, -1 is returned, and errno is set appropriately." | 20:07 | |
according to read(2) | |||
dogbert17 | fortunately I printed that value as well :-) | 20:08 | |
it's YOU | 20:09 | ||
i.e. errno = 9 :) | |||
patrickb | nine: How would I go about using system or open when I want to suppress and capture stdout and stderr? | ||
moritz | patrickb: is that a perl 5 or a raku question? | 20:12 | |
nine | moritz: perl 5 | 20:13 | |
moritz | then IPC::Run is an often cited answer to that question | 20:14 | |
or IPC::Open2 / IPC::Open3 if you are looking for something in perl core | |||
patrickb | moritz: Perl5. I was just tasked with removing qx from github.com/MoarVM/MoarVM/blob/mast...modules.pl | 20:16 | |
nine | patrickb: why do you even capture STDERR? Frankly, I don't think it's worth the hassle | 20:17 | |
patrickb | It's used to provide an error message containing the original output. | 20:18 | |
nine | But if you don't capture the original output it will still be there | 20:19 | |
The order may be reversed, but that's still better than not working at all if your directories are named a bit unusual | 20:20 | ||
patrickb | True. I'm more concerned about 2&>1 not working on Windows than I am about the dir quoting though. | ||
nine | dogbert17: errno 9 is EBADF /* Bad file number */ | 20:21 | |
dogbert17: so my guess would be that it's -1 in this case | 20:27 | ||
dogbert17 | can file descriptors be negative? | 20:28 | |
nine | No. -1 is used in libuv to signify that they are not yet opened: static int uv__signal_lock_pipefd[2] = { -1, -1 }; | 20:29 | |
20:30
sivoais joined
|
|||
dogbert17 | nine: I think I managed to catch the problem in strace | 20:44 | |
nine: gist.github.com/dogbert17/c2b2687e...6ebedc72ea | 20:46 | ||
20:47
zakharyas left
|
|||
nine | dogbert17: but that looks like a normal successful run? | 20:49 | |
dogbert17 | but what about the line: [pid 9135] tgkill(9133, 9135, SIGABRT <unfinished ...> | ||
line 122 in the gist | |||
nine | oh....got distracted by all the exited with 0 | 20:50 | |
dogbert17 | I have seen the r = -1 and errno = 9 sometimes but my while loop running the script didn't quit. This is just like it I believe. | 20:51 | |
do you want a normal run to compare with? | 20:54 | ||
nine | I've done that locally | 20:55 | |
Also of interest: [pid 9135] <... write resumed> ) = -1 EPIPE (Broken pipe) | 20:57 | ||
The writes and reads of "*" are uv__signal_lock and uv__signal_unlock | 20:58 | ||
So the write to the broken pipe is uv__signal_unlock | |||
vrurg | How hard would it be to dump stack trace in MVM_exception_throwpayload prior to a panic? I'm getting "MoarVM panic: Internal error: Unwound entire stack and missed handler" and can't locate the problem. | 20:59 | |
nine | vrurg: I'd just set a break point there and call MVM_dump_backtrace(tc) | 21:00 | |
vrurg | nine: Sorry, missed the key point: raku/nqp stack. | 21:01 | |
nine | vrurg: MVM_dump_backtrace will give you exactly that | 21:02 | |
vrurg | Oh, thanks! | ||
nine | dogbert17: I'm pretty sure "[pid 9133] close(4 <unfinished ...>" is closing uv__signal_lock_pipefd[0], and that's why the "[pid 9135] write(5, "*", 1 <unfinished ...>" immediately after it fails | 21:03 | |
vrurg | Though I'm afraid there could be another problem. The panic is produced by frame unwinding. And it lefts NULL in tc->cur_frame. Can I just temporarily store the original pointer without taking any measures agains gc collection? | 21:04 | |
nine | dogbert17: can you try it with libuv commit 1de151700d30629090d6a06e47cb92d538601577 reverted? | 21:07 | |
dogbert17 | ah, you suspect a libuv bug? | 21:12 | |
nine | Well, I suspect it's a regression and we did have a libuv bump recently. And that bump brings a commit that touches the signal code and is related to closing. Worth a try I'd say | 21:14 | |
dogbert17 | gah, my revert gets reverted when building moarvm | 21:22 | |
nine | dogbert17: looks like you need to commit that submodule change in the MoarVM repo, too | 21:25 | |
dogbert17 | yeah | 21:26 | |
do I have to rebuild rakudo as well? | 21:29 | ||
nine | no | ||
dogbert17 | unless I did something wrong it soes not help, I still see | 21:34 | |
[pid 13713] exit_group(0) = ? | |||
[pid 13739] tgkill(13713, 13739, SIGABRT <unfinished ... | |||
nine | Well, would have been nice... | 21:45 | |
dogbert17 | for how long have you noticed this problem? | 21:46 | |
21:50
Altai-man_ joined
|
|||
nine | First confirmed time seems to be February 5th | 21:50 | |
no, the 7th | |||
dogbert17 | That's the same day that we updated to libuv 1.34.2 | 21:51 | |
nine | Can you try downgrading again? | 21:52 | |
Maybe I just suspected the wrong commit | |||
21:52
sena_kun left
|
|||
patrickb | nine: Now I switched over to open(). Mind to have another look? | 21:54 | |
dogbert17 | the previous version we used was 1.33 | ||
which was merged on Oct. 18 2019 | 21:55 | ||
Geth | MoarVM: 9a706ba082 | (Patrick Böker)++ | tools/update-submodules.pl Change update-submodules.pl to not use the shell at all This should make this more platform agnostic and a bit safer. As a side effect of this the detailed error reporting that checked the STDERR and reported based on that is now gone, because it's not easily possible in perl to capture STDERR without using backticks. |
22:02 | |
MoarVM: bb69a97426 | niner++ (committed using GitHub Web editor) | tools/update-submodules.pl Merge pull request #1248 from patrickbkr/git-cache-dir-with-spaces Fix buiding when git reference dir has spaces in its path |
|||
dogbert17 | I get tgkill(4179, 4190, SIGABRT <unfinished ...> with version 1.33 of libuv as well | 22:03 | |
something must have changes so that the exitcode is not always zero | |||
*changed | |||
nine | So you get that SIGABRT in strace but no core dump? | 22:05 | |
dogbert17 | correct | ||
nine | so it _is_ the change in libuv | ||
is the same true for the revert you tried previously? | 22:06 | ||
dogbert17 | yes | ||
the coredumps quite seldom but the tgkill line can be reproed quite easily by having a spectest running in the background | 22:07 | ||
nine | Oooh...could it just be that it previously swallowed that SIGABRT? | ||
dogbert17 | quite possible | ||
nine | * Calls to raise() or abort() to programmatically raise a signal are not detected by libuv; these will not trigger a signal watcher. | 22:08 | |
according to the docs | |||
22:10
zakharyas joined
|
|||
leont | Catching SIGABRT is generally a terrible idea, only acceptable use-case would be attaching a debugger or some such | 22:12 | |
nine is off to bed now | 22:21 | ||
patrickb | o/ | 22:22 | |
22:33
zakharyas left
|
|||
Geth | MoarVM: patrickbkr++ created pull request #1256: Fix submodule updating |
22:40 | |
patrickb | The above should be merged soon. Build from fresh checkouts broke with the previous commit. The above is the corresponding fix. | 22:41 | |
23:19
harrow left
23:45
patrickb left
23:47
harrow joined
23:50
sena_kun joined
23:52
Altai-man_ left
|