03:38
lizmat_ joined,
[Coke]_ joined
03:39
vrurg left,
vrurg joined
03:40
Geth left
03:41
[Coke] left,
Nicholas left,
lizmat left
03:42
gfldex left
03:43
gfldex joined
03:47
Nicholas joined
08:44
sena_kun joined
08:59
lizmat_ left
09:00
Geth joined,
lizmat joined
13:11
TheAthlete joined
17:05
TheAthlete left
|
|||
lizmat | A MoarVM crash with one of my modules: gist.github.com/lizmat/24bddc13083...c52079761b | 20:24 | |
suggestions? | |||
timo | i know barely anything about lldb | 20:39 | |
ugexe | fwiw on an older rakudo i also can reproduce the segfault but the lldb output seems more useful | 20:40 | |
gist.github.com/ugexe/f33d9ec6910b...d6b5356520 | |||
timo | yeah a million times more useful for sure | 20:41 | |
could you print the values of test (probably *test) and *test->st? | 20:42 | ||
or maybe test->st before *test->st | |||
ugexe | yeah if you can tell me how to do that | 20:43 | |
timo | i guess just print test and print *test, but i'd have to look through the docs as well | ||
ugexe | what seems odd to me is there are only 2 threads | 20:44 | |
i guess i figured more threads would have been started by the time things get around to actually running rakudo code | 20:45 | ||
timo | if we exec processes, i assume the async io thread would be started. if we create a ThreadPoolScheduler in order to run "start" blocks we would create the supervisor thread | 20:46 | |
execing processes i would expect to happen when we do precompilation using subprocesses | |||
ugexe | (lldb) print *test | 20:47 | |
error: Couldn't apply expression side effects : Couldn't dematerialize a result variable: couldn't read its memory | |||
timo | ok, please try "frame variable" | ||
ugexe | gist.github.com/ugexe/68d6f1605b04...d0352cd53c | 20:48 | |
fwiw im on the latest commit rakudo now | |||
(still seeing the same error and lldb output from before) | |||
timo | ok, i don't think test is supposed to be null so that's definitely funny | ||
i should try reproducing it on my linux box so i have the chance to use rr so i can debug in reverse | 20:49 | ||
hey, can you try setting spesh blocking to 1? | 20:50 | ||
we do have a MVM_spesh_dump_arg_guard(tc, sf, ag) where we would get a textual representation of the arg guard tree and its nodes, we just gotta find the SF that goes with it, that would be in the caller frame or the one above that | 20:53 | ||
ugexe | doesn't seem to have changed anything | ||
timo | indeed, the MVM_spesh_poll_for_result has a local variable sf | 20:54 | |
"frame select 1" should give you the frame for poll_for_result | 20:56 | ||
ugexe | gist.github.com/ugexe/4a92e11bf66d...700e8e240e | 20:57 | |
timo | yep, in there you can "frame variable" to get the value of sf that we need | 20:59 | |
ugexe | gist.github.com/ugexe/44f4ae0766a2...61456a0902 | 21:00 | |
:) | |||
timo | then we should be able to "print MVM_spesh_dump_arg_guard(tc, sf, spesh->body.spesh_arg_guard)" | ||
haha, thanks optimization | 21:01 | ||
ugexe | (lldb) print MVM_spesh_dump_arg_guard(tc, sf, spesh->body.spesh_arg_guard) | ||
timo | ok instead of sf we can just use tc->cur_frame->static_info i think | ||
ugexe | error: Couldn't materialize: couldn't get the value of variable sf: variable not available | ||
error: errored out in DoExecute, couldn't PrepareToExecuteJITExpression | |||
timo | maybe it's cur_frame->body.static_info | ||
ugexe | (lldb) print MVM_spesh_dump_arg_guard(tc, tc->cur_frame->static_info, spesh->body.spesh_arg_guard) | 21:02 | |
(char *) 0x00000350936ae000 "Latest guard tree for 'insert-also' (cuid: 24, file: site#sources/1CA230DD87FD1ED08E9604B09C828C9D9EA80971 (Array::Sorted::Util):153)\n\n0: CALLSITE 0x100c50060 | Y: 1, N: 0\n1: LOAD ARG 0 | Y: 2\n2: STABLE CONC Int | Y: 3, N: 0\n3: LOAD ARG 1 | Y: 4\n4: STABLE CONC Array | Y: 5, N: 0\n5: RESULT 0\n\n" | |||
timo | anyway, it's likely that MVM_SPESH_OSR_DISABLE=1 will make the crash go away, but we do want to figure out why the crash happens of course | 21:04 | |
ugexe | github.com/lizmat/Array-Sorted-Uti...kumod#L153 | ||
i think that is the line referenced | |||
timo | ok cool so, the node we're crashing on is current_node=2, so the check if what we have is a concrete Int in argument 0 | 21:05 | |
does "print MVM_dump_backtrace(tc)" work? | 21:06 | ||
ugexe | gist.github.com/ugexe/a58660d63332...26a6065f75 | ||
timo | ok i'm not sure we can realistically expect to find what exactly goes wrong without some kind of record&replay / time-traveling debugger like rr | 21:08 | |
let me try to reproduce this locally | |||
perfect, it also crashes here | 21:12 | ||
unlucky: when i "rr record" i get a different kind of crash, and i'm not 100% sure why | 21:22 | ||
here's my hypothesis without solid evidence so far: we might be trying to osr-check inside the body of a dispatcher and we're getting very confused™ about it | 21:36 | ||
no, maybe the output of dump_bytecode was putting the arrow in the wrong spot | 21:55 | ||
i really want something for rr that lets me have some kind of timeline to see where i am, and where i was recently | 22:20 | ||
there's an uninline that happens nearby, i sure hope this isn't a case where uninline messes up, that's no fun to debug :ed | 22:25 | ||
:D | |||
22:47
sena_kun left
22:52
[Coke]_ is now known as [Coke]
|
|||
timo | you know, we should really try building a mode or tool that randomly fails guards so we can torture-test our deoptimization code | 23:52 |