Welcome to the main channel on the development of MoarVM, a virtual machine for NQP and Rakudo (moarvm.org). This channel is being logged for historical purposes.
Set by lizmat on 24 May 2021.
07:27 lizmat joined 07:49 lizmat left 09:13 librasteve_ joined 09:16 lizmat joined 09:21 lizmat left 11:23 librasteve_ left 12:26 lizmat joined
timo I can literally just™ longjmp in a signal handler? as long as the signal wasn't thrown during an async-signal-unsafe function? 16:46
Voldenet > pubs.opengroup.org/onlinepubs/9799...ngjmp.html 17:18
if I understand correctly then any call to malloc or printf becomes UB afterwards 17:19
timo right, it has to be a async-signal-safe function that was running at the time the signal was thrown 17:21
otherwise you're just fully screwed
unfortunate for me, who is using `use NativeCall; sub printf(int64 $ptr) is native(Str) { * }; printf(100); say "lol"'` as a test program to get a segfault to react to 17:23
since printf is literally one of the examples in the signal-safety man page for unsafe functions 17:24
strlen would be safe, surely 17:25
yup.
Voldenet man7.org/linux/man-pages/man7/sign...ety.7.html 17:27
timo yes, that's the page i was referencing
Voldenet to me it's amazing that printf isn't safe but write is
timo well, printf goes through stdio's buffering implementation, write goes straight to the kernel i think
i haven't looked at the implementation of write, but i assume it just puts the arguments in the right order for the system call and invokes it with ... int 3 or whatever is what invokes system calls these days 17:28
printf and friends also involve locking, which is always a great thing to be interrupting :D 17:29
Voldenet ah, so that means that printf could be safe 17:30
timo elaborate?
Voldenet printf could, instead of operating on buffers, directly write 17:31
which doesn't seem fast, but safe
timo wait what the hell glibc 2.1 introduced functions that let you get a stack trace? for real?
yeah, if you want that you can always snprintf + write i guess?
Voldenet except snprintf is not safe either :> 17:32
timo but yes, can be much much slower, especially for very short bits ... i think printf will actually call the write function for each directive in the format string, as well as the bits in between? i could be wrong
i wonder why snprintf isn't on the safe list 17:33
stackoverflow.com/questions/678399...async-safe - The POSIX standard does not require snprintf to be async-signal safe–let's adopt the convention of the GNU C Library manual and call that "AS-Safe" for short. However, it is possible for a vendor to implement snprintf in such a way that it is AS-Safe, by ensuring it does not make any calls to non-AS-Safe 17:35
functions, such as malloc, or do anything else that might make it non-AS-Safe (e.g. attempt to take out locks or mutexes, or access global or thread-local state). And if a vendor does that, then their snprintf implementation will be AS-Safe in practice, and if they want to, they can then officially document it as AS-Safe, as an extension to the standard.
snprintf: Preliminary: | MT-Safe locale | AS-Unsafe heap | AC-Unsafe mem | 17:36
Voldenet I think setlocale can partially initialize locale for thread during snprintf somehow 17:37
or reinitialize
so if snprintf calls getlocale for something, locale could become partially initialized 17:38
timo that sounds like fun! 17:41
Voldenet docs.oracle.com/cd/E36784_01/html/...tf-3c.html
heh these docs even mention Async-Signal-Safe as long as you don't fiddle with locale
timo i'm not sure we target Solaris
backtrace() will dynamically load libgcc when first called (unless it's already there) which makes it not async signal safe unless you always pull in libgcc before the first time you want to use it ;( 17:42
Voldenet in practice, if I read it correctly, longjmp in signal handlers could only work in controlled environment 17:48
since any use of unsafe, makes the whole blocks of code also unsafe 17:51
timo yeah, we definitely can't handle all situations
i'm not expecting that we could just turn a segfault into a raku exception and continue running 17:53
but it would be nice to give some information about what's going wrong beyond just "Segmentation Fault (core dumped)" for common cases
Voldenet hm, allocating a buffer and doing that write is actually not bad 17:54
right, it'd have to be preallocated, because no malloc 17:55
timo what write do you mean?
Voldenet the unistd.h one
since it's required to be async-signal-safe
huh, `_Fork(3)` is async-signal-safe as well… 17:57
timo haha, so if i catch a segfault, the first thing i should do is fork and do the handling in the child process! 17:59
Voldenet but I wonder if it's then safe to operate on partially initialized locale 18:01
that doesn't seem sane
timo surely not 18:02
"best effort" i would say
Voldenet hm, execve is safe as well and probably more useful 18:10
timo maybe with a "traceme" :D 18:11
i'm not sure why backtrace_symbols_fd doesn't give me function names :| 18:13
18:18 librasteve_ joined
Voldenet hm, maybe backtrace + write + addr2line would work 18:19
timo oh, dope, with libffi the stack actually just continues after the bit that nativecall sets up to do the call 18:20
it has one function name for libffi out of 3 frames on the stack, and zero names out of 4 frames in libc.so 18:21
how do you use addr2line when ASLR is involved? 18:25
ah it supposedly supports symbol + offset, i guess then i need to change the --exe= to the .so for the individual line 18:27
⬢ [timo@toolbx raku]$ addr2line --exe=//var/home/timo/raku/prefix/lib/libmoar.so -a --pretty-print "MVM_nativecall_dispatch+0x1813" 18:28
0x0000000000050423: /var/home/timo/raku/moarvm/src/core/nativecall_libffi.c:1287 (discriminator 7)
so, now on a segfault moarvm forks and raises SIGSTOP, and the child attaches with ptrace (which also stops the parent, but maybe the parent reaches raise(sigstop) quicker than the child reaches the attach attempt) 19:41
in the child process I should actually have the moarvm-related state of all threads readable by going via the instance and enumerating the threads, and with ptrace I can get the actual Instruction Pointer (all registers, really) for backtrace purposes and more 19:43
the backtrace convenience functions from glibc don't seem to allow using an arbitrary stack pointer + frame pointer + instruction pointer as a starting point, though, so I'd want to work with (a) libunwind directly 19:44
though not exactly sure if it's interesting to see the C stacks of all other threads when there was a segfault? 21:25
unless your program is heavily using nativecalls maybe? or to see which threads are waiting inside of like, pop or shift on a ConcBlockingQueue (i.e. worker threads waiting for a job to come through the job queue) 21:31
lizmat well, I guess any program using Inline::Perl5 might be interested in that ? 22:25
timo ah, presumably, yeah 22:32
lizmat the idea of forking after a segfault... and then inspecting the parent process... brilliant! 22:35
timo well ... maybe 22:37
the man page for async-safety points out that fork may be removed from the list of async safe functions in the future 22:38
just forking without an exec is cool because the entire parent process memory is still there in your own memory space and your stack is still the stack that led up to the crash
but at the same time, just getting rid of the other threads by forking (or cloning, which may be safe still) off a new process doesn't cause global state, like locks on stdio buffers and such, to be reset to a sane state 22:40
you can clone and exec a binary that gets the pid and thread id of your crashed thread on the commandline and that can then ptrace your crashed process to get everything out of it 22:42
but tbh at that point you can perhaps just exec gdb with a commandline that has a few commands in it that give good information 22:43
gdb isn't always there of course, or lldb, or whatever 22:45
lizmat ah,... ok :-( 22:46
timo doing output with only safe functions is possible 22:49
but it's a bunch of re-implementation of stuff 22:50
like V mentioned earlier, snprintf and such rely on global state for locale stuff, so outputting anything formatted would be annoying
lizmat ack 22:53
timo it's not entirely clear under what circumstances we can do much better than a C stacktrace + a moar stack trace and also moar stack traces of all threads (those would be possible without ptrace), and if we fork + ptrace parent we can get C stack traces of the other threads 22:55
lizmat feels like a valuable thing to have 22:56
sooo much better than "Bus Error"
lizmat gets some shuteye 22:57
timo it'd be dope to also have disassembly around the error location, but that's more libraries to pull in
good eye liz
good shut?
it'd be very valuable to have on CI jobs that otherwise don't let us get at a core dump easily 22:58
but these errors often happen on "not the default configuration", like on some variant of ARM or on windows or on macs 22:59
.o( pretending x86_64 linux is the default for my own sanity )