#moarvm on 5 November 2024 - Raku Programming Language Log

Welcome to the main channel on the development of MoarVM, a virtual machine for NQP and Rakudo (moarvm.org). This channel is being logged for historical purposes. Set by lizmat on 24 May 2021.
04:09 lizmat_ joined 04:12 kjp left 04:13 lizmat left, kjp joined 04:14 kjp left, kjp joined 05:25 Techcable left, harrow left, Nicholas left, gfldex left, ilogger2_ left, tbrowder left, committable6 left, quotable6 left 05:27 committable6 joined, quotable6 joined, harrow joined, Nicholas joined, gfldex joined, ilogger2_ joined, tbrowder joined 05:28 Techcable joined 05:32 Techcable left 05:33 Techcable joined 07:39 lizmat_ left, lizmat joined
patrickb	I have a malloced C string that I'd like to throw in an adhoc exception. But I see no way to do so without leaking the memory in that string. Is there a way?	08:55	Copy link Message link Add to gist Remove
lizmat	timo or nine might know a way	08:58	Copy link Message link Add to gist Remove
timo	yeah there is		Copy link Message link Add to gist Remove
patrickb	Oh I think I see. There is a waste arg.	08:59	Copy link Message link Add to gist Remove
timo	MVM_exception_throw_adhoc_free i think is what the function is called, you pass it an array of pointers to c strings that ends with a NULL to signal the end, and it takes care of freeing the values for you after making the string from the format		Copy link Message link Add to gist Remove
patrickb	Whoop!		Copy link Message link Add to gist Remove
timo	btw i've created a new command for the debug server		Copy link Message link Add to gist Remove
patrickb	Do tell?	09:01	Copy link Message link Add to gist Remove
	!		Copy link Message link Add to gist Remove
timo	it 1) lists all filenames it knows about (for breakpoints) and 2) sends you notifications whenever new files are introduced to the files table, and it allows you to optionally ask the thread to stop after encountering any new file	09:02	Copy link Message link Add to gist Remove
	plus, for convenience, send a stack trace immediately if you want		Copy link Message link Add to gist Remove
patrickb	Oh! That's kind of related to what I want to. I.e. make moar send the source of such files.	09:04	Copy link Message link Add to gist Remove
timo	right		Copy link Message link Add to gist Remove
lizmat	thread name PR merged	09:05	Copy link Message link Add to gist Remove
timo	for the next step i want to also make a big ol' list of line numbers that have a breakpoint instruction corresponding to it somewhere		Copy link Message link Add to gist Remove
	make that available via the protocol		Copy link Message link Add to gist Remove
lizmat	ok, now that we're all here: I'd like to make a module today that would be callable with -Mfoo from the command line	09:06	Copy link Message link Add to gist Remove
	and which would force a full stack trace when Ctrl-C is pressed	09:07	Copy link Message link Add to gist Remove
timo	since the breakpoints are based on annotations being read and turned into breakpoint instructions when the instrumentation barrier is first hit by a frame, that makes it kind of "lazy" when lines from the file show up		Copy link Message link Add to gist Remove
lizmat	basically, it would invoke itself with the debug-server enabled, and add a signal handler		Copy link Message link Add to gist Remove
patrickb	timo: What's the use of such a list? Is it just so that the client doesn't have to keep track of active breakpoints?	09:08	Copy link Message link Add to gist Remove
timo	plus i would really like there to be more specificity to breakpoint locations, since one line can have instructions from multiple frames in it (think of a line with multiple map instructions with blocks and what-not		Copy link Message link Add to gist Remove
	it's needed so that the user can get feedback on whether a line they put a breakpoint on will even ever trigger or not	09:09	Copy link Message link Add to gist Remove
	i may also provide a way to force all frames from a given compunit, or all frames everywhere, to be instrumentation-barriered immediately, or perhaps a way to filter all frames by what file names are in their annotations	09:10	Copy link Message link Add to gist Remove
	i'm a bit annoyed by the commands in the moar remote app. they are kind of wordy. it'd really be great to have something more TUI like that also has some keybinds maybe. though it's not required to make it TUI-like to support some keys on the input side, of course	09:14	Copy link Message link Add to gist Remove
	patrickb: do you already have something in mind for how to register what source belongs to a given compunit? i'm thinking the only really reliable way to work with this is to have rakudo and nqp call a moar syscall (for example) that takes the compunit and either the source text as a string, or maybe a path to a local file, plus some extra information. for example, we do a lot of small compilations	09:19	Copy link Message link Add to gist Remove
	for BEGIN time things (rakuast improves upon this by not always having to compile to run code) and there it'd be interesting to have a start and end position of the relevant code as well		Copy link Message link Add to gist Remove
	and it'd be good to be able to grab the bytecode of a frame over the connection as well, but for that we would have to come up with a format :D	09:25	Copy link Message link Add to gist Remove
09:49 sena_kun joined
patrickb	I have still very little idea of all the moving parts in this. But my naive idea was to have rakudo annotate the source to the comp units. (And provide an opt out.)	10:12	Copy link Message link Add to gist Remove
	So we'd have the source embedded in the bytecode. I'd ideally zstd compress it (and put zstd into moar by default).	10:13	Copy link Message link Add to gist Remove
	patrickb is back to $work	10:15	Copy link Message link Add to gist Remove
	timo: I'm pretty sure you have a lot more context in that area. Before I start working on this we should definitely first consult and make a plan together.	10:36	Copy link Message link Add to gist Remove
13:27 sena_kun left 13:33 sena_kun joined 13:34 sena_kun left, sena_kun joined
timo	you know, we don't have to change anything about the moarvm file format, we can just have a buffer with the zstd compressed source serialized into the serialized blob, and some way to find which object is the one we want when looking for the sources	14:17	Copy link Message link Add to gist Remove
lizmat	could even be a HLL level thing in RakuAST	14:41	Copy link Message link Add to gist Remove
	for each frame, a substr into the total source file ?		Copy link Message link Add to gist Remove
timo	it'd be nice if the source string were not deserialized and decompressed until it is actually accessed, that should definitely be possible in terms of SMOP in the rakuast nodes	14:42	Copy link Message link Add to gist Remove
	otherwise we're going to get a bad memory usage increase for the core setting for example	14:43	Copy link Message link Add to gist Remove
lizmat	right		Copy link Message link Add to gist Remove
timo	but also i'm not entirely sure if putting source text, even compressed, into the precomp files is a great idea	14:44	Copy link Message link Add to gist Remove
	we already install sources alongside distributions, right? with a hash of the contents?		Copy link Message link Add to gist Remove
lizmat	I mean, the core C setting source gzipped is 500K or so	14:45	Copy link Message link Add to gist Remove
	the core c setting bytecode file is about 29M	14:46	Copy link Message link Add to gist Remove
	feels like an additional 500K would only make it about 2% larger ?		Copy link Message link Add to gist Remove
timo	even with zstd -19 it's 431 KiB big	14:47	Copy link Message link Add to gist Remove
lizmat	wonder whether the .Uni of the source would even compress better	14:50	Copy link Message link Add to gist Remove
timo	i can give that a try		Copy link Message link Add to gist Remove
	aww, you can only spurt a buf8 or buf16	14:54	Copy link Message link Add to gist Remove
	we don't have anything fast built-in to turn a buf32 into a buf16 or buf8 with big or little endian byte ordering?	14:55	Copy link Message link Add to gist Remove
patrickb	The reason why I'd like to have it in the byte code is availability and	14:57	Copy link Message link Add to gist Remove
	extensibility.	14:58	Copy link Message link Add to gist Remove
timo	and the issue of source on disk different from source in storage probably also?		Copy link Message link Add to gist Remove
patrickb	If you put it into the bytecode, then moar will by design not run into issues of the likes of not finding the sources.		Copy link Message link Add to gist Remove
timo	well, maybe not for installed dists, but for .precomp files when you use -I .		Copy link Message link Add to gist Remove
patrickb	Also it's possible to provide sources for things that don't exist on the disk. E.g. EVALed stuff.	14:59	Copy link Message link Add to gist Remove
timo	indeed, but for EVALed stuff we don't have to store it in the compunit since the debugserver is also in the process		Copy link Message link Add to gist Remove
	no need to compress then either, and can leave it out if the debugserver is not turned on	15:00	Copy link Message link Add to gist Remove
patrickb	That would shift a lot of the source handling logic into moar, right?		Copy link Message link Add to gist Remove
timo	can you elaborate a bit? i'm getting a headache (unrelated to this discussion) and thinking is a bit hard right now	15:01	Copy link Message link Add to gist Remove
patrickb	If it'd be an annotation on the CompUnit, then moar would not have to deal with how the source is generated at all.		Copy link Message link Add to gist Remove
	The Raku compiler would add the annotation during compilation. Moar would simply reach for that annotation and return it if found.	15:02	Copy link Message link Add to gist Remove
	All the fancy stuff of how to find the source (on the disk, some EVAL string, some deparsed RakuAST) would then live in the compiler, not in moar.	15:03	Copy link Message link Add to gist Remove
timo	ok, so the behaviour of read-uint8 on a buf32 is that the offset you pass in selects the array slot, then you get the lowest 8 bits of the value in it as your result	15:04	Copy link Message link Add to gist Remove
lizmat	there's also an Endian argument	15:05	Copy link Message link Add to gist Remove
	hmmm	15:06	Copy link Message link Add to gist Remove
timo	right, i imagine that will let you get either the highest or the lowest byte, but not the two in the middle :D		Copy link Message link Add to gist Remove
lizmat	actually, for read-uint8 the endian setting has no meanig		Copy link Message link Add to gist Remove
	for buf32 you'd need read-uint32 ?		Copy link Message link Add to gist Remove
timo	yeah but i want to get the individual bytes so i can write them out :D	15:07	Copy link Message link Add to gist Remove
	well, we can write_fhb a int16 buf so i can get the lower and upper two-bytes with read-uint16 and stash them in my result buf16		Copy link Message link Add to gist Remove
	actually, if i get the BigEndian first, then the LittleEndian, for a 16bit read, i get the incredibleness of middle-endian in the result	15:09	Copy link Message link Add to gist Remove
	oh no		Copy link Message link Add to gist Remove
	actually what i get with LittleEndian and BigEndian is the lowest 16 bits both times, but once "backwards" and once "forwards"	15:10	Copy link Message link Add to gist Remove
	the upper 16 bits of the 32bit buf entry remains inaccessible to /.read-u?int[8\|16]/	15:11	Copy link Message link Add to gist Remove
	could be easiest to just create a CArray to hold all the data and just nativecast it		Copy link Message link Add to gist Remove
lizmat	well, the semantics of the .readxxx methods are based on blob8	15:12	Copy link Message link Add to gist Remove
timo	that's fair		Copy link Message link Add to gist Remove
	it's tricky to make it make sense intuitively + consistently	15:13	Copy link Message link Add to gist Remove
lizmat	yup	15:15	Copy link Message link Add to gist Remove
	afk&		Copy link Message link Add to gist Remove
timo	508 KiB core_c_setting_uni_nfc_to_32bit_binary.bin.zst; 432 K gen/moar/CORE.c.setting.zst	15:20	Copy link Message link Add to gist Remove
	there are 3053 codepoints outside of 0x00..0xff in the core setting, compared to 3262397 inside 0x00..0xff	15:40	Copy link Message link Add to gist Remove
	looks like the variable size of utf8 is working very well here, and zstd -19 doesn't gain anything from using a fixed width encoding		Copy link Message link Add to gist Remove
15:47 sena_kun left, sena_kun joined
timo	zstd --ultra --long -22 is not noticeably better either	15:53	Copy link Message link Add to gist Remove
	like 7KiB better for 32bit integers and 1KiB better for utf8	15:54	Copy link Message link Add to gist Remove
	patrickb: do you have a good idea for how we would handle source files that use #?line directives?	15:55	Copy link Message link Add to gist Remove
patrickb	Unsure, but isn't that a non issue?	15:58	Copy link Message link Add to gist Remove
timo	not exactly sure		Copy link Message link Add to gist Remove
	i'm going to go lie down for my head :\|	15:59	Copy link Message link Add to gist Remove
patrickb	The #?line thing is there to bridge the gap between what's in the source that's compiled and the original source files, right?		Copy link Message link Add to gist Remove
	Recover well,		Copy link Message link Add to gist Remove
timo	right, we mainly use it for anything we gen-cat while building rakudo, but i think users can also use it if they like	16:00	Copy link Message link Add to gist Remove
patrickb	If so, then the file that's actually compiled is the one that we would end up putting in the byte code. So there is no line number mismatch anymore.	16:01	Copy link Message link Add to gist Remove
	We would simply ignore those #?line statements.		Copy link Message link Add to gist Remove
timo	the annotations we put into the moarvm file reference lines "after" remapping to individual files	16:02	Copy link Message link Add to gist Remove
patrickb	Oh.	16:03	Copy link Message link Add to gist Remove
	So a single comp unit can reference lines in multiple files?	16:04	Copy link Message link Add to gist Remove
timo	that's right. but we don't want to just get the contents of these files as they are, since gen-cat also handles #?if moar and #?if jvm and whatnot		Copy link Message link Add to gist Remove
	they make the line numbers line up by emitting empty lines where something didn't match though	16:05	Copy link Message link Add to gist Remove
patrickb	Would it be feasible to have "raw" line number annotations in addition to the "mapped" ones we already have?		Copy link Message link Add to gist Remove
timo	i wouldn't like them to be on every annotation, but it'd be fine to have a mapping from filename to line (maybe character) offset into the source so you can fix them up very cheaply	16:06	Copy link Message link Add to gist Remove
	although	16:07	Copy link Message link Add to gist Remove
	don't try what happens when you have something like #?line foo.txt 1 bla bla #?line bar.txt 1 bla bla #?line foo.txt 10 lol tricked you #?line bar.txt 99 what		Copy link Message link Add to gist Remove
	i.e. the same file name multiple times in the file		Copy link Message link Add to gist Remove
	or even the same filename a few times in a row but with overlapping starting line numbers?	16:08	Copy link Message link Add to gist Remove
	ok the syntax is different		Copy link Message link Add to gist Remove
	i thought the #line comment had to be a little bit more special than that. i wonder if anybody put a comment like that by accident in their own code? maybe it's only active during rakudo build somehow??	16:09	Copy link Message link Add to gist Remove
patrickb	Would it make more sense to have annotations for raw line numbers only and a mapping that takes the #?line magic into account? Feels a bit backwards to do these transformations and then add logic to revert it again.		Copy link Message link Add to gist Remove
timo	the logic to revert it would only be for the debug server / client, whereas the correctly mapped line numbers are used in stack traces and everything	16:12	Copy link Message link Add to gist Remove
patrickb	right		Copy link Message link Add to gist Remove
timo	OK i'll go rest for real now		Copy link Message link Add to gist Remove
patrickb	o/		Copy link Message link Add to gist Remove
timo	feel free to investigate, i don't know very much about the machinery behind line numbers and annotations and such	16:13	Copy link Message link Add to gist Remove
	it randomly occurs to me that there's more "interesting" stuff to account for in terms of filename / line number stuff in the context of the debug server. for example, we allow multiple versions of the same module to be loaded simultaneously. if we don't give sources/AB/CDEFABCDEF as the filename, the user of the debug protocol can't really express which of the files with the same name they mean	16:15	Copy link Message link Add to gist Remove
	when they create a new breakpoint		Copy link Message link Add to gist Remove
	having the actual source in the compunit is one step towards making this better, but the problem of identifying what the user is refering to is almost orthogonal to that	16:16	Copy link Message link Add to gist Remove
	maybe we just want to change the API to allow either a filename as string, or a filename as an int that refers to a table. same for when the debug server sends over stack traces and other bits that have a filename in them	16:17	Copy link Message link Add to gist Remove
	plus, there's maybe even a use case for differentiating a breakpoint spot by more dimensions. like what if we have an additional number for every breakpoint that refers to the same filename and line number that we increment every time we generate code that has that in it, then you could break in just one spesh candidate for example, or only when the frame is non-speshed, or only when it's jitted,	16:20	Copy link Message link Add to gist Remove
	etc etc		Copy link Message link Add to gist Remove
	per-spesh-candidate breakpoints make it simple to "break here but only if this function has been called with three object arguments"	16:21	Copy link Message link Add to gist Remove
patrickb	I guess from a usability point of view we'd want to be able to see a list of all open files (with identifiers that allows distinguishing them) and a way to add breakpoints to files that haven't been loaded yet. When defining such breakpoints one should ideally be able to be fuzzy as being forced to get the identifier perfectly right is annoying.		Copy link Message link Add to gist Remove
timo	100% agree on the fuzzy thing. integrating fzf would probably not be a bad choice	16:22	Copy link Message link Add to gist Remove
patrickb	You already added something that notifies whenever a file is loaded, right?		Copy link Message link Add to gist Remove
timo	well, i call it "loaded" but really it's when the instrumentation barrier is first hit on any frame that mentions the filename. which i guess already happens with a compunit's mainline frame being run, and perhaps even in the dependencies+deserialize frame?	16:23	Copy link Message link Add to gist Remove
patrickb	Then that logic wouldn't need to live in the server but could reside in the client! (Given the server gives the client a chance to register new breakpoints before proceeding with executing code of the newly loaded file.)		Copy link Message link Add to gist Remove
timo	yes, there is an option to suspend the thread after it finishes the instrumentation barrier, though making all threads suspend at that point is a little bit more annoying to write	16:24	Copy link Message link Add to gist Remove
	so actually, after the first thread loads the first frame from the file and goes to sleep, another thread could run past and run any other frame from that file, since it's now no longer "never seen before"	16:25	Copy link Message link Add to gist Remove
patrickb	Hm. That's a problem we'll need to solve...	16:26	Copy link Message link Add to gist Remove
timo	i can think of a way to prevent other threads from racing past to a different frame after the first one has been seen. it'd be semantically different from "stop all threads at that moment" since threads that aren't touching that file at all would continue running	16:28	Copy link Message link Add to gist Remove
	nah, the most sensible thing is to implement the "suspend all threads" thing at that exact point	16:29	Copy link Message link Add to gist Remove
	and there should be a similar thing to "file loaded" (name change pending, maybe) that refers to compunits rather than seeing annotations in frames. and probably also a request to immediately read all annotations and get all filenames and line numbers as soon as a compunit is loaded? or maybe that should always happen by default if the debug server is running	16:31	Copy link Message link Add to gist Remove
	ok, AFK for real now		Copy link Message link Add to gist Remove
patrickb	o/ again :-)	16:33	Copy link Message link Add to gist Remove
Geth	MoarVM/more_helpful_oops_messages: 4 commits pushed by (Timo Paulssen)++ - turn cmp_write_str with const string into simpler code - debugserver: output correct info for in situ strings - output unknown command type number when NYI/unknown - sprinkle hopefully helpful hints in error messages	22:30	Copy link Message link Add to gist Remove
23:21 sena_kun left

Please report any issues / comments / feature requests as an issue on App::Raku::Log.

Thank you!