Welcome to the main channel on the development of MoarVM, a virtual machine for NQP and Rakudo (moarvm.org). This channel is being logged for historical purposes. Set by lizmat on 24 May 2021. |
|||
04:09
lizmat_ joined
04:12
kjp left
04:13
lizmat left,
kjp joined
04:14
kjp left,
kjp joined
05:25
Techcable left,
harrow left,
Nicholas left,
gfldex left,
ilogger2_ left,
tbrowder left,
committable6 left,
quotable6 left
05:27
committable6 joined,
quotable6 joined,
harrow joined,
Nicholas joined,
gfldex joined,
ilogger2_ joined,
tbrowder joined
05:28
Techcable joined
05:32
Techcable left
05:33
Techcable joined
07:39
lizmat_ left,
lizmat joined
|
|||
patrickb | I have a malloced C string that I'd like to throw in an adhoc exception. But I see no way to do so without leaking the memory in that string. Is there a way? | 08:55 | |
lizmat | timo or nine might know a way | 08:58 | |
timo | yeah there is | ||
patrickb | Oh I think I see. There is a waste arg. | 08:59 | |
timo | MVM_exception_throw_adhoc_free i think is what the function is called, you pass it an array of pointers to c strings that ends with a NULL to signal the end, and it takes care of freeing the values for you after making the string from the format | ||
patrickb | Whoop! | ||
timo | btw i've created a new command for the debug server | ||
patrickb | Do tell? | 09:01 | |
! | |||
timo | it 1) lists all filenames it knows about (for breakpoints) and 2) sends you notifications whenever new files are introduced to the files table, and it allows you to optionally ask the thread to stop after encountering any new file | 09:02 | |
plus, for convenience, send a stack trace immediately if you want | |||
patrickb | Oh! That's kind of related to what I want to. I.e. make moar send the source of such files. | 09:04 | |
timo | right | ||
lizmat | thread name PR merged | 09:05 | |
timo | for the next step i want to also make a big ol' list of line numbers that have a breakpoint instruction corresponding to it somewhere | ||
make that available via the protocol | |||
lizmat | ok, now that we're all here: I'd like to make a module today that would be callable with -Mfoo from the command line | 09:06 | |
and which would force a full stack trace when Ctrl-C is pressed | 09:07 | ||
timo | since the breakpoints are based on annotations being read and turned into breakpoint instructions when the instrumentation barrier is first hit by a frame, that makes it kind of "lazy" when lines from the file show up | ||
lizmat | basically, it would invoke itself with the debug-server enabled, and add a signal handler | ||
patrickb | timo: What's the use of such a list? Is it just so that the client doesn't have to keep track of active breakpoints? | 09:08 | |
timo | plus i would really like there to be more specificity to breakpoint locations, since one line can have instructions from multiple frames in it (think of a line with multiple map instructions with blocks and what-not | ||
it's needed so that the user can get feedback on whether a line they put a breakpoint on will even ever trigger or not | 09:09 | ||
i may also provide a way to force all frames from a given compunit, or all frames everywhere, to be instrumentation-barriered immediately, or perhaps a way to filter all frames by what file names are in their annotations | 09:10 | ||
i'm a bit annoyed by the commands in the moar remote app. they are kind of wordy. it'd really be great to have something more TUI like that also has some keybinds maybe. though it's not required to make it TUI-like to support some keys on the input side, of course | 09:14 | ||
patrickb: do you already have something in mind for how to register what source belongs to a given compunit? i'm thinking the only really reliable way to work with this is to have rakudo and nqp call a moar syscall (for example) that takes the compunit and either the source text as a string, or maybe a path to a local file, plus some extra information. for example, we do a lot of small compilations | 09:19 | ||
for BEGIN time things (rakuast improves upon this by not always having to compile to run code) and there it'd be interesting to have a start and end position of the relevant code as well | |||
and it'd be good to be able to grab the bytecode of a frame over the connection as well, but for that we would have to come up with a format :D | 09:25 | ||
09:49
sena_kun joined
|
|||
patrickb | I have still very little idea of all the moving parts in this. But my naive idea was to have rakudo annotate the source to the comp units. (And provide an opt out.) | 10:12 | |
So we'd have the source embedded in the bytecode. I'd ideally zstd compress it (and put zstd into moar by default). | 10:13 | ||
patrickb is back to $work | 10:15 | ||
timo: I'm pretty sure you have a lot more context in that area. Before I start working on this we should definitely first consult and make a plan together. | 10:36 | ||
13:27
sena_kun left
13:33
sena_kun joined
13:34
sena_kun left,
sena_kun joined
|
|||
timo | you know, we don't have to change anything about the moarvm file format, we can just have a buffer with the zstd compressed source serialized into the serialized blob, and some way to find which object is the one we want when looking for the sources | 14:17 | |
lizmat | could even be a HLL level thing in RakuAST | 14:41 | |
for each frame, a substr into the total source file ? | |||
timo | it'd be nice if the source string were not deserialized and decompressed until it is actually accessed, that should definitely be possible in terms of SMOP in the rakuast nodes | 14:42 | |
otherwise we're going to get a bad memory usage increase for the core setting for example | 14:43 | ||
lizmat | right | ||
timo | but also i'm not entirely sure if putting source text, even compressed, into the precomp files is a great idea | 14:44 | |
we already install sources alongside distributions, right? with a hash of the contents? | |||
lizmat | I mean, the core C setting source gzipped is 500K or so | 14:45 | |
the core c setting bytecode file is about 29M | 14:46 | ||
feels like an additional 500K would only make it about 2% larger ? | |||
timo | even with zstd -19 it's 431 KiB big | 14:47 | |
lizmat | wonder whether the .Uni of the source would even compress better | 14:50 | |
timo | i can give that a try | ||
aww, you can only spurt a buf8 or buf16 | 14:54 | ||
we don't have anything fast built-in to turn a buf32 into a buf16 or buf8 with big or little endian byte ordering? | 14:55 | ||
patrickb | The reason why I'd like to have it in the byte code is availability and | 14:57 | |
extensibility. | 14:58 | ||
timo | and the issue of source on disk different from source in storage probably also? | ||
patrickb | If you put it into the bytecode, then moar will by design not run into issues of the likes of not finding the sources. | ||
timo | well, maybe not for installed dists, but for .precomp files when you use -I . | ||
patrickb | Also it's possible to provide sources for things that don't exist on the disk. E.g. EVALed stuff. | 14:59 | |
timo | indeed, but for EVALed stuff we don't have to store it in the compunit since the debugserver is also in the process | ||
no need to compress then either, and can leave it out if the debugserver is not turned on | 15:00 | ||
patrickb | That would shift a lot of the source handling logic into moar, right? | ||
timo | can you elaborate a bit? i'm getting a headache (unrelated to this discussion) and thinking is a bit hard right now | 15:01 | |
patrickb | If it'd be an annotation on the CompUnit, then moar would not have to deal with how the source is generated at all. | ||
The Raku compiler would add the annotation during compilation. Moar would simply reach for that annotation and return it if found. | 15:02 | ||
All the fancy stuff of how to find the source (on the disk, some EVAL string, some deparsed RakuAST) would then live in the compiler, not in moar. | 15:03 | ||
timo | ok, so the behaviour of read-uint8 on a buf32 is that the offset you pass in selects the array slot, then you get the lowest 8 bits of the value in it as your result | 15:04 | |
lizmat | there's also an Endian argument | 15:05 | |
hmmm | 15:06 | ||
timo | right, i imagine that will let you get either the highest or the lowest byte, but not the two in the middle :D | ||
lizmat | actually, for read-uint8 the endian setting has no meanig | ||
for buf32 you'd need read-uint32 ? | |||
timo | yeah but i want to get the individual bytes so i can write them out :D | 15:07 | |
well, we can write_fhb a int16 buf so i can get the lower and upper two-bytes with read-uint16 and stash them in my result buf16 | |||
actually, if i get the BigEndian first, then the LittleEndian, for a 16bit read, i get the incredibleness of middle-endian in the result | 15:09 | ||
oh no | |||
actually what i get with LittleEndian and BigEndian is the lowest 16 bits both times, but once "backwards" and once "forwards" | 15:10 | ||
the upper 16 bits of the 32bit buf entry remains inaccessible to /.read-u?int[8|16]/ | 15:11 | ||
could be easiest to just create a CArray to hold all the data and just nativecast it | |||
lizmat | well, the semantics of the .readxxx methods are based on blob8 | 15:12 | |
timo | that's fair | ||
it's tricky to make it make sense intuitively + consistently | 15:13 | ||
lizmat | yup | 15:15 | |
afk& | |||
timo | 508 KiB core_c_setting_uni_nfc_to_32bit_binary.bin.zst; 432 K gen/moar/CORE.c.setting.zst | 15:20 | |
there are 3053 codepoints outside of 0x00..0xff in the core setting, compared to 3262397 inside 0x00..0xff | 15:40 | ||
looks like the variable size of utf8 is working very well here, and zstd -19 doesn't gain anything from using a fixed width encoding | |||
15:47
sena_kun left,
sena_kun joined
|
|||
timo | zstd --ultra --long -22 is not noticeably better either | 15:53 | |
like 7KiB better for 32bit integers and 1KiB better for utf8 | 15:54 | ||
patrickb: do you have a good idea for how we would handle source files that use #?line directives? | 15:55 | ||
patrickb | Unsure, but isn't that a non issue? | 15:58 | |
timo | not exactly sure | ||
i'm going to go lie down for my head :| | 15:59 | ||
patrickb | The #?line thing is there to bridge the gap between what's in the source that's compiled and the original source files, right? | ||
Recover well, | |||
timo | right, we mainly use it for anything we gen-cat while building rakudo, but i think users can also use it if they like | 16:00 | |
patrickb | If so, then the file that's actually compiled is the one that we would end up putting in the byte code. So there is no line number mismatch anymore. | 16:01 | |
We would simply ignore those #?line statements. | |||
timo | the annotations we put into the moarvm file reference lines "after" remapping to individual files | 16:02 | |
patrickb | Oh. | 16:03 | |
So a single comp unit can reference lines in multiple files? | 16:04 | ||
timo | that's right. but we don't want to just get the contents of these files as they are, since gen-cat also handles #?if moar and #?if jvm and whatnot | ||
they make the line numbers line up by emitting empty lines where something didn't match though | 16:05 | ||
patrickb | Would it be feasible to have "raw" line number annotations in addition to the "mapped" ones we already have? | ||
timo | i wouldn't like them to be on every annotation, but it'd be fine to have a mapping from filename to line (maybe character) offset into the source so you can fix them up very cheaply | 16:06 | |
although | 16:07 | ||
don't try what happens when you have something like #?line foo.txt 1 bla bla #?line bar.txt 1 bla bla #?line foo.txt 10 lol tricked you #?line bar.txt 99 what | |||
i.e. the same file name multiple times in the file | |||
or even the same filename a few times in a row but with overlapping starting line numbers? | 16:08 | ||
ok the syntax is different | |||
i thought the #line comment had to be a little bit more special than that. i wonder if anybody put a comment like that by accident in their own code? maybe it's only active during rakudo build somehow?? | 16:09 | ||
patrickb | Would it make more sense to have annotations for raw line numbers only and a mapping that takes the #?line magic into account? Feels a bit backwards to do these transformations and then add logic to revert it again. | ||
timo | the logic to revert it would only be for the debug server / client, whereas the correctly mapped line numbers are used in stack traces and everything | 16:12 | |
patrickb | right | ||
timo | OK i'll go rest for real now | ||
patrickb | o/ | ||
timo | feel free to investigate, i don't know very much about the machinery behind line numbers and annotations and such | 16:13 | |
it randomly occurs to me that there's more "interesting" stuff to account for in terms of filename / line number stuff in the context of the debug server. for example, we allow multiple versions of the same module to be loaded simultaneously. if we don't give sources/AB/CDEFABCDEF as the filename, the user of the debug protocol can't really express which of the files with the same name they mean | 16:15 | ||
when they create a new breakpoint | |||
having the actual source in the compunit is one step towards making this better, but the problem of identifying what the user is refering to is almost orthogonal to that | 16:16 | ||
maybe we just want to change the API to allow either a filename as string, or a filename as an int that refers to a table. same for when the debug server sends over stack traces and other bits that have a filename in them | 16:17 | ||
plus, there's maybe even a use case for differentiating a breakpoint spot by more dimensions. like what if we have an additional number for every breakpoint that refers to the same filename and line number that we increment every time we generate code that has that in it, then you could break in just one spesh candidate for example, or only when the frame is non-speshed, or only when it's jitted, | 16:20 | ||
etc etc | |||
per-spesh-candidate breakpoints make it simple to "break here but only if this function has been called with three object arguments" | 16:21 | ||
patrickb | I guess from a usability point of view we'd want to be able to see a list of all open files (with identifiers that allows distinguishing them) and a way to add breakpoints to files that haven't been loaded yet. When defining such breakpoints one should ideally be able to be fuzzy as being forced to get the identifier perfectly right is annoying. | ||
timo | 100% agree on the fuzzy thing. integrating fzf would probably not be a bad choice | 16:22 | |
patrickb | You already added something that notifies whenever a file is loaded, right? | ||
timo | well, i call it "loaded" but really it's when the instrumentation barrier is first hit on any frame that mentions the filename. which i guess already happens with a compunit's mainline frame being run, and perhaps even in the dependencies+deserialize frame? | 16:23 | |
patrickb | Then that logic wouldn't need to live in the server but could reside in the client! (Given the server gives the client a chance to register new breakpoints before proceeding with executing code of the newly loaded file.) | ||
timo | yes, there is an option to suspend the thread after it finishes the instrumentation barrier, though making all threads suspend at that point is a little bit more annoying to write | 16:24 | |
so actually, after the first thread loads the first frame from the file and goes to sleep, another thread could run past and run any other frame from that file, since it's now no longer "never seen before" | 16:25 | ||
patrickb | Hm. That's a problem we'll need to solve... | 16:26 | |
timo | i can think of a way to prevent other threads from racing past to a different frame after the first one has been seen. it'd be semantically different from "stop all threads at that moment" since threads that aren't touching that file at all would continue running | 16:28 | |
nah, the most sensible thing is to implement the "suspend all threads" thing at that exact point | 16:29 | ||
and there should be a similar thing to "file loaded" (name change pending, maybe) that refers to compunits rather than seeing annotations in frames. and probably also a request to immediately read all annotations and get all filenames and line numbers as soon as a compunit is loaded? or maybe that should always happen by default if the debug server is running | 16:31 | ||
ok, AFK for real now | |||
patrickb | o/ again :-) | 16:33 | |
Geth | MoarVM/more_helpful_oops_messages: 4 commits pushed by (Timo Paulssen)++ | 22:30 | |
23:21
sena_kun left
|