Welcome to the main channel on the development of MoarVM, a virtual machine for NQP and Rakudo (moarvm.org). This channel is being logged for historical purposes. Set by lizmat on 24 May 2021. |
|||
00:03
reportable6 left
00:05
reportable6 joined
06:02
reportable6 left
06:05
reportable6 joined
|
|||
Nicholas | good *, #moarvm | 07:00 | |
Nicholas follows the URL from last night. | |||
Today I learned that "session mead" was a thing. | |||
nine | With my deserialization fixes t/05-autocomplete.t ran successfully in a loop for 10 hours or so. Then it crashed with "corrupted double-linked list" message by malloc when resizing an array during deserialization. | 07:32 | |
Nicholas | Gah! And of course you have no idea whether you tickled a different bug. | 07:33 | |
nine | It's random memory corruption. Could be the issue I was chasing, but also something else. No way to determine that from just a core dump. | 07:35 | |
Nicholas | you need (at least) ASAN, if not get lucky with rr. Aaargh, Aaargh | ||
nine | The 10 hours are a good sign anyway. After all, I do have a reasonable explanation for why my changes fix bugs. They just don't fix all of them :) | 07:39 | |
At least the weather is suitable today for running stuff in loops during the day :) | 07:43 | ||
So I ran it with asan, but of course it complains about leaks. Add --full-cleanup and bang! "Collectable 0x623000000130 in a gen2 freelist accessed" every single time. | 07:50 | ||
Nicholas | oh, that | 07:51 | |
yes, I've seen that. | |||
nine | But use of gentle force allowed for me to reproduce it in gdb and it's just the missing worker thread cleanup in --full-cleanup that's causing this. Maybe I should continue my work on that some day | ||
jnthnwrthngtn | moarning o/ | 09:07 | |
Nicholas | \o | ||
jnthnwrthngtn | Time to find out what happens if I switch the QAST compiler to compile call and callmethod to use new-disp.. | 09:17 | |
Nicholas | make sure you have good backups! | ||
candles, matches and a flashlight | 09:18 | ||
jnthnwrthngtn | .oO( generator for the beer fridge ) |
10:08 | |
I realized that I don't have to swich both method calls and normal calls at the same time, so currently going for method calls. | 10:09 | ||
Nicholas | jnthnwrthngtn: or simply, "a deep enough wine cellar such that it maintains a good temperature with zero carbon footprint" | 10:11 | |
jnthnwrthngtn | Argh. So we get some of the way then...run directly into the bootstrap | 10:20 | |
Nicholas | does that make it lunchtime? | 10:21 | |
jnthnwrthngtn | 'cus of course there isn't one NQPRoutine to consider, there's two | ||
Nicholas | and this won't get easie runless you actually do both at once? | 10:23 | |
10:30
AlexDaniel joined
|
|||
jnthnwrthngtn | Well, there's an easy hack while I ponder if there's a better solution :) | 10:34 | |
Then there was an ordering issue. And now I'm running into not having ported multiple dispatch yet :) | 10:39 | ||
lizmat hopes jnthnwrthngtn will be able to write a cool blog post as a cooling down exercise :-) | 10:47 | ||
jnthnwrthngtn | Hm, this isn't quite so painful as feared. With QAST -> MAST compilation compiling callmethod nodes into the dispatch op, NQP builds and fails only 13 test files | 11:03 | |
Well, I don't know how painful fixing those will be :) | |||
Geth isn't reporting still, but I did a temporary thing that lets me get the single dispatch ported and working and then do the multiple dispatch after. | 11:04 | ||
lizmat | I'm afraid recent RSC events caused Geth to become unresponsive :-( | 11:09 | |
jnthnwrthngtn | Ah, OK | ||
.oO( Raku Secret Commits ) |
11:10 | ||
lizmat | I'm in the middle of preparing my presentation at esLibre tomorrow, so have no time to look at this now | ||
jnthnwrthngtn | It's fine, I'll survive :) | ||
I'm meant to work on something else this afternoon anyway. | 11:11 | ||
Nicholas | that's OK too - afternoons don't actually exist | ||
nine | Because....noon doesn't exist, right? | 11:12 | |
Nicholas | I'm not sure quite why. Lunch still exists. | ||
but maybe it's just one sort of "all day breakfast" | |||
nine | One can have lunch in the morning | ||
jnthnwrthngtn | oh, right, lunch | ||
I didn't push the commit that switches over callmethod; will try and debug some of the test issues first | 11:14 | ||
Maybe more of that later today. Dunno. We'll see how quickly I tire of this afternoons's Go + strangely documented C API advernture. :) | |||
*adventure, even | |||
Nicholas | Given that choice, it sounds very tempting to "out to lunch" and stay there. Pesky COVID, spoiling these plans... | 11:15 | |
12:02
reportable6 left
12:03
reportable6 joined
14:17
dogbert17 left
|
|||
nine | I don't know what it is about this test but it seems to inevitably end in some crash when run often enough. Now up: segfault in spesh stats add_type_at_offset | 14:33 | |
Cought in rr this time :) | 14:34 | ||
Certainly doesn't look right. The computer may be excused for getting confused by this: $4 = {bytecode_offset = 542, num_types = 1426, types = 0x4, invokes = 0xffff00020000021e, num_invokes = 542, num_type_tuples = 1426, type_tuples = 0x10, plugin_guards = 0xffff000000000592, num_plugin_guards = 1904} | 14:35 | ||
It looks so weird because it's actually a static frame's handler: {start_offset = 542, end_offset = 1426, category_mask = 4, action = 0, block_reg = 2, goto_offset = 0, label_reg = 2, inlinee = 0} | 14:39 | ||
15:05
frost left
|
|||
jnthnwrthngtn | Well, that's pretty broken. | 15:19 | |
nine | Considering that MVMSpeshStatsByOffset are malloced, this means that it must have been freed prematurely, which would happen when the MVMStaticFrameSpesh holding it gets collected. Trying to verify that now | 15:26 | |
Alternatively it could also be that handler that got freed prematurely and its memory re-used for the by_offset. | 15:28 | ||
If my break point on MVM_free with the MVMSpeshStatsByOffset's address doesn't hit, that's probably the case | 15:29 | ||
jnthnwrthngtn | Turns out a lot of the failing NQP tests were due to testing things that are changing or relying on things that are changing incidentally; after fixing those up, there's only 4 test files pointing out almost certain problems. | 16:57 | |
And 2 are the same exception | 16:58 | ||
lizmat | wow, that's good progress! | 16:59 | |
jnthnwrthngtn | One MoarVM fix later, down to 3. One of them is because I didn't reinstate the method not found error reporting callback in the new dispatcher yet. | 17:31 | |
And the other two are a compiler blow-up in termish, which is weird in so far as we must go through that tens of thousands of times without trouble elsewhere. | 17:32 | ||
Think that's for after dinner. | 17:34 | ||
nine | Tens of thousands of times? Are you running with spesh enabled? | 17:38 | |
So...the break point on MVM_free did not fire | 17:39 | ||
lizmat | so what is the "<" twigil ? | 17:42 | |
docs says: "<Index into match object (not really a variable)" | |||
but is no further docs | 17:43 | ||
(yes, working on a presentation on sigils and twigils) | 17:44 | ||
nine | But....the point where we overwrite the MVMSpeshStatsByOffset is immediately after allocating the handler array. So it can't be a freed handler | 17:49 | |
jnthnwrthngtn | nine: No, I just mean that given we compile the entire of NQP and all but these 2 tests, we must go thorugh the termish rule a lot of times without trouble, so why does it break this time? | 17:51 | |
(Not actually asking you. :)) | |||
Or rather, in exactly 2 cases | |||
lizmat: It's not really a twigil, but I think it's about $<foo> | 17:52 | ||
lizmat | aha... ok | ||
jnthnwrthngtn | food, bbl | 17:53 | |
nine | Wait a minute...this already doesn't look exactly sane: (rr) p *tss | 17:58 | |
$20 = {arg_types = 0x0, hits = 37, osr_hits = 0, by_offset = 0x47e8d70, num_by_offset = 67108865, max_depth = 1} | |||
Unless 67 million entries in that array are to be expected. Which I doubt. If nothing else, from the fact alone that we grow this array one entry at a time, which we ususally don't do if we expect it to grow a lot | 17:59 | ||
Nicholas | I'm not sure how to be a more helpful rubber duck | 18:00 | |
sure, quack! | |||
but I don't know what useful question to add | |||
I guess I can ask if your beer fridge is well stocked. | 18:01 | ||
Or are they mead fridges these days? :-) | 18:02 | ||
oh yes. sherry. Let me test that... | |||
18:02
reportable6 left
|
|||
nine | Ooooh...the plot thickens. That fishy tss thing, we get it via tss = simf->type_idx >= 0 ? &(simf->ss->by_callsite[simf->callsite_idx].by_type[simf->type_idx]) : NULL; | 18:05 | |
18:05
reportable6 joined
|
|||
nine | Now simf->type_idx is 8. But! simf->ss->by_callsite[simf->callsite_idx].num_by_type is just 4! So it's an out of bounds array access. | 18:06 | |
The bad news is that this means we have some logic error and I'm not terribly familiar with spesh's stack simulation code. | 18:07 | ||
18:21
dogbert17 joined
|
|||
dogbert17 | nine: seems like you fell into a rabbit hole :) | 18:22 | |
nine | Noooo...that never happens! Would be a complete first | ||
dogbert17 | so have many bugs have you fixed, one, two or more | 18:23 | |
MasterDuke | looks like it's been a busy day | 18:40 | |
github.com/faster-cpython/ideas/is...dated-desc has some interesting stuff we might be able to steal/be inspired by | 19:35 | ||
22:24
m6502 joined
23:16
m6502 left
|