|
Welcome to the main channel on the development of MoarVM, a virtual machine for NQP and Rakudo (moarvm.org). This channel is being logged for historical purposes. Set by lizmat on 24 May 2021. |
|||
|
00:03
reportable6 left
00:05
reportable6 joined
06:02
reportable6 left
06:05
reportable6 joined
|
|||
| Nicholas | good *, #moarvm | 07:00 | |
| Nicholas follows the URL from last night. | |||
| Today I learned that "session mead" was a thing. | |||
| nine | With my deserialization fixes t/05-autocomplete.t ran successfully in a loop for 10 hours or so. Then it crashed with "corrupted double-linked list" message by malloc when resizing an array during deserialization. | 07:32 | |
| Nicholas | Gah! And of course you have no idea whether you tickled a different bug. | 07:33 | |
| nine | It's random memory corruption. Could be the issue I was chasing, but also something else. No way to determine that from just a core dump. | 07:35 | |
| Nicholas | you need (at least) ASAN, if not get lucky with rr. Aaargh, Aaargh | ||
| nine | The 10 hours are a good sign anyway. After all, I do have a reasonable explanation for why my changes fix bugs. They just don't fix all of them :) | 07:39 | |
| At least the weather is suitable today for running stuff in loops during the day :) | 07:43 | ||
| So I ran it with asan, but of course it complains about leaks. Add --full-cleanup and bang! "Collectable 0x623000000130 in a gen2 freelist accessed" every single time. | 07:50 | ||
| Nicholas | oh, that | 07:51 | |
| yes, I've seen that. | |||
| nine | But use of gentle force allowed for me to reproduce it in gdb and it's just the missing worker thread cleanup in --full-cleanup that's causing this. Maybe I should continue my work on that some day | ||
| jnthnwrthngtn | moarning o/ | 09:07 | |
| Nicholas | \o | ||
| jnthnwrthngtn | Time to find out what happens if I switch the QAST compiler to compile call and callmethod to use new-disp.. | 09:17 | |
| Nicholas | make sure you have good backups! | ||
| candles, matches and a flashlight | 09:18 | ||
| jnthnwrthngtn | .oO( generator for the beer fridge ) |
10:08 | |
| I realized that I don't have to swich both method calls and normal calls at the same time, so currently going for method calls. | 10:09 | ||
| Nicholas | jnthnwrthngtn: or simply, "a deep enough wine cellar such that it maintains a good temperature with zero carbon footprint" | 10:11 | |
| jnthnwrthngtn | Argh. So we get some of the way then...run directly into the bootstrap | 10:20 | |
| Nicholas | does that make it lunchtime? | 10:21 | |
| jnthnwrthngtn | 'cus of course there isn't one NQPRoutine to consider, there's two | ||
| Nicholas | and this won't get easie runless you actually do both at once? | 10:23 | |
|
10:30
AlexDaniel joined
|
|||
| jnthnwrthngtn | Well, there's an easy hack while I ponder if there's a better solution :) | 10:34 | |
| Then there was an ordering issue. And now I'm running into not having ported multiple dispatch yet :) | 10:39 | ||
| lizmat hopes jnthnwrthngtn will be able to write a cool blog post as a cooling down exercise :-) | 10:47 | ||
| jnthnwrthngtn | Hm, this isn't quite so painful as feared. With QAST -> MAST compilation compiling callmethod nodes into the dispatch op, NQP builds and fails only 13 test files | 11:03 | |
| Well, I don't know how painful fixing those will be :) | |||
| Geth isn't reporting still, but I did a temporary thing that lets me get the single dispatch ported and working and then do the multiple dispatch after. | 11:04 | ||
| lizmat | I'm afraid recent RSC events caused Geth to become unresponsive :-( | 11:09 | |
| jnthnwrthngtn | Ah, OK | ||
.oO( Raku Secret Commits ) |
11:10 | ||
| lizmat | I'm in the middle of preparing my presentation at esLibre tomorrow, so have no time to look at this now | ||
| jnthnwrthngtn | It's fine, I'll survive :) | ||
| I'm meant to work on something else this afternoon anyway. | 11:11 | ||
| Nicholas | that's OK too - afternoons don't actually exist | ||
| nine | Because....noon doesn't exist, right? | 11:12 | |
| Nicholas | I'm not sure quite why. Lunch still exists. | ||
| but maybe it's just one sort of "all day breakfast" | |||
| nine | One can have lunch in the morning | ||
| jnthnwrthngtn | oh, right, lunch | ||
| I didn't push the commit that switches over callmethod; will try and debug some of the test issues first | 11:14 | ||
| Maybe more of that later today. Dunno. We'll see how quickly I tire of this afternoons's Go + strangely documented C API advernture. :) | |||
| *adventure, even | |||
| Nicholas | Given that choice, it sounds very tempting to "out to lunch" and stay there. Pesky COVID, spoiling these plans... | 11:15 | |
|
12:02
reportable6 left
12:03
reportable6 joined
14:17
dogbert17 left
|
|||
| nine | I don't know what it is about this test but it seems to inevitably end in some crash when run often enough. Now up: segfault in spesh stats add_type_at_offset | 14:33 | |
| Cought in rr this time :) | 14:34 | ||
| Certainly doesn't look right. The computer may be excused for getting confused by this: $4 = {bytecode_offset = 542, num_types = 1426, types = 0x4, invokes = 0xffff00020000021e, num_invokes = 542, num_type_tuples = 1426, type_tuples = 0x10, plugin_guards = 0xffff000000000592, num_plugin_guards = 1904} | 14:35 | ||
| It looks so weird because it's actually a static frame's handler: {start_offset = 542, end_offset = 1426, category_mask = 4, action = 0, block_reg = 2, goto_offset = 0, label_reg = 2, inlinee = 0} | 14:39 | ||
|
15:05
frost left
|
|||
| jnthnwrthngtn | Well, that's pretty broken. | 15:19 | |
| nine | Considering that MVMSpeshStatsByOffset are malloced, this means that it must have been freed prematurely, which would happen when the MVMStaticFrameSpesh holding it gets collected. Trying to verify that now | 15:26 | |
| Alternatively it could also be that handler that got freed prematurely and its memory re-used for the by_offset. | 15:28 | ||
| If my break point on MVM_free with the MVMSpeshStatsByOffset's address doesn't hit, that's probably the case | 15:29 | ||
| jnthnwrthngtn | Turns out a lot of the failing NQP tests were due to testing things that are changing or relying on things that are changing incidentally; after fixing those up, there's only 4 test files pointing out almost certain problems. | 16:57 | |
| And 2 are the same exception | 16:58 | ||
| lizmat | wow, that's good progress! | 16:59 | |
| jnthnwrthngtn | One MoarVM fix later, down to 3. One of them is because I didn't reinstate the method not found error reporting callback in the new dispatcher yet. | 17:31 | |
| And the other two are a compiler blow-up in termish, which is weird in so far as we must go through that tens of thousands of times without trouble elsewhere. | 17:32 | ||
| Think that's for after dinner. | 17:34 | ||
| nine | Tens of thousands of times? Are you running with spesh enabled? | 17:38 | |
| So...the break point on MVM_free did not fire | 17:39 | ||
| lizmat | so what is the "<" twigil ? | 17:42 | |
| docs says: "<Index into match object (not really a variable)" | |||
| but is no further docs | 17:43 | ||
| (yes, working on a presentation on sigils and twigils) | 17:44 | ||
| nine | But....the point where we overwrite the MVMSpeshStatsByOffset is immediately after allocating the handler array. So it can't be a freed handler | 17:49 | |
| jnthnwrthngtn | nine: No, I just mean that given we compile the entire of NQP and all but these 2 tests, we must go thorugh the termish rule a lot of times without trouble, so why does it break this time? | 17:51 | |
| (Not actually asking you. :)) | |||
| Or rather, in exactly 2 cases | |||
| lizmat: It's not really a twigil, but I think it's about $<foo> | 17:52 | ||
| lizmat | aha... ok | ||
| jnthnwrthngtn | food, bbl | 17:53 | |
| nine | Wait a minute...this already doesn't look exactly sane: (rr) p *tss | 17:58 | |
| $20 = {arg_types = 0x0, hits = 37, osr_hits = 0, by_offset = 0x47e8d70, num_by_offset = 67108865, max_depth = 1} | |||
| Unless 67 million entries in that array are to be expected. Which I doubt. If nothing else, from the fact alone that we grow this array one entry at a time, which we ususally don't do if we expect it to grow a lot | 17:59 | ||
| Nicholas | I'm not sure how to be a more helpful rubber duck | 18:00 | |
| sure, quack! | |||
| but I don't know what useful question to add | |||
| I guess I can ask if your beer fridge is well stocked. | 18:01 | ||
| Or are they mead fridges these days? :-) | 18:02 | ||
| oh yes. sherry. Let me test that... | |||
|
18:02
reportable6 left
|
|||
| nine | Ooooh...the plot thickens. That fishy tss thing, we get it via tss = simf->type_idx >= 0 ? &(simf->ss->by_callsite[simf->callsite_idx].by_type[simf->type_idx]) : NULL; | 18:05 | |
|
18:05
reportable6 joined
|
|||
| nine | Now simf->type_idx is 8. But! simf->ss->by_callsite[simf->callsite_idx].num_by_type is just 4! So it's an out of bounds array access. | 18:06 | |
| The bad news is that this means we have some logic error and I'm not terribly familiar with spesh's stack simulation code. | 18:07 | ||
|
18:21
dogbert17 joined
|
|||
| dogbert17 | nine: seems like you fell into a rabbit hole :) | 18:22 | |
| nine | Noooo...that never happens! Would be a complete first | ||
| dogbert17 | so have many bugs have you fixed, one, two or more | 18:23 | |
| MasterDuke | looks like it's been a busy day | 18:40 | |
| github.com/faster-cpython/ideas/is...dated-desc has some interesting stuff we might be able to steal/be inspired by | 19:35 | ||
|
22:24
m6502 joined
23:16
m6502 left
|
|||