Welcome to the main channel on the development of MoarVM, a virtual machine for NQP and Rakudo (moarvm.org). This channel is being logged for historical purposes. Set by lizmat on 24 May 2021. |
|||
00:00
timo left,
TempIRCLogger joined,
discord-raku-bot joined,
timo joined
00:02
reportable6 left
00:03
reportable6 joined
01:03
quotable6 joined
02:17
rakuUser left
04:03
quotable6 left,
notable6 left,
nativecallable6 left,
evalable6 left,
statisfiable6 left,
tellable6 left,
squashable6 left,
bloatable6 left,
greppable6 left,
unicodable6 left,
linkable6 left,
coverable6 left,
committable6 left,
benchable6 left,
sourceable6 left,
reportable6 left,
bisectable6 left,
releasable6 left,
shareable6 left
04:04
evalable6 joined,
benchable6 joined
04:05
committable6 joined,
coverable6 joined,
statisfiable6 joined,
linkable6 joined
04:06
bloatable6 joined,
unicodable6 joined,
nativecallable6 joined,
tellable6 joined,
greppable6 joined,
releasable6 joined
05:04
shareable6 joined
05:07
reportable6 joined
06:02
reportable6 left
06:03
reportable6 joined
06:04
notable6 joined
06:05
bisectable6 joined
06:06
squashable6 joined
06:09
squashable6 left
06:11
squashable6 joined
|
|||
nine | So, the current implementation of the fallback method on Junctions relies on AUTOTHREAD being kinda stupid and threading on every Junction in the arguments. | 07:21 | |
07:31
frost joined
|
|||
nine | The obvious solution is to just not use AUTOTHREAD there at all. We already know that the first argument is the Junction we want to thread over. So no need to look for it and no need to check the parameter's type. | 08:02 | |
This fixes the $junction.some-method infiniloop and improves performance of method calls on junctions. | |||
08:06
sourceable6 joined
|
|||
nine | Alas...the real problem emerges: multis. The auto-generated protos claim Mu as argument types. Accordingly we don't actualy autothread, but call the multi with a Junction. Which then causes a bind failure which triggers autothreading and we end up in yet another infinite recursion | 08:07 | |
Btw. CALL-ME is actually even simpler than the fallback method. It's just call each of the eigenstates and collect their results into a new Junction. | 08:37 | ||
By punting on the multi issue (i.e. not trying to be smarter with them and instead autothread always) I can get through make test and spectest without failures. First "All tests successful." new-disp has seen on spectest :) | 08:38 | ||
Of course that's not a full solution. OTOH I'm pretty sure it's an improvement over what we have in master. | 08:39 | ||
I fear for a full solution we actually need to replace AUTOTHREAD with a dispatcher. Or several dispatchers. Or even worse, integrate it into the existing ones. I don't see how else we could get out of the multi infiniloop | 08:42 | ||
09:55
evalable6 left,
linkable6 left
09:56
linkable6 joined
|
|||
Geth | MoarVM/new-disp: 478 commits pushed by (Jonathan Worthington)++, (Timo Paulssen)++, (Daniel Green)++, (Stefan Seifert)++, (Nicholas Clark)++ review: github.com/MoarVM/MoarVM/compare/a...735164299a |
09:58 | |
nine | PSA: I have rebased and force pushed the new-disp branches, so we won't chase bugs that were alreay fixed on master again. Also a merge is coming nearer anyway. I have taken the opportunity to remove the commit/revert pairs. | 10:00 | |
timo | your first message today means when we have a function that takes one Mu and one Any and both get a junction we will still thread both of them? | 10:11 | |
Nicholas | nine: did you remove the revert pairs? | 10:12 | |
oh, wait, you did | |||
timo | it sounds like he did, from that last message | ||
Nicholas | I can't read far enough | ||
nine++ | |||
fixed in post! | |||
nine | timo: yes (to the junction question) | 10:13 | |
timo | last time i looked into junctions / dispatchers i was wondering about that exact thing, but i was assuming our current implementation already solved that somehow | 10:15 | |
nine | Pushed my Junction fix to rakudo new-disp, fixing the last remaining spec test | 10:16 | |
timo | i think there must be cases that we literally can't do correctly. like when there's one multi candidate that takes Mu, Any and one that takes Any, Mu | ||
what can we even do in that case; disambiguate like in other cases where multiple multis match? | |||
nine | timo: well, the upside of this endeavour is, that now I know enough to have a sensible discussion :) | 10:17 | |
timo | boop the user on the snoot and tell them "dont do that"? | ||
nine | Both sound sensible to me | 10:18 | |
10:28
frost left
|
|||
nine | m: for ^10 { Thread.start: -> { mkdir "/tmp/testdir"; CATCH { default { note $_ } } } } | 10:30 | |
camelia | Failed to create directory '/tmp/testdir' with mode '0o777': Failed to mkdir: File exists in block at <tmp> line 1 Failed to create directory '/tmp/testdir' with mode '0o777': Failed to mkdir: File exists in block at <tmp> line 1 |
||
nine | This ^^^ is actually a TOCTOU race condition in MoarVM but only on non-Windows | 10:31 | |
timo | i think there is another T in that acronym | 10:33 | |
or i guess it doesn't have to be | |||
nine | Well, true. One could argue that it's not critical though :) | 10:35 | |
timo | doesn't that want like a sleep or so in order not to quit too early? | 10:36 | |
nine | Well, it managed to demonstrate the bogus "Failed to mkdir: File exists", so I claim it's good enough :) | 10:37 | |
timo | we normally return success for mkdir when the folder already exists? | 10:41 | |
nine | We do so on Windows and did so before commit 8e9afcf52e79ef58026fdf32bdc01078e6ca5afd Tue Jun 15 11:41:17 2021 +0700 | 10:42 | |
did so everywhere else | |||
10:42
linkable6 left
10:44
linkable6 joined
|
|||
MasterDuke | huh, i don't recall ever seeing that happen | 10:45 | |
huh, i don't recall ever seeing that happen | 10:46 | ||
btw, if anybody feels like some light reading, i'm not sure why gist.github.com/MasterDuke17/07eaa...f898dda43c doesn't fix the example of `use MONKEY-SEE-NO-EVAL; use YAMLish; my $buf = buf8.allocate(100_000); my $raku = $buf.raku; my $yaml = save-yaml $buf; for ^5 { say $_; try EVAL $raku; load-yaml $yaml }` | 10:47 | ||
Geth | MoarVM/new-disp: ffd33a736c | (Stefan Seifert)++ | src/io/dirops.c Fix TOCTTOU race condition in mkdir on non-Windows mkdir should report successfully creating a directory if it already exists. We do so on Windows and before commit 8e9afcf52e79ef58026fdf32bdc01078e6ca5afd did so reliably everywhere else. However that commit introduced a race condition. Since we first check if the directory already exists and only if not try to create it, this leaves open a window of opportunity for some other process or ... (10 more lines) |
10:51 | |
10:55
evalable6 joined
|
|||
timo | MasterDuke: you're working on this with master branches or with new-disp? | 10:57 | |
MasterDuke | master | 11:05 | |
11:06
quotable6 joined
|
|||
timo | what number does it usually reach before it crashes and burns? | 11:11 | |
MasterDuke | never gets past 1 | 11:13 | |
timo | on new-disp it does | 11:24 | |
MasterDuke | huh. with or without my patch? | 11:27 | |
dogbert11 | does anyone else get a hang when running 'MVM_SPESH_NODELAY=1 ./rakudo-m -Ilib t/02-rakudo/99-misc.t' under new-disp? | 11:37 | |
timo | without your patch | 11:41 | |
yeah it does look like it hangs | |||
who wants to build the incredibly tricky to implement "dump backtraces for all threads in gdb" | 11:45 | ||
nine | What hangs is actually this: MVM_SPESH_NODELAY=1 gdb --args ./rakudo-m --profile -e 'my %functions = (1 => sub (@args) { [-] @args }, 4 => sub (@args) { [+] @args }); sub foo(@ast) { %functions{@ast[0]}(@ast.map: {$_ ~~ Array ?? foo $_ !! $_} ); }([4, [1, 1, 2]]) for ^250;' | 11:47 | |
Ah, it's pretty straight forward: MVM_profile_instrumented_start waits on tc->instance->cond_spesh_sync without marking the thread blocked, while the spesh thread is waiting for the other thread to join the GC run. | 11:49 | ||
dogbert11 | deadlock? | 11:50 | |
nine | yes | ||
Geth | MoarVM/new-disp: 9429bdcb16 | (Stefan Seifert)++ | src/profiler/instrument.c Fix a potential deadlock when starting to profile MVM_profile_instrumented_start waited on tc->instance->cond_spesh_sync without marking the thread blocked. If the spesh thread was at that point waiting for the other thread to join the GC run, we would end up in a dead lock. Fix by marking the thread blocked while waiting for the mutex and cond var. |
11:55 | |
12:02
reportable6 left
|
|||
dogbert11 | nine++, shouldn't we have seen this problem before? | 12:02 | |
or have we just been 'lucky' | 12:03 | ||
Geth | MoarVM/new-disp: a234d1dab0 | (Stefan Seifert)++ | src/profiler/instrument.c Fix a potential deadlock when ending profiling MVM_profile_instrumented_end waited on tc->instance->cond_spesh_sync without marking the thread blocked. If the spesh thread was at that point waiting for the other thread to join the GC run, we would end up in a dead lock. Fix by marking the thread blocked while waiting for the mutex and cond var. |
12:04 | |
dogbert11 | oh, the fixes are coming fast :) | 12:05 | |
nine | dogbert11: I think we've just been lucky. After all the deadlock required a GC run just at the wrong time | ||
timo | any changes for like startup can make a noticeable difference to gc load | 12:06 | |
we can do a lot of startup with not a lot of GC by now, right? | |||
nine | no idea | 12:07 | |
dogbert11 | I can't but wonder if it would be possible to sweep the codebase for problems like this | 12:08 | |
timo | (gdb) print tc->nursery_alloc_limit - tc->nursery_tospace | 12:09 | |
$6 = 4194304 | |||
(gdb) print tc->nursery_alloc - tc->nursery_tospace | |||
$7 = 4010880 | |||
this is from rakudo -e 'exit()' | |||
so when we've parsed a tiny program and have compiled and run it, we're about 180 kbytes before the first GC run | 12:10 | ||
the good thing about that is when the entire compiler and parser stuff has already become garbage by the time we're in user code, we'll probably have a very empty nursery after that | |||
that's just theorycrafting, though | 12:11 | ||
ok my program is now use nqp; nqp::force_gc(); sleep(5); and the numbers i get are 4194304 and 23128 | 12:22 | ||
so right after that forced gc the nursery is about half a percent used | |||
m: say 0x03d20000 - 0x00405000 | 12:25 | ||
camelia | 59879424 | ||
timo | m: say (0x03d20000 - 0x00405000) / (1024 * 1024) | 12:26 | |
camelia | 57.105469 | ||
timo | (just the heap size of the process when it's just about finished) | ||
i had that gdb python script that analyzes the heck out of the nursery and gen2 in terms of how many objects of what repr and type and such, but it was incredibly slow. maybe i'll rewrite it, or at least a part of it, in C at some point | 12:33 | ||
lizmat | where did it get its data from ? | 12:35 | |
timo | it essentially walked all the slots in the gen2 and walked from object to object in the nursery | 12:36 | |
lizmat | aah... /me missed the "gdb" part :-) | ||
timo | slot_type = int(STable['REPR_data'].cast(gdb.lookup_type("MVMArrayREPRData").pointer())['slot_type']) | ||
*shudders* | 12:37 | ||
the tricky bit about the gen2 is that we don't have the info about if a slot is occupied or not directly, but we do have a free list that points from one slot to the next | 12:38 | ||
i guess we could check if the first 8 bytes of the slot is a pointer into one of the gen2 pages of the same size | 12:39 | ||
my implementation back then would go collect all the pages, have a bit (well, a Bool) for every slot, then walk the entire free list and punches holes, and only after that will it go through all the objects that it now knows to be valid | |||
not actually sure if this benefits literally anybody :) | 12:55 | ||
13:05
reportable6 joined
|
|||
timo | i also had it display occupancy of gen2 pages so you'd get a little graphic that shows how holes are distributed | 13:24 | |
i mean, tracy has much prettier display for that, but still | 13:28 | ||
MasterDuke | nine++ also fixed that warning in the mkdir code | 13:36 | |
timo: interesting re new-disp not MVM_oops'ing | 13:37 | ||
timo | could be related to us not optimizing quite as much | ||
i assume you've already checked like turning spesh pieces off | 13:38 | ||
MasterDuke | well, it happens with jit disabled. i think nine diagnosed it as pretty straightforward. two mutexes are locked, one of which is set as the ex_release_mutex. an exception is thrown (in bytecode validation iirc), the ex_release_mutex is unlocked, the other is gc'ed while still locked | 13:42 | |
timo | why do we explode in bytecode validation in the first place | 13:50 | |
like, that's for when we're reading bad bytecode | |||
nine | MasterDuke: this doesn't look right: gist.github.com/MasterDuke17/07eaa...-patch-L63 | 13:55 | |
MasterDuke | m: use MONKEY-SEE-NO-EVAL; EVAL "say buf8.new({(^66_000).join(q|,|)})" | ||
camelia | Bytecode validation error at offset 52, instruction 10: callsite expects 257 more positionals in block <unit> at <tmp> line 1 |
||
nine | MasterDuke: you're not releasing ex_release_mutex[0] if ex_release_mutex[1] is set? | ||
MasterDuke | right. i'm assuming they work like a stack | 13:56 | |
i.e., set ex_release; <code>; set ex_release; <code>; release; <code>; release | 13:58 | ||
nine | But if an exception is thrown, we unravel the whole C stack. So all of those ex_release_mutex should be released | 13:59 | |
Ah, not in the manually called MVM_tc_set_ex_release_mutex. | 14:01 | ||
MasterDuke | right, and there's a simple fix i'll just add, call MVM_tc_release_ex_release_mutex twice at src/core/exceptions.c:996 | ||
nine | MasterDuke: but in src/core/exceptions.c:996 | ||
14:02
tealecloud left
|
|||
MasterDuke | i'm not sure why it doesn't happen on new-disp. something else must have changed, but did it actually fix the problem or just mask it somehow? | 14:05 | |
ugh. still `MoarVM panic: Tried to garbage-collect a locked mutex` on master even with the second MVM_tc_release_ex_release_mutex added | 14:07 | ||
timo | ah, the call to the , operator is too large | 14:12 | |
well, we did change how invocation works in new-disp | |||
MasterDuke | i didn't try that shortened example on new-disp, anyone have it built right now? | 14:13 | |
timo | yes, give me a moment | ||
Bytecode validation error at offset 54, instruction 9: | 14:14 | ||
register operand index 54 out of range 0..3 | |||
in block <unit> at -e line 1 | |||
y | |||
MasterDuke | hm. a similar error, but not exactly the same | 14:29 | |
but shouldn't there still be a second mutex getting collected? | 14:30 | ||
since there is in fact a bytecode validation error | |||
timo | probably if you turn on --full-cleanup haha | 14:36 | |
jnthnwrthngtn | nine++ # passing spectest on new-disp \o/ | 14:38 | |
dogbert11 | nine: are you still around | 15:24 | |
could this be a deadlock similar to the one you fixed a few hours ago? gist.github.com/dogbert17/8b4ce8bd...cf9f5d4be5 | 15:28 | ||
15:28
tealecloud joined
|
|||
nine | dogbert11: yes, indeed! | 15:49 | |
timo | was clean spectests our last task before merging to master or do we switch over nativecall first? | 15:59 | |
nine | I think Blin is the final frontier | 16:00 | |
timo | ah, of course | ||
16:24
tealecloud left
|
|||
nine | dogbert11: so, you know how to fix it :) | 16:35 | |
dogbert11 | nine: yes, I believe so but it seems as if github has changed a bit, my username password no longer works when pushing, sigh | 16:51 | |
they talking some kind of token rubbish | 16:53 | ||
Nicholas | dogbert11: I've not read it properly yet, but they're talking about stuff here: github.blog/2021-09-01-improving-g...ty-github/ | 17:11 | |
dogbert11 | Nicholas: thx, I've now managed to follow their instructions | 17:21 | |
Geth | MoarVM: dogbert17++ created pull request #1536: Fix another potential deadlock when starting to profile |
17:32 | |
dogbert11 | nine: done, I shamelessly stole some of your text from a previous commit | 17:33 | |
18:02
reportable6 left
|
|||
nine | dogbert11: no worry. I stole from my first commit for the second one as well :D | 18:11 | |
Geth | MoarVM/new-disp: d17593de7c | (Jan-Olof Hendig)++ | src/profiler/profile.c Fix another potential deadlock when starting to profile Very much like commit 9429bdc. The difference here was that MVM_profile_start waited for tc->instance->cond_spesh_sync without marking the thread blocked. A deadlock can occur if the spesh thread is at this point waiting for the other thread to join the GC run. Fix by marking the thread blocked while waiting for the mutex and cond var. |
18:13 | |
MoarVM/new-disp: 0306ed31a8 | niner++ (committed using GitHub Web editor) | src/profiler/profile.c Merge pull request #1536 from dogbert17/fix-possible-deadlock Fix another potential deadlock when starting to profile |
|||
MasterDuke | so i think jnthnwrthngtn has some inlining work planned, but other than that it's just a clean Blin run new-disp is waiting for? | 18:37 | |
19:04
reportable6 joined
20:21
discord-raku-bot left
20:22
discord-raku-bot joined
|
|||
jnthnwrthngtn | I'm not sure "just" is quite the right word for getting Blin clean enough to merge, but yes, what's left is 1) fix regressions, 2) get us able to inline Raku multi/method calls again | 21:20 | |
I think probably sometime soonish after the next release would be a good time, if we're looking good on those | 21:21 | ||
Then we've a release cycle for feedback on anything really pressing | |||
timo | before inlining happens again, is benchmarking a silly idea? | 21:26 | |
maybe comparing current new-disp with master but inlining disabled? | 21:27 | ||
jnthnwrthngtn | I didn't restore spesh linking yet either, so even that's not quite comparable | 21:29 | |
Geth | MoarVM/new-disp: 17c21493d4 | (Jonathan Worthington)++ | 2 files Make some frame walker functions static inlines During compilation, dynamic variable lookup consumes a lot of cycles, and the spesh frame walker used to implement it is thus very hot. In fact, it accounted for 9.21% of CPU instruction fetches during CORE.setting compilation. Wich this change, we save ~7 billion of those instruction fetches, lowering it to 8.78%. |
||
jnthnwrthngtn | So worth waiting for that at least, I'd say | ||
timo | holy wow | 21:31 | |
i'm very surprised that so many things aren't being inlined automatically already | 21:32 | ||
jnthnwrthngtn | Me too | 21:38 | |
Many of these are cross-compilation-unit | |||
But I'd still expect LTO to do better | 21:39 | ||
So far as dynamic variables go, I suspect we really do look up more of them now | |||
And create pressure on caching of them | |||
timo | i don't remember what we recently added in terms of dynamics | 21:40 | |
jnthnwrthngtn | $*MAST_FRAME is newly needed for numerous ops (hllize, istype, etc.) | ||
timo | there's not something we can do about that? | 21:41 | |
i imagine it's a dynamic var because we use the stack to have a stack of them while recursing? | 21:42 | ||
so can't just use an attribute on some object | |||
jnthnwrthngtn | Well, we can keep a stack on the object or some such also | 21:45 | |
And cache the current one in an attribute | 21:46 | ||
I think the compiler isn't even an instance at the moment | |||
But we could make it one | |||
timo | have we run the dynamic variable caching analysis script in the last few years? | 21:47 | |
22:04
evalable6 left,
linkable6 left
22:05
linkable6 joined
23:06
evalable6 joined
|