github.com/moarvm/moarvm | IRC logs at colabti.org/irclogger/irclogger_logs/moarvm
Set by AlexDaniel on 12 June 2018.
08:06 zakharyas joined 08:14 AlexDaniel left 08:27 AlexDaniel joined, AlexDaniel left, AlexDaniel joined 08:32 Kaeipi left 08:33 Kaiepi joined 09:26 squashable6 left 09:27 squashable6 joined
MasterDuke is there a reason ctxouter isn't jitted? the implementation in interp.c looks pretty simple github.com/MoarVM/MoarVM/blob/mast...3356-L3364 09:28
09:30 sena_kun joined
MasterDuke hm. colabti.org/irclogger/irclogger_lo...03-03#l102 "18:37 brrt I don't recall the exact reason, but there was a reason ctxouter didn't work." 09:31
that was a year ago, not sure that a lot has been done to the jit in the meantime, so whatever reason probably still holds? 09:32
lizmat yeah, fraid so, altough it might be worth pinging brrt 09:50
.seen brrt
tellable6 lizmat, I saw brrt 2020-04-08T11:33:08Z in #moarvm: <brrt> \o
lizmat hopes brrt is doing ok
sena_kun lizmat, I saw his messages an hour ago or so in another place. 10:04
lizmat ok, good to hear!
10:56 Altai-man_ joined 10:58 sena_kun left 11:24 zakharyas left 11:45 pamplemousse joined 11:57 MasterDuke left 12:17 MasterDuke joined 12:57 sena_kun joined 12:58 Altai-man_ left 13:07 pamplemousse left 13:09 pamplemousse joined 13:37 farcas1982regreg joined 14:06 robertle joined
MasterDuke yep, this patch gist.github.com/MasterDuke17/40d52...44a9477cc3 causes `Frame has no lexical with name '$?PACKAGE' at gen/moar/stage2/NQPHLL.nqp:1499 (/home/dan/Source/perl6/install/share/nqp/lib/NQPHLL.moarvm:SET_BLOCK_OUTER_CTX)` when running install-core-dist.raku after successfully building rakudo 14:09
and lots of rakudo's tests fail 14:10
14:43 robertle left 14:56 Altai-man_ joined 14:58 sena_kun left
MasterDuke hm. v2 of the patch gist.github.com/MasterDuke17/40d52...r_v2-patch has a very similar failure `Frame has no lexical with name '::?CLASS'` 15:56
i don't have any bash history for jit-bisect anymore, anybody remember how it's supposed to be run? 16:09
nine MasterDuke: are there any other JIT implementations of ops that use contexts and/or the framewalker? 16:16
Could be that MVM_context_apply_traversal relies on some book keeping data that's just not set up by the JIT
MasterDuke nine: i copied the implementation of ctxcallerskipthunks (it's interp.c implementation is identical except for the literal passed to MVM_context_apply_traversal) 16:17
github.com/MoarVM/MoarVM/blob/mast...1027-L1048 and github.com/MoarVM/MoarVM/blob/mast...4241-L4250 16:18
a bisect is currently running 16:20
`JIT Broken Frame/BB: 1 / 91===SORRY!===Frame has no lexical with name '$_'` 16:22
nine Ah, I see. Then I'd guess that the error is actually in another JITed op and implemeting ctxouter just unlocks that 16:23
MasterDuke nine: care to see the log the jit bisect produced? i've never really understood them enough to find anything in them that points out where to look 16:24
nine can take a look 16:25
MasterDuke gist.github.com/MasterDuke17/40d52...44a9477cc3 has it 16:26
jnthn So working on dispatch has led me to our calling conventions.
MasterDuke they need changing? 16:27
jnthn And looking at how we can efficiently implement the whole capture tweakery thing 16:28
Because the naive approach - well, also what we'd do when evaluating a dispatcher to record a guard/transform chain - is just to produce new MVMCaptures each time
nine So...you're gonna tell us that it will be a lot faster to pass on arguments in the future? 16:29
jnthn But we don't want to do that for the real guard chain walk.
Anyway, focusing back on what we do today for a moment 16:30
prepargs <callsite> - OK, so the callsite contains the argument register kinds, and also now the named argument names 16:31
arg_o 0, r(0)
arg_o 1, r(2)
The integer in the middle there writes into the args buffer. But we always, afaik, emit those in order. That's pretty redundant.
But wait, the information that it's an object argument is redundant too, 'cus that's in the callsite 16:32
And in fact, why do we even have an args buffer at all? It means we have to copy twice. 16:33
First, register to args buffer
Then in binding, args buffer to parameter
nine Couldn't the callsite contain the list of work registers that contain the args? They are determined at compile time anyway 16:38
jnthn I don't think it should contain the actual work register indices 16:41
Because we can't intern callsites so widely then
But I think it could contain constants 16:42
So then we have
prepargs <callsite>
[list of 16-bit integers identifying registers]
dispatch ...
That way, every arg is 2 bytes instead of 6 bytes today (or 2 bytes instead of 14 bytes for named args) 16:43
timotimo list of integers, like, literally where we'd normally have bytecode?
jnthn Yes 16:44
They're effectively "varargs" to the prepargs
nine Basically a prepargs OP with a variable number of arguments
jnthn hah!
And what if we take it even further? 16:45
dispatch_o r(0), <callsite>, 'dispatcher-name'
And then followed by the list of 16-bit integers 16:46
So instead of a 2-argument call today being prepargs (2 + 4 bytes), 2 arg_o instructions (2 * 6 bytes) and one invoke instruction (2 + 2 + 2 bytes), for a total of 24 bytes *and* 4 instructions to interpret 16:47
It'd be 2 (dispatch_o instruction code) + 2 (result register) + 4 (callsite) + 4 (dispatcher name) + 3 * 2 registers (one register is the invokee) = 18 bytes 16:48
1 instruction to interpret 16:49
And no copying into an arg buffer
No arg buffer for the GC to have to collect
In fact, no arg buffer to allocate at all
So every frame takes less ->work too
The other thing I'm thinking to do is move flattening up front 16:51
So we do it at the callsite 16:52
And for cases where we have, say, up to N positional args flattened in, we resolve it to an interned callsite
Maybe some rule for named ones too
(Need to be careful that a malicious program doesn't explode the memory use :))
16:57 sena_kun joined
jnthn And if I hang this new way of doing things off the new `dispatch` instruction, I've got a gradual migration path for implementing this. :) 16:58
16:59 Altai-man_ left
jnthn Ok, home time 17:04
jnthn hopes the time invested in the design work will mean he has an easier/shorter time of the impl work :) 17:05
17:25 zakharyas joined
MasterDuke nine: guess nothing jumped out at you in that bisect log? 18:50
18:56 Altai-man_ joined 18:57 zakharyas left 18:58 sena_kun left
nwc10 jnthn: er, hangon, currently each *call* causes allocation? Or "each call site on first call"? 19:00
timotimo which allocation are you refering to? 19:12
19:13 zakharyas joined
nwc10 17:49 < jnthn> No arg buffer for the GC to have to collect 19:13
timotimo that's more a "have a couple pointers that have to be put into a worklist" thing
nwc10 timotimo: I'm not familiar (at all) with the MoarVM calling convention, so I can't easily follow from jnthn's long description what is "plan he can rule out now" versus "current"
(sort of clear what "future" is intended to be, but of course "no plan survives contact with the enemy") 19:14
timotimo arg_buffer is actually a pointer into *work, eh? so maybe we're currently just allocating it at the end of the registers area or something?
lizmat also, will these plans affect the JIT in any way ? 19:17
will new ops need to be JIIted
I assume so
jnthn nwc10: Currently the registers area for a frame has an area known as the "args buffer"; we keep a pointer into it also. 19:56
nwc10: The GC needs to walk these registers based on the callsite describing which ones are objects/strings 19:57
It's not a big amount of work, but every little helps.
nwc10 ah OK thanks 19:58
jnthn lizmat: Remains to be seen exactly how it works out, but it's unlikely that the op the interpreter uses will be JITted directly.
lizmat yeah, figured as much 19:59
by having the ops do more, wouldn't that make it harder to JIT ?
jnthn We'll be able to do things from inlining (op disappears) through specialization linking and so turning it into a fastinvoke of a specialization and fall back to at least a variant that avoids some of the overheads. 20:00
lizmat: Only if we ever let the JIT see it. :) 20:01
lizmat ok, so you're saying the JIT is going to have simpler targets ?
jnthn Well, in the inlining case it's got no op, in the linked specialization case it's a lot like today. That covers the monomorphic majority without really needing any changes. But yeah, a nicer fallback form for the JIT is possible, perhaps even including JITting the guard tree as it exists at the point we produce the specialization. 20:03
20:04 MasterDuke left
jnthn tbh, I'm mostly worried at this point about how badly we'll behave on the megamorphic minority, 'cus as it stands the design hasn't got a great answer to that. 20:06
nwc10 "Doctor doctor, it hurts when I do this" "Well, don't do that then" 20:07
one of the two English meta-jokes that I'm aware of
the other being
"Two Irishmen sitting on the floor. One fell off"
lizmat megamorphic as "method raku" existing on many types in a single dispatch chain? 20:08
nwc10 These will make no sense unless you are aware of various stereotype and set-piece English jokes
jnthn lizmat: Yes, if you have one particular callsite that encounters many different types, for example. 20:09
I was always fond of the one where they saw an advert saying "tree fellers wanted" and were like, "darn, there's only two of us"... :) 20:10
lizmat: Though the other variant is stable type but many method names (the current factoring of how we invoke action methods looks this way) 20:11
And the worst would be $so-many-types."$so-many-names"() :) 20:12
20:12 MasterDuke joined
lizmat couldn't a guard be something like "type seen"? 20:14
jnthn Well, normally you'd see a type and a method name and they won't change much, so the approach of "guard on type and name" (if name ain't already a constant) works out fine. 20:15
But if you see 100 types and 100 method names, you don't want to build a tree of 10,000 entries 20:16
At some point you're better off with having a per-target-type hash 20:17
lizmat so why not start out with one?
jnthn ?
lizmat a per-target hash ?
or a per-target list ? 20:18
jnthn We do that today. 20:19
$ perl6 -e 'say X::AdHoc.^methods(:all).elems'
165
$ cat src/core.c/Exception.pm6 | grep class | wc -l 20:20
322
m: say 165 * 322
camelia 53130
jnthn Just for that one file, there's 53,000 serialized hash entries in CORE.setting's precomp thanks to this.
Even if we assume we manage to do it compact enough that there's 2 bytes each for the key and hash (it'll be wrose in a big comp unit like CORE.setting), that's 200KB. 20:22
That's *before* you use the type and we deserialize the per-type method cache hash.
lizmat I wonder if X::AdHoc needs that many methods 20:24
maybe Exception should be made outside of Any ?
jnthn Was just doing the calculation, and I reckon it's 40 bytes just for the hash bucket storage once expanded...
m: say 165 * 40 20:25
camelia 6600
jnthn This only happens for the types you use, but still...
lizmat: It's not really to do with exceptions, it's everything. I just picked it as a file that illustrates that Raku code is quite class-dense. 20:26
Or at least, can be.
lizmat yeah, but this was really outside of this discussion :-)
jnthn Especially given they have safety/performance benefits over hashes.
Anyway, no, I don't really think Exceptions not being Any would help matters. :) 20:27
lizmat it doesn't break the build, but it does break installing core modules
jnthn I'm just noting why the pre-calculation of a method cache for every type is costly now we have the size of standard library and people running the size of applications they do :)
And why I'm keen to move away from it as part of this set of changes, so we at least only build it for the cases that really need it. 20:28
(The other part of the story here is that I relied on this pre-calc to resolve a bootstrap loop also, and will probably have to find another way to circularity saw that too...) 20:32
nwc10 wonders if the circularity saw is related to [Tux]'s chainsaw. (This is probably a far to cross-channel in joke. Don't cross the streams) 20:33
lizmat yeah, I got it :-) 20:34
on p5p, [Tux] would always be ready to rip out code that had become obsolete and removable 20:35
nwc10 and I smile, because formats never met *his* view of these criteria :-) 20:36
lizmat well, they may have been obsolete, but definitely not removable ?
nwc10 my opinion (I stress both of these) is that removing the *implementation* gains little, as it is (relatively) bug fre and self contained. But optionally disabling it lexically would allow all the "magic" variables to be disabld, which would "free up" a lot of "syntax space" 20:38
all the things like (IIRC) $= $; $-
needing to be treated as scalars
jnthn wanders away for a bit to do homework 20:41
nwc10 I should wander away to do sleep 20:42
20:56 pamplemousse left 20:57 sena_kun joined 20:58 Altai-man_ left 21:19 zakharyas left 22:01 sena_kun left 22:30 farcas1982regreg left