github.com/moarvm/moarvm | IRC logs at colabti.org/irclogger/irclogger_logs/moarvm Set by AlexDaniel on 12 June 2018. |
|||
01:13
Merfont joined,
Kaeipi left
02:29
squashable6 left
02:31
squashable6 joined
03:32
pamplemousse left
05:52
squashable6 left
05:53
squashable6 joined
|
|||
nwc10 | good *, #moarvm | 07:52 | |
07:53
zakharyas joined
08:57
MasterDuke left
09:05
farcas1982regreg joined
09:06
Altai-man_ joined
09:13
robertle joined
|
|||
jnthn | o/ | 09:22 | |
Altai-man_ | o/ | 09:23 | |
jnthn, don't want to be pushy, but slack please (not a command to slack, though if you want to...) | 09:24 | ||
09:27
sena_kun joined
09:28
Altai-man_ left
09:32
MasterDuke joined
|
|||
nwc10 | \o | 09:43 | |
MasterDuke | hm. `my @a = ^1000; my @b; for ^1_000 { @b = @a.rotate(3); }; say @b.head` causes 2040708 to MVM_spesh_arg_guard_run. that seems like a few more than there should be? | 09:53 | |
jnthn | Well, there's a million elements to process, that's 2 arg guard checks per element | 09:55 | |
"should be" is kinda hard to say; does it run faster with spesh disabled? If not, it's coming out ahead even wit them. | 09:56 | ||
*with | |||
Of course it's nice when it can do specialization linking and avoid it. | |||
MasterDuke | if i increase the for look to 10k, with spesh is about 0.5s faster, ~5.8s instead of ~6.3s | 09:59 | |
*for loop | |||
but a perf report shows more time in MVM_spesh_arg_guard_run and resolve_using_guards than most other code | 10:00 | ||
10:02
robertle left,
robertle joined
|
|||
MasterDuke | and this is the code that has a weird discrepancy between what a --profile shows and a spesh log says. fwiw, the profile says 50% of time is spent in interpreted code | 10:03 | |
10:04
robertle left
|
|||
MasterDuke | and the other 50% in jitted code, 0% in specialized | 10:05 | |
no change with BLOCKING and NODELAY | 10:06 | ||
jnthn | What about disable? | 10:10 | |
e.g. are we looking at a situation where spesh makes things worse, or just could do better than it currently does? | |||
(Agree something sounds a bit off, but coming out ahead is still coming out ahead... :)) | |||
In the end, though, the spesh arg guard selects a specialization that depends on those guards, and that in turn means it's absorbing the cost of parameter type checking and probably other type checks in the original code. | 10:14 | ||
10:14
Altai-man_ joined
|
|||
MasterDuke | disabling spesh is 0.5s slower (when the loop is 10k instead of 1k) | 10:16 | |
10:17
sena_kun left
|
|||
MasterDuke | so yes, spesh is helping, it just seems to be sub-optimal somehow | 10:17 | |
fwiw, gist.github.com/MasterDuke17/6e63d...0974ea521d has a count of current_node values at the end of each iteration of github.com/MoarVM/MoarVM/blob/mast...L417-L472, if this info can be useful somehow | 10:21 | ||
how/where does a profile know that a function was speshed or jitted? | 12:06 | ||
lizmat | MasterDuke: how does that compare to my @a = ^1000; my @b; for ^1000 { @b = @a.map: * + 1 } | 12:08 | |
that passes the same .push-all / .push path, afaik, but it *does* spesh and makes the .push only .02% of CPU or so | 12:09 | ||
MasterDuke | yeah, i see that | 12:11 | |
jnthn | MasterDuke: optimize.c in spesh rewrites the profile enter ops, iirc | 12:13 | |
MasterDuke: To call a "we're profiled" vesrion | 12:14 | ||
oops, we're specialized | |||
And then the JIT emits them as "we're jitted" :) | |||
12:16
sena_kun joined
|
|||
MasterDuke | jnthn: github.com/MoarVM/MoarVM/blob/mast...2937-L2940 ? | 12:16 | |
timotimo | and the jit jits "enterspesh" to pass "entered via jit" instead | 12:17 | |
jnthn | yes, that looks like it | ||
MasterDuke | is there an easy way to log the function name at that point? | ||
i.e., via a printf | |||
jnthn | probably g->sf->body.name | ||
You'll need to call the thingy to turn it into a C string first | 12:18 | ||
MasterDuke | k, thanks | ||
12:18
Altai-man_ left,
Merfont left
|
|||
jnthn | So...hm, what next... | 12:18 | |
12:19
pamplemousse joined
12:20
Kaiepi joined
|
|||
jnthn | At first I figured I could think about my ideas for changing how callstack is used (so we can have non-frame things on it) until I actually needed resumable dispatchers. | 12:20 | |
uh, put off, not think about :) | 12:21 | ||
Wonder where I was originally going with that sentence... | |||
Anyway, I started looking into capture transformations and so forth and figure I'll need working space for those beyond the immediate boring cases. | |||
So it probably comes sooner rather than later. | 12:22 | ||
(Latest notes on the capture stuff are at gist.github.com/jnthn/e81634dec57a...operations ) | 12:23 | ||
nwc10 | jnthn: "what next..." surely is one of "coffee", "lunch" or "beer" depending on what time of the day, er, morning, it is. | 12:25 | |
jnthn | I already did coffee and lunch :) | 12:26 | |
MasterDuke | hm. there's only one push hit in that example. it shows up at optimize_bb_switch 22 times. when profiling, it hits the case linked above 2 times | ||
nwc10 | you're allowed "walk" under the current rules? | ||
jnthn | Yes | ||
So long as I'm wearing a face covering. | |||
nwc10 | "washing up" seems to be a thing too | ||
jnthn | I think I'll implement the new capture representation, and see where that takes me :) | 12:27 | |
MasterDuke | so why would the profile think it wasn't speshed at all? | ||
12:27
regreg joined
|
|||
jnthn | I actually arrived at working thinking about a beer today...my walk to work coincided with the big brewery I walk near making a batch and the wind direction being correct to carry it to where I walk :) | 12:28 | |
MasterDuke | i have one of those small home-brew kits. i have about a gallon of irish stout sitting right next to me that i need to check if it's ready to be bottled | 12:29 | |
12:30
farcas1982regreg left,
regreg_ joined
12:32
regreg left
12:35
regreg_ left
|
|||
MasterDuke | lizmat: interesting, that push isn't being called for the code in the loop. it's creating the initial array from the range | 12:36 | |
well, in the case of rotate it's also getting hit there. but in the other examples it's just create the initial array | 12:37 | ||
lizmat | so you're saying it's reifying every time ? | ||
MasterDuke | maybe? | 12:38 | |
lizmat | sticking in a @a[999] before the loop, doesn't seem to change it | 12:39 | |
MasterDuke | it seems to perform the type specialization after 1000 hits. so i thought that might explain why it wasn't shown as speshed/jitted initially, since we were only creating a 1k array | 12:40 | |
lizmat | afk for a bit& | 12:41 | |
MasterDuke | but that doesn't actually make sense, because in the rotate example it's also being hit in the loop body, and the profile shows 1001000 calls | ||
Geth | MoarVM/new-disp: 1f7e27a5c5 | (Jonathan Worthington)++ | 8 files Stub in new MVMCapture REPR and BOOTCapture To be used as part of the new dispatch mechanism, and will eventually take over from the current MVMCallCapture (which will, when it's no longer needed, be removed). |
12:52 | |
jnthn | I'm steadily coming to terms with the fact that all of argument processing is going to have to change, and I'll need two versions of it... | 12:56 | |
(for the migration period) | |||
12:57
farcas1982regreg joined
|
|||
MasterDuke | a profile of `my @a = ^2_000; my @b; for ^1_000 { @b = @a.rotate(3); }; say @b.head` shows 2001172 logs of push with mode MVM_PROFILE_ENTER_NORMAL, 656 logs of mode MVM_PROFILE_ENTER_JIT_INLINE, and 172 logs of mode MVM_PROFILE_ENTER_JIT | 13:04 | |
jnthn | Is there any evidence in the spesh log of it not rewriting the profile instruction properly? | ||
(Note: there may be multiple specializations of it) | 13:05 | ||
MasterDuke | nothing it jumping out at me | 13:09 | |
first instruction in BB of after for 'push' is prof_enterspesh | 13:15 | ||
BB 1 | 13:16 | ||
lizmat | jnthn: if all of argument processing is going to change, would that be an opportunity to fix some dispatch quirks ? | 13:45 | |
m: multi a($a) { "one" }; multi a($a, *%_) { "two" }; a 42 # specifically | 13:46 | ||
camelia | Ambiguous call to 'a(Int)'; these signatures all match: :($a) :($a, *%_) in block <unit> at <tmp> line 1 |
||
jnthn | lizmat: Argument processing at the VM level, which is well below the decision making of the multi-dispatcher | 13:56 | |
lizmat | ok | ||
jnthn | Having played with some ideas for a while, I think I've found a way to have all of 1) no copying to a temporary argument buffer in the common case, 2) no branching in argument fetch instructions depending on if the args we get came directly from the dispatch instruction or via a capture/flattening/whatever, and 3) a quite simple design. | 13:59 | |
moritz | that sounds pretty awesome! | 14:02 | |
now let's hope that step 3 doesn't involve solving the halting problem :D | |||
MasterDuke | nice | 14:05 | |
jnthn | I'm gonna get funny looks for it 'cus it also involves an array of the form [0,1,2,3,4,...] :) | 14:06 | |
nine starts practicing his funny looks | 14:10 | ||
14:15
Altai-man_ joined
14:18
sena_kun left
14:33
zakharyas left
|
|||
jnthn | Written up here: gist.github.com/jnthn/e81634dec57a...operations | 14:55 | |
(as much because writing it up helps me get the idea straight in my head...) | |||
I figure that flattening results will also be something that lives on our (soon to be more genearl purpose) call stack. | 15:03 | ||
*general, gah :) | |||
So, next question: how much does today's arg handling code need to change to deal with this... :) | 15:16 | ||
MasterDuke | if you care about editing corrections: "the destination of the call will will", "brining the baggage of argument processing" (i prefer picked arguments), "expanded flattening arguemnts" | 15:18 | |
jnthn | No, I don't really | ||
It's my personal notes that I'm sharing as go :) | 15:19 | ||
*as I go | |||
I'll spend some time turning appropriate parts into docs when I'm through. | |||
nine | Your personal notes tend to end up as our only documentation ;) | ||
MasterDuke | arg, and i even had a typo in my snide comment! | ||
15:24
robertle joined
|
|||
MasterDuke | so the after speshed version of push does in fact have prof_enterspesh, and says at the end "Specialization took ..." and "JIT was successful ...". push is called a couple million times, so is it somehow not getting the new optimized version? something about the guard tree? | 15:31 | |
jnthn | Maybe, or something inappropriate about the call site (like flattening)? | 15:32 | |
Currently if the call flattens args, no spesh for you | 15:33 | ||
(Thus why flattening is moving to being the first step with `dispatch`) | |||
MasterDuke | in push's BB 7: "prepargs callsite(0x7f28000356c0, 2 arg, 2 pos, nonflattening, interned)" | 15:37 | |
that was in the before, no mention at all in the after | 15:38 | ||
jnthn | ah, then it was inlined | ||
MasterDuke | push inlines assign-scalar-no-whence-no-typecheck. later push-all inlines push | 15:39 | |
push is just 3 BBs in its after state | 15:40 | ||
oops, four, the numbering starts at 0 | 15:41 | ||
don't think i've ever looked at a spesh log taken while profiling (it's usually one or the other), but it's nice to see the prof_enters and prof_enterinlines, and the paired prof_exits | 15:43 | ||
15:48
robertle left
|
|||
timotimo | t.co/0sdmkJmW5r - beautiful, and probably applies very well to comp sci | 15:53 | |
jnthn | :D | ||
16:16
sena_kun joined
16:18
Altai-man_ left
|
|||
Geth | MoarVM/new-disp: 8e5095262d | (Jonathan Worthington)++ | 13 files Prepare the way for new args handling Split the param context up into legacy, new, and common parts. Go through all the places broken by this. In most of them, panic if we shouldn't be there with non-legacy args, or if we didn't implement handling of dispatch args. In some places, update them with handling for the new dispatch args approach too. ... (7 more lines) |
17:40 | |
jnthn | Took a bit of doing, but probably the best migration strategy | 17:44 | |
And gets me to a point where I think that my path to having the `dispatch` callback of a built-in dispatcher being called is "just write code" | 17:46 | ||
And getting beyond *that* point means I need to actually work out how the dispatch program representation will look. :) | 17:48 | ||
Anyways, home time, I think | 17:50 | ||
18:15
Altai-man_ joined
18:18
sena_kun left
18:39
Kaeipi joined
18:40
zakharyas joined
18:43
Kaiepi left
18:45
farcas1982regreg left
18:46
farcas1982regreg joined
|
|||
Geth | MoarVM: MasterDuke17++ created pull request #1283: Fix the order of some MVM_calloc arguments |
19:57 | |
20:16
sena_kun joined
20:18
Altai-man_ left
20:29
robertle joined
20:48
MasterDuke left
20:56
zakharyas left
21:23
MasterDuke joined
21:34
pamplemousse left
21:35
farcas1982regreg left,
pamplemousse joined
21:40
robertle left
|
|||
MasterDuke | should this template exist? github.com/MoarVM/MoarVM/blob/mast...2813-L2818 | 21:52 | |
the lego jit doesn't have a prof_enter implementation, so i guess it'll never be called. but isn't the point of calling MVM_profile_log_enter with MVM_PROFILE_ENTER_NORMAL to mean it was interpreted? | 21:54 | ||
22:15
Altai-man_ joined
22:18
sena_kun left
22:36
Kaeipi left,
Kaeipi joined
|
|||
timotimo | oh, that should definitely not be ENTER_NORMAL | 22:36 | |
but it's also not supposed to be survive speshing nor jitting | 22:37 | ||
MasterDuke | and templates don't get called without a lego jit implementation, right? | 22:39 | |
timotimo | no, a template can be used even if the lego jit doesn't have anything for it | 22:40 | |
MasterDuke | ah. well, changing it to MVM_PROFILE_ENTER_JIT didn't change anything about a profile of that rotate example | 22:42 | |
22:45
Altai-man_ left
|
|||
Geth | MoarVM: MasterDuke17++ created pull request #1284: Use correct mode when JITting prof_enter |
22:52 | |
timotimo | OK | 22:53 | |
MasterDuke | i'm still trying to figure out why that push isn't being reported as speshed or jitted | 22:58 | |
23:49
MasterDuke left
|
|||
Geth | MoarVM: 565e55ce17 | (Daniel Green)++ | 3 files Fix the order of some MVM_calloc arguments It's number of elements then size. |
23:58 | |
MoarVM: c4917b192b | (Jonathan Worthington)++ (committed using GitHub Web editor) | 3 files Merge pull request #1283 from MasterDuke17/fix_order_of_calloc_arguments Fix the order of some MVM_calloc arguments |