github.com/moarvm/moarvm | IRC logs at colabti.org/irclogger/irclogger_logs/moarvm
Set by AlexDaniel on 12 June 2018.
00:05 Kaiepi joined 00:38 Kaiepi left 01:54 lizmat joined 02:22 bloatable6 joined 02:39 Util joined 03:51 Kaiepi joined
Geth MoarVM/ryu: 6 commits pushed by (Nicholas Clark)++ 05:45
nwc10 good *, #moarvm 05:46
japhb Good way to say hello, with a push! :-) 05:49
06:11 sivoais left 06:39 sivoais joined 06:50 patrickb joined
patrickb o/ 06:52
nwc10 \o 06:54
Geth MoarVM: patrickbkr++ created pull request #1484:
CI: Update package index before installing packages
07:03
patrickb nwc10: I hope the above will fix the ryu PR. 07:04
nwc10 I was about to ask you exactly that :-)
patrickb Work on CIs tends to be a lot of pain. I have quite a bit of sympathy for MasterDuke and his attempt to improve the Azure chain. 07:07
MasterDuke++ 07:08
Geth MoarVM: 19db00f75d | (Patrick Böker)++ | azure-pipelines.yml
CI: Update package index before installing packages

This should fix missing package errors.
07:36
MoarVM: 09c4c4d427 | (Patrick Böker)++ (committed using GitHub Web editor) | azure-pipelines.yml
Merge pull request #1484 from patrickbkr/ci-update-package-index

CI: Update package index before installing packages
patrickb It now installs gdb 10. So the fix seems to actually have been correct. 07:37
07:39 lizmat left
Geth MoarVM/ryu: 6 commits pushed by (Nicholas Clark)++ 07:47
nwc10 and now rebased onto that
07:53 sena_kun left, lizmat joined 07:59 sena_kun joined 08:06 LizBot joined 08:28 zakharyas joined
timotimo you know, jnthn, if we know that a frame we just hit the instrumentation barrier on in order to validate it is going to call into another soon after, we could totally queue the other frame for immediate verification, perhaps off-thread, and win the tiniest amount of latency 09:14
jnthn Given a lot of frames are quite small, I think off-thread might not be a win, in that the coordination could dominate, but one could go on bytecode size 09:24
timotimo i wonder what workload i had where verification was a lot of time spent. perhaps "the empty program" 09:25
jnthn But yeah, since the instrumentation has to walk the bytecode anyway...
There'a a memory trade-off also iirc, because bytecode validation depends on the frame being fully deserialized and I think also creates annotation maps 09:26
And just because one frame can be statically seen as referencing another doesn't mean that it will actually call it
For example, CATCH blocks 09:27
timotimo cdn.discordapp.com/attachments/557...nknown.png here's the moar heapanalyzer with its bytecode validation zones
jnthn In which case we risk doing work ahead of time that we'd neve really do
09:28 frost-lab joined
jnthn The long tail is interesting there 09:29
09:30 cog_ left
jnthn I wonder what it took 9ms to validate 09:30
I should really install that Tracey thing when I get to tuning up new-disp 09:31
Looks pretty amazing in terms of what you can visualize and find out 09:32
timotimo i'm putting in reporting for what exactly is being validated right now so we can see
can't reproduce the 9 right now 09:35
jnthn I guess this can be sensitive to context switches and other sources of load, or does it somehow account for that? 09:44
timotimo i would have seen it if i had zoomed to the zone 09:48
and i think the "self time" and "running time" and other stats also show that
it could be the mainline of nqp/lib/QAST.moar takes a bit long 09:51
hm, though for that i only got the name not the filename which i should have gotten if it were that file 09:52
that was the last load bytecode region before that validation however 09:54
09:55 domidumont joined
timotimo or i have to see it like a stack 09:55
ok i have the filename in it as well now 09:58
nwc10 patrickb: yes, your fix fixed it 09:59
and I realise "most context free message so far today" 10:00
oops. Azure, apt and d'oh!
timotimo MASTOPS' frame with uuid 827 wins with 993 μs, next up is nqp/lib/QAST.moarvm's <mainline> with 777μs and then is core.c.setting.moarvm's 19141 with 766, then it drops a bit with 400 μs for core setting's unit
that very first one is the frame that has a boatload of locals and the code is just getcode + takeclosure + getcode + takeclosure et cetera followed by checkarity, paramnamesused, then wval + bindlex a lot, and then it sets up integer arrays by pushing const_i64_16 values into it over and over 10:02
and then a hash with string keys, where integers are boxed but it's not even caching the hllboxtype_i it just gets it over and over again :D 10:03
it's got 16.67k instructions in total 10:04
also 866 registers
wonder if anything keeps us from getting these arrays and hashes created at compile-time and properly serialized 10:05
nwc10 I've not used Bloaty McBloatface, but from what I've read it tells you the size of various sections. Is there any easy way to do that for MoarVM compiled bytecode files? So one can see whether a change moves stuff between sections? Or what total size the various different bytecode tags add up to? 10:06
timotimo what does "bytecode tags" mean to you here? 10:07
just 10.15k instructions in the <mainline> of QAST.moarvm, but it looks even more wasteful in some spaces 10:08
nwc10 good question. Wasn't clear. Say, how much is serialised arrays. Vs say serialised code. Except, I realise that this is a daft idea becuase everything contains everything else
timotimo haha, yeah true
10:13 samcv left, samcv joined
timotimo do we do any good on repeated findmeth with the same name on the same object (grabbed fresh with a wval each time) when every findmeth is in a different spot in a frame we only run a single time? 10:17
right now our QAST.nqp ends up with a boatload of top-level calls to QAST::MASTOperations.add_core_moarop_mapping, for example, and that's a wval + decont + findmeth every time, though we could perhaps bind QAST::MASTOperations itself to a local, have the method found once up front, and call it over and over again 10:19
jnthn Even better would be to build the mapping data structure at BEGIN time, but maybe something blocks us on that 10:25
timotimo we currently use closures to create these core moarop mappers; serializing closures isn't a problem tho 10:26
why are these operations even still there now that we have the big hash of writer subs 10:27
i guess for argument count validation and such
jnthn And sometimes name discrepancies, I guess 10:28
timotimo that's right 10:29
not just sometimes, actually kind of a lot. many related to underscores 10:32
10:42 frost-lab left
timotimo anyway, if these two frames are improved to rely on serialization instead of mainline execution, imagine the miliseconds this could save 11:08
lizmat would take any msecs saved at startup
timotimo at least 1 milisecond from validation, and i don't have any measurement for how long these two mainlines take to execute by themselves
lizmat where would I need to make changes?
timotimo the trickier one would be in nqp's vm/moar/QAST/QASTOperationsMAST.nqp 11:10
huh. the MASTOps frame that i thought was the one taking so long to validate is the big begin block that makes up the entire moarvm/lib/MAST/Ops.nqp file 11:12
i don't see it get invoked by anything in the dump of MASTOps.moarvm eithegr 11:14
does the frame being something else's outer cause it to be validated?
lizmat things like %hll_inlinability{$hll} := {} unless nqp::existskey(%hll_inlinability, $hll); 11:15
could benefit from using nqp::ifnull
reducing number of lookups
but I guess that's only once per op anyway
timotimo that lives in add_core_moarop_mapping or so? 11:16
as long as we run that code on startup a tiny improvement like that could help. if we can get it to run during compile instead, the win isn't as big 11:18
i mean, the win from making it run at compile time is big, the win from using infull in that case there will not be as big 11:22
11:27 zakharyas left 11:44 patrickb left
timotimo cdn.discordapp.com/attachments/557...nknown.png - guess what you can just nativecall into the TracyC functions, like ___tracy_emit_message 11:52
tracy is telling me "sampling is disabled due to non-native scheduler clock. are you running under a VM?" and i don't know what that means. 11:54
lizmat not running on a physical machine ? 11:55
but under valgrind? or in a container ?
timotimo yeah but this is on hardware 11:56
11:57 MasterDuke left
timotimo no VMs that i'm aware of 11:57
lizmat maybe Tracy knows something you don't ? 11:58
can you actually *see* the hardware ?
timotimo yeah 11:59
i mean, i see the enclosure it's inside of
12:02 domidumont left
lizmat ok :-) 12:05
nine timotimo: the kernel can use a variety of clock sources. What does /sys/devices/system/clocksource/clocksource0/current_clocksource say about it? 13:05
timotimo it says hpet 13:06
High Precision Extreme Timer?
the other available one is acpi_pm
for masterduke, tracy's sampling worked fine, i wonder what the clock source is on that machine? 13:07
13:15 Nornie28 joined 13:19 Nornie28 left 13:24 LizBot left 14:07 domidumont joined
nine on mine it's tsc 14:28
timotimo Traditional Source of Clocks 14:30
what clock sources are available on your machine, nine? 14:31
my desktop also has tsc active and has tsc, hpet, and acpi_pm available
(the system i'm working from right now is my laptop)
nine tsc hpet acpi_pm 14:34
14:47 zakharyas joined
timotimo tracy dev points out that the flag "cap_user_time_zero" isn't set when doing perf_open, which isn't clear how to get that set, but is possibly an NYI in the kernel for my hardware perhaps? 14:55
15:08 zakharyas left 15:09 zakharyas joined 16:07 domidumont left 16:22 domidumont joined 17:06 lizmat left 17:16 lizmat joined 17:18 domidumont left 17:24 LizBot joined 18:04 dogbert17 left 18:07 dogbert17 joined 18:08 LizBot left, LizBot joined 18:47 zakharyas left 19:56 zakharyas joined 20:18 zakharyas left 20:32 cog_ joined 20:43 MasterDuke joined
MasterDuke timotimo: /sys/devices/system/clocksource/clocksource0/current_clocksource is tsc for me 20:44
20:50 unicodable6 left, tellable6 joined 20:51 evalable6 joined, greppable6 left, releasable6 left, committable6 joined 20:52 shareable6 joined, sourceable6 joined 20:53 bisectable6 joined, statisfiable6 left 20:54 statisfiable6 joined 20:55 unicodable6 joined 20:56 tellable6 left 20:57 releasable6 joined 21:05 sourceable6 left, benchable6 left, sourceable6 joined, bloatable6 left 21:07 nativecallable6 joined, tellable6 joined, bloatable6 joined, greppable6 joined
MasterDuke .tell jnthn in case you missed them, some questions here colabti.org/irclogger/irclogger_lo...-04-30#l53 21:25
tellable6 MasterDuke, I'll pass your message to jnthn
21:49 evalable6 left, shareable6 left, statisfiable6 left, bisectable6 left, unicodable6 left, committable6 left, sourceable6 left, bloatable6 left, tellable6 left, releasable6 left, coverable6 left, greppable6 left, nativecallable6 left
MasterDuke oh, looks like i may have figured out this azure pipeline enough to get it a bit simplified, assuming people like the changes 21:51
21:51 unicodable6 joined, sourceable6 joined 21:52 notable6 joined, greppable6 joined, releasable6 joined, statisfiable6 joined, committable6 joined 21:53 evalable6 joined, tellable6 joined
MasterDuke spoke a little too soon... 21:55
22:36 Kaiepi left 23:15 Kaiepi joined