github.com/moarvm/moarvm | IRC logs at colabti.org/irclogger/irclogger_logs/moarvm
Set by AlexDaniel on 12 June 2018.
00:15 Kaypie left, Merfont joined 00:16 p6bannerbot sets mode: +v Merfont 00:17 Kaypie joined, Merfont left 00:18 p6bannerbot sets mode: +v Kaypie 00:43 Ven`` left 04:07 KDr2 joined, p6bannerbot sets mode: +v KDr2 04:54 AlexDani` joined, p6bannerbot sets mode: +v AlexDani` 05:02 AlexDaniel left
nwc10 good morning, #moarvm 06:22
It's still dark outside, but I think that it's morning now
06:59 domidumont joined 07:00 p6bannerbot sets mode: +v domidumont
masak good consensus morning, nwc10 07:04
09:01 zakharyas joined 09:02 p6bannerbot sets mode: +v zakharyas 09:25 robertle_ joined 09:26 p6bannerbot sets mode: +v robertle_ 10:04 patrickb joined 10:05 p6bannerbot sets mode: +v patrickb
nwc10 jnthn: I built origin/pea with ASAN - it gets to Rakudo's tests, and then lots fail. The couple I checked manually were both regular SEGVs in spesh code. I didn't dig further 10:09
but no pavement pizza
10:16 lizmat joined, p6bannerbot sets mode: +v lizmat
jnthn nwc10: Odd, without ASAN it's happy...but what env do you have? 10:28
Or I should say, it's happier :) 10:30
nwc10 +#define MVM_ARRAY_CONC_DEBUG 1
+#define FSA_SIZE_DEBUG 1
+#define MVM_SPESH_CHECK_DU 1
MVM_SPESH_BLOCKING=1
MVM_SPESH_NODELAY=1
all that 10:31
jnthn Ah...probably the NODELAY throwing up more issues 10:37
I'll do that if I can't get any of the exploding spectests that only need BLOCKING to work
Uh, "work" as in "show up the problems"
Hmm, OK, the one in S32-io/open.t has a deopt happening in a frame that had two SRs, so this looks like a promising one to investigate :) 10:43
10:54 lizmat left 10:57 lizmat joined, p6bannerbot sets mode: +v lizmat
jnthn Ohh...just discovered that when we inline, we don't fix up the deopt_users entries... 11:05
In the incoming graph
timotimo well, that'd be rather bad :) :)
nwc10 this is a pea specific bug? Or general?
jnthn Well, nothing cared before now I suspect
timotimo maybe no removal of unused deopt things after the inlining pass 11:06
jnthn Well, depends how you look at it. If pea is the first thing that cares about the information, which I think it is, then it's pea specific. Though of course it's arguably general. :)
Technically we should always have, but calculating stuff that wasn't used is a waste of CPU cycles. :) 11:07
timotimo good point
nwc10 OK, possibly splitting hairs here, and not relevant as you know the cause, but my concept of "general" is "can one create a testcase that triggers it (however hairy) in the original codebase". 11:08
jnthn Ah
I don't think one can
Nothing used the info
So far as I can tell 11:09
nwc10 OK, but at the level of "internals working as implicitly documented" I'd agree that it's arguably "general" here.
timotimo nwc10: how hairy is "if $compilerversion eq "12345" { Pointer.new(0).deref }? :)
nwc10 (not that this matters either)
timotimo: not sure. In that there are (also) documented ways to make Perl 5 SEGV (or abort) which one can trigger if one is allowed to feed it code 11:10
timotimo but this makes everything a general general bug :P
since you can use any aspect of the runtime environment to test and then just null pointer deref for a crash :P 11:11
but ... *shruuuuuuug*
11:11 domidumont left
nwc10 ah OK, I see what you mean 11:11
no, I think that that's cheating
timotimo "no matter how hairy, except when you're cheating" ;) 11:12
but yeah, it's definitely cheating
nwc10 I guess I'm implicitly thinking that "fiddling with the internals in a way that breaks the intenral's invariants is cheating"
timotimo much like eating perl6 spec tests and just outputting "1..1\nok 1 -\n"
nwc10 er, grammar
(mine)
jnthn: with MVM_SPESH_NODELAY unset the spectest fails I see are all from (I think) the same backtrace in log_parameter src/spesh/log.c:94 11:13
timotimo t/09-moar/00-misc.t crashed for me during "make test", but it won't crash in isolation 11:18
oh yikes
my ram is filled to the brim 11:19
that could have something to do with it
jnthn No, that one fails for me also :)
Darn, fixing up those indices doesn't make my problem go away
nwc10 does lunch?
jnthn So there's more do it than that
nwc10: Only temporarily :P
nwc10 beer? "longer, but still temporarily"? 11:20
jnthn Ohh...argh...
Of course this isn't quite going to do it
Because the object we did SR on is in the caller
We've then inlined something
So any deopt point inside of that inline would implicate the object in the inliner 11:21
So, we already store the deopt index of the original invoke in the inlinee table 11:35
When we seen an inlined deopt point, we can search backwards to find the nearest inline start index annotation, which will then get us a table entry 11:36
Though handling multi-level inlines will need a bit of care here
Also this feels a tad less than optimal, but I guess "first make it correct" 11:38
Ah, also: when we're taking an existing spesh candidate, we don't calculate its deopt usages. 11:43
(An existing one to inline, I mean)
Darn, this really needs some thinking. 11:44
I also wondered momentariliy if there's a simplification to be had because in such cases we'd have already done the PEA/SR on the inlinee, but no, that doesn't help at all, because the allocation site might be in the inlinee and the inlining makes SR possible because now the thing doesn't escape. 11:50
(Because its usages are all in the inliner)
I think this ties into the deeper long-standing problem that we can't optimize inside of inlinees much after inlining them because we don't understand the usages of things across deopt points 11:51
I'd thought that we could perhaps get deopt handling for scalar replaced stuff to work out without having to solve that more general case
Alas, I don't think we can.
I think we'll need to solve that problem first.
So that'll be a bit of a detour :) 11:52
timotimo sounds like potentially a big bit
maybe a nibble?
jnthn Well, we need to do it at some point I guess... :) 11:53
Now I have to figure out how to do it
I think perhaps the best way is to store something like a list of (bytecode offset of a write instruction, number of deopt users, ...that number of indices...) 11:54
And then when we are parsing the bytecode, we can store the mapping of instruction to bytecode offset, and then use that to create the deopt users list 11:56
timotimo i wonder if write_array_int from serialization is ever "expensive enough to warrant an optimization", because it uses MVM_repr_at_pos_i, which is virtual, but we're most likely going to see a majority of VMArray there
jnthn timotimo: I think all of those can probably get JIT love at some point 11:57
timotimo oh? how so? 11:58
jnthn I figure we can just poke the value directly into the array storage after a bounds check, and just fall back if the bounds check fails 11:59
timotimo i'm currently talking about serialization.c
jnthn Oh... 12:00
Sorry, I misread and thought this was about bytecode writing
timotimo no, right now i'm looking at the serialized blob :) 12:01
jnthn ah, ok
timotimo i kind of believe we don't know terribly much about the serialized blob itself
so the first order of business for me is to see if the int arrays we serialize tend to be good for being compressed, for example by using differential encoding with our varints
that ought to work well for line number annotations, for example 12:02
at the moment we'd be using at least 2 bytes for every line past 127 or something
jnthn I think annotations are in their own segment 12:03
timotimo on the other hand, how much data do the annotations even take? haven't looked yet.
yep, they are
jnthn lunch, bbiab 12:04
timotimo cool. in the core setting we have 1458 empty int arrays and 765 non-empty int arrays 12:17
we're writing a type tag and varint for the size, i think we can save a byte for every empty int array by having an "empty int array" tag 12:18
on the other hand, ~1.3kbytes saved for a 14 megabyte file ... maybe not that impressive :)
nwc10 How many bits are free in the array type tag? 12:19
IIRC (and this was 2 years ago or more) there were 4 bits free at the low order of somewhere
timotimo 252 intarr: 2 elems 1 min 1 max 1 absdiffmax
237 intarr: 1 elems 1 min 1 max 1 absdiffmax
100 intarr: 2 elems 0 min 1 max 1 absdiffmax
53 intarr: 1 elems 2 min 2 max 2 absdiffmax
you mean we should have a few "custom int arrays" in there? :)
nwc10 not sure. re-read your previous statement more carefully and "empty array" consuming 1 type might be better than something custom consuming a bit 12:20
timotimo we have a byte and we're using numbers 1 through 13 at the moment 12:21
so 4 bits out of 8 bits used so far
we can have an assortment of ready-made int arrays, and maybe there's object arrays that are all vmnull except for one slot, for example 12:22
interesting. that int array writing code isn't used at all for most of the steps in rakudo compilation 12:24
so i'll have to look around for more
right, even VMArray's serialize doesn't just write_ref itself 12:26
12:26 zakharyas left, lucasb joined, p6bannerbot sets mode: +v lucasb
timotimo perhaps that's because write_ref doesn't handle anything stable related? 12:26
12:27 domidumont joined 12:28 p6bannerbot sets mode: +v domidumont
timotimo huh, there's still really not that many at all 12:33
now running the core setting with VMArray's serialize instrumented 12:34
intvmarr: 2 slot 27 elems 78796800 min 1483228800 max 1483228800 absdiffmax - what is that, like unix time as an unpacked Rat? :) 12:42
oh, 27 elems, maybe it's the leap seconds table 12:43
who knows how many vmarrays of obj contain only P6int objects, or scalars with a P6int in them, or a boxed bigint ... 12:51
jnthn OK, so...time for a new branch to play with this deopt info preservation... 13:07
13:13 AlexDani` is now known as AlexDaniel
timotimo gen/moar/Perl6-Grammar.nqp 13:25
serialized blob size: 4172256 (4074 kbytes)
Stage mast : serialized blob size: 5427488 (5300 kbytes) 13:27
i would have expected this to be bigger, tbh
AlexDaniel anybody still looking at R#2564 ? 13:38
synopsebot R#2564 [open]: github.com/rakudo/rakudo/issues/2564 [āš  blocker āš ] "Cannot invoke this object (REPR: Null; VMNull)" in t/04-nativecall/06-struct.t on mipsel
AlexDaniel nine, timotimo, jnthn: ā†‘ ?
it's likely too late, but stillā€¦
timotimo Stage mast : serialized blob size: 5427488 (5300 kbytes) 13:48
15.205
Stage mbc : annotations_size: 686988
0.337 13:49
really not small, but also not quite gigantic savings to be had
jnthn AlexDaniel: I'll see if I can find time before the next release, if nobody else gets to it, but nine++ looks to be a good way into debugging it
Geth MoarVM/persist-inline-deopt-info: 0fef86cbe8 | (Jonathan Worthington)++ | 4 files
Save deopt usage info when writing bytecode

When we encounter an instruction and the register it writes has depot users, then persist the bytecode location and the using deopt indices. This is so we'll be able to reconstruct it when doing inlining.
A future improvement would be to only keep it when the bytecode is within the inline size limit, using the same rules as the inliner itself uses to make that decision.
13:51
jnthn That's in theory saving the info. Now to try for the reconstruction.
13:59 zakharyas joined 14:00 p6bannerbot sets mode: +v zakharyas
AlexDaniel c: 66f8ee0^,66f8ee0 index $_, 0 for 1 .. 10000; say now - BEGIN now 14:16
committable6 AlexDaniel, Ā¦66f8ee0^: Ā«0.132468ā¤Ā» Ā¦66f8ee0: Ā«1.4713747ā¤Ā» 14:17
AlexDaniel R#2593
synopsebot R#2593 [open]: github.com/rakudo/rakudo/issues/2593 [regression][āš  blocker āš ] 'index' routine shows severe performance decline in a loop
patrickb When including moar.h, is it good style to rely on it including uv.h (or any other lib in there) or should one double include? 15:19
15:21 Kaypie is now known as Kaiepi 15:40 Ven`` joined 15:41 p6bannerbot sets mode: +v Ven`` 15:45 MasterDuke joined, p6bannerbot sets mode: +v MasterDuke, MasterDuke left, MasterDuke joined, herbert.freenode.net sets mode: +v MasterDuke, p6bannerbot sets mode: +v MasterDuke
MasterDuke i was just struck by R#2593 and added some fprintfs of (result|gs)->num_guards to append_guard() and free_dead_guards() 15:49
synopsebot R#2593 [open]: github.com/rakudo/rakudo/issues/2593 [regression][āš  blocker āš ] 'index' routine shows severe performance decline in a loop
MasterDuke i don't know of any particularly good way to associate an allocate with a free, but if i sort the numbers and look at the tail, it doesn't exactly match up 15:50
hm, though only the very last set of numbers don't match. the numbers appear to be in groups of 4, so the end of the allocated list has 4 22140s and 4 22143s, but the end of the freed list has 4 22137s and 4 22140s 15:53
jnthn MasterDuke: I bet the set of guards the plugin installs somehow never match the value in question 15:54
MasterDuke hm, i'm a little worried that the mismatch is just because the last set that's allocated is never freed because the program exits 15:56
would it be safe to use the address of the MVMSpeshPluginGuardSet (e.g., gs or results) as a unique/shared key across the allocate and free? 15:57
16:01 robertle_ left
MasterDuke jnthn: where is the 'return type checking spesh plugin'? 16:11
jnthn Rakudo repo 16:12
src/vm/moar/spesh-plugins.nqp or something like
MasterDuke thanks
i'd be lying if i said i understood what's happening there (i found the right plugin). you think the error is in that code, or the moarvm code it triggers? 16:15
jnthn I think in that code 16:18
There's various instructions that set up guards, which are the conditions under which the plugin result will be used 16:19
But if it somehow makes a set of conditions that don't match what was actually received, then it'll just be invoked again and again, and keep trying to add again and again
Geth MoarVM/persist-inline-deopt-info: e6bf5697cb | (Jonathan Worthington)++ | 3 files
Recreate deopt users when inlining from bytecode
16:24
MoarVM/persist-inline-deopt-info: 1742b35525 | (Jonathan Worthington)++ | src/spesh/inline.c
Fixup deopt_users deopt indexes when inlining
MasterDuke jnthn: thanks, i'll have to ponder that a while 16:26
16:26 lucasb left
MasterDuke everyone: i make no promises at all, don't wait on me if you know what's wrong and are inclined to fix 16:28
jnthn I think my brain will be taken up with this deopt stuff for a little bit :) 16:30
In theory those 3 commits mean we now have deopt usage info for inlinees so can start to optimize them a bit more aggressively. 16:31
MasterDuke is the ` || nqp::istype($ret, Nil)` correct here github.com/rakudo/rakudo/blob/mast...s.nqp#L214 ? the other *_coerce subs don't have that, but their corresponding non-coersion subs do 16:37
jnthn Yes, I think Nil can go through in that case correctly 16:39
MasterDuke k 16:40
16:45 domidumont left
MasterDuke huh, adding an nqp::say in the spesh plugin causes the rakudo build to die with `Method 'new' not found for invocant of class 'X::Dynamic::NotFound'` in stage parse 16:57
timotimo MasterDuke: have you considered the debugserver for this task? :) 17:00
17:01 domidumont joined 17:02 p6bannerbot sets mode: +v domidumont
jnthn Hm, this is odd. When I enable dead instruction elimination inside of inlined basic blocks, there's a few new failures. In one of them, the difference in the frame that seems to be the significant one is just 3 deleted `set` instructions. But the error is `const_n32 NYI` 17:11
And in a frame some way down the call stack
It did do a deopt all earlier involving the frame down the stack, and it was triggered from the frame that bissecting points out 17:15
But I can't see yet how on earth deleting 3 set instructions in mixin results in the deopt of !reduce a few frames down the stack ending up broken 17:17
MasterDuke timotimo: hm, the debugserver can access code inside the spesh plugins? 17:18
jnthn MasterDuke: I'd expect that'd work since it's just normal code 17:21
17:22 zakharyas left
timotimo it should be able to, yeah 17:23
i don't see a reason why not
MasterDuke it was my bad. changed `nqp::say($orig_type)` to 'nqp::say($orig_type.HOW.name($orig_type))` and now everything is working as expected 17:30
any way to tell where/why the plugin triggered? or is somethere where the debugserver is the best option? 17:32
timotimo: p6-moarvm-remote? that's the debugserver? 17:33
17:34 coverable6 left
timotimo that's just the debug client library 17:34
17:34 coverable6 joined
timotimo you'll want the app that goes with it, and start moarvm with --debug-port and probably also --debug-suspend 17:34
17:35 p6bannerbot sets mode: +v coverable6
jnthn dinner time, will continue debugging the dead instruction deletion from inlines tomorrow to try and figure out what on earth is going on... 17:38
MasterDuke fwiw, here's a list of the $orig_type and $type as seen by the plugin gist.github.com/MasterDuke17/0ebc2...0c7f9f971b 17:41
17:42 Ven`` left
MasterDuke unique list of those pairs 17:42
no more than 5 lines for most of the pairs, 29524 lines of `Orig_type = Int:D and type is now = Int` 17:43
17:54 zakharyas joined 17:55 p6bannerbot sets mode: +v zakharyas
MasterDuke timotimo: `Missing serialize REPR function for REPR VMException (BOOTException)` when trying to start the debug client 17:57
that's a SORRY! 17:58
timotimo oh? well, that isn't so good
MasterDuke might it not like the nqp::say i left in the spesh plugin?
timotimo i'll be afk for a little bit, but i'll look into it afterwards
hm, that's not very likely, but not completely impossible 17:59
the debug server client doesn't have to run the same version as the server
bbl
MasterDuke same error when using 2018.06 to run the client 18:00
later...
if i reduce the number of iterations in the example code from 10k down to 1k then the number of `Orig_type = Int:D and type is now = Int` lines is only 3280 18:10
18:15 releasable6 left, notable6 left, quotable6 left, shareable6 left, undersightable6 left 18:16 undersightable6 joined, quotable6 joined, shareable6 joined, notable6 joined, releasable6 joined, p6bannerbot sets mode: +v undersightable6, p6bannerbot sets mode: +v quotable6, p6bannerbot sets mode: +v shareable6, p6bannerbot sets mode: +v notable6, p6bannerbot sets mode: +v releasable6
timotimo MasterDuke: it could be from Terminal::ANSIColor missing, perhaps? 18:22
maybe i didn't build the recognition properly or something
MasterDuke oh, that sounds familiar
timotimo could you try $! = Nil; below the try require in lib/App/Moarvm/Debug/Formatter.pm6?
MasterDuke works now that i added a -I pointing to Terminal::ANSIColor 18:23
timotimo can you also test that change? 18:24
MasterDuke yep, works also 18:25
timotimo cool, i'll push that later, then 18:29
MasterDuke more info from that common case: `Orig_type = Int:D and type = Int and definite_check = 1 and nqp::iscont($rv) = 0 and rv = Nil and nqp::isnull($coerce_to) = 1` 18:38
which, if i'm reading the plugin code correctly, means `nqp::speshguardconcrete($rv);` is called and then `&identity` is returned (instead of `make-unchecked-coercion($rv, $coerce_to);`) 18:39
huh, why is $rv Nil? shouldn't it be Int:D? 18:46
oh right, index returns Nil if the needle isn't found 18:55
19:01 zakharyas left
MasterDuke huh. turning `nqp::speshguardtype($rv, $rv.WHAT);` into `unless nqp::istype($rv, Nil) { nqp::speshguardtype($rv, $rv.WHAT); }` drops the runtime from 4s to 2.6s and mem from 2.4g to 1.6g, but that's still way higher than 2018.06 rakudo, where it's 0.4s and 88m 19:39
oh hey, i see `BAIL: Unexpected opcode in invoke sequence: <speshresolve>` in the spesh log. there is a jit case for sp_speshresolve though... did it get renamed? 19:47
afk for a while 19:58
20:02 domidumont left
timotimo speshresolve isn't allowed to remain in the post-spesh code i don't think 20:03
20:04 brrt joined 20:05 p6bannerbot sets mode: +v brrt
brrt ohai #moarvm 20:05
timotimo ohai brrt 20:07
brrt ohai timotimo :-) 20:08
I'm trying to work out how to specify the register class in the tiler... 20:09
I have encoded that with a convoluted bit-packing hack that I'm now sorry about 20:16
nwc10 good *, brrt 20:18
brrt good *, nwc10 20:19
correction, that I'm *really* sorry about 20:21
(and now, I will have to unpack it)
note to self: bitpacking really is only useful in some cases 20:23
yay, the expr jit compiler will correctly insert a store_num/load_num pair for a num-typed 'set' 20:35
only trouble is that it crashes..
actually, why does that crash in the tiler 20:36
patrickb moarvm does not include a full libuv, right? 20:42
brrt what's a 'full' libuv? I was under the impression it does 20:43
so I'm wondering what more things we don't include that you are thinking of :-)
patrickb looking at the Makefile it only compiles a selected set of c files into moarvm
timotimo we only include the libuv parts for the relevant architecture/OS i think?!
patrickb I'm interested in uv_exepath which returns an absolute path for the current executable 20:44
brrt oh, I see
yeah, we kind of inline it
patrickb That's spread over several files: libuv/src/unix/{obenbsd.c procfsexepath.c ...} 20:45
brrt I see 20:46
patrickb Would it be feasable to include those, just so I can use that one function?
Well, not even from moarvm, but from rakudo :-P
I'm writing a replacement for the sh/bat rakudo runners in C and I need a way to retrieve the executables location. That is non-trivial. 20:51
brrt ooc, why? 20:55
(not the runners, I get that)
I still have a plan to make ELF-loadable MoarVM bytecode
patrickb There is no canonical way in POSIX to retrieve the program name and/or path. On linux use proc if you can, do argv[0] mangling if you can't. 20:57
After that, do symlink resolution.
Other OS' are different. 20:58
Or was that why I need the location in the first place? 20:59
brrt yes
the last bit :-)
patrickb Relocatability
brrt what aspect of it?
I'm curious. I'd always imagined one day solving this by somehow linking all of rakudo into the executable 21:00
Although that day is always far off :-)
anyway, /me afk
patrickb During load perl6 needs access to several files. the main perl6.moarvm and several others.
brrt (I need to update a bunch of templates to ensure that we use load_num where needed) 21:01
patrickb To locate those I need to have some place to start my search from. That'd be the executables location.
brrt I understand
patrickb I see two alternatives to move forward: 21:03
1. Include several files into the MoarVM build that I then use from my runner program.
2. Do it on my own using e.g. github.com/gpakosz/whereami 21:04
21:05 brrt left
patrickb I'd be grateful for opinions. I'm done for today, but will look for replies tomorrow. 21:06
'night everyone!
21:07 patrickb left 21:23 Kaiepi left 21:24 Kaiepi joined, p6bannerbot sets mode: +v Kaiepi 21:27 Kaiepi left 21:29 Kaiepi joined 21:30 p6bannerbot sets mode: +v Kaiepi 21:31 lizmat left 22:39 MasterDuke left 23:21 Ven`` joined 23:22 p6bannerbot sets mode: +v Ven`` 23:47 MasterDuke joined, p6bannerbot sets mode: +v MasterDuke, MasterDuke left, MasterDuke joined, herbert.freenode.net sets mode: +v MasterDuke, p6bannerbot sets mode: +v MasterDuke
MasterDuke timotimo: should speshresolve still be jitted though? 23:48
and if you look at a profile of the example from the issue `index $_, 0 for 1 .. 10000`, pretty much all of the index calls are un-jitted 23:49
what i don't understand is why the plugin is getting called so many times 23:50
(heh, that's just one of the many things i don't understand, don't have time to list them all) 23:51