github.com/moarvm/moarvm | IRC logs at colabti.org/irclogger/irclogger_logs/moarvm
Set by AlexDaniel on 12 June 2018.
00:09 salamanderrake joined 00:10 p6bannerbot sets mode: +v salamanderrake 00:13 salamanderrake left 00:22 Lildirt joined, Lildirt left
timotimo froggs!!! 00:38
00:38 purrdeta6 joined 00:39 p6bannerbot sets mode: +v purrdeta6 00:44 purrdeta6 left 00:57 banzaikitten4 joined, Kaiepi left 00:58 p6bannerbot sets mode: +v banzaikitten4, banzaikitten4 left 02:04 Sabotender29 joined 02:05 p6bannerbot sets mode: +v Sabotender29 02:06 Sabotender29 left 02:08 Guest27421 joined 02:09 p6bannerbot sets mode: +v Guest27421 02:12 Guest27421 left 02:30 AlexDaniel joined, p6bannerbot sets mode: +v AlexDaniel 02:31 AlexDaniel left, AlexDaniel joined, barjavel.freenode.net sets mode: +v AlexDaniel, p6bannerbot sets mode: +v AlexDaniel 03:06 fossxplorer3 joined 03:07 fossxplorer3 left 03:09 Kaiepi joined, p6bannerbot sets mode: +v Kaiepi 03:26 fireworks20 joined, fireworks20 left 03:56 morsik17 joined 03:57 p6bannerbot sets mode: +v morsik17, morsik17 left 04:10 LookingGlassSec joined 04:11 p6bannerbot sets mode: +v LookingGlassSec 04:15 programmerq22 joined 04:16 p6bannerbot sets mode: +v programmerq22, programmerq22 left, LookingGlassSec left 05:18 Matrixiumn joined 05:19 p6bannerbot sets mode: +v Matrixiumn 05:20 Matrixiumn left 05:42 Kilo`byte joined 05:43 Kilo`byte left 06:07 robertle joined 06:08 p6bannerbot sets mode: +v robertle 06:33 Kaiepi left 06:44 Kaiepi joined 06:45 p6bannerbot sets mode: +v Kaiepi 07:58 zakharyas joined 07:59 p6bannerbot sets mode: +v zakharyas 08:43 dodobrain14 joined, dodobrain14 left 09:40 robertle left 09:41 robertle joined 09:42 p6bannerbot sets mode: +v robertle 09:47 andries4 joined 09:48 p6bannerbot sets mode: +v andries4, andries4 left
Geth MoarVM: 7507090328 | (Jonathan Worthington)++ | 4 files
Make spesh thread more GC-responsive

By making us able to mark the spesh graph more completely, and making the one we're currently working on available for GC marking, we can introduce GC sync points at a number of locations in the optimization process. This can notably reduce the GC latency when optimizing larger graphs.
Didn't measure repeatedly, but spectest time decreased by ~8s and CORE.setting compilation by ~2s; the latter feels a bit on the high side, so I'd take these numbers with a pinch of salt.
10:51
10:52 robertle left
jnthn timotimo: Would be intersting to see if your GC latency profiling looks any different after ^^ :) 10:53
10:54 robertle joined 10:55 p6bannerbot sets mode: +v robertle 11:08 Arirang joined, Arirang is now known as Guest34255, Guest34255 left
masak .oO( take these numbers with a punch of salt ) 11:13
11:15 travis-ci joined, travis-ci left
dogbert2 only 24 degrees centigrade outside :) 11:17
11:33 TriJetScud13 joined, zakharyas left, p6bannerbot sets mode: +v TriJetScud13, TriJetScud13 left 11:52 robertle left 12:07 dp318 joined, dp318 left
lizmat only 34.6 here outside :-( 12:32
timotimo jnthn: i find it surprising that we wouldn't have to also mark facts known values and types 12:33
jnthn We do mark them?
timotimo but i guess if something's in there, it wouldn't have died in the nursery?!
oh!
hah, i see that now
i should have expanded the code before saying that
Geth MoarVM: 2f36e2666a | (Jonathan Worthington)++ | src/gc/allocation.c
Add branch hint macros to nursery allocation

Seems to shave a little off various programs; the effect was visible with callgrind.
12:39
diakopter .tell brrt well hopefully the static analysis wouldn't even need to expand the macros..? 12:42
yoleaux diakopter: I'll pass your message to brrt.
12:43 zakharyas joined 12:44 p6bannerbot sets mode: +v zakharyas 12:49 harrow` joined 12:50 p6bannerbot sets mode: +v harrow`, camelia left 12:52 shareable6 left, nativecallable6 left 12:54 reportable6 left, benchable6 left, statisfiable6 left, greppable6 left 12:56 camelia joined 12:57 p6bannerbot sets mode: +v camelia, robertle joined 12:58 p6bannerbot sets mode: +v robertle
Geth MoarVM: 67a9afef60 | (Jonathan Worthington)++ | 3 files
Have sp_fastcreate do a direct nursery allocation

Since we should never use this when we're in gen2 allocation mode. Saves a branch and a function call per fastcreate.
12:58
13:01 travis-ci joined, travis-ci left
timotimo jnthn: i tried to come up with a spesh plugin for p6assign to a Proxy, but i think i was still missing some important detail; it looks like what we actually call to invoke the proxy has to have very slightly different arguments, which is why the invokespec has two tiny subs that create a Scalar and call &!FETCH with it 13:05
got a hot tip for me?
lizmat
.oO( don't go outside now )
timotimo :) 13:06
jnthn Yeah, it's doing an extra wrapping of the thing I think 13:07
timotimo so my first instinct was to say "build a little closure inside the spesh plugin's run and return that"
but i don't think a closure like that is optimizable by spesh then?
jnthn That can work, especially as we can inline closures
timotimo i thought we can only do that if it's the outer we inline it into? 13:08
jnthn No
timotimo or was that the magic trick for "getlexvia"?
jnthn Can do it more generally
Right, the via trick :)
That was one of those things where when I thought of it, I wondered why I didn't think of it 2 years beforehand... :)
timotimo i'll give it a try again, perhaps
annoyingly, we mostly emit p6store in typical situations where we have proxies 13:09
13:09 iczero11 joined
timotimo like when assigning to the result of a method call 13:09
13:10 p6bannerbot sets mode: +v iczero11
jnthn Ho, is that not yet using a spesh plugin? 13:11
Can't always remember what I did and didn't get around to :) 13:12
timotimo only p6assign i think 13:13
let me check again
13:14 iczero11 left
timotimo well, there's an "assign" spesh plugin, but not a "store" one 13:14
jnthn Aha 13:19
Time to write it! :)
timotimo i wonder if i should have a look 13:20
buses and such
13:21 travis-ci joined, travis-ci left 13:25 digitalcold15 joined, digitalcold15 left
timotimo oooh, stage parse is below 60s again 13:29
13:34 lizmat left
jnthn :) 13:37
13:43 Cronus29 joined 13:44 p6bannerbot sets mode: +v Cronus29 13:45 Cronus29 left 13:50 reportable6 joined 13:51 p6bannerbot sets mode: +v reportable6 13:53 beaky24 joined, p6bannerbot sets mode: +v beaky24 13:54 beaky24 left 14:48 brrt joined, zenguy- joined, p6bannerbot sets mode: +v brrt
timotimo ok, after having a "breakfast" i can try to figure out why my current attempt gives me No such method 'CALL-ME' for invocant of type 'Bool' 14:49
14:49 p6bannerbot sets mode: +v zenguy-, zenguy- left
timotimo MoarVM oops: Too many levels of inlining popped 14:51
when i try to MVM_SPESH_NODELAY the code
ah, the data doesn't actually land in lexicals, only registers 14:55
that makes the debugserver less helpful
OK, that's interesting. instead of getting the FETCH that i've assigned to the object in the constructor, i apparently resolved Proxy's actual FETCH method instead 14:57
oh, hah 14:58
i've been consistently messing up FETCH and STORE
i'm trying to write a spesh plugin for STORE, not for FETCH!
though FETCH will also want one
oh, no, i think i'm actually wrongfully deconting somewhere, thus causing FETCH to be run when i actually wanted to work with the proxy object itself 14:59
15:03 MartesZibellina joined 15:04 p6bannerbot sets mode: +v MartesZibellina, MartesZibellina left 15:06 ski_ joined 15:07 p6bannerbot sets mode: +v ski_, ski_ left
timotimo jnthn, do i need something like speshguardsf to ensure i'm getting the same code object? i've first tried spesguardobj, but i that gets very unhappy if you "Proxy.new" many, many times 15:11
15:15 diakopter left
timotimo ooh, getstaticcode could be the right one? 15:16
hm, but i think i'd have to have that as a spesh-recorded instruction, too 15:17
otherwise i can't guard against the result
got a preliminary implementation of that guard 15:26
well, need to implement the guard, i only added the guard instruction 15:28
15:28 brrt left 15:36 diakopter joined, p6bannerbot sets mode: +v diakopter
timotimo OK, it runs, that's good. 15:47
now for timing 15:50
aaw, it's barely faster 15:52
m: say <4.34 4.47 4.38 4.36> / 4 15:54
camelia 1
timotimo m)
m: say <4.34 4.47 4.38 4.36>.sum / 4
camelia 4.3875
timotimo m: say <4.73 4.71 4.64 4.64>.sum / 4 15:55
camelia 4.68
timotimo m: say 4.3875 / 4.68
camelia 0.9375
timotimo it didn't end up inlining the actual fetch sub that was put into the proxy, just the sub that creates the Scalar and passes that on 15:56
15:58 lizmat joined 15:59 brrt joined
timotimo maybe it could only be better if we did a second round of logging after the spesh plugin has been resolved and inlined 15:59
15:59 p6bannerbot sets mode: +v lizmat, p6bannerbot sets mode: +v brrt 16:00 Mercster24 joined, Geth left, p6bannerbot sets mode: +v Mercster24 16:01 Geth joined, p6bannerbot sets mode: +v Geth
timotimo jnthn: should i upload PRs for moar, nqp, and rakudo for this spesh plugin? 16:03
or is 7% not worth the hassle?
not even 7% 16:04
16:05 Mercster24 left
jnthn Sends PRs, I can review them and see what complexity we get for the speedup 16:06
timotimo sure thing. maybe you also see what could make spesh better at figuring out the inner inline
Geth MoarVM/speshplugin_guardstaticcode: 8dabbcc01b | (Timo Paulssen)++ | 8 files
add speshguardgetstaticcode, for closures and such

lets a spesh plugin figure out if a given object, such as the $!do attribute of a Code, is "the same" across invocations - ignoring what exactly it closes over.
16:09
16:09 tigermousr22 joined 16:10 tigermousr22 left
Geth MoarVM: timo++ created pull request #932:
add speshguardgetstaticcode, for closures and such
16:10
16:11 robertle left
timotimo github.com/rakudo/rakudo/pull/2189 - also has the link to the other pull requests in the description 16:14
brrt \o 16:32
yoleaux 12:42Z <diakopter> brrt: well hopefully the static analysis wouldn't even need to expand the macros..?
timotimo o/ brrt 16:34
sp_fastalloc is simple enough to have the most likely case inlined into the expr template, IMO 16:35
there's a check for size > 0 in the function version, but on the jit level we already know the size exactly 16:36
brrt: any idea for something like MVM_LIKELY or MVM_UNLIKELY at the exprjit level?
jnthn m: say 3.68 / 7.34 16:41
camelia 0.501362
timotimo oh wow
that's a nice ratio
jnthn Yeah, it's for gist.github.com/jnthn/196263d7e888...b2e42b2008 16:42
My changes today have made that run in half the time 16:43
Some unpushed
16:44 robertle joined
brrt not yet timotimo 16:44
i'm not sure how MVM_LIKELY is compiled
16:44 p6bannerbot sets mode: +v robertle
timotimo no clue. does X86_64 encode branch predictor hints into the assembly implicitly or explicitly? 16:44
i.e. is gcc flipping the then and else branches around so the more likely one is the one that needs no goto? 16:45
jnthn I at first thought the latter
Oh, it may really just be that
Watever it does seems to have an effect on instruction count though
timotimo hm, "instruction fetch" is what valgrind counts, isn't it? 16:46
jnthn Thing so 16:47
*Think
timotimo i imagine it perhaps has to do with fetches being not byte-per-byte, but whole cache-lines at once?
and if the unlikely then branch starts in the same cache line, it gets fetched, and the likely else branch might start in the middle of the right cache line, too
brrt hmm, that's cute 16:49
we'd need either a new node, or a node annotation (flags?) 16:58
timotimo probably having a node would be easiest 16:59
but feel free to push that off to the medium future 17:00
17:03 supercool2 joined 17:04 p6bannerbot sets mode: +v supercool2, supercool2 left 17:08 zakharyas left, zakharyas joined 17:09 p6bannerbot sets mode: +v zakharyas 17:18 Kaiepi left 17:21 zakharyas left 17:22 Kaiepi joined 17:23 p6bannerbot sets mode: +v Kaiepi
Kaiepi thoughts on implementing asm jit support for x32? 17:37
brrt sure 17:48
we hate x32
brrt = we, in this case :-P 17:49
I'm going to qualify that a bit better... 17:50
I think when we say x32, we mean the 'run 64 bit code with 32 bit addresses' - and presumably integers as well?
it's a windows thing iirc 17:51
I actually think that is a pretty sane idea, given how much better x86-64 is than x86
but, on the other hand....
17:52 nyuszika7h3 joined
brrt here's my general hypothesis on perl6, adoption, and performance 17:52
there's two places where performance is going to matter for perl6 adoption
1): on developer laptops
17:52 p6bannerbot sets mode: +v nyuszika7h3
brrt 2): on production servers 17:53
17:53 nyuszika7h3 left
brrt all other platforms, including ARMv7, AArch64, x32, MIPS, PowerPC, whatever 17:53
(production servers are going to have Xeon-style processors, typically virtualized) 17:54
all other platforms don't really matter. Not in this phase of adoption
if we ever get as big as perl5, then there will be a horde of users clamoring for their obscure platforms. And we'll have a moarvm porters group who will make sure that happens
timotimo there's also x32 for linux 17:55
it's also not very difficult for a perl6 program to balloon up to 4 gigs %) 17:56
brrt that, too
unfortunately 17:57
timotimo brrt: do you think porting the fast path of sp_fastcreate to exprjit will lose us the benefits gained from adding the branch hint macros to the allocate_nursery function? 18:00
Kaiepi so that's a "keep it on the sideburner until later?"
brrt as far as i'm concerned, yes 18:01
and maybe at that time x32 might no longer be a thing
timotimo: I'm not sure; the only way to know is to test
18:02 lucasb joined 18:03 p6bannerbot sets mode: +v lucasb
brrt an open problem is - how do I, with the current register allocator, or something else which is both fast and correct, prevent an unlikely branch from forcing a spill in a likely branch 18:05
lucasb just to note that I'm still on x86 32bit... sure my next machine will be 64bit and I'll never bother with the non-existent JIT for 32b, but until then... :) 18:08
I wonder, is it even possible to have a variant of the current JIT for 32bit? Would brrt be willying to guide someone doing that?
18:09 cebor joined
lucasb *willingly 18:09
18:09 cebor is now known as Guest18913, Guest18913 left
brrt it is totally possible 18:10
and if somebody were up for it, I'd help, sure
it's just that I'm not going to spend time on it myself
lucasb that's ok 18:11
timotimo the lego jit should not be terribly hard, right? 18:13
since it spills and loads at every conceivable point
number of registers isn't such a big problem
having to implement the calling conventions is a little bit of work i guess? 18:14
brrt there's a bunch of different ones 18:15
and you basically can't share any code
timotimo of course there are m(
brrt and I make no guarantees that any of the 'top' constructs make a lot of sense for the lower constructs 18:16
Kaiepi i'm up for it 18:19
i will need some help though
brrt obviously 18:21
how much do you know about assembly language
Kaiepi not very much, i cargo-culted it going off the other examples in src/jit/x64/emit.dasc 18:22
are there any resources you could point me too so i could learn more?
brrt hmm. there's the dynasm docs here: corsix.github.io/dynasm-doc/tutorial.html 18:28
I woudl suggest you start playing with that to get a feel for assembly
Kaiepi aight 18:29
thanks
lucasb Kaiepi++ I totally encourage that! 18:35
brrt if nothing else you'll learn some things :-) 18:36
lucasb I wouldn't know how to do that. By the time I acquire the skills, the platform will be long gone, and maybe possibly even Earth itself
Kaiepi it's always good to learn some new things
lucasb but I can try help Kaiepi :)
brrt assembly really isn't all that hard though
there's very few things you have to take into account 18:37
18:43 Zoffix joined, p6bannerbot sets mode: +v Zoffix, ChanServ sets mode: +o Zoffix 18:44 p6bannerbot left, p6bannerbot joined, Zoffix sets mode: +o p6bannerbot, Zoffix left 18:45 brrt left 18:48 lucasb left 19:15 Platonides17 joined, p6bannerbot sets mode: +v Platonides17 19:17 Platonides17 left 19:41 AlexDaniel left 19:42 AlexDaniel joined, AlexDaniel left, AlexDaniel joined, p6bannerbot sets mode: +v AlexDaniel 20:01 benchable6 joined, p6bannerbot sets mode: +v benchable6
Geth MoarVM: xelak6++ created pull request #934:
Get the number of bytes to be processed from the current buffer and not from the header.
20:25
20:42 drakythe0 joined 20:43 p6bannerbot sets mode: +v drakythe0 20:48 drakythe0 left 20:50 __idiot__ joined 20:51 p6bannerbot sets mode: +v __idiot__, __idiot__ left 21:03 lizmat left 21:12 robertle left
jnthn D'oh, I put beer to the freezer for a bit so it'd be nice and cool, and now I've got iced beer... 21:17
21:44 nativecallable6 joined 21:45 p6bannerbot sets mode: +v nativecallable6
timotimo better than a freezer full of "spicy" beer shards 21:46
more spiky than spicy, really
in german it works better because "scharf" means both "sharp" and "spicy"
21:59 Guest25466 joined 22:00 p6bannerbot sets mode: +v Guest25466, Guest25466 left
jnthn :) 22:08
It hadn't frozen through fully, and now I'm near the end of the class and it's getting warm 22:09
Geth MoarVM: 448e75bd3d | (Alexius Korzinek)++ | src/strings/utf8_c8.c
Get the number of bytes to be processed from the current buffer and not from the header.

This fixes issue #2158.
22:22
MoarVM: 3e679da29a | (Jonathan Worthington)++ (committed using GitHub Web editor) | src/strings/utf8_c8.c
Merge pull request #934 from xelak6/master

Get the number of bytes to be processed from the current buffer and not from the header.
22:27 Kaiepi left, Kaiepi joined 22:28 p6bannerbot sets mode: +v Kaiepi 22:29 bisectable6 joined 22:30 p6bannerbot sets mode: +v bisectable6 22:43 travis-ci joined, p6bannerbot sets mode: +v travis-ci
travis-ci MoarVM build errored. Jonathan Worthington 'Merge pull request #934 from xelak6/master 22:43
travis-ci.org/MoarVM/MoarVM/builds/413340536 github.com/MoarVM/MoarVM/compare/6...679da29adb
22:43 travis-ci left 22:47 arza5 joined, p6bannerbot sets mode: +v arza5, arza5 left 23:14 amar joined, amar left
timotimo hm. perhaps i should have backed up a thing or two from /tmp before rebooting 23:35
i found timo.github.io/_site/weeklychanges...thing.html 23:36