github.com/moarvm/moarvm | IRC logs at colabti.org/irclogger/irclogger_logs/moarvm Set by AlexDaniel on 12 June 2018. |
|||
00:09
salamanderrake joined
00:10
p6bannerbot sets mode: +v salamanderrake
00:13
salamanderrake left
00:22
Lildirt joined,
Lildirt left
|
|||
timotimo | froggs!!! | 00:38 | |
00:38
purrdeta6 joined
00:39
p6bannerbot sets mode: +v purrdeta6
00:44
purrdeta6 left
00:57
banzaikitten4 joined,
Kaiepi left
00:58
p6bannerbot sets mode: +v banzaikitten4,
banzaikitten4 left
02:04
Sabotender29 joined
02:05
p6bannerbot sets mode: +v Sabotender29
02:06
Sabotender29 left
02:08
Guest27421 joined
02:09
p6bannerbot sets mode: +v Guest27421
02:12
Guest27421 left
02:30
AlexDaniel joined,
p6bannerbot sets mode: +v AlexDaniel
02:31
AlexDaniel left,
AlexDaniel joined,
barjavel.freenode.net sets mode: +v AlexDaniel,
p6bannerbot sets mode: +v AlexDaniel
03:06
fossxplorer3 joined
03:07
fossxplorer3 left
03:09
Kaiepi joined,
p6bannerbot sets mode: +v Kaiepi
03:26
fireworks20 joined,
fireworks20 left
03:56
morsik17 joined
03:57
p6bannerbot sets mode: +v morsik17,
morsik17 left
04:10
LookingGlassSec joined
04:11
p6bannerbot sets mode: +v LookingGlassSec
04:15
programmerq22 joined
04:16
p6bannerbot sets mode: +v programmerq22,
programmerq22 left,
LookingGlassSec left
05:18
Matrixiumn joined
05:19
p6bannerbot sets mode: +v Matrixiumn
05:20
Matrixiumn left
05:42
Kilo`byte joined
05:43
Kilo`byte left
06:07
robertle joined
06:08
p6bannerbot sets mode: +v robertle
06:33
Kaiepi left
06:44
Kaiepi joined
06:45
p6bannerbot sets mode: +v Kaiepi
07:58
zakharyas joined
07:59
p6bannerbot sets mode: +v zakharyas
08:43
dodobrain14 joined,
dodobrain14 left
09:40
robertle left
09:41
robertle joined
09:42
p6bannerbot sets mode: +v robertle
09:47
andries4 joined
09:48
p6bannerbot sets mode: +v andries4,
andries4 left
|
|||
Geth | MoarVM: 7507090328 | (Jonathan Worthington)++ | 4 files Make spesh thread more GC-responsive By making us able to mark the spesh graph more completely, and making the one we're currently working on available for GC marking, we can introduce GC sync points at a number of locations in the optimization process. This can notably reduce the GC latency when optimizing larger graphs. Didn't measure repeatedly, but spectest time decreased by ~8s and CORE.setting compilation by ~2s; the latter feels a bit on the high side, so I'd take these numbers with a pinch of salt. |
10:51 | |
10:52
robertle left
|
|||
jnthn | timotimo: Would be intersting to see if your GC latency profiling looks any different after ^^ :) | 10:53 | |
10:54
robertle joined
10:55
p6bannerbot sets mode: +v robertle
11:08
Arirang joined,
Arirang is now known as Guest34255,
Guest34255 left
|
|||
masak .oO( take these numbers with a punch of salt ) | 11:13 | ||
11:15
travis-ci joined,
travis-ci left
|
|||
dogbert2 only 24 degrees centigrade outside :) | 11:17 | ||
11:33
TriJetScud13 joined,
zakharyas left,
p6bannerbot sets mode: +v TriJetScud13,
TriJetScud13 left
11:52
robertle left
12:07
dp318 joined,
dp318 left
|
|||
lizmat | only 34.6 here outside :-( | 12:32 | |
timotimo | jnthn: i find it surprising that we wouldn't have to also mark facts known values and types | 12:33 | |
jnthn | We do mark them? | ||
timotimo | but i guess if something's in there, it wouldn't have died in the nursery?! | ||
oh! | |||
hah, i see that now | |||
i should have expanded the code before saying that | |||
Geth | MoarVM: 2f36e2666a | (Jonathan Worthington)++ | src/gc/allocation.c Add branch hint macros to nursery allocation Seems to shave a little off various programs; the effect was visible with callgrind. |
12:39 | |
diakopter | .tell brrt well hopefully the static analysis wouldn't even need to expand the macros..? | 12:42 | |
yoleaux | diakopter: I'll pass your message to brrt. | ||
12:43
zakharyas joined
12:44
p6bannerbot sets mode: +v zakharyas
12:49
harrow` joined
12:50
p6bannerbot sets mode: +v harrow`,
camelia left
12:52
shareable6 left,
nativecallable6 left
12:54
reportable6 left,
benchable6 left,
statisfiable6 left,
greppable6 left
12:56
camelia joined
12:57
p6bannerbot sets mode: +v camelia,
robertle joined
12:58
p6bannerbot sets mode: +v robertle
|
|||
Geth | MoarVM: 67a9afef60 | (Jonathan Worthington)++ | 3 files Have sp_fastcreate do a direct nursery allocation Since we should never use this when we're in gen2 allocation mode. Saves a branch and a function call per fastcreate. |
12:58 | |
13:01
travis-ci joined,
travis-ci left
|
|||
timotimo | jnthn: i tried to come up with a spesh plugin for p6assign to a Proxy, but i think i was still missing some important detail; it looks like what we actually call to invoke the proxy has to have very slightly different arguments, which is why the invokespec has two tiny subs that create a Scalar and call &!FETCH with it | 13:05 | |
got a hot tip for me? | |||
lizmat | .oO( don't go outside now ) |
||
timotimo | :) | 13:06 | |
jnthn | Yeah, it's doing an extra wrapping of the thing I think | 13:07 | |
timotimo | so my first instinct was to say "build a little closure inside the spesh plugin's run and return that" | ||
but i don't think a closure like that is optimizable by spesh then? | |||
jnthn | That can work, especially as we can inline closures | ||
timotimo | i thought we can only do that if it's the outer we inline it into? | 13:08 | |
jnthn | No | ||
timotimo | or was that the magic trick for "getlexvia"? | ||
jnthn | Can do it more generally | ||
Right, the via trick :) | |||
That was one of those things where when I thought of it, I wondered why I didn't think of it 2 years beforehand... :) | |||
timotimo | i'll give it a try again, perhaps | ||
annoyingly, we mostly emit p6store in typical situations where we have proxies | 13:09 | ||
13:09
iczero11 joined
|
|||
timotimo | like when assigning to the result of a method call | 13:09 | |
13:10
p6bannerbot sets mode: +v iczero11
|
|||
jnthn | Ho, is that not yet using a spesh plugin? | 13:11 | |
Can't always remember what I did and didn't get around to :) | 13:12 | ||
timotimo | only p6assign i think | 13:13 | |
let me check again | |||
13:14
iczero11 left
|
|||
timotimo | well, there's an "assign" spesh plugin, but not a "store" one | 13:14 | |
jnthn | Aha | 13:19 | |
Time to write it! :) | |||
timotimo | i wonder if i should have a look | 13:20 | |
buses and such | |||
13:21
travis-ci joined,
travis-ci left
13:25
digitalcold15 joined,
digitalcold15 left
|
|||
timotimo | oooh, stage parse is below 60s again | 13:29 | |
13:34
lizmat left
|
|||
jnthn | :) | 13:37 | |
13:43
Cronus29 joined
13:44
p6bannerbot sets mode: +v Cronus29
13:45
Cronus29 left
13:50
reportable6 joined
13:51
p6bannerbot sets mode: +v reportable6
13:53
beaky24 joined,
p6bannerbot sets mode: +v beaky24
13:54
beaky24 left
14:48
brrt joined,
zenguy- joined,
p6bannerbot sets mode: +v brrt
|
|||
timotimo | ok, after having a "breakfast" i can try to figure out why my current attempt gives me No such method 'CALL-ME' for invocant of type 'Bool' | 14:49 | |
14:49
p6bannerbot sets mode: +v zenguy-,
zenguy- left
|
|||
timotimo | MoarVM oops: Too many levels of inlining popped | 14:51 | |
when i try to MVM_SPESH_NODELAY the code | |||
ah, the data doesn't actually land in lexicals, only registers | 14:55 | ||
that makes the debugserver less helpful | |||
OK, that's interesting. instead of getting the FETCH that i've assigned to the object in the constructor, i apparently resolved Proxy's actual FETCH method instead | 14:57 | ||
oh, hah | 14:58 | ||
i've been consistently messing up FETCH and STORE | |||
i'm trying to write a spesh plugin for STORE, not for FETCH! | |||
though FETCH will also want one | |||
oh, no, i think i'm actually wrongfully deconting somewhere, thus causing FETCH to be run when i actually wanted to work with the proxy object itself | 14:59 | ||
15:03
MartesZibellina joined
15:04
p6bannerbot sets mode: +v MartesZibellina,
MartesZibellina left
15:06
ski_ joined
15:07
p6bannerbot sets mode: +v ski_,
ski_ left
|
|||
timotimo | jnthn, do i need something like speshguardsf to ensure i'm getting the same code object? i've first tried spesguardobj, but i that gets very unhappy if you "Proxy.new" many, many times | 15:11 | |
15:15
diakopter left
|
|||
timotimo | ooh, getstaticcode could be the right one? | 15:16 | |
hm, but i think i'd have to have that as a spesh-recorded instruction, too | 15:17 | ||
otherwise i can't guard against the result | |||
got a preliminary implementation of that guard | 15:26 | ||
well, need to implement the guard, i only added the guard instruction | 15:28 | ||
15:28
brrt left
15:36
diakopter joined,
p6bannerbot sets mode: +v diakopter
|
|||
timotimo | OK, it runs, that's good. | 15:47 | |
now for timing | 15:50 | ||
aaw, it's barely faster | 15:52 | ||
m: say <4.34 4.47 4.38 4.36> / 4 | 15:54 | ||
camelia | 1 | ||
timotimo | m) | ||
m: say <4.34 4.47 4.38 4.36>.sum / 4 | |||
camelia | 4.3875 | ||
timotimo | m: say <4.73 4.71 4.64 4.64>.sum / 4 | 15:55 | |
camelia | 4.68 | ||
timotimo | m: say 4.3875 / 4.68 | ||
camelia | 0.9375 | ||
timotimo | it didn't end up inlining the actual fetch sub that was put into the proxy, just the sub that creates the Scalar and passes that on | 15:56 | |
15:58
lizmat joined
15:59
brrt joined
|
|||
timotimo | maybe it could only be better if we did a second round of logging after the spesh plugin has been resolved and inlined | 15:59 | |
15:59
p6bannerbot sets mode: +v lizmat,
p6bannerbot sets mode: +v brrt
16:00
Mercster24 joined,
Geth left,
p6bannerbot sets mode: +v Mercster24
16:01
Geth joined,
p6bannerbot sets mode: +v Geth
|
|||
timotimo | jnthn: should i upload PRs for moar, nqp, and rakudo for this spesh plugin? | 16:03 | |
or is 7% not worth the hassle? | |||
not even 7% | 16:04 | ||
16:05
Mercster24 left
|
|||
jnthn | Sends PRs, I can review them and see what complexity we get for the speedup | 16:06 | |
timotimo | sure thing. maybe you also see what could make spesh better at figuring out the inner inline | ||
Geth | MoarVM/speshplugin_guardstaticcode: 8dabbcc01b | (Timo Paulssen)++ | 8 files add speshguardgetstaticcode, for closures and such lets a spesh plugin figure out if a given object, such as the $!do attribute of a Code, is "the same" across invocations - ignoring what exactly it closes over. |
16:09 | |
16:09
tigermousr22 joined
16:10
tigermousr22 left
|
|||
Geth | MoarVM: timo++ created pull request #932: add speshguardgetstaticcode, for closures and such |
16:10 | |
16:11
robertle left
|
|||
timotimo | github.com/rakudo/rakudo/pull/2189 - also has the link to the other pull requests in the description | 16:14 | |
brrt | \o | 16:32 | |
yoleaux | 12:42Z <diakopter> brrt: well hopefully the static analysis wouldn't even need to expand the macros..? | ||
timotimo | o/ brrt | 16:34 | |
sp_fastalloc is simple enough to have the most likely case inlined into the expr template, IMO | 16:35 | ||
there's a check for size > 0 in the function version, but on the jit level we already know the size exactly | 16:36 | ||
brrt: any idea for something like MVM_LIKELY or MVM_UNLIKELY at the exprjit level? | |||
jnthn | m: say 3.68 / 7.34 | 16:41 | |
camelia | 0.501362 | ||
timotimo | oh wow | ||
that's a nice ratio | |||
jnthn | Yeah, it's for gist.github.com/jnthn/196263d7e888...b2e42b2008 | 16:42 | |
My changes today have made that run in half the time | 16:43 | ||
Some unpushed | |||
16:44
robertle joined
|
|||
brrt | not yet timotimo | 16:44 | |
i'm not sure how MVM_LIKELY is compiled | |||
16:44
p6bannerbot sets mode: +v robertle
|
|||
timotimo | no clue. does X86_64 encode branch predictor hints into the assembly implicitly or explicitly? | 16:44 | |
i.e. is gcc flipping the then and else branches around so the more likely one is the one that needs no goto? | 16:45 | ||
jnthn | I at first thought the latter | ||
Oh, it may really just be that | |||
Watever it does seems to have an effect on instruction count though | |||
timotimo | hm, "instruction fetch" is what valgrind counts, isn't it? | 16:46 | |
jnthn | Thing so | 16:47 | |
*Think | |||
timotimo | i imagine it perhaps has to do with fetches being not byte-per-byte, but whole cache-lines at once? | ||
and if the unlikely then branch starts in the same cache line, it gets fetched, and the likely else branch might start in the middle of the right cache line, too | |||
brrt | hmm, that's cute | 16:49 | |
we'd need either a new node, or a node annotation (flags?) | 16:58 | ||
timotimo | probably having a node would be easiest | 16:59 | |
but feel free to push that off to the medium future | 17:00 | ||
17:03
supercool2 joined
17:04
p6bannerbot sets mode: +v supercool2,
supercool2 left
17:08
zakharyas left,
zakharyas joined
17:09
p6bannerbot sets mode: +v zakharyas
17:18
Kaiepi left
17:21
zakharyas left
17:22
Kaiepi joined
17:23
p6bannerbot sets mode: +v Kaiepi
|
|||
Kaiepi | thoughts on implementing asm jit support for x32? | 17:37 | |
brrt | sure | 17:48 | |
we hate x32 | |||
brrt = we, in this case :-P | 17:49 | ||
I'm going to qualify that a bit better... | 17:50 | ||
I think when we say x32, we mean the 'run 64 bit code with 32 bit addresses' - and presumably integers as well? | |||
it's a windows thing iirc | 17:51 | ||
I actually think that is a pretty sane idea, given how much better x86-64 is than x86 | |||
but, on the other hand.... | |||
17:52
nyuszika7h3 joined
|
|||
brrt | here's my general hypothesis on perl6, adoption, and performance | 17:52 | |
there's two places where performance is going to matter for perl6 adoption | |||
1): on developer laptops | |||
17:52
p6bannerbot sets mode: +v nyuszika7h3
|
|||
brrt | 2): on production servers | 17:53 | |
17:53
nyuszika7h3 left
|
|||
brrt | all other platforms, including ARMv7, AArch64, x32, MIPS, PowerPC, whatever | 17:53 | |
(production servers are going to have Xeon-style processors, typically virtualized) | 17:54 | ||
all other platforms don't really matter. Not in this phase of adoption | |||
if we ever get as big as perl5, then there will be a horde of users clamoring for their obscure platforms. And we'll have a moarvm porters group who will make sure that happens | |||
timotimo | there's also x32 for linux | 17:55 | |
it's also not very difficult for a perl6 program to balloon up to 4 gigs %) | 17:56 | ||
brrt | that, too | ||
unfortunately | 17:57 | ||
timotimo | brrt: do you think porting the fast path of sp_fastcreate to exprjit will lose us the benefits gained from adding the branch hint macros to the allocate_nursery function? | 18:00 | |
Kaiepi | so that's a "keep it on the sideburner until later?" | ||
brrt | as far as i'm concerned, yes | 18:01 | |
and maybe at that time x32 might no longer be a thing | |||
timotimo: I'm not sure; the only way to know is to test | |||
18:02
lucasb joined
18:03
p6bannerbot sets mode: +v lucasb
|
|||
brrt | an open problem is - how do I, with the current register allocator, or something else which is both fast and correct, prevent an unlikely branch from forcing a spill in a likely branch | 18:05 | |
lucasb | just to note that I'm still on x86 32bit... sure my next machine will be 64bit and I'll never bother with the non-existent JIT for 32b, but until then... :) | 18:08 | |
I wonder, is it even possible to have a variant of the current JIT for 32bit? Would brrt be willying to guide someone doing that? | |||
18:09
cebor joined
|
|||
lucasb | *willingly | 18:09 | |
18:09
cebor is now known as Guest18913,
Guest18913 left
|
|||
brrt | it is totally possible | 18:10 | |
and if somebody were up for it, I'd help, sure | |||
it's just that I'm not going to spend time on it myself | |||
lucasb | that's ok | 18:11 | |
timotimo | the lego jit should not be terribly hard, right? | 18:13 | |
since it spills and loads at every conceivable point | |||
number of registers isn't such a big problem | |||
having to implement the calling conventions is a little bit of work i guess? | 18:14 | ||
brrt | there's a bunch of different ones | 18:15 | |
and you basically can't share any code | |||
timotimo | of course there are m( | ||
brrt | and I make no guarantees that any of the 'top' constructs make a lot of sense for the lower constructs | 18:16 | |
Kaiepi | i'm up for it | 18:19 | |
i will need some help though | |||
brrt | obviously | 18:21 | |
how much do you know about assembly language | |||
Kaiepi | not very much, i cargo-culted it going off the other examples in src/jit/x64/emit.dasc | 18:22 | |
are there any resources you could point me too so i could learn more? | |||
brrt | hmm. there's the dynasm docs here: corsix.github.io/dynasm-doc/tutorial.html | 18:28 | |
I woudl suggest you start playing with that to get a feel for assembly | |||
Kaiepi | aight | 18:29 | |
thanks | |||
lucasb | Kaiepi++ I totally encourage that! | 18:35 | |
brrt | if nothing else you'll learn some things :-) | 18:36 | |
lucasb | I wouldn't know how to do that. By the time I acquire the skills, the platform will be long gone, and maybe possibly even Earth itself | ||
Kaiepi | it's always good to learn some new things | ||
lucasb | but I can try help Kaiepi :) | ||
brrt | assembly really isn't all that hard though | ||
there's very few things you have to take into account | 18:37 | ||
18:43
Zoffix joined,
p6bannerbot sets mode: +v Zoffix,
ChanServ sets mode: +o Zoffix
18:44
p6bannerbot left,
p6bannerbot joined,
Zoffix sets mode: +o p6bannerbot,
Zoffix left
18:45
brrt left
18:48
lucasb left
19:15
Platonides17 joined,
p6bannerbot sets mode: +v Platonides17
19:17
Platonides17 left
19:41
AlexDaniel left
19:42
AlexDaniel joined,
AlexDaniel left,
AlexDaniel joined,
p6bannerbot sets mode: +v AlexDaniel
20:01
benchable6 joined,
p6bannerbot sets mode: +v benchable6
|
|||
Geth | MoarVM: xelak6++ created pull request #934: Get the number of bytes to be processed from the current buffer and not from the header. |
20:25 | |
20:42
drakythe0 joined
20:43
p6bannerbot sets mode: +v drakythe0
20:48
drakythe0 left
20:50
__idiot__ joined
20:51
p6bannerbot sets mode: +v __idiot__,
__idiot__ left
21:03
lizmat left
21:12
robertle left
|
|||
jnthn | D'oh, I put beer to the freezer for a bit so it'd be nice and cool, and now I've got iced beer... | 21:17 | |
21:44
nativecallable6 joined
21:45
p6bannerbot sets mode: +v nativecallable6
|
|||
timotimo | better than a freezer full of "spicy" beer shards | 21:46 | |
more spiky than spicy, really | |||
in german it works better because "scharf" means both "sharp" and "spicy" | |||
21:59
Guest25466 joined
22:00
p6bannerbot sets mode: +v Guest25466,
Guest25466 left
|
|||
jnthn | :) | 22:08 | |
It hadn't frozen through fully, and now I'm near the end of the class and it's getting warm | 22:09 | ||
Geth | MoarVM: 448e75bd3d | (Alexius Korzinek)++ | src/strings/utf8_c8.c Get the number of bytes to be processed from the current buffer and not from the header. This fixes issue #2158. |
22:22 | |
MoarVM: 3e679da29a | (Jonathan Worthington)++ (committed using GitHub Web editor) | src/strings/utf8_c8.c Merge pull request #934 from xelak6/master Get the number of bytes to be processed from the current buffer and not from the header. |
|||
22:27
Kaiepi left,
Kaiepi joined
22:28
p6bannerbot sets mode: +v Kaiepi
22:29
bisectable6 joined
22:30
p6bannerbot sets mode: +v bisectable6
22:43
travis-ci joined,
p6bannerbot sets mode: +v travis-ci
|
|||
travis-ci | MoarVM build errored. Jonathan Worthington 'Merge pull request #934 from xelak6/master | 22:43 | |
travis-ci.org/MoarVM/MoarVM/builds/413340536 github.com/MoarVM/MoarVM/compare/6...679da29adb | |||
22:43
travis-ci left
22:47
arza5 joined,
p6bannerbot sets mode: +v arza5,
arza5 left
23:14
amar joined,
amar left
|
|||
timotimo | hm. perhaps i should have backed up a thing or two from /tmp before rebooting | 23:35 | |
i found timo.github.io/_site/weeklychanges...thing.html | 23:36 |