00:29 hoelzro joined 01:48 ilbot3 joined 03:28 zakharyas joined 05:57 domidumont joined 06:01 domidumont joined 07:20 domidumont joined
jnthn morning, #moarvm 09:04
moritz \o jnthn 09:05
nwc10 you stop reading it for 2 days because nothing happened, and then something does: blog.pyston.org/ 09:20
dalek arVM: ae8d383 | (Pawel Murias)++ | src/6model/serialization.c:
Serialize hll role.
09:23
arVM: 94ade9f | (Pawel Murias)++ | src/6model/serialization.c:
Remove an empty else branch.
arVM: 134ed39 | jnthn++ | src/6model/serialization.c:
Merge pull request #370 from pmurias/serialize-hll-role

Serialize hll role.
jnthn And ruling on the versioning question is in the issue. :) 09:24
nwc10 hhvm.com/blog/ still dull 09:25
jnthn I guess that kinda drives home that the choice between refcounting and tracing is not something you can abstract away for real world programs. 09:28
nwc10 yes 09:42
and I guess what you said is another way of thinking what I thought - once you've set off down the road of "reference counting", and exposed the side effects of the behaviour of it to your API users, there's really no going tback 09:43
jnthn Right. 09:44
nwc10 except pypy seems to think otherwise (on the "no going back") 09:45
so I wonder how this will pan out
useful that both are re-implementations of the same language
jnthn Another thing worth noting is that it was C extension support that seemed to partly push them towards refcounting 09:50
At least iiuc
" But applying a tracing GC to a refcounting C API, such as the one that Python has, is risky and comes with many performance pitfalls." 09:51
Not sure if there's another way to read that
nwc10 yes. I'd missed that nuance.
But pypy thinks that it can do clever hacks in its GC to support the C API
maybe they can
maybe it's that they aren't starting out from the "can we run CPAN" equivalent 09:52
dalek MoarVM: 6627a8a | jnthn++ | src/ (6 files):
MoarVM: Avoid VMNull setup memcpy/loop in spesh'd frames.
MoarVM:
MoarVM: Just shove in `null` instructions for all object registers when we
MoarVM: compute the CFG, but before forming SSA. Then, let dead instruction
09:53 dalek joined
jnthn Feels like the Perl 6 approach (you write things with native parts using NativeCall, rather than as VM extensions) is side-stepping those things fairly nicely. 09:54
Those commits were a rebase/merge of jnthn/opts that I worked on last week but didn't want in before the release 09:55
I see a couple of seconds off Rakudo build time
And around 30 KB off Rakudo base memory
09:57 cognominal joined
arnsholt Heh, a high-performance compiler moving from tracing GC to refcounting is pretty amusing 10:03
But given that they want 100% compatibility with CPython, I can see why you'd end up doing that 10:04
dalek arVM/fix_cache_sc_idx: 628f734 | diakopter++ | src/6model/s (2 files):
write sc idx when deserializing, repossessing, and preparing to serialize
10:12
jnthn So, that's a rebase of a patch from diakopter++ that shaves a LOAD of Rakudo build time, but breaks all the nativecall tests on my Windows box. Going to see if I can figure out why so we can get the nice win :)
Explodes in my Linux VM too. 10:23
moritz what LOAD are we talking about? 10:36
jnthn 11s 10:40
m: say 101 / 112
camelia rakudo-moar 5638a1: OUTPUTĀ«0.901786ā¤Ā»
jnthn 10%
oh, and s/LOAD of/LOAD off/ :)
hm, I may have fixed it :) 11:07
dalek arVM/fix_cache_sc_idx: 3d8c9cd | jnthn++ | src/6model/sc.c:
Only used cached SC index if SC itself matches.

This fixes various mis-lookups that result from repossession.
11:08
arVM/fix_cache_sc_idx: 400569f | jnthn++ | src/6model/sc.c:
Add missing index update after repossession.
nwc10 jnthn: will let current test finish before testing that 11:10
I wonder if the Pyston reference counting "thing" is also part of the more general problem of "how do you wrap a C API in a language with managed memory"
because even if you have a good ffi interface, the wetware has to (correctly) parse the documentation to work out the semantics of who owns memory and how it should be freed 11:11
`char **` doesn't tell you very much
jnthn Yeah, that problem doesn't go away
And, with C's type system, can't 11:12
I was more claiming that a different problem went away (C library bindings have a coupling to the VM) 11:14
(Because the glue code you write couples to the VM)
Yeah, seems much better after my fixes 11:17
jnthn does a panda install, given these changes affect precomp related cdoe 11:20
Darn. Tests for NativeHelpers::Blob explode 11:28
In the same place the ones I fixed earlier
dalek arVM: 306289f | (Jimmy Zhuo)++ | src/ (4 files):
add a missing prefix for some function
12:03
arVM/fix_cache_sc_idx: c06b519 | jnthn++ | src/6model/sc.c:
Missing index update when repossessing STable.
12:08
jnthn That fixes things pretty well 12:15
JimmyZ: That wasn't really an improvement and created me a conflict to solve :/ 12:27
And fixes post-merge 12:31
That was about the pessimal time to do that change.
Now I should probably retest things
But at least I can bitch on IRC while I wait :P
dalek arVM: 628f734 | diakopter++ | src/6model/s (2 files):
write sc idx when deserializing, repossessing, and preparing to serialize
12:32
arVM: 3d8c9cd | jnthn++ | src/6model/sc.c:
Only used cached SC index if SC itself matches.

This fixes various mis-lookups that result from repossession.
arVM: 400569f | jnthn++ | src/6model/sc.c:
Add missing index update after repossession.
arVM: c06b519 | jnthn++ | src/6model/sc.c:
Missing index update when repossessing STable.
arVM: b0c26d6 | jnthn++ | src/6model/s (2 files):
Merge branch 'fix_cache_sc_idx'

Remove a memset to clear the args buffer.
The idea of it was to ensure the GC never saw junk there. However, as there's no point during argument setup that GC could run, this could never happen anyway. Saves a call to memset for every single invoke.
jnthn Testing welcome. Shaves a good 10% off Rakudo builds here. 12:33
nwc10 if you make this too fast, no-one will ever be able to get coffee again. 12:34
jnthn I think we've plenty of room for improvement in Rakudo build times still ;) 12:35
I mean, sure, a full Rakudo build is under 2 minutes, and an NQP build is 35s, but... :)
JimmyZ jnthn: sorry, I didn't see you your commit before my push 12:36
jnthn JimmyZ: np, I fixed it all up
I'm just grumpy today.
nwc10 I hope you remembered to eat lunch 12:37
jnthn Yes :)
nwc10 good
jnthn The butcher I order lamb from (can't really find it in supermarkets here) turned out to also have potato dumplings filled with smoked meat, which you just have to drop in boiling water for 15 minutes :) 12:39
Saves me making potato dough
Which is a bit tedious :) 12:40
nwc10 you have to know where to find lamb here too
I think I've spotted more horsemeat butchers than places I know that sell lamb
jnthn Yeah...I was surprised how hard it was to track down.
lizmat our local sheep herder recently went out of business 12:41
jnthn Aww :(
lizmat mostly because he actually wasn't allowed to sell the lamb meat
from what I understand, that is true for most sheep herds in the NL 12:42
jnthn Oh...
Curious.
lizmat: btw, if you've a moment and fancy trying a build with MoarVM HEAD, you should see a nice improvement to CORE.setting build time 12:43
lizmat some health regulation based on: we don't know where the sheep have been
nwc10 cows don't have this problem?
lizmat jnthn: am in the midst of working on List/Array.iterator
jnthn lizmat: No hurry :)
lizmat nwc10: no, because they're always in the "same" meadow
nwc10 and deer aren't consisdered to be domestic animals, so it doesn't matter where they've been? or are they also not allowed to be sold? 12:45
lizmat most deer that you buy in a shop actually *are* domesticized
aka, lived in a single meadow for most of their lives 12:46
at least over here in the NL
nwc10 ah OK.
this sounds very much like a rule made up by folks who don't actually understand farming. Presuambly it's very important to have an ISO 9000 audit trail too. Because that's what makes the difference between a quality product and rubbish 12:49
lizmat yup, something like that 12:54
13:08 domidumont joined 13:11 patrickz joined
jnthn Man, callgrind is fine but slow... :) 13:17
[Coke] (get coffee again) dammit, I'm already out. 13:18
jnthn m: say 417780695 - 416958586 13:40
camelia rakudo-moar a80414: OUTPUTĀ«822109ā¤Ā»
jnthn m: say 822109 / 417780695
camelia rakudo-moar a80414: OUTPUTĀ«0.0019678004ā¤Ā»
jnthn Hm, rather modest :)
nwc10 presumably it's still 0.2% in the right direction 13:43
jnthn Indeed 13:46
m: say 17855658600 - 17624207945 # for a program sat in a loop invoking lots
camelia rakudo-moar a80414: OUTPUTĀ«231450655ā¤Ā»
jnthn m: say 231450655 / 17855658600
camelia rakudo-moar a80414: OUTPUTĀ«0.01296231409ā¤Ā»
jnthn More than 1% there 13:47
jnthn The first number was for Rakudo startup, fwiw
14:14 brrt joined
nwc10 brrt: I think blog.pyston.org/ will interest you 14:16
brrt yes
yes it does
it was from reading the backlog that i came here, in fact 14:17
:-)
i think... i claimed, some months ago, that the end stage of Pyston is as a CPython module
jnthn ooh, brrt
brrt ohai jnthn, nwc10 :-)
jnthn Maybe you can tell me why my JIT patch is silly and fails :)
brrt which patch is that 14:18
jnthn brrt: gist.github.com/jnthn/59bcc1dd9eb3...8b678f25b4 14:19
I got rid of the memset in MVM_args_prepare, making it tiny 14:20
Inlining that C function into the interpreter was easy enough
Then I figured we can get rid of a call in the JIT too and save a bunch of instructions per invocation :)
brrt ok, got it :-) 14:21
jnthn But I think I'm doing something wrong
In the JIT bit
I don't see what :)
The symptom is almost as if the "mov FRAME:TMP5->cur_args_callsite, CALLSITE:TMP6;" doesn't happen
brrt hmmm 14:22
not sure if the CALLSITE: before TMP6 is necessary 14:23
jnthn I wasn't either, that was added to see if it was to blame :)
Well, if it's ommision was to blame :)
Am I doing that array lookup right above it?
brrt not 100% sure 14:24
let me check the docs
jnthn Certainly, if in
| mov TMP6, CALLSITE:TMP6[callsite_idx];
I remove CALLSITE, then the build explode
*explodes
(Fails at the dynasm stage) 14:25
brrt corsix.github.io/dynasm-doc/refere...html#_type
the thing is, CALLSITE:TMP6[callsite_idx] expands to
[reg + sizeof(ctype)*imm32]
jnthn Oh 14:26
brrt which in this case is [reg+sizeof(MVMCallSite)*imm32]
jnthn And this is an array of pointers to callsites
brrt aye
that should make some difference
and CALLSITE:TMP6 should expand to 'name:reg...[reg + (int)(ptrdiff_t)&(((ctype*)0)...)]' 14:27
jnthn Indeed, it works now I fix that :)
brrt which isn't what you want either; you just want the pointer itself
ok, great :-)
jnthn Yeah, now I got 14:28
+|.type CALLSITEPTR, MVMCallsite*
And
+ | mov TMP6, CALLSITEPTR:TMP6[callsite_idx];
:)
Thanks!
brrt great :-) yw
brrt is always happy when folks work on the JIT without my intervention :-) 14:29
ok, the really interesting bit about Pyston is written almost between the lines 14:30
performance in python is numpy performance
all the rest doesn't really matter 14:31
everything that can't be solved efficiently with numpy (or other compiled libraries like pygame) can be solved by using A Better Algorithm 14:32
and you know, that is pretty much true; that's is precisely what I also found in using python for network analysis
so any python runtime that makes numpy slower isn't going to be used by the only people who care about python performance in the first place.... 14:33
14:34 colomon joined
brrt but, it is fascinating work, and i especially admire their (for lack of a better word) 'agile' adaptation against changing conditions 14:34
timotimo that incidentally means that perl6 is fast enough for all things python is fast enough for by using Inline::Python :P
brrt perl6 is fast enough once we port numpy? 14:36
nump6
we should totally have that
jnthn m: say "Another {416958586 - 416859701} CPU cycles off startup"
camelia rakudo-moar e5bd09: OUTPUTĀ«Another 98885 CPU cycles off startupā¤Ā»
jnthn But I didn't expect much there :)
brrt \o/ 14:37
what kind of percentage is that
jnthn m: say "Another {17624207945 - 17534022355} off the hot invoke loop program" 14:38
camelia rakudo-moar e5bd09: OUTPUTĀ«Another 90185590 off the hot invoke loop programā¤Ā»
jnthn m: say 90185590 / 17624207945
camelia rakudo-moar e5bd09: OUTPUTĀ«0.00511714287ā¤Ā»
brrt 0,5%
jnthn That's in addition to the earlier larger win from removing memset
brrt little bits help, at least when they multiply
jnthn m: say 17855658600 - 17534022355 14:39
camelia rakudo-moar e5bd09: OUTPUTĀ«321636245ā¤Ā»
jnthn m: say 321636245 / 17855658600
camelia rakudo-moar e5bd09: OUTPUTĀ«0.01801312694ā¤Ā»
jnthn 1.8% overall
brrt not bad, not bad
timotimo i almost didn't get to open up my laptop yesterday during the GPN
but now i get to build all these nice patches y'all have :D
dalek arVM: 494b4e4 | jnthn++ | src/ (4 files):
Inline args preparation into interpreter, JIT.

Shaves some more instructions off every invocation. The C compiler may have got to the inline into interp.c anyway, but the JIT one will certainly save a bunch of instructions per invocation.
14:40
jnthn Hm, wow, all the lookups of &EXHAUST being late bound costs us a bit 14:43
brrt lizmat: we still have a herder in Groningen, for grass cutting without trimmer noise 14:44
timotimo jnthn: can you tell me how you measured that/figured it out? 14:45
jnthn callgrind showed MVM_frame_find_lexical_by_name up, I added a printf 14:46
timotimo ah, makes sense 14:47
brrt anyway, i'm off 14:58
just to make you jealous, i'll tel you i'm making falafel 14:59
see you :-)
jnthn ooh :) 15:00
Though talking of nice food things, I got a fresh bag of dried kashmiri chillies delivered today :)
Wow. We don't even inovke EXHAUST on Moar 15:05
Or JVM
That approach dates back to the Parrot days o.O
So that code is doing nothing at all
nwc10 what *is* EXHAUST? or shall I wait for the blog post? :-) 15:06
(google failed me)
jnthn I'm still getting to the bottom of it :) 15:08
psch i remember trying to figure out what EXHAUST does a few times
jnthn But in summary: the original way return was done on Parrot was you took a "continuation"
That when invoked would go to a return handler 15:09
psch but if it's not invoked on either current backend...
jnthn And then you stored it in a lexical
And so you could properly, lexically, resolve returns, not dynamicly
Now, at the point of an actual return happening that was replaced by &EXHAUST 15:10
Which if invoked would say "you already returned from this routine"
To be a better error than "you aren't even in a routine at all" 15:11
nwc10 OK, I think that makes sense.
first reference to EXHAUST that I find is commit dc16a4fc72a1eaa8309b7de948607a9ca7c0a1e4
jnthn But it turns out EXHAUST threw the exact same error as not being in a routine at all
nwc10 +sub EXHAUST(|$) {
+ die "Attempt to return from exhausted Routine"
+}
jnthn And we didn't actually invoke it 15:12
And there's surely a lot of better ways to handle this :)
jnthn does the smallest fix first
nwc10 a ristretto? :-) 15:13
jnthn Except that small thing busts it too. o.O
Um...unovers a spesh inline bug, it seems 15:21
Which I think I'm a bit tired to find right now :) 15:26
But generally I think `return` handling wants a larger look over
In so far as it'd be nice if we could kill off the whole lexotic thing in favor of a normal exception handler
Since we already have have those resolved as static rather than dynamic 15:27
That in turn means that we can look at inlining return
Since we could turn it into a normal multi 15:28
And that in turn means spesh should be able to, maybe with a little more effort, rewrite the return into a "goto"-alike
Anyway, think that's for next week. 15:29
When I'll have a good bit more Perl 6/MoarVM time than this week.
16:32 domidumont joined
timotimo great. i know from experience that eliminating returns at the end of a sub was always a noticable improvement in performance 17:05
and i never really understood how EXHAUST etc really worked
19:12 mohij joined 20:18 patrickz joined 21:20 janktopus joined 22:13 vendethiel joined 22:49 Ven joined, ggoebel114 joined 22:55 Ven joined