01:47
ilbot3 joined
02:26
tokuhiro_ joined
03:27
tokuhiro_ joined
03:39
tokuhiro_ joined
07:03
Ven joined
07:59
Ven joined,
zakharyas joined
|
|||
JimmyZ | A very nice book 'Static Single Assignment Book': ssabook.gforge.inria.fr/latest/book.pdf # almost complete, project address gforge.inria.fr/scm/?group_id=1950 | 08:26 | |
08:48
FROGGS joined
08:50
lizmat joined
09:09
FROGGS joined
|
|||
jnthn | JimmyZ++ # nice link indeed! | 09:42 | |
JimmyZ | scm.gforge.inria.fr/anonscm/svn/ss...torial.pdf # PPT for some more info :) | 09:45 | |
jnthn | m: say "SSA".flip # :P | 09:48 | |
camelia | rakudo-moar c54773: OUTPUT«ASS» | ||
timotimo | i wonder how many techniques we cannot reliably implement because older versions are not available in our ssa implementation | 09:49 | |
of course we can always allocate a temporary register and set its value right after the desired version gets written | 09:50 | ||
jnthn | "older versions"? | ||
Oh, I think know the issue you mean | 09:51 | ||
It's a trade-off. | |||
If you make the other one you get more costly/difficult deopt | |||
timotimo | yes | 09:52 | |
i remember that | |||
hum, now i remember i got nowhere with my deopt bridges thing yet | 09:53 | ||
not getting paid enough for complicated things :) | 09:54 | ||
not actually convinced i really can deal with complicated things that much better when I'm getting paid... | 10:12 | ||
10:15
Ven joined
|
|||
timotimo | i think i have the impostor syndrome | 10:18 | |
but still better than impastor or inpasta | |||
arnsholt | Is that where you make a lot of copy-pasta? | 10:26 | |
timotimo | mhhh pasta | 10:30 | |
jnthn: were you able to find out why MapIterCommon doesn't have its new method spesh'd? | 10:47 | ||
if not, do you want me to litter the code with debug statements and figure it out? | |||
jnthn | timotimo: No, if you could look into that it'd be great | ||
Because it's the kind of method that *should* spesh really well | |||
timotimo | sure, did you use a specific benchmark for it? | 10:48 | |
jnthn | 'cus it's jsut a bunch of binds to attributes | ||
for ^1000 { for ^1000 { } } | |||
Well, that may hit the for -> while opt maybe | |||
If that still works | |||
timotimo | i recently fixed it | 10:49 | |
it used to look for an &infix:<,> in th QAST, which was reoved duringGLR | |||
jnthn | OK, well, just my @a = ^1000; for @a { for @a { } } | ||
timotimo | rebuilding rakudo now | 10:50 | |
interesting | 10:55 | ||
in the profile i'm looking at it gets called 1001 | |||
1001 times, about 4/5th of those calls were even jitted | |||
jnthn | Probably something initializ-y | ||
Oh? | |||
timotimo | how did you reach the conclusion it doesn't get speshed? | ||
that piece of code initializes a crapton of IntLexRef | 11:00 | ||
3005001 | |||
2002000 of those in sink-all and 1002001 in pull-one | |||
same with a :=, but i suspect that can be eased by implementing push-exactly or something in range's iterator | 11:02 | ||
jnthn | Wow | 11:03 | |
I need to look at the lex ref issues there | |||
Maybe that's what I'll do this evening | |||
I really need to do several hours on a $other-job today | |||
timotimo | ah, sure | ||
jnthn | But good to know | ||
timotimo | there's a infix:<<> in the code you gave that gets only speshed, not jitted | ||
1002001 calls, 7.04% (155.9ms) | 11:04 | ||
jnthn | How did I conclude it wasn't? Because in the Text::CSV profiler output it isn't being | ||
timotimo | (exclusive time) | ||
jnthn | So we'll need to look deeper :( | ||
timotimo | i'll grab Text::CSV onto my laptop as well | ||
all the flaky mobile connections :| | 11:05 | ||
jnthn | if you liked it shoulda put 4G on it | 11:06 | |
timotimo | occasionally i do get 4G | ||
how do you invoke the benchmark? | 11:07 | ||
jnthn | I created a file with this 1000 times: | 11:08 | |
hello,","," ",world,"!" | |||
And then | |||
cat test-small.csv | perl6-m -Ilib --profile test-t.pl | |||
You'll need to grab Slang::Tuxic and File::Temp and File::Directory::Tree also | |||
timotimo | just noticed that | 11:10 | |
rebootstrapping panda right now | 11:11 | ||
um ... even with test-t.pl i get almost 99% jitted new | 11:15 | ||
src/gen/m-CORE.setting:2696 | |||
perhaps more worrying is sink-all of sequential map being 100% interpreted and 4.66% exclusive time | 11:16 | ||
and 520351 BOOTCode being allocated inside BUILDALL's while loop | 11:17 | ||
there's only 5010 calls to BUILDALL according to the routines tab | 11:18 | ||
hehe. | 11:25 | ||
List's iterator method has a class :: does Iterator in it | |||
that generates code to take a whole bunch of closures | |||
it gets called 36013 times | |||
allocates 180065 BOTOCode in total | 11:26 | ||
m: say 180065 / 36013 | 11:30 | ||
camelia | rakudo-moar 7c9911: OUTPUT«5» | ||
timotimo | that's how many methods that class has :P | ||
i've just moved it out of the method and i'll see if it makes a big difference | |||
yeah, though i couldn't report it due to flaky network >_> | 11:38 | ||
55 instead of 59 gc runs | 11:40 | ||
we may want to move some more classes for iterators out of the methods that use them, to prevent taking closures fro all the methods | |||
jnthn | Wait, why are they taking closrues?! | ||
If they are something's up with our code-gen | 11:41 | ||
timotimo | they shouldn't be? | ||
jnthn | The only time a method should take a closure is if it's an l-value | ||
uh damn | |||
r-value | |||
In a class body it's always in sink context | 11:42 | ||
timotimo | how else would we have classes defined in inner scopes be able to refer to closed-over values? | ||
jnthn | Classes aren't closures | ||
timotimo | ah! | ||
well, then you can fix it :) | |||
jnthn | Thus why we've had and fixed various bugs where people did refer to lexicals :) | ||
OK, I'll put it on my todo list along with looking at the code-gen issues that make too many lexicalrefs | 11:43 | ||
Hm, if we can fix these two then we would get GC runs down a lot | |||
timotimo | likely (and hopefully) | 11:44 | |
jnthn | And so improve performance a whole lot | ||
nwc10 | other bloggage: morepypy.blogspot.co.at/2015/09/pyp...ments.html | 11:47 | |
timotimo | our GC isn't the fastest | 11:48 | |
we still have those bunchtons of gen2 roots that are irking me a bit | |||
i'd love common gc run times to dwindle below 2ms :| | |||
actually, if we want to ever be able to do 60fps game development or something, 2ms is still more than a single video frame | 11:52 | ||
11:55
Ven joined
|
|||
timotimo | maybe incremental GC would be a thing to consider at some point? i have no idea what requirements that adds to the rest of the VM and if we can get there easily enough | 11:59 | |
jnthn | On my box I see GC times of 3ms-4ms in various cases | 12:02 | |
nwc10 | for now, I suspect that we get bigger net wins by doing other stuff, relying on KISS and the tail end of Moore's Law. | 12:03 | |
but that's just an opinion. | |||
jnthn | m: say 1/60 | 12:05 | |
camelia | rakudo-moar 7c9911: OUTPUT«0.016667» | ||
jnthn | That's a lot more than 0.002 :P | ||
nwc10 | jnthn: even if you're now in SECAM territory, surely as you're still in Europe, the correct fraction is 1/50 :-) | ||
timotimo | wow, i want some of these gc times | 12:07 | |
oh, i thought millisecond meants 1/100 second, haha, that's fail | |||
but still, i hardly ever get gc times as good as jnthn's getting :( | 12:10 | ||
or is that really just "in some cases"? | |||
jnthn | I had 3.x ms average GC for the for lines('file'.IO) { } | 12:11 | |
6-7ms is common in apps that are retaining more stuff | |||
timotimo | let's see. | 12:12 | |
ah, for that test-small.csv isn't big enough by far :) | |||
7 to 8 ms in that | 12:13 | ||
can our machines' performance differ this drastically? | |||
i'll teach the jit about continuationreset, that'll make pull-one jittable, which is at 15% exclusive time in the for-lines benchmark | 12:18 | ||
jnthn | Wait, are you on latest? | 12:19 | |
I made for 'foo'.IO.lines { } not use continuations | 12:20 | ||
timotimo | oh! | ||
jnthn | But sure, do that anyway :) | ||
timotimo | shall i still go ahead? | ||
jnthn | Because it'll make every gather/take thing faster :) | ||
timotimo | right. first i'll have to get off this train, though | ||
damn, and my fav song of this album just came on :( | |||
12:47
JimmyZ left,
JimmyZ joined
|
|||
timotimo | if something in interp.c sets the cur_op before calling the C function in question, i'll mark it :invokish in the oplist, so that the jit doesn't explode, right? | 12:50 | |
FROGGS | sounds reasonable | ||
timotimo | though in this case it's not because it invokes stuff, but because it records the cur_op into the continuation's address | 12:51 | |
FROGGS | :throwish seems to have a similar effect | 12:52 | |
timotimo | BBL | 12:59 | |
13:18
virtualsue joined
13:37
brrt joined
|
|||
brrt | \o | 13:37 | |
13:37
Ven joined
|
|||
FROGGS | hi brrt | 13:41 | |
brrt | hi FROGGS | 13:42 | |
13:49
virtualsue left
14:33
tokuhiro_ joined
14:48
Ven joined
|
|||
hoelzro | jnthn: regarding that string heap optimization, is the optimization that MoarVM SCs no longer have their own string heaps, and just expect the code to refer to the string heap in the bytecode itself? | 15:21 | |
I really want to fix the nqp-js problem, and I think the only way to do that is to truly understand that optimization | |||
jnthn | hoelzro: It's exactly that, yes | 15:22 | |
hoelzro: Actually all we used to do was just build a string array | |||
15:22
Ven joined
|
|||
jnthn | So there was a huge push arr, "foo" sequence | 15:22 | |
And the change was just to get SCs to use identical indexes to the string heap of the bytecode file itself | 15:23 | ||
So we could save that | |||
Which saved a bunch of work at startup | |||
hoelzro | so the dependency string heap reference will almost definitely have a different index after the optimization, right? since it's referring to all strings in the compunit? | 15:25 | |
or is that wrong? a compunit with a single SC would have essentially the same string heap as the SC itself, maybe? | |||
hoelzro looks as this as a good thing, because he never really understood the serialization stuff before | 15:26 | ||
jnthn | iirc, the serializer pushes the unique strings into a list | ||
And then keeps that list somewhere internal in the VM | 15:27 | ||
Oh, on the current CompUnit I think | |||
hoelzro | is there a way to get --dump to dump things like the string heap, or lower level info on the SCs in the compunit? | ||
jnthn | And then uses it when it does the MAST -> bytecode | ||
Not that I'm aware of | |||
16:04
FROGGS joined
17:14
Ven joined
18:30
arnsholt joined
|
|||
timotimo | i'm puzzled | 18:45 | |
as soon as the jit kicks in on "my num @values; loop { @values.push: 0.0e1 }" it complains "expected num register!" | 18:46 | ||
18:46
vendethiel joined
|
|||
timotimo | but the jit code that's responsible for what gets emitted there should really put MVM_reg_num64 into the slot that decides what happens | 18:47 | |
19:16
brrt joined
19:26
tokuhiro_ joined
19:38
Peter_R joined
20:07
Ven joined
21:03
brrt joined
|
|||
brrt | holy mother of irregular instruction encoding | 21:07 | |
timotimo: is that the old JIT? and what line says that? | 21:08 | ||
apparantly, kids, if and only if the register number of an indexed register is 4, then we need a second modrm byte, or something | 21:10 | ||
timotimo | m) | 21:11 | |
brrt: what do you mean "what line says that"? | 21:12 | ||
brrt | what line says 'expteced num register!' | ||
timotimo | ah | ||
that's from push | |||
brrt | i expect we use push_n for that? | 21:14 | |
timotimo | it's as if this line was wrong: | ||
m)+ (op == MVM_OP_push_n || op == MVM_OP_unshift_n) ? MVM_reg_num64 : | |||
(without the facepalm smiley in front) | |||
jitlog says it's been devirtualized | |||
and speshlog says it's actually a push_n | |||
brrt | hmmm | 21:15 | |
timotimo | oh, my str @foo is NYI? | ||
perhaps a GLR thing? | |||
brrt | i think .. i dunno | ||
timotimo | it was also NYI pre-glr | 21:17 | |
brrt | the bytecode generation issue is an irregularity in the encoding of rbp | 21:24 | |
timotimo | x86 is hard | ||
FROGGS | rbp? | ||
timotimo | base pointer? | ||
brrt | yes.... and also r12, since that looks just like rbp from the perspective of x86 | 21:26 | |
x86 is really, really hard | |||
21:28
tokuhiro_ joined
|
|||
brrt | actuayll, it's rsp, not rbp | 21:30 | |
anyway... | 21:31 | ||
it looks like something i can crack | |||
FROGGS | ++brrt | ||
brrt | keep the ++'s for when the commit comes :-P | ||
FROGGS | sure :o) | ||
always got some in my pocket | 21:32 | ||
timotimo | sounds like you know what way to go and all that might stand in your way is missing infrastructure for keeping the information that modrm needs to become bigger ... or something | ||
brrt | hmm yeah, i guess | 21:34 | |
timotimo | does it seem like that's the last problem on your way? | 21:35 | |
brrt | last instruction encoding problem i'm aware of, yes | ||
timotimo | awesome :) | ||
brrt | there is a cheap-but-ugly workaround. it means giving up r12 | ||
and not use it at all | 21:36 | ||
timotimo | sure | ||
then you'll see if everything else works but that :) | |||
brrt | that'd work. but it'd only be a matter of time before somebody would choose to push dynasm over the limit again | ||
possibly me | |||
timotimo | giving up just a single register doesn't sound terrible | ||
brrt | well, it's also all stack relative stuff | 21:37 | |
r12 and rsp | |||
timotimo | oh | ||
that's more interesting, then | |||
brrt | why they are irregular, i don't know | ||
timotimo | otherwise i'd have said "if fixing this takes too long, skipping it will get us to a working code gen faster" | ||
brrt | well, i'm going to think about it more | 21:38 | |
fairly sure this can be fixed | |||
timotimo | mhm | ||
brrt | we have the 'meaning' of the vreg at runtime | ||
so we can actually add/rewrite the bytes | |||
but it's tricky | 21:39 | ||
(the good bit is, it was already broken before i started ^^) | |||
timotimo | yeah :) | ||
"we" being "inside the dynasm internals", right? | |||
brrt | yes | 21:41 | |
the runtime | |||
but it requires i study the entire bytes-meaning table | 21:42 | ||
timotimo | urgh | ||
you'll has a bachelor of ft'aghn after that | |||
brrt | wiki.osdev.org/X86-64_Instruction_E...addressing | 21:43 | |
and this beaty here: wiki.osdev.org/X86-64_Instruction_E...dressing_2 | |||
timotimo | oh, that's not gigantic | ||
it's just a lot | 21:44 | ||
brrt | yeah, it's managable, it's just highly irregular | ||
21:45
kjs_ joined
|
|||
timotimo | turn off your patern recognition brain parts and it'll feel better, eh? :D | 21:45 | |
you won't even notice there's no regularity to it! | |||
brrt | right :-) | ||
i'm going to sleep | |||
see you tomorrow! | |||
timotimo | good night brrt! | 21:46 | |
22:29
tokuhiro_ joined
22:53
kjs_ joined
23:40
tokuhiro_ joined
|