01:08 colomon joined 01:10 agentzh_ joined 02:10 agentzh_ joined 02:53 colomon joined 03:39 agentzh_ joined 04:54 agentzh_ joined 06:26 FROGGS joined 06:49 zakharyas joined 06:52 agentzh_ joined 06:57 Ven joined 07:32 lizmat joined 08:10 lizmat joined 08:38 lizmat joined 08:52 lizmat_ joined 08:54 lizmat joined 08:58 lizmat_ joined 09:34 lizmat_ joined 09:39 lizmat_ joined 09:43 lizmat joined 09:48 agentzh_ joined 09:51 lizmat_ joined 09:56 lizmat joined 09:58 flussence joined 10:04 lizmat_ joined 10:21 lizmat joined 10:34 lizmat_ joined 10:37 lizmat__ joined, lizmat___ joined 11:19 brrt joined 11:25 rurban joined 11:26 agentzh_ joined
brrt \o 11:28
masak o/
jnthn o/ brrt 11:29
brrt o/ jnthn, masak 11:31
you all still as osdc?
or flying back already
jnthn brrt: Still here at a small hackathon after it, but leaving by train this evening 11:33
brrt :-o that is quite a distance (google maps tells me 7 h 37 min 11:35
masak I'm not at OSDC :/ 11:37
brrt well, too bad, but neither am i :-)
jnthn brrt: I'm only going as far as Goteborg... 11:39
brrt ah, thats probably more reasonable :-0
:-)
is osdc a perl conference? 11:40
they do use act
jnthn It was Perl and QT and civic hacking and more 11:45
brrt so pretty broad actually. nice 11:46
jnthn Yeah, it was a nice event
12:16 Ven joined
masak seemed like a nice event from the online echoes... :) 12:16
jnthn With MoarVM HEAD, on Windows (64-bit), building NQP immediately explodes like this: 13:38
JIT: trying to pass arguments in local space (stack top offset: 64, size: 8) at <unknown>:1 (src\vm\moar\stage0/QRegex.moarvm:!cursor_pass:4294967295)
13:41 agentzh_ joined
brrt what, what 13:41
ow, i see
two things 13:42
a): that shouldn't be an exception, you silly persion
jnthn I don't immediately understand enough to fix it but I'm guessing it's the repr op devirt now has an arg list too long to JIT code for on Windows. 13:43
brrt s/persion/person/
yes, that is precisely what's going on
ugh, blame me for not checking on windows, i should have 13:44
theres an easy fix though
jnthn OK, I'll leave it to you :) 13:45
brrt :-) 13:47
FROGGS okay, so this is nothing about my QRegex work? 14:20
jnthn 3No 14:22
timotimo sorry about blowing stuff up with devirt :(
brrt np :-) 14:23
timotimo the "easy fix" isn't in moar yet, though?
brrt no
how much stack space do we want
we have 64 bytes
we should increase this to how much?
timotimo and i'm blowing that already?
brrt no 14:24
you have something with 8 arguments?
eh, 9 arguments, probably
0..8
timotimo i don't know actually
yeah, one with 9 14:25
the getattr series of ops
and the bindattr series, too
brrt getattrs and bindattrs, indeed 14:26
anyway
how much do we want to increase this to? 14:27
jnthn brrt: Is there a cost to increase it?
brrt 128 bytes?
hardly
jnthn OK, then 128
brrt 128 is 0x100 iirc
jnthn m: say 0x100 14:28
camelia rakudo-moar f9c982: OUTPUTĀ«256ā¤Ā»
jnthn :)
m: say 0x80
camelia rakudo-moar f9c982: OUTPUTĀ«128ā¤Ā»
timotimo 88 bytes per hour! 14:29
(you'll see some crazy shit)
brrt compiling
aye, you're right
hmmm
wait a minute
then my check is wrong
let me think a bit longer 14:30
timotimo if we make the stack much bigger, we'll possibly consume more cache lines with unused data?
brrt who cares :-)
but i'm checking the calculation
ok, we allocate 0x80 bytes, is 128 bytes 14:31
then we pass the arguments by offset from rsp upwards 14:32
masak m: say 128.base(16) 14:33
camelia rakudo-moar f9c982: OUTPUTĀ«80ā¤Ā»
masak m: say "0x", 128.base(16)
camelia rakudo-moar f9c982: OUTPUTĀ«0x80ā¤Ā»
brrt we store the work registers from rbp downward, in the range 0...0x20 14:34
m: say 0x20
camelia rakudo-moar f9c982: OUTPUTĀ«32ā¤Ā»
brrt so...
if we have 0x80 bytes allocated
and we write downwards as the position gets higher
m: say 0x80 - 0.20;
camelia rakudo-moar f9c982: OUTPUTĀ«127.8ā¤Ā»
brrt x
m: say 0x80 - 0x20; 14:35
camelia rakudo-moar f9c982: OUTPUTĀ«96ā¤Ā»
brrt we actually have 96 bytes over 14:36
ok, i have time for you again 14:56
:-)
oh goody, windows uses 0x20 bytes from the 0x96 for the first 4 arguments 14:57
eh not from the 0x96 but from the argument space
14:59 FROGGS joined
brrt but the other bytes, we use for scratch space 15:05
so in short
we have the following layout
[ 0x20 (save stable regs) | 0x20 (scratch space) | 0x20 (arg space on windows) | 0x20 (reserve arg space on windows) ] 15:06
for posix that is the same, except that we have 0x40 bytes of stack space near the top 15:08
because posix doesn't reserve the 0x20 bytes near the top
now, if we use not 0x80 but 0x100 bytes of stack space... 15:09
(that's quite liberal, innit)
then we have on windows
[ 0x20 a | 0x20 b | ... | 0x20 d ]
m: (0x100 - 0x60).say 15:10
camelia rakudo-moar 4c1d2d: OUTPUTĀ«write string requires an object with REPR MVMOSHandleā¤ in block <unit> at /tmp/8wjVgHxMg7:1ā¤ā¤Ā»
brrt m: say 0x100 - 0x20
camelia rakudo-moar 4c1d2d: OUTPUTĀ«write string requires an object with REPR MVMOSHandleā¤ in block <unit> at /tmp/LGDkokDxYc:1ā¤ā¤Ā»
brrt m: say (0x100 - 0x60).Str
camelia rakudo-moar 4c1d2d: OUTPUTĀ«write string requires an object with REPR MVMOSHandleā¤ in block <unit> at /tmp/mGa4QMTP1G:1ā¤ā¤Ā»
brrt m: 0x20.say;
camelia rakudo-moar 4c1d2d: OUTPUTĀ«write string requires an object with REPR MVMOSHandleā¤ in block <unit> at /tmp/065b7Rp6bV:1ā¤ā¤Ā»
brrt m: "OH HAI".say;
camelia rakudo-moar 4c1d2d: OUTPUTĀ«write string requires an object with REPR MVMOSHandleā¤ in block <unit> at /tmp/1H9fAjXdWR:1ā¤ā¤Ā»
brrt what
we have no fewer than 0xa0 bytes left in that case 15:11
or 160
that means space for 20 arguments in total (windows) 15:12
or 30 arguments on posix, if i'm corrct 15:13
dalek arVM: bec36ae | brrt++ | src/jit/emit_x64.dasc:
Increase stack space for call arguments.

This may be a costly change, because it makes our stack space two cache lines large. But it should make moar work on windows again.
15:22
15:23 agentzh_ joined
nwc10 brrt: presumably you could do that just on Windows? But, for now, I'm assuming KISS is best 15:23
(fix it between September and Christmas)
FROGGS .tell brrt I was able build: perl6 version 2015.04-221-gfce74e1 built on MoarVM version 2015.04-105-gbec36ae 15:37
nwc10 FROGGS: on what OS? (implied is something win) 15:39
FROGGS nwc10: win 7 x64
TimToady doesn't seem to run any slower here 15:40
FROGGS I'm also checking that as we "speak"
nwc10 if nothing references those cache lines, it won't matter, other than page size, will it?
by "references" I'm meaning "reads from or writes to"
FROGGS yeah, does not feel slower 15:47
15:52 Ven joined
[Coke] (can't we do a stage0 update in a branch and then when merging, only merge everything else, and redo the stage0 update in master?) 15:59
lizmat I had an unexplained slowdown late last night, that I fixed by nuking install and rebuilding 16:11
16:44 lizmat joined
jnthn waves from a train 16:44
brrt++ # fixing stuff
nwc10 jnthn: a moving train? in the right direction? 16:45
jnthn yes, but I managed to get in the carriages that only go part of the way 16:46
But I've no idea if there are many free seats in the carriages that go all the way, so will not hurry to move :) 16:47
17:12 pyrimidine joined
timotimo maybe we should strike that item off the roadmap now? the reprop devirtualization? 17:30
17:31 FROGGS joined
jnthn I tend to do it each release. 17:33
(update it)
But feel free to do it now...I think you have a commit bit to the site 17:34
timotimo ah
yes, i do
will do it later today
i still need to find some benchmark that shows a time improvement with devirt'd reprops compared to without 17:36
perl6-bench didn't show any improvement ;(
jnthn Did you try getting instruction count numbers under callgrind? 17:37
timotimo i don't think i did, no
jnthn give it a go mebbe :) 17:38
Time for a train switch here...
bbl
FROGGS [Coke]: what would that give us? (besides an even bigger repository to clone) 18:11
[Coke] FROGGS: a place where we can test your code? We already gave up on repo size when we checked in stage 0. :) 18:12
jnthn If you go through a few rebootstraps then you can squash them into one when merging a branch and then delete the branch, which can save space :) 18:13
But you hvae to know that they're coming :)
FROGGS [Coke]: I extensively tested my code... I only do branches for review/testing in case I'm uncertain 18:14
[Coke]: and here jnthn already reviewed it (even when he did not run/test it)
but pushing a stage0 to a branch does not give us any value
[Coke] I disagree, but ok. it's your branch. 18:15
FROGGS disagree about what detail?
[Coke] "that it doesn't give us any value"
FROGGS jnthn: true... maybe we need a 'call for new ops' besides the usual call for papers 18:16
[Coke]: I agree that branches are nice when you want to push code that needs testing by other ppl
jnthn: you can't squash jvm's stage0 though, can you? 18:18
jnthn You can't squash shared history, no
But when I was doing the moarvm support for example, I had about 6 different state 0s along the way.
*stage
And squashed those before the merge. 18:19
Think I did similar with the JVM branch
But yeah, once it's in master then...no luck.
FROGGS well, I think about the scenario where you have new ops in two branches, and these two ops are *used* in nqp...
you won't be lucky there
[Coke] yes, you don't merge the generated files in that case, you regenerate them. 18:20
FROGGS if new ops aren't used in nqp, you don't need a stage0
so there is no (easy) way to merge stage0's
[Coke] no one is suggesting that.
sorry: *I* am not suggesting that. 18:21
FROGGS [Coke]: and you only can regenerate stage0 when you can build nqp
jnthn I don't think it happens often enough to really be worrying over it.
FROGGS yes, though it would be nice if we did not have to care about stage0... and it would be also nice to be able to refactor moarvm ops, but... 18:24
like I would like to change some ops so that they branch 18:25
18:25 ggoebel2 joined
FROGGS but that seems impossible, even via creating (and afterwards deleting) a temp op 18:25
18:26 betterwo1ld joined 18:27 arnsholt joined
jnthn battery out & 18:31
18:36 brrt joined 19:38 camelia joined 20:27 lizmat joined 20:31 colomon joined 20:44 brrt joined
timotimo brrt: what's the reason behind making the stack bigger on unix as well? i thought we only need that on windows? 21:16
brrt mostly macro trickery, really 21:17
the is-windows flag is set at dynasm time
whereas we do the check at runtime, so we'd need to convert it at least to a c-preprocessor macro 21:18
timotimo mhm 21:19
brrt or at least something like that
i'm aware there are msvc-specific flags
i won't do that because people may well be cross-compiling
timotimo ah, sure 21:20
i've just been told intel has the possibility to add into a memory address rather than just a register 21:21
maybe we could get a decent size improvement if we did that throughout; and also for other opcodes?
brrt i'm actually not all that familiar with a very clear way to know we're targetting win64 abi
ehm.....
well, i'd think at least one of these has to be a register?
timotimo yes, sure 21:22
brrt then i'm not sure how we could profit (now)
timotimo but rather than load A, load B, add A B, store A
we could have load B, add
load B, add to memory of A B, done
brrt no
or
yes, if the destination register C is identical to A 21:23
may be a worthwhile optimization
timotimo usly
... 21:25
brrt hmm? 21:29
its still a minor optimization since under the cover you'll do the exact same thing
but in binary size it will help a bit
timotimo my laptop is oozing hard 21:30
oom-ing
21:33 agentzh_ joined
brrt out-of-memory? 21:33
timotimo yes
unpacked a big tarball in /tmp
brrt and /tmp is on memory? 21:34
ugh, casts to (void*), one reason not to like c++ 21:35
timotimo i just figured that out :)
ah 21:38
the rm -rf just came through in my /tmp
21:41 colomon joined
brrt :-) 21:43
brrt afk
see you tomorrow
lizmat brrt: good night! 21:44
brrt good night :-)
22:36 lizmat joined
timotimo in one case we'd have load accumulator, load argument, add/subtract/whatever, store accumulator; in the other case we'd have load argument, add/subtract/whatever in-place; that could indeed be a little win 23:01
and i even found an example where there's more adds and subtracts with the accumulator going into the same register as it was loaded from 23:02
236 accumulator_stats: sub_i: same
149 accumulator_stats: add_i: same
136 accumulator_stats: add_i: different
58 accumulator_stats: sub_i: different
this being t/spec/S05-mass/properties-script.rakudo.moar
i could imagine this is the regex engine 23:05
huh, that test now fails? :\ 23:18
23:20 agentzh_ joined
timotimo ah, confused by intel syntax 23:22
haven't asm in a long time, it seems