01:08
colomon joined
01:10
agentzh_ joined
02:10
agentzh_ joined
02:53
colomon joined
03:39
agentzh_ joined
04:54
agentzh_ joined
06:26
FROGGS joined
06:49
zakharyas joined
06:52
agentzh_ joined
06:57
Ven joined
07:32
lizmat joined
08:10
lizmat joined
08:38
lizmat joined
08:52
lizmat_ joined
08:54
lizmat joined
08:58
lizmat_ joined
09:34
lizmat_ joined
09:39
lizmat_ joined
09:43
lizmat joined
09:48
agentzh_ joined
09:51
lizmat_ joined
09:56
lizmat joined
09:58
flussence joined
10:04
lizmat_ joined
10:21
lizmat joined
10:34
lizmat_ joined
10:37
lizmat__ joined,
lizmat___ joined
11:19
brrt joined
11:25
rurban joined
11:26
agentzh_ joined
|
|||
brrt | \o | 11:28 | |
masak | o/ | ||
jnthn | o/ brrt | 11:29 | |
brrt | o/ jnthn, masak | 11:31 | |
you all still as osdc? | |||
or flying back already | |||
jnthn | brrt: Still here at a small hackathon after it, but leaving by train this evening | 11:33 | |
brrt | :-o that is quite a distance (google maps tells me 7 h 37 min | 11:35 | |
masak | I'm not at OSDC :/ | 11:37 | |
brrt | well, too bad, but neither am i :-) | ||
jnthn | brrt: I'm only going as far as Goteborg... | 11:39 | |
brrt | ah, thats probably more reasonable :-0 | ||
:-) | |||
is osdc a perl conference? | 11:40 | ||
they do use act | |||
jnthn | It was Perl and QT and civic hacking and more | 11:45 | |
brrt | so pretty broad actually. nice | 11:46 | |
jnthn | Yeah, it was a nice event | ||
12:16
Ven joined
|
|||
masak | seemed like a nice event from the online echoes... :) | 12:16 | |
jnthn | With MoarVM HEAD, on Windows (64-bit), building NQP immediately explodes like this: | 13:38 | |
JIT: trying to pass arguments in local space (stack top offset: 64, size: 8) at <unknown>:1 (src\vm\moar\stage0/QRegex.moarvm:!cursor_pass:4294967295) | |||
13:41
agentzh_ joined
|
|||
brrt | what, what | 13:41 | |
ow, i see | |||
two things | 13:42 | ||
a): that shouldn't be an exception, you silly persion | |||
jnthn | I don't immediately understand enough to fix it but I'm guessing it's the repr op devirt now has an arg list too long to JIT code for on Windows. | 13:43 | |
brrt | s/persion/person/ | ||
yes, that is precisely what's going on | |||
ugh, blame me for not checking on windows, i should have | 13:44 | ||
theres an easy fix though | |||
jnthn | OK, I'll leave it to you :) | 13:45 | |
brrt | :-) | 13:47 | |
FROGGS | okay, so this is nothing about my QRegex work? | 14:20 | |
jnthn | 3No | 14:22 | |
timotimo | sorry about blowing stuff up with devirt :( | ||
brrt | np :-) | 14:23 | |
timotimo | the "easy fix" isn't in moar yet, though? | ||
brrt | no | ||
how much stack space do we want | |||
we have 64 bytes | |||
we should increase this to how much? | |||
timotimo | and i'm blowing that already? | ||
brrt | no | 14:24 | |
you have something with 8 arguments? | |||
eh, 9 arguments, probably | |||
0..8 | |||
timotimo | i don't know actually | ||
yeah, one with 9 | 14:25 | ||
the getattr series of ops | |||
and the bindattr series, too | |||
brrt | getattrs and bindattrs, indeed | 14:26 | |
anyway | |||
how much do we want to increase this to? | 14:27 | ||
jnthn | brrt: Is there a cost to increase it? | ||
brrt | 128 bytes? | ||
hardly | |||
jnthn | OK, then 128 | ||
brrt | 128 is 0x100 iirc | ||
jnthn | m: say 0x100 | 14:28 | |
camelia | rakudo-moar f9c982: OUTPUTĀ«256ā¤Ā» | ||
jnthn | :) | ||
m: say 0x80 | |||
camelia | rakudo-moar f9c982: OUTPUTĀ«128ā¤Ā» | ||
timotimo | 88 bytes per hour! | 14:29 | |
(you'll see some crazy shit) | |||
brrt | compiling | ||
aye, you're right | |||
hmmm | |||
wait a minute | |||
then my check is wrong | |||
let me think a bit longer | 14:30 | ||
timotimo | if we make the stack much bigger, we'll possibly consume more cache lines with unused data? | ||
brrt | who cares :-) | ||
but i'm checking the calculation | |||
ok, we allocate 0x80 bytes, is 128 bytes | 14:31 | ||
then we pass the arguments by offset from rsp upwards | 14:32 | ||
masak | m: say 128.base(16) | 14:33 | |
camelia | rakudo-moar f9c982: OUTPUTĀ«80ā¤Ā» | ||
masak | m: say "0x", 128.base(16) | ||
camelia | rakudo-moar f9c982: OUTPUTĀ«0x80ā¤Ā» | ||
brrt | we store the work registers from rbp downward, in the range 0...0x20 | 14:34 | |
m: say 0x20 | |||
camelia | rakudo-moar f9c982: OUTPUTĀ«32ā¤Ā» | ||
brrt | so... | ||
if we have 0x80 bytes allocated | |||
and we write downwards as the position gets higher | |||
m: say 0x80 - 0.20; | |||
camelia | rakudo-moar f9c982: OUTPUTĀ«127.8ā¤Ā» | ||
brrt | x | ||
m: say 0x80 - 0x20; | 14:35 | ||
camelia | rakudo-moar f9c982: OUTPUTĀ«96ā¤Ā» | ||
brrt | we actually have 96 bytes over | 14:36 | |
ok, i have time for you again | 14:56 | ||
:-) | |||
oh goody, windows uses 0x20 bytes from the 0x96 for the first 4 arguments | 14:57 | ||
eh not from the 0x96 but from the argument space | |||
14:59
FROGGS joined
|
|||
brrt | but the other bytes, we use for scratch space | 15:05 | |
so in short | |||
we have the following layout | |||
[ 0x20 (save stable regs) | 0x20 (scratch space) | 0x20 (arg space on windows) | 0x20 (reserve arg space on windows) ] | 15:06 | ||
for posix that is the same, except that we have 0x40 bytes of stack space near the top | 15:08 | ||
because posix doesn't reserve the 0x20 bytes near the top | |||
now, if we use not 0x80 but 0x100 bytes of stack space... | 15:09 | ||
(that's quite liberal, innit) | |||
then we have on windows | |||
[ 0x20 a | 0x20 b | ... | 0x20 d ] | |||
m: (0x100 - 0x60).say | 15:10 | ||
camelia | rakudo-moar 4c1d2d: OUTPUTĀ«write string requires an object with REPR MVMOSHandleā¤ in block <unit> at /tmp/8wjVgHxMg7:1ā¤ā¤Ā» | ||
brrt | m: say 0x100 - 0x20 | ||
camelia | rakudo-moar 4c1d2d: OUTPUTĀ«write string requires an object with REPR MVMOSHandleā¤ in block <unit> at /tmp/LGDkokDxYc:1ā¤ā¤Ā» | ||
brrt | m: say (0x100 - 0x60).Str | ||
camelia | rakudo-moar 4c1d2d: OUTPUTĀ«write string requires an object with REPR MVMOSHandleā¤ in block <unit> at /tmp/mGa4QMTP1G:1ā¤ā¤Ā» | ||
brrt | m: 0x20.say; | ||
camelia | rakudo-moar 4c1d2d: OUTPUTĀ«write string requires an object with REPR MVMOSHandleā¤ in block <unit> at /tmp/065b7Rp6bV:1ā¤ā¤Ā» | ||
brrt | m: "OH HAI".say; | ||
camelia | rakudo-moar 4c1d2d: OUTPUTĀ«write string requires an object with REPR MVMOSHandleā¤ in block <unit> at /tmp/1H9fAjXdWR:1ā¤ā¤Ā» | ||
brrt | what | ||
we have no fewer than 0xa0 bytes left in that case | 15:11 | ||
or 160 | |||
that means space for 20 arguments in total (windows) | 15:12 | ||
or 30 arguments on posix, if i'm corrct | 15:13 | ||
dalek | arVM: bec36ae | brrt++ | src/jit/emit_x64.dasc: Increase stack space for call arguments. This may be a costly change, because it makes our stack space two cache lines large. But it should make moar work on windows again. |
15:22 | |
15:23
agentzh_ joined
|
|||
nwc10 | brrt: presumably you could do that just on Windows? But, for now, I'm assuming KISS is best | 15:23 | |
(fix it between September and Christmas) | |||
FROGGS | .tell brrt I was able build: perl6 version 2015.04-221-gfce74e1 built on MoarVM version 2015.04-105-gbec36ae | 15:37 | |
nwc10 | FROGGS: on what OS? (implied is something win) | 15:39 | |
FROGGS | nwc10: win 7 x64 | ||
TimToady | doesn't seem to run any slower here | 15:40 | |
FROGGS | I'm also checking that as we "speak" | ||
nwc10 | if nothing references those cache lines, it won't matter, other than page size, will it? | ||
by "references" I'm meaning "reads from or writes to" | |||
FROGGS | yeah, does not feel slower | 15:47 | |
15:52
Ven joined
|
|||
[Coke] | (can't we do a stage0 update in a branch and then when merging, only merge everything else, and redo the stage0 update in master?) | 15:59 | |
lizmat | I had an unexplained slowdown late last night, that I fixed by nuking install and rebuilding | 16:11 | |
16:44
lizmat joined
|
|||
jnthn waves from a train | 16:44 | ||
brrt++ # fixing stuff | |||
nwc10 | jnthn: a moving train? in the right direction? | 16:45 | |
jnthn | yes, but I managed to get in the carriages that only go part of the way | 16:46 | |
But I've no idea if there are many free seats in the carriages that go all the way, so will not hurry to move :) | 16:47 | ||
17:12
pyrimidine joined
|
|||
timotimo | maybe we should strike that item off the roadmap now? the reprop devirtualization? | 17:30 | |
17:31
FROGGS joined
|
|||
jnthn | I tend to do it each release. | 17:33 | |
(update it) | |||
But feel free to do it now...I think you have a commit bit to the site | 17:34 | ||
timotimo | ah | ||
yes, i do | |||
will do it later today | |||
i still need to find some benchmark that shows a time improvement with devirt'd reprops compared to without | 17:36 | ||
perl6-bench didn't show any improvement ;( | |||
jnthn | Did you try getting instruction count numbers under callgrind? | 17:37 | |
timotimo | i don't think i did, no | ||
jnthn | give it a go mebbe :) | 17:38 | |
Time for a train switch here... | |||
bbl | |||
FROGGS | [Coke]: what would that give us? (besides an even bigger repository to clone) | 18:11 | |
[Coke] | FROGGS: a place where we can test your code? We already gave up on repo size when we checked in stage 0. :) | 18:12 | |
jnthn | If you go through a few rebootstraps then you can squash them into one when merging a branch and then delete the branch, which can save space :) | 18:13 | |
But you hvae to know that they're coming :) | |||
FROGGS | [Coke]: I extensively tested my code... I only do branches for review/testing in case I'm uncertain | 18:14 | |
[Coke]: and here jnthn already reviewed it (even when he did not run/test it) | |||
but pushing a stage0 to a branch does not give us any value | |||
[Coke] | I disagree, but ok. it's your branch. | 18:15 | |
FROGGS | disagree about what detail? | ||
[Coke] | "that it doesn't give us any value" | ||
FROGGS | jnthn: true... maybe we need a 'call for new ops' besides the usual call for papers | 18:16 | |
[Coke]: I agree that branches are nice when you want to push code that needs testing by other ppl | |||
jnthn: you can't squash jvm's stage0 though, can you? | 18:18 | ||
jnthn | You can't squash shared history, no | ||
But when I was doing the moarvm support for example, I had about 6 different state 0s along the way. | |||
*stage | |||
And squashed those before the merge. | 18:19 | ||
Think I did similar with the JVM branch | |||
But yeah, once it's in master then...no luck. | |||
FROGGS | well, I think about the scenario where you have new ops in two branches, and these two ops are *used* in nqp... | ||
you won't be lucky there | |||
[Coke] | yes, you don't merge the generated files in that case, you regenerate them. | 18:20 | |
FROGGS | if new ops aren't used in nqp, you don't need a stage0 | ||
so there is no (easy) way to merge stage0's | |||
[Coke] | no one is suggesting that. | ||
sorry: *I* am not suggesting that. | 18:21 | ||
FROGGS | [Coke]: and you only can regenerate stage0 when you can build nqp | ||
jnthn | I don't think it happens often enough to really be worrying over it. | ||
FROGGS | yes, though it would be nice if we did not have to care about stage0... and it would be also nice to be able to refactor moarvm ops, but... | 18:24 | |
like I would like to change some ops so that they branch | 18:25 | ||
18:25
ggoebel2 joined
|
|||
FROGGS | but that seems impossible, even via creating (and afterwards deleting) a temp op | 18:25 | |
18:26
betterwo1ld joined
18:27
arnsholt joined
|
|||
jnthn | battery out & | 18:31 | |
18:36
brrt joined
19:38
camelia joined
20:27
lizmat joined
20:31
colomon joined
20:44
brrt joined
|
|||
timotimo | brrt: what's the reason behind making the stack bigger on unix as well? i thought we only need that on windows? | 21:16 | |
brrt | mostly macro trickery, really | 21:17 | |
the is-windows flag is set at dynasm time | |||
whereas we do the check at runtime, so we'd need to convert it at least to a c-preprocessor macro | 21:18 | ||
timotimo | mhm | 21:19 | |
brrt | or at least something like that | ||
i'm aware there are msvc-specific flags | |||
i won't do that because people may well be cross-compiling | |||
timotimo | ah, sure | 21:20 | |
i've just been told intel has the possibility to add into a memory address rather than just a register | 21:21 | ||
maybe we could get a decent size improvement if we did that throughout; and also for other opcodes? | |||
brrt | i'm actually not all that familiar with a very clear way to know we're targetting win64 abi | ||
ehm..... | |||
well, i'd think at least one of these has to be a register? | |||
timotimo | yes, sure | 21:22 | |
brrt | then i'm not sure how we could profit (now) | ||
timotimo | but rather than load A, load B, add A B, store A | ||
we could have load B, add | |||
load B, add to memory of A B, done | |||
brrt | no | ||
or | |||
yes, if the destination register C is identical to A | 21:23 | ||
may be a worthwhile optimization | |||
timotimo | usly | ||
... | 21:25 | ||
brrt | hmm? | 21:29 | |
its still a minor optimization since under the cover you'll do the exact same thing | |||
but in binary size it will help a bit | |||
timotimo | my laptop is oozing hard | 21:30 | |
oom-ing | |||
21:33
agentzh_ joined
|
|||
brrt | out-of-memory? | 21:33 | |
timotimo | yes | ||
unpacked a big tarball in /tmp | |||
brrt | and /tmp is on memory? | 21:34 | |
ugh, casts to (void*), one reason not to like c++ | 21:35 | ||
timotimo | i just figured that out :) | ||
ah | 21:38 | ||
the rm -rf just came through in my /tmp | |||
21:41
colomon joined
|
|||
brrt | :-) | 21:43 | |
brrt afk | |||
see you tomorrow | |||
lizmat | brrt: good night! | 21:44 | |
brrt | good night :-) | ||
22:36
lizmat joined
|
|||
timotimo | in one case we'd have load accumulator, load argument, add/subtract/whatever, store accumulator; in the other case we'd have load argument, add/subtract/whatever in-place; that could indeed be a little win | 23:01 | |
and i even found an example where there's more adds and subtracts with the accumulator going into the same register as it was loaded from | 23:02 | ||
236 accumulator_stats: sub_i: same | |||
149 accumulator_stats: add_i: same | |||
136 accumulator_stats: add_i: different | |||
58 accumulator_stats: sub_i: different | |||
this being t/spec/S05-mass/properties-script.rakudo.moar | |||
i could imagine this is the regex engine | 23:05 | ||
huh, that test now fails? :\ | 23:18 | ||
23:20
agentzh_ joined
|
|||
timotimo | ah, confused by intel syntax | 23:22 | |
haven't asm in a long time, it seems |