|
01:08
colomon joined
01:10
agentzh_ joined
02:10
agentzh_ joined
02:53
colomon joined
03:39
agentzh_ joined
04:54
agentzh_ joined
06:26
FROGGS joined
06:49
zakharyas joined
06:52
agentzh_ joined
06:57
Ven joined
07:32
lizmat joined
08:10
lizmat joined
08:38
lizmat joined
08:52
lizmat_ joined
08:54
lizmat joined
08:58
lizmat_ joined
09:34
lizmat_ joined
09:39
lizmat_ joined
09:43
lizmat joined
09:48
agentzh_ joined
09:51
lizmat_ joined
09:56
lizmat joined
09:58
flussence joined
10:04
lizmat_ joined
10:21
lizmat joined
10:34
lizmat_ joined
10:37
lizmat__ joined,
lizmat___ joined
11:19
brrt joined
11:25
rurban joined
11:26
agentzh_ joined
|
|||
| brrt | \o | 11:28 | |
| masak | o/ | ||
| jnthn | o/ brrt | 11:29 | |
| brrt | o/ jnthn, masak | 11:31 | |
| you all still as osdc? | |||
| or flying back already | |||
| jnthn | brrt: Still here at a small hackathon after it, but leaving by train this evening | 11:33 | |
| brrt | :-o that is quite a distance (google maps tells me 7 h 37 min | 11:35 | |
| masak | I'm not at OSDC :/ | 11:37 | |
| brrt | well, too bad, but neither am i :-) | ||
| jnthn | brrt: I'm only going as far as Goteborg... | 11:39 | |
| brrt | ah, thats probably more reasonable :-0 | ||
| :-) | |||
| is osdc a perl conference? | 11:40 | ||
| they do use act | |||
| jnthn | It was Perl and QT and civic hacking and more | 11:45 | |
| brrt | so pretty broad actually. nice | 11:46 | |
| jnthn | Yeah, it was a nice event | ||
|
12:16
Ven joined
|
|||
| masak | seemed like a nice event from the online echoes... :) | 12:16 | |
| jnthn | With MoarVM HEAD, on Windows (64-bit), building NQP immediately explodes like this: | 13:38 | |
| JIT: trying to pass arguments in local space (stack top offset: 64, size: 8) at <unknown>:1 (src\vm\moar\stage0/QRegex.moarvm:!cursor_pass:4294967295) | |||
|
13:41
agentzh_ joined
|
|||
| brrt | what, what | 13:41 | |
| ow, i see | |||
| two things | 13:42 | ||
| a): that shouldn't be an exception, you silly persion | |||
| jnthn | I don't immediately understand enough to fix it but I'm guessing it's the repr op devirt now has an arg list too long to JIT code for on Windows. | 13:43 | |
| brrt | s/persion/person/ | ||
| yes, that is precisely what's going on | |||
| ugh, blame me for not checking on windows, i should have | 13:44 | ||
| theres an easy fix though | |||
| jnthn | OK, I'll leave it to you :) | 13:45 | |
| brrt | :-) | 13:47 | |
| FROGGS | okay, so this is nothing about my QRegex work? | 14:20 | |
| jnthn | 3No | 14:22 | |
| timotimo | sorry about blowing stuff up with devirt :( | ||
| brrt | np :-) | 14:23 | |
| timotimo | the "easy fix" isn't in moar yet, though? | ||
| brrt | no | ||
| how much stack space do we want | |||
| we have 64 bytes | |||
| we should increase this to how much? | |||
| timotimo | and i'm blowing that already? | ||
| brrt | no | 14:24 | |
| you have something with 8 arguments? | |||
| eh, 9 arguments, probably | |||
| 0..8 | |||
| timotimo | i don't know actually | ||
| yeah, one with 9 | 14:25 | ||
| the getattr series of ops | |||
| and the bindattr series, too | |||
| brrt | getattrs and bindattrs, indeed | 14:26 | |
| anyway | |||
| how much do we want to increase this to? | 14:27 | ||
| jnthn | brrt: Is there a cost to increase it? | ||
| brrt | 128 bytes? | ||
| hardly | |||
| jnthn | OK, then 128 | ||
| brrt | 128 is 0x100 iirc | ||
| jnthn | m: say 0x100 | 14:28 | |
| camelia | rakudo-moar f9c982: OUTPUTĀ«256ā¤Ā» | ||
| jnthn | :) | ||
| m: say 0x80 | |||
| camelia | rakudo-moar f9c982: OUTPUTĀ«128ā¤Ā» | ||
| timotimo | 88 bytes per hour! | 14:29 | |
| (you'll see some crazy shit) | |||
| brrt | compiling | ||
| aye, you're right | |||
| hmmm | |||
| wait a minute | |||
| then my check is wrong | |||
| let me think a bit longer | 14:30 | ||
| timotimo | if we make the stack much bigger, we'll possibly consume more cache lines with unused data? | ||
| brrt | who cares :-) | ||
| but i'm checking the calculation | |||
| ok, we allocate 0x80 bytes, is 128 bytes | 14:31 | ||
| then we pass the arguments by offset from rsp upwards | 14:32 | ||
| masak | m: say 128.base(16) | 14:33 | |
| camelia | rakudo-moar f9c982: OUTPUTĀ«80ā¤Ā» | ||
| masak | m: say "0x", 128.base(16) | ||
| camelia | rakudo-moar f9c982: OUTPUTĀ«0x80ā¤Ā» | ||
| brrt | we store the work registers from rbp downward, in the range 0...0x20 | 14:34 | |
| m: say 0x20 | |||
| camelia | rakudo-moar f9c982: OUTPUTĀ«32ā¤Ā» | ||
| brrt | so... | ||
| if we have 0x80 bytes allocated | |||
| and we write downwards as the position gets higher | |||
| m: say 0x80 - 0.20; | |||
| camelia | rakudo-moar f9c982: OUTPUTĀ«127.8ā¤Ā» | ||
| brrt | x | ||
| m: say 0x80 - 0x20; | 14:35 | ||
| camelia | rakudo-moar f9c982: OUTPUTĀ«96ā¤Ā» | ||
| brrt | we actually have 96 bytes over | 14:36 | |
| ok, i have time for you again | 14:56 | ||
| :-) | |||
| oh goody, windows uses 0x20 bytes from the 0x96 for the first 4 arguments | 14:57 | ||
| eh not from the 0x96 but from the argument space | |||
|
14:59
FROGGS joined
|
|||
| brrt | but the other bytes, we use for scratch space | 15:05 | |
| so in short | |||
| we have the following layout | |||
| [ 0x20 (save stable regs) | 0x20 (scratch space) | 0x20 (arg space on windows) | 0x20 (reserve arg space on windows) ] | 15:06 | ||
| for posix that is the same, except that we have 0x40 bytes of stack space near the top | 15:08 | ||
| because posix doesn't reserve the 0x20 bytes near the top | |||
| now, if we use not 0x80 but 0x100 bytes of stack space... | 15:09 | ||
| (that's quite liberal, innit) | |||
| then we have on windows | |||
| [ 0x20 a | 0x20 b | ... | 0x20 d ] | |||
| m: (0x100 - 0x60).say | 15:10 | ||
| camelia | rakudo-moar 4c1d2d: OUTPUTĀ«write string requires an object with REPR MVMOSHandle⤠in block <unit> at /tmp/8wjVgHxMg7:1ā¤ā¤Ā» | ||
| brrt | m: say 0x100 - 0x20 | ||
| camelia | rakudo-moar 4c1d2d: OUTPUTĀ«write string requires an object with REPR MVMOSHandle⤠in block <unit> at /tmp/LGDkokDxYc:1ā¤ā¤Ā» | ||
| brrt | m: say (0x100 - 0x60).Str | ||
| camelia | rakudo-moar 4c1d2d: OUTPUTĀ«write string requires an object with REPR MVMOSHandle⤠in block <unit> at /tmp/mGa4QMTP1G:1ā¤ā¤Ā» | ||
| brrt | m: 0x20.say; | ||
| camelia | rakudo-moar 4c1d2d: OUTPUTĀ«write string requires an object with REPR MVMOSHandle⤠in block <unit> at /tmp/065b7Rp6bV:1ā¤ā¤Ā» | ||
| brrt | m: "OH HAI".say; | ||
| camelia | rakudo-moar 4c1d2d: OUTPUTĀ«write string requires an object with REPR MVMOSHandle⤠in block <unit> at /tmp/1H9fAjXdWR:1ā¤ā¤Ā» | ||
| brrt | what | ||
| we have no fewer than 0xa0 bytes left in that case | 15:11 | ||
| or 160 | |||
| that means space for 20 arguments in total (windows) | 15:12 | ||
| or 30 arguments on posix, if i'm corrct | 15:13 | ||
| dalek | arVM: bec36ae | brrt++ | src/jit/emit_x64.dasc: Increase stack space for call arguments. This may be a costly change, because it makes our stack space two cache lines large. But it should make moar work on windows again. |
15:22 | |
|
15:23
agentzh_ joined
|
|||
| nwc10 | brrt: presumably you could do that just on Windows? But, for now, I'm assuming KISS is best | 15:23 | |
| (fix it between September and Christmas) | |||
| FROGGS | .tell brrt I was able build: perl6 version 2015.04-221-gfce74e1 built on MoarVM version 2015.04-105-gbec36ae | 15:37 | |
| nwc10 | FROGGS: on what OS? (implied is something win) | 15:39 | |
| FROGGS | nwc10: win 7 x64 | ||
| TimToady | doesn't seem to run any slower here | 15:40 | |
| FROGGS | I'm also checking that as we "speak" | ||
| nwc10 | if nothing references those cache lines, it won't matter, other than page size, will it? | ||
| by "references" I'm meaning "reads from or writes to" | |||
| FROGGS | yeah, does not feel slower | 15:47 | |
|
15:52
Ven joined
|
|||
| [Coke] | (can't we do a stage0 update in a branch and then when merging, only merge everything else, and redo the stage0 update in master?) | 15:59 | |
| lizmat | I had an unexplained slowdown late last night, that I fixed by nuking install and rebuilding | 16:11 | |
|
16:44
lizmat joined
|
|||
| jnthn waves from a train | 16:44 | ||
| brrt++ # fixing stuff | |||
| nwc10 | jnthn: a moving train? in the right direction? | 16:45 | |
| jnthn | yes, but I managed to get in the carriages that only go part of the way | 16:46 | |
| But I've no idea if there are many free seats in the carriages that go all the way, so will not hurry to move :) | 16:47 | ||
|
17:12
pyrimidine joined
|
|||
| timotimo | maybe we should strike that item off the roadmap now? the reprop devirtualization? | 17:30 | |
|
17:31
FROGGS joined
|
|||
| jnthn | I tend to do it each release. | 17:33 | |
| (update it) | |||
| But feel free to do it now...I think you have a commit bit to the site | 17:34 | ||
| timotimo | ah | ||
| yes, i do | |||
| will do it later today | |||
| i still need to find some benchmark that shows a time improvement with devirt'd reprops compared to without | 17:36 | ||
| perl6-bench didn't show any improvement ;( | |||
| jnthn | Did you try getting instruction count numbers under callgrind? | 17:37 | |
| timotimo | i don't think i did, no | ||
| jnthn | give it a go mebbe :) | 17:38 | |
| Time for a train switch here... | |||
| bbl | |||
| FROGGS | [Coke]: what would that give us? (besides an even bigger repository to clone) | 18:11 | |
| [Coke] | FROGGS: a place where we can test your code? We already gave up on repo size when we checked in stage 0. :) | 18:12 | |
| jnthn | If you go through a few rebootstraps then you can squash them into one when merging a branch and then delete the branch, which can save space :) | 18:13 | |
| But you hvae to know that they're coming :) | |||
| FROGGS | [Coke]: I extensively tested my code... I only do branches for review/testing in case I'm uncertain | 18:14 | |
| [Coke]: and here jnthn already reviewed it (even when he did not run/test it) | |||
| but pushing a stage0 to a branch does not give us any value | |||
| [Coke] | I disagree, but ok. it's your branch. | 18:15 | |
| FROGGS | disagree about what detail? | ||
| [Coke] | "that it doesn't give us any value" | ||
| FROGGS | jnthn: true... maybe we need a 'call for new ops' besides the usual call for papers | 18:16 | |
| [Coke]: I agree that branches are nice when you want to push code that needs testing by other ppl | |||
| jnthn: you can't squash jvm's stage0 though, can you? | 18:18 | ||
| jnthn | You can't squash shared history, no | ||
| But when I was doing the moarvm support for example, I had about 6 different state 0s along the way. | |||
| *stage | |||
| And squashed those before the merge. | 18:19 | ||
| Think I did similar with the JVM branch | |||
| But yeah, once it's in master then...no luck. | |||
| FROGGS | well, I think about the scenario where you have new ops in two branches, and these two ops are *used* in nqp... | ||
| you won't be lucky there | |||
| [Coke] | yes, you don't merge the generated files in that case, you regenerate them. | 18:20 | |
| FROGGS | if new ops aren't used in nqp, you don't need a stage0 | ||
| so there is no (easy) way to merge stage0's | |||
| [Coke] | no one is suggesting that. | ||
| sorry: *I* am not suggesting that. | 18:21 | ||
| FROGGS | [Coke]: and you only can regenerate stage0 when you can build nqp | ||
| jnthn | I don't think it happens often enough to really be worrying over it. | ||
| FROGGS | yes, though it would be nice if we did not have to care about stage0... and it would be also nice to be able to refactor moarvm ops, but... | 18:24 | |
| like I would like to change some ops so that they branch | 18:25 | ||
|
18:25
ggoebel2 joined
|
|||
| FROGGS | but that seems impossible, even via creating (and afterwards deleting) a temp op | 18:25 | |
|
18:26
betterwo1ld joined
18:27
arnsholt joined
|
|||
| jnthn | battery out & | 18:31 | |
|
18:36
brrt joined
19:38
camelia joined
20:27
lizmat joined
20:31
colomon joined
20:44
brrt joined
|
|||
| timotimo | brrt: what's the reason behind making the stack bigger on unix as well? i thought we only need that on windows? | 21:16 | |
| brrt | mostly macro trickery, really | 21:17 | |
| the is-windows flag is set at dynasm time | |||
| whereas we do the check at runtime, so we'd need to convert it at least to a c-preprocessor macro | 21:18 | ||
| timotimo | mhm | 21:19 | |
| brrt | or at least something like that | ||
| i'm aware there are msvc-specific flags | |||
| i won't do that because people may well be cross-compiling | |||
| timotimo | ah, sure | 21:20 | |
| i've just been told intel has the possibility to add into a memory address rather than just a register | 21:21 | ||
| maybe we could get a decent size improvement if we did that throughout; and also for other opcodes? | |||
| brrt | i'm actually not all that familiar with a very clear way to know we're targetting win64 abi | ||
| ehm..... | |||
| well, i'd think at least one of these has to be a register? | |||
| timotimo | yes, sure | 21:22 | |
| brrt | then i'm not sure how we could profit (now) | ||
| timotimo | but rather than load A, load B, add A B, store A | ||
| we could have load B, add | |||
| load B, add to memory of A B, done | |||
| brrt | no | ||
| or | |||
| yes, if the destination register C is identical to A | 21:23 | ||
| may be a worthwhile optimization | |||
| timotimo | usly | ||
| ... | 21:25 | ||
| brrt | hmm? | 21:29 | |
| its still a minor optimization since under the cover you'll do the exact same thing | |||
| but in binary size it will help a bit | |||
| timotimo | my laptop is oozing hard | 21:30 | |
| oom-ing | |||
|
21:33
agentzh_ joined
|
|||
| brrt | out-of-memory? | 21:33 | |
| timotimo | yes | ||
| unpacked a big tarball in /tmp | |||
| brrt | and /tmp is on memory? | 21:34 | |
| ugh, casts to (void*), one reason not to like c++ | 21:35 | ||
| timotimo | i just figured that out :) | ||
| ah | 21:38 | ||
| the rm -rf just came through in my /tmp | |||
|
21:41
colomon joined
|
|||
| brrt | :-) | 21:43 | |
| brrt afk | |||
| see you tomorrow | |||
| lizmat | brrt: good night! | 21:44 | |
| brrt | good night :-) | ||
|
22:36
lizmat joined
|
|||
| timotimo | in one case we'd have load accumulator, load argument, add/subtract/whatever, store accumulator; in the other case we'd have load argument, add/subtract/whatever in-place; that could indeed be a little win | 23:01 | |
| and i even found an example where there's more adds and subtracts with the accumulator going into the same register as it was loaded from | 23:02 | ||
| 236 accumulator_stats: sub_i: same | |||
| 149 accumulator_stats: add_i: same | |||
| 136 accumulator_stats: add_i: different | |||
| 58 accumulator_stats: sub_i: different | |||
| this being t/spec/S05-mass/properties-script.rakudo.moar | |||
| i could imagine this is the regex engine | 23:05 | ||
| huh, that test now fails? :\ | 23:18 | ||
|
23:20
agentzh_ joined
|
|||
| timotimo | ah, confused by intel syntax | 23:22 | |
| haven't asm in a long time, it seems | |||