dalek | arVM: cfd7fd6 | jimmy++ | src/jit/emit_x64.dasc: jit cmp_i |
03:10 | |
03:11
jimmy_ joined
|
|||
JimmyZ | ^^ I don't know how to test it. nqp::cmp_i doesn't help me :( | 03:12 | |
dalek | arVM: e5379eb | jimmy++ | src/jit/graph.c: forgot to add cmp_i to jgb_consume_ins |
03:36 | |
07:21
FROGGS joined
|
|||
dalek | arVM: 4ee566d | jimmy++ | src/jit/graph.c: jit isnanorinf |
07:51 | |
arVM: 20b7d1c | jimmy++ | src/jit/ (2 files): jit cmp_n |
08:34 | ||
jnthn | If you don't know how to test stuff, putting it in master is probably not the best idea, is it? | 08:35 | |
JimmyZ | you're right, but seems I didn't break roast, I doubt nothing get jited at all | 09:37 | |
JimmyZ waits for timotimo++ :) | |||
10:07
zakharyas joined
|
|||
dalek | arVM: dbc4776 | jimmy++ | src/jit/emit_x64.dasc: small fix |
10:29 | |
13:02
brrt joined
|
|||
brrt | \o | 13:02 | |
i had two ideas while cycling today :-) | 13:04 | ||
one is to try and debug our seemingly mac os x-only issues on opendarwin | |||
however, opendarwin has died, and puredarwin (its successor) isn't really very active either | |||
so i'll have to see about that | 13:05 | ||
the other was about WHICH | |||
iirc, jnthn fixed WHICH instability by pre-allocating a gen2 spot for gen1 objects if they ever call WHICH | |||
but there's another, simpler and (i think) cheaper to way fix it | 13:06 | ||
which is to use allocate a number to each nursery round, and use the combination of (nursery_round << 32 | nursery_offset) as the basis for WHICH | 13:07 | ||
hmm... only one problem there, that's not going to fly well for stuff that is allocated directly into gen2 | |||
lizmat | fwiw, jnthn is commuting atm | 13:41 | |
brrt | i see | 13:42 | |
any progress on the os-x based issues yet? | |||
lizmat | no, not yet | 14:04 | |
jnthn | brrt: The trouble is, objects that make it into gen2 would then need an entry in some table saying where they used to be in the nursery | 14:08 | |
brrt | that's easy, just prefix it to the object header | 14:10 | |
i.e. instead of copying it directly, first copy it's 'WHICH' number, and then the objec theader | |||
eh object structure | |||
not just the header | |||
:-) | 14:11 | ||
\o jnthn by the way :-) | |||
14:11
Util joined
|
|||
timotimo | thank you, jimmyz for implementing these ops in the jit :) | 14:12 | |
jnthn | The whole point of this design was to not have to make objects bigger. | ||
timotimo | oh no, the object doesn't get bigger! it just takes up more space than you'd think! :P | 14:13 | |
also, code that scans the gen2, like the collector or my gdb plugin, would need to intuit if the first 64 bits are a number or part of the object header ... | |||
jnthn | Right. And you'd have to conditionally add it on to get the size bin to allocate in... | 14:14 | |
And then what if nobody does .WHICH until the ting is in gen2? | |||
So many ways for things to go wrong. | |||
timotimo | jimmyz, in my experience a nqp build and a rakudo build will exercise things like _i ops well enough | 14:16 | |
and a spectest run will exercise _n ops well, usually | 14:17 | ||
i think the nqp test suite will also exercise _n stuff a fair bit | |||
can always gather jit logs to see if the ops ended up in compiled output | |||
brrt | right timotimo :-) | 14:18 | |
jnthn | Another good way is to write a loop that uses the op you want to exercise LOADS of times :) | ||
brrt | true jnthn | ||
jnthn | That'll get it jitted :) | ||
brrt is reviewing the jit additions | |||
i thought we had cmp_i yet? | |||
timotimo | hehe | ||
we had cmp_I | 14:19 | ||
brrt | ah i see :-) | ||
14:20
JimmyZ joined
|
|||
brrt | hmmm | 14:20 | |
JimmyZ | timotimo: I don't think any test triggered the cmp_i :( | ||
I mean the jitted cmp_i | |||
timotimo | hm, OK | 14:21 | |
i'm going to have a look | |||
JimmyZ | and cmp_n | ||
brrt | hmmm | ||
JimmyZ | and maybe isnanorinf | ||
brrt | i don't think jitted cmp_i is right as it is written | ||
please confirm this for me: cmp_i does: | 14:22 | ||
setg al; movzx rax, al; setl al; movzr r8, al; sub rax, tmp3; | 14:23 | ||
al is part of rax | |||
tmp3 = r8 | |||
JimmyZ | so I can use rcx ? | 14:24 | |
brrt | hmmm | 14:26 | |
yes, for example | |||
better yet | |||
r8b | |||
known as 'TMP3b' iirc in renamed-registers | |||
JimmyZ | brrt: BTW I does `ojbdump interp.o` and see something like seta %sil, does dynasm support sil ? | 14:27 | |
brrt | sil is rsi lower byte | ||
i think so, yes | |||
corsix.github.io/dynasm-doc/instructions.html | |||
JimmyZ | sometimes I want to just copy the asm code from objdump interp.o to the jit code | ||
brrt | please be careful with that | 14:28 | |
14:28
kjs_ joined
|
|||
brrt | :-) | 14:28 | |
JimmyZ | and with a bit modification | ||
brrt | but it's a good basis to start from, i think | ||
[Coke] | moar sees a slight improvement on OS X today. | 14:30 | |
JIT failing now 75 tests, non-jit only 8 | |||
JimmyZ | ok, free to patch it or I will patch it tomorrow :) | ||
timotimo | in what way is cmp_i currently wrong? | 14:31 | |
[Coke] | (ah, the jit #'s are fluctuating. was at 14 jit failures saturdayā¦) | ||
jnthn | [Coke]: Sadly, I htink that's just the heisenbug being less heisen... | ||
Or less bug | |||
:) | |||
brrt | :-) i'll fix it | ||
jnthn | I got access to OSX and reproduced the bug, anyways. | ||
JimmyZ | Oh, don't say my wrong code makes OSX happy | ||
brrt | JimmyZ++ for working on JIT :-) | ||
not all that wrong, just not right yet | |||
JimmyZ | and how about cmp_n? | 14:32 | |
JimmyZ hasn't writting asm code for 10 years... | 14:33 | ||
*written | |||
[Coke] | jnthn: true, could just be noise. Please let me know if there's more I can do to help track this down. (aside from, you know, tracking it down and fixing it. :) | 14:35 | |
jnthn | [Coke]: Well, you could teach my TDD in Java class tomorrow if you like... :P | ||
[Coke] | That's fine, just need someone to cover my coldfusion dayjob here. | 14:39 | |
jnthn | So dependencies... :) | ||
[Coke] | but there's no coffee until they fix the water, so that's robably the hardest ask. :) | ||
jnthn | No coffee at work? That's probably illegal in Sweden... :P | 14:40 | |
brrt | i haven't looked at cmp_n and tbh i don't know the whole sse stuff that well | 14:49 | |
timotimo | using sse means our code is extra fast!! | ||
that's all i know. | |||
brrt | ... :-) | ||
why not run it on CUDA :-P | |||
timotimo | can we outsorce it into amazon's elastic cloud? | 14:50 | |
JimmyZ | Didn't you write eq_n and etc? P | 14:52 | |
brrt | eq_n is much simpler iirc | ||
perhaps not much simpler | 14:53 | ||
but simple in comparison | |||
JimmyZ | I compared gt_n in emit_x64.dasc and cmp_n in interp.o, and think cmp_n is simple too | 14:54 | |
jnthn | .oO( "in comparison" :D ) |
14:57 | |
brrt | :-) | ||
i wonder how to emit a cmp_i instruction | |||
JimmyZ | ;) | 14:59 | |
[Coke] | jnthn: I am making due with diet soda with extra caffiene. | 15:03 | |
timotimo | over the recent months i've seen more and more articles about diet soda being worse than non-diet soda | 15:05 | |
brrt | hang on, /me has fix. | ||
timotimo | allegedly the artificial sweetness signalizes "sugar's coming!" to the body, but no sugar is in fact coming in | ||
OSLT | |||
brrt finds that nonsense | |||
your brain is not that stupid | |||
as in, your brain actually senses the actual availability of glucose | 15:06 | ||
timotimo | ah? interesting :) | ||
brrt | and controls its production via the autonomic neural pathways in the liver | ||
yes. long story. i did my bachelors thesis about it | |||
timotimo | that's excellent! | 15:07 | |
brrt | :-) | 15:08 | |
jnthn | .oO( Beer is way simpler, doesn't come in diet variety at all... ) |
||
timotimo | it comes in a "root" variety, though | ||
brrt | gist.github.com/bdw/aa900db7e31128cd98a3 | 15:09 | |
timotimo | seems like a strange idea to use cmp_i and then compare > 0 %) | ||
brrt | see if i care :-P | 15:10 | |
timotimo | :) | ||
brrt | but yeah, you could do something like: my int $j := nqp::cmp_i($x, 50); - that would give you a somewhat easier readout | 15:11 | |
15:11
kjs_ joined
|
|||
brrt | anyway, which of the patches do you like better | 15:11 | |
subtract bytes and sign extends | |||
or | |||
zero extend bytes and subtract | |||
the former is one fewer operation :-) | |||
anyway, i can push the patch :-) | 15:13 | ||
(the earlier, that is) | |||
dalek | arVM: 23c16b4 | (Bart Wiegmans)++ | src/jit/emit_x64.dasc: Fix cmp_i rax and al share registers. I think it is better to explicitly use the 'renamed registers' so that it is clear when they conflict |
||
15:19
brrt left,
brrt joined
|
|||
timotimo | brrt: which operatios are cheaper on processors? :P | 15:19 | |
brrt | i have no idea | 15:20 | |
my suspicion is that sign extending is more expensive than zero extending, and that there is no difference between 64 bit and 8 bit subtract | 15:21 | ||
so that the first version might pay off | |||
JimmyZ | timotimo: Oh I know sub/add are cheaper than inc/dec, so it's good to change them | 15:23 | |
timotimo | wait what? | 15:24 | |
JimmyZ | brrt: Will you commit the gist one? which is cheaper? | ||
:P | |||
brrt | no idea | 15:25 | |
the thing is clear enough though | |||
JimmyZ | timotimo: github.com/openresty/sregex/commit...ca5e3b099a | 15:27 | |
timotimo | whaaaaaat | ||
brrt | thats lame | ||
but that may be workload dependent | 15:28 | ||
jnthn | wtf | ||
Mebbe inc is spec'd in some interesting way that makes it harder to do fast. | 15:29 | ||
timotimo | yeah, it must have some semantic difference | ||
like overflowing vs not overflowing | |||
setting exception flags or not | |||
no clue | |||
JimmyZ | I'm sure he won't only test it one time ;) | 15:30 | |
timotimo | yeah, it's very surprising | ||
if there's no significant difference between inc and add $foo, 1, inc would be implemented the exact same way as add $foo, 1 :) | 15:31 | ||
JimmyZ | +1 | 15:33 | |
brrt | could be an edge case in the microcode interpreter? | 15:35 | |
timotimo | wait | 15:36 | |
we're changing our interpreter to become a machine code compiler | |||
just so that that machine code gets interpreted again? | |||
:P | |||
JimmyZ | by different interpreters :P | 15:37 | |
brrt | yes | 15:38 | |
turtles all the way down, until you get to electrons :-) | 15:39 | ||
JimmyZ | hmm, looks like we have inc_u/dec_u but no sub_u and add_u? | 15:58 | |
brrt | does that differ much? | 15:59 | |
JimmyZ | I don't think we uses inc_u/dec_u in nqp and rakudo | 16:01 | |
jnthn: ^^ | |||
timotimo | brrt: would you like to explain how to build invokewithcapture in the jit? or do it yourself? | 16:02 | |
brrt | oh, yes, i can do both, but it has it's complexities | ||
timotimo | ok, then you implement it :P | 16:03 | |
brrt | wow man :-) | 16:06 | |
timotimo | :P | ||
brrt | basically, see MVM_jit_emit_invoke in src/jit/emit_x64.dasc? | 16:07 | |
jnthn | JimmyZ: *yet* | ||
JimmyZ | and will? | ||
brrt | well, that most of that all is nunecissary | ||
unnecissary | 16:08 | ||
unnecessary | |||
*brains* | |||
jnthn | JimmyZ: Yeah, except so | ||
JimmyZ | ok :) | ||
00:10 here , good night all | 16:09 | ||
jnthn | 'night, JimmyZ | ||
brrt | sleep well JimmyZ | 16:11 | |
in fact, invokewithcapture looks sufficiently unlike other invokes that it might be best not to share code generation between them at all | 16:12 | ||
and rather treat it like a primitive | |||
17:00
pmichaud joined
|
|||
dalek | arVM: da0ec71 | (Timo Paulssen)++ | src/profiler/ (2 files): with less naive log_allocated, annotate many more ops |
17:03 | |
timotimo | jnthn: would be glad if you could review my choices of ops to annotate | ||
jnthn | Looks sane | 17:10 | |
timotimo | jnthn: how much worth do you see in splicing BBs together if there's a bunch of BBs in a row that only have no extra successors/predecessors among them? | 17:43 | |
i'm suspecting not having to look too closely at BB boundaries and PHIs would benefit some of the more "simple minded" optimizations | 17:44 | ||
.o( and i'm still wondering how best to "normalize" register versions, so that a write to r5(2) followed by a read from r5(7) could be identified as "working with the same register" ... ) | 17:46 | ||
hoelzro | an MVMObject's contents (ex. the slots of a P6opaque) are stored after the STable, yes? | 17:53 | |
ah, I see: MVMP6opaqueBody | 17:57 | ||
timotimo | the STable is always in the header of any object | 17:59 | |
18:11
kjs_ joined
18:30
kjs_ joined
18:38
FROGGS joined
|
|||
dalek | arVM: 71a7dd0 | TimToady++ | src/6model/reprs/NFA.c: guard special cases with a single test |
19:25 | |
arVM: 89a13e2 | TimToady++ | src/6model/reprs/NFA. (2 files): calculate longlit offset at last litchar JIT invokewithcapture This is JITted as a primitive since important parts of its logic are not shared with other invoke ops. |
|||
19:26
dalek joined
19:52
vendethiel joined
|
|||
FROGGS .oO( The Last Litchar - Summer 2015 In Your Cinema ) | 20:05 | ||
timotimo | gamerhorizon.com/wp-content/uploads...ack_EN.jpg The Litchar 3 - Wild Parse | 20:12 | |
FROGGS | :D | 20:21 | |
21:05
kjs_ joined
21:16
brrt joined
21:28
kjs_ joined
22:40
kjs_ joined
23:06
tadzik joined,
pmichaud joined,
Util joined,
dalek joined
23:08
synopsebot joined
23:11
[Coke] joined
23:46
timo joined
|