| dalek | arVM: cfd7fd6 | jimmy++ | src/jit/emit_x64.dasc: jit cmp_i |
03:10 | |
|
03:11
jimmy_ joined
|
|||
| JimmyZ | ^^ I don't know how to test it. nqp::cmp_i doesn't help me :( | 03:12 | |
| dalek | arVM: e5379eb | jimmy++ | src/jit/graph.c: forgot to add cmp_i to jgb_consume_ins |
03:36 | |
|
07:21
FROGGS joined
|
|||
| dalek | arVM: 4ee566d | jimmy++ | src/jit/graph.c: jit isnanorinf |
07:51 | |
| arVM: 20b7d1c | jimmy++ | src/jit/ (2 files): jit cmp_n |
08:34 | ||
| jnthn | If you don't know how to test stuff, putting it in master is probably not the best idea, is it? | 08:35 | |
| JimmyZ | you're right, but seems I didn't break roast, I doubt nothing get jited at all | 09:37 | |
| JimmyZ waits for timotimo++ :) | |||
|
10:07
zakharyas joined
|
|||
| dalek | arVM: dbc4776 | jimmy++ | src/jit/emit_x64.dasc: small fix |
10:29 | |
|
13:02
brrt joined
|
|||
| brrt | \o | 13:02 | |
| i had two ideas while cycling today :-) | 13:04 | ||
| one is to try and debug our seemingly mac os x-only issues on opendarwin | |||
| however, opendarwin has died, and puredarwin (its successor) isn't really very active either | |||
| so i'll have to see about that | 13:05 | ||
| the other was about WHICH | |||
| iirc, jnthn fixed WHICH instability by pre-allocating a gen2 spot for gen1 objects if they ever call WHICH | |||
| but there's another, simpler and (i think) cheaper to way fix it | 13:06 | ||
| which is to use allocate a number to each nursery round, and use the combination of (nursery_round << 32 | nursery_offset) as the basis for WHICH | 13:07 | ||
| hmm... only one problem there, that's not going to fly well for stuff that is allocated directly into gen2 | |||
| lizmat | fwiw, jnthn is commuting atm | 13:41 | |
| brrt | i see | 13:42 | |
| any progress on the os-x based issues yet? | |||
| lizmat | no, not yet | 14:04 | |
| jnthn | brrt: The trouble is, objects that make it into gen2 would then need an entry in some table saying where they used to be in the nursery | 14:08 | |
| brrt | that's easy, just prefix it to the object header | 14:10 | |
| i.e. instead of copying it directly, first copy it's 'WHICH' number, and then the objec theader | |||
| eh object structure | |||
| not just the header | |||
| :-) | 14:11 | ||
| \o jnthn by the way :-) | |||
|
14:11
Util joined
|
|||
| timotimo | thank you, jimmyz for implementing these ops in the jit :) | 14:12 | |
| jnthn | The whole point of this design was to not have to make objects bigger. | ||
| timotimo | oh no, the object doesn't get bigger! it just takes up more space than you'd think! :P | 14:13 | |
| also, code that scans the gen2, like the collector or my gdb plugin, would need to intuit if the first 64 bits are a number or part of the object header ... | |||
| jnthn | Right. And you'd have to conditionally add it on to get the size bin to allocate in... | 14:14 | |
| And then what if nobody does .WHICH until the ting is in gen2? | |||
| So many ways for things to go wrong. | |||
| timotimo | jimmyz, in my experience a nqp build and a rakudo build will exercise things like _i ops well enough | 14:16 | |
| and a spectest run will exercise _n ops well, usually | 14:17 | ||
| i think the nqp test suite will also exercise _n stuff a fair bit | |||
| can always gather jit logs to see if the ops ended up in compiled output | |||
| brrt | right timotimo :-) | 14:18 | |
| jnthn | Another good way is to write a loop that uses the op you want to exercise LOADS of times :) | ||
| brrt | true jnthn | ||
| jnthn | That'll get it jitted :) | ||
| brrt is reviewing the jit additions | |||
| i thought we had cmp_i yet? | |||
| timotimo | hehe | ||
| we had cmp_I | 14:19 | ||
| brrt | ah i see :-) | ||
|
14:20
JimmyZ joined
|
|||
| brrt | hmmm | 14:20 | |
| JimmyZ | timotimo: I don't think any test triggered the cmp_i :( | ||
| I mean the jitted cmp_i | |||
| timotimo | hm, OK | 14:21 | |
| i'm going to have a look | |||
| JimmyZ | and cmp_n | ||
| brrt | hmmm | ||
| JimmyZ | and maybe isnanorinf | ||
| brrt | i don't think jitted cmp_i is right as it is written | ||
| please confirm this for me: cmp_i does: | 14:22 | ||
| setg al; movzx rax, al; setl al; movzr r8, al; sub rax, tmp3; | 14:23 | ||
| al is part of rax | |||
| tmp3 = r8 | |||
| JimmyZ | so I can use rcx ? | 14:24 | |
| brrt | hmmm | 14:26 | |
| yes, for example | |||
| better yet | |||
| r8b | |||
| known as 'TMP3b' iirc in renamed-registers | |||
| JimmyZ | brrt: BTW I does `ojbdump interp.o` and see something like seta %sil, does dynasm support sil ? | 14:27 | |
| brrt | sil is rsi lower byte | ||
| i think so, yes | |||
| corsix.github.io/dynasm-doc/instructions.html | |||
| JimmyZ | sometimes I want to just copy the asm code from objdump interp.o to the jit code | ||
| brrt | please be careful with that | 14:28 | |
|
14:28
kjs_ joined
|
|||
| brrt | :-) | 14:28 | |
| JimmyZ | and with a bit modification | ||
| brrt | but it's a good basis to start from, i think | ||
| [Coke] | moar sees a slight improvement on OS X today. | 14:30 | |
| JIT failing now 75 tests, non-jit only 8 | |||
| JimmyZ | ok, free to patch it or I will patch it tomorrow :) | ||
| timotimo | in what way is cmp_i currently wrong? | 14:31 | |
| [Coke] | (ah, the jit #'s are fluctuating. was at 14 jit failures saturdayā¦) | ||
| jnthn | [Coke]: Sadly, I htink that's just the heisenbug being less heisen... | ||
| Or less bug | |||
| :) | |||
| brrt | :-) i'll fix it | ||
| jnthn | I got access to OSX and reproduced the bug, anyways. | ||
| JimmyZ | Oh, don't say my wrong code makes OSX happy | ||
| brrt | JimmyZ++ for working on JIT :-) | ||
| not all that wrong, just not right yet | |||
| JimmyZ | and how about cmp_n? | 14:32 | |
| JimmyZ hasn't writting asm code for 10 years... | 14:33 | ||
| *written | |||
| [Coke] | jnthn: true, could just be noise. Please let me know if there's more I can do to help track this down. (aside from, you know, tracking it down and fixing it. :) | 14:35 | |
| jnthn | [Coke]: Well, you could teach my TDD in Java class tomorrow if you like... :P | ||
| [Coke] | That's fine, just need someone to cover my coldfusion dayjob here. | 14:39 | |
| jnthn | So dependencies... :) | ||
| [Coke] | but there's no coffee until they fix the water, so that's robably the hardest ask. :) | ||
| jnthn | No coffee at work? That's probably illegal in Sweden... :P | 14:40 | |
| brrt | i haven't looked at cmp_n and tbh i don't know the whole sse stuff that well | 14:49 | |
| timotimo | using sse means our code is extra fast!! | ||
| that's all i know. | |||
| brrt | ... :-) | ||
| why not run it on CUDA :-P | |||
| timotimo | can we outsorce it into amazon's elastic cloud? | 14:50 | |
| JimmyZ | Didn't you write eq_n and etc? P | 14:52 | |
| brrt | eq_n is much simpler iirc | ||
| perhaps not much simpler | 14:53 | ||
| but simple in comparison | |||
| JimmyZ | I compared gt_n in emit_x64.dasc and cmp_n in interp.o, and think cmp_n is simple too | 14:54 | |
| jnthn | .oO( "in comparison" :D ) |
14:57 | |
| brrt | :-) | ||
| i wonder how to emit a cmp_i instruction | |||
| JimmyZ | ;) | 14:59 | |
| [Coke] | jnthn: I am making due with diet soda with extra caffiene. | 15:03 | |
| timotimo | over the recent months i've seen more and more articles about diet soda being worse than non-diet soda | 15:05 | |
| brrt | hang on, /me has fix. | ||
| timotimo | allegedly the artificial sweetness signalizes "sugar's coming!" to the body, but no sugar is in fact coming in | ||
| OSLT | |||
| brrt finds that nonsense | |||
| your brain is not that stupid | |||
| as in, your brain actually senses the actual availability of glucose | 15:06 | ||
| timotimo | ah? interesting :) | ||
| brrt | and controls its production via the autonomic neural pathways in the liver | ||
| yes. long story. i did my bachelors thesis about it | |||
| timotimo | that's excellent! | 15:07 | |
| brrt | :-) | 15:08 | |
| jnthn | .oO( Beer is way simpler, doesn't come in diet variety at all... ) |
||
| timotimo | it comes in a "root" variety, though | ||
| brrt | gist.github.com/bdw/aa900db7e31128cd98a3 | 15:09 | |
| timotimo | seems like a strange idea to use cmp_i and then compare > 0 %) | ||
| brrt | see if i care :-P | 15:10 | |
| timotimo | :) | ||
| brrt | but yeah, you could do something like: my int $j := nqp::cmp_i($x, 50); - that would give you a somewhat easier readout | 15:11 | |
|
15:11
kjs_ joined
|
|||
| brrt | anyway, which of the patches do you like better | 15:11 | |
| subtract bytes and sign extends | |||
| or | |||
| zero extend bytes and subtract | |||
| the former is one fewer operation :-) | |||
| anyway, i can push the patch :-) | 15:13 | ||
| (the earlier, that is) | |||
| dalek | arVM: 23c16b4 | (Bart Wiegmans)++ | src/jit/emit_x64.dasc: Fix cmp_i rax and al share registers. I think it is better to explicitly use the 'renamed registers' so that it is clear when they conflict |
||
|
15:19
brrt left,
brrt joined
|
|||
| timotimo | brrt: which operatios are cheaper on processors? :P | 15:19 | |
| brrt | i have no idea | 15:20 | |
| my suspicion is that sign extending is more expensive than zero extending, and that there is no difference between 64 bit and 8 bit subtract | 15:21 | ||
| so that the first version might pay off | |||
| JimmyZ | timotimo: Oh I know sub/add are cheaper than inc/dec, so it's good to change them | 15:23 | |
| timotimo | wait what? | 15:24 | |
| JimmyZ | brrt: Will you commit the gist one? which is cheaper? | ||
| :P | |||
| brrt | no idea | 15:25 | |
| the thing is clear enough though | |||
| JimmyZ | timotimo: github.com/openresty/sregex/commit...ca5e3b099a | 15:27 | |
| timotimo | whaaaaaat | ||
| brrt | thats lame | ||
| but that may be workload dependent | 15:28 | ||
| jnthn | wtf | ||
| Mebbe inc is spec'd in some interesting way that makes it harder to do fast. | 15:29 | ||
| timotimo | yeah, it must have some semantic difference | ||
| like overflowing vs not overflowing | |||
| setting exception flags or not | |||
| no clue | |||
| JimmyZ | I'm sure he won't only test it one time ;) | 15:30 | |
| timotimo | yeah, it's very surprising | ||
| if there's no significant difference between inc and add $foo, 1, inc would be implemented the exact same way as add $foo, 1 :) | 15:31 | ||
| JimmyZ | +1 | 15:33 | |
| brrt | could be an edge case in the microcode interpreter? | 15:35 | |
| timotimo | wait | 15:36 | |
| we're changing our interpreter to become a machine code compiler | |||
| just so that that machine code gets interpreted again? | |||
| :P | |||
| JimmyZ | by different interpreters :P | 15:37 | |
| brrt | yes | 15:38 | |
| turtles all the way down, until you get to electrons :-) | 15:39 | ||
| JimmyZ | hmm, looks like we have inc_u/dec_u but no sub_u and add_u? | 15:58 | |
| brrt | does that differ much? | 15:59 | |
| JimmyZ | I don't think we uses inc_u/dec_u in nqp and rakudo | 16:01 | |
| jnthn: ^^ | |||
| timotimo | brrt: would you like to explain how to build invokewithcapture in the jit? or do it yourself? | 16:02 | |
| brrt | oh, yes, i can do both, but it has it's complexities | ||
| timotimo | ok, then you implement it :P | 16:03 | |
| brrt | wow man :-) | 16:06 | |
| timotimo | :P | ||
| brrt | basically, see MVM_jit_emit_invoke in src/jit/emit_x64.dasc? | 16:07 | |
| jnthn | JimmyZ: *yet* | ||
| JimmyZ | and will? | ||
| brrt | well, that most of that all is nunecissary | ||
| unnecissary | 16:08 | ||
| unnecessary | |||
| *brains* | |||
| jnthn | JimmyZ: Yeah, except so | ||
| JimmyZ | ok :) | ||
| 00:10 here , good night all | 16:09 | ||
| jnthn | 'night, JimmyZ | ||
| brrt | sleep well JimmyZ | 16:11 | |
| in fact, invokewithcapture looks sufficiently unlike other invokes that it might be best not to share code generation between them at all | 16:12 | ||
| and rather treat it like a primitive | |||
|
17:00
pmichaud joined
|
|||
| dalek | arVM: da0ec71 | (Timo Paulssen)++ | src/profiler/ (2 files): with less naive log_allocated, annotate many more ops |
17:03 | |
| timotimo | jnthn: would be glad if you could review my choices of ops to annotate | ||
| jnthn | Looks sane | 17:10 | |
| timotimo | jnthn: how much worth do you see in splicing BBs together if there's a bunch of BBs in a row that only have no extra successors/predecessors among them? | 17:43 | |
| i'm suspecting not having to look too closely at BB boundaries and PHIs would benefit some of the more "simple minded" optimizations | 17:44 | ||
| .o( and i'm still wondering how best to "normalize" register versions, so that a write to r5(2) followed by a read from r5(7) could be identified as "working with the same register" ... ) | 17:46 | ||
| hoelzro | an MVMObject's contents (ex. the slots of a P6opaque) are stored after the STable, yes? | 17:53 | |
| ah, I see: MVMP6opaqueBody | 17:57 | ||
| timotimo | the STable is always in the header of any object | 17:59 | |
|
18:11
kjs_ joined
18:30
kjs_ joined
18:38
FROGGS joined
|
|||
| dalek | arVM: 71a7dd0 | TimToady++ | src/6model/reprs/NFA.c: guard special cases with a single test |
19:25 | |
| arVM: 89a13e2 | TimToady++ | src/6model/reprs/NFA. (2 files): calculate longlit offset at last litchar JIT invokewithcapture This is JITted as a primitive since important parts of its logic are not shared with other invoke ops. |
|||
|
19:26
dalek joined
19:52
vendethiel joined
|
|||
| FROGGS .oO( The Last Litchar - Summer 2015 In Your Cinema ) | 20:05 | ||
| timotimo | gamerhorizon.com/wp-content/uploads...ack_EN.jpg The Litchar 3 - Wild Parse | 20:12 | |
| FROGGS | :D | 20:21 | |
|
21:05
kjs_ joined
21:16
brrt joined
21:28
kjs_ joined
22:40
kjs_ joined
23:06
tadzik joined,
pmichaud joined,
Util joined,
dalek joined
23:08
synopsebot joined
23:11
[Coke] joined
23:46
timo joined
|
|||