dalek arVM: cfd7fd6 | jimmy++ | src/jit/emit_x64.dasc:
jit cmp_i
03:10
03:11 jimmy_ joined
JimmyZ ^^ I don't know how to test it. nqp::cmp_i doesn't help me :( 03:12
dalek arVM: e5379eb | jimmy++ | src/jit/graph.c:
forgot to add cmp_i to jgb_consume_ins
03:36
07:21 FROGGS joined
dalek arVM: 4ee566d | jimmy++ | src/jit/graph.c:
jit isnanorinf
07:51
arVM: 20b7d1c | jimmy++ | src/jit/ (2 files):
jit cmp_n
08:34
jnthn If you don't know how to test stuff, putting it in master is probably not the best idea, is it? 08:35
JimmyZ you're right, but seems I didn't break roast, I doubt nothing get jited at all 09:37
JimmyZ waits for timotimo++ :)
10:07 zakharyas joined
dalek arVM: dbc4776 | jimmy++ | src/jit/emit_x64.dasc:
small fix
10:29
13:02 brrt joined
brrt \o 13:02
i had two ideas while cycling today :-) 13:04
one is to try and debug our seemingly mac os x-only issues on opendarwin
however, opendarwin has died, and puredarwin (its successor) isn't really very active either
so i'll have to see about that 13:05
the other was about WHICH
iirc, jnthn fixed WHICH instability by pre-allocating a gen2 spot for gen1 objects if they ever call WHICH
but there's another, simpler and (i think) cheaper to way fix it 13:06
which is to use allocate a number to each nursery round, and use the combination of (nursery_round << 32 | nursery_offset) as the basis for WHICH 13:07
hmm... only one problem there, that's not going to fly well for stuff that is allocated directly into gen2
lizmat fwiw, jnthn is commuting atm 13:41
brrt i see 13:42
any progress on the os-x based issues yet?
lizmat no, not yet 14:04
jnthn brrt: The trouble is, objects that make it into gen2 would then need an entry in some table saying where they used to be in the nursery 14:08
brrt that's easy, just prefix it to the object header 14:10
i.e. instead of copying it directly, first copy it's 'WHICH' number, and then the objec theader
eh object structure
not just the header
:-) 14:11
\o jnthn by the way :-)
14:11 Util joined
timotimo thank you, jimmyz for implementing these ops in the jit :) 14:12
jnthn The whole point of this design was to not have to make objects bigger.
timotimo oh no, the object doesn't get bigger! it just takes up more space than you'd think! :P 14:13
also, code that scans the gen2, like the collector or my gdb plugin, would need to intuit if the first 64 bits are a number or part of the object header ...
jnthn Right. And you'd have to conditionally add it on to get the size bin to allocate in... 14:14
And then what if nobody does .WHICH until the ting is in gen2?
So many ways for things to go wrong.
timotimo jimmyz, in my experience a nqp build and a rakudo build will exercise things like _i ops well enough 14:16
and a spectest run will exercise _n ops well, usually 14:17
i think the nqp test suite will also exercise _n stuff a fair bit
can always gather jit logs to see if the ops ended up in compiled output
brrt right timotimo :-) 14:18
jnthn Another good way is to write a loop that uses the op you want to exercise LOADS of times :)
brrt true jnthn
jnthn That'll get it jitted :)
brrt is reviewing the jit additions
i thought we had cmp_i yet?
timotimo hehe
we had cmp_I 14:19
brrt ah i see :-)
14:20 JimmyZ joined
brrt hmmm 14:20
JimmyZ timotimo: I don't think any test triggered the cmp_i :(
I mean the jitted cmp_i
timotimo hm, OK 14:21
i'm going to have a look
JimmyZ and cmp_n
brrt hmmm
JimmyZ and maybe isnanorinf
brrt i don't think jitted cmp_i is right as it is written
please confirm this for me: cmp_i does: 14:22
setg al; movzx rax, al; setl al; movzr r8, al; sub rax, tmp3; 14:23
al is part of rax
tmp3 = r8
JimmyZ so I can use rcx ? 14:24
brrt hmmm 14:26
yes, for example
better yet
r8b
known as 'TMP3b' iirc in renamed-registers
JimmyZ brrt: BTW I does `ojbdump interp.o` and see something like seta %sil, does dynasm support sil ? 14:27
brrt sil is rsi lower byte
i think so, yes
corsix.github.io/dynasm-doc/instructions.html
JimmyZ sometimes I want to just copy the asm code from objdump interp.o to the jit code
brrt please be careful with that 14:28
14:28 kjs_ joined
brrt :-) 14:28
JimmyZ and with a bit modification
brrt but it's a good basis to start from, i think
[Coke] moar sees a slight improvement on OS X today. 14:30
JIT failing now 75 tests, non-jit only 8
JimmyZ ok, free to patch it or I will patch it tomorrow :)
timotimo in what way is cmp_i currently wrong? 14:31
[Coke] (ah, the jit #'s are fluctuating. was at 14 jit failures saturdayā€¦)
jnthn [Coke]: Sadly, I htink that's just the heisenbug being less heisen...
Or less bug
:)
brrt :-) i'll fix it
jnthn I got access to OSX and reproduced the bug, anyways.
JimmyZ Oh, don't say my wrong code makes OSX happy
brrt JimmyZ++ for working on JIT :-)
not all that wrong, just not right yet
JimmyZ and how about cmp_n? 14:32
JimmyZ hasn't writting asm code for 10 years... 14:33
*written
[Coke] jnthn: true, could just be noise. Please let me know if there's more I can do to help track this down. (aside from, you know, tracking it down and fixing it. :) 14:35
jnthn [Coke]: Well, you could teach my TDD in Java class tomorrow if you like... :P
[Coke] That's fine, just need someone to cover my coldfusion dayjob here. 14:39
jnthn So dependencies... :)
[Coke] but there's no coffee until they fix the water, so that's robably the hardest ask. :)
jnthn No coffee at work? That's probably illegal in Sweden... :P 14:40
brrt i haven't looked at cmp_n and tbh i don't know the whole sse stuff that well 14:49
timotimo using sse means our code is extra fast!!
that's all i know.
brrt ... :-)
why not run it on CUDA :-P
timotimo can we outsorce it into amazon's elastic cloud? 14:50
JimmyZ Didn't you write eq_n and etc? P 14:52
brrt eq_n is much simpler iirc
perhaps not much simpler 14:53
but simple in comparison
JimmyZ I compared gt_n in emit_x64.dasc and cmp_n in interp.o, and think cmp_n is simple too 14:54
jnthn
.oO( "in comparison" :D )
14:57
brrt :-)
i wonder how to emit a cmp_i instruction
JimmyZ ;) 14:59
[Coke] jnthn: I am making due with diet soda with extra caffiene. 15:03
timotimo over the recent months i've seen more and more articles about diet soda being worse than non-diet soda 15:05
brrt hang on, /me has fix.
timotimo allegedly the artificial sweetness signalizes "sugar's coming!" to the body, but no sugar is in fact coming in
OSLT
brrt finds that nonsense
your brain is not that stupid
as in, your brain actually senses the actual availability of glucose 15:06
timotimo ah? interesting :)
brrt and controls its production via the autonomic neural pathways in the liver
yes. long story. i did my bachelors thesis about it
timotimo that's excellent! 15:07
brrt :-) 15:08
jnthn
.oO( Beer is way simpler, doesn't come in diet variety at all... )
timotimo it comes in a "root" variety, though
brrt gist.github.com/bdw/aa900db7e31128cd98a3 15:09
timotimo seems like a strange idea to use cmp_i and then compare > 0 %)
brrt see if i care :-P 15:10
timotimo :)
brrt but yeah, you could do something like: my int $j := nqp::cmp_i($x, 50); - that would give you a somewhat easier readout 15:11
15:11 kjs_ joined
brrt anyway, which of the patches do you like better 15:11
subtract bytes and sign extends
or
zero extend bytes and subtract
the former is one fewer operation :-)
anyway, i can push the patch :-) 15:13
(the earlier, that is)
dalek arVM: 23c16b4 | (Bart Wiegmans)++ | src/jit/emit_x64.dasc:
Fix cmp_i

rax and al share registers. I think it is better to explicitly use the 'renamed registers' so that it is clear when they conflict
15:19 brrt left, brrt joined
timotimo brrt: which operatios are cheaper on processors? :P 15:19
brrt i have no idea 15:20
my suspicion is that sign extending is more expensive than zero extending, and that there is no difference between 64 bit and 8 bit subtract 15:21
so that the first version might pay off
JimmyZ timotimo: Oh I know sub/add are cheaper than inc/dec, so it's good to change them 15:23
timotimo wait what? 15:24
JimmyZ brrt: Will you commit the gist one? which is cheaper?
:P
brrt no idea 15:25
the thing is clear enough though
JimmyZ timotimo: github.com/openresty/sregex/commit...ca5e3b099a 15:27
timotimo whaaaaaat
brrt thats lame
but that may be workload dependent 15:28
jnthn wtf
Mebbe inc is spec'd in some interesting way that makes it harder to do fast. 15:29
timotimo yeah, it must have some semantic difference
like overflowing vs not overflowing
setting exception flags or not
no clue
JimmyZ I'm sure he won't only test it one time ;) 15:30
timotimo yeah, it's very surprising
if there's no significant difference between inc and add $foo, 1, inc would be implemented the exact same way as add $foo, 1 :) 15:31
JimmyZ +1 15:33
brrt could be an edge case in the microcode interpreter? 15:35
timotimo wait 15:36
we're changing our interpreter to become a machine code compiler
just so that that machine code gets interpreted again?
:P
JimmyZ by different interpreters :P 15:37
brrt yes 15:38
turtles all the way down, until you get to electrons :-) 15:39
JimmyZ hmm, looks like we have inc_u/dec_u but no sub_u and add_u? 15:58
brrt does that differ much? 15:59
JimmyZ I don't think we uses inc_u/dec_u in nqp and rakudo 16:01
jnthn: ^^
timotimo brrt: would you like to explain how to build invokewithcapture in the jit? or do it yourself? 16:02
brrt oh, yes, i can do both, but it has it's complexities
timotimo ok, then you implement it :P 16:03
brrt wow man :-) 16:06
timotimo :P
brrt basically, see MVM_jit_emit_invoke in src/jit/emit_x64.dasc? 16:07
jnthn JimmyZ: *yet*
JimmyZ and will?
brrt well, that most of that all is nunecissary
unnecissary 16:08
unnecessary
*brains*
jnthn JimmyZ: Yeah, except so
JimmyZ ok :)
00:10 here , good night all 16:09
jnthn 'night, JimmyZ
brrt sleep well JimmyZ 16:11
in fact, invokewithcapture looks sufficiently unlike other invokes that it might be best not to share code generation between them at all 16:12
and rather treat it like a primitive
17:00 pmichaud joined
dalek arVM: da0ec71 | (Timo Paulssen)++ | src/profiler/ (2 files):
with less naive log_allocated, annotate many more ops
17:03
timotimo jnthn: would be glad if you could review my choices of ops to annotate
jnthn Looks sane 17:10
timotimo jnthn: how much worth do you see in splicing BBs together if there's a bunch of BBs in a row that only have no extra successors/predecessors among them? 17:43
i'm suspecting not having to look too closely at BB boundaries and PHIs would benefit some of the more "simple minded" optimizations 17:44
.o( and i'm still wondering how best to "normalize" register versions, so that a write to r5(2) followed by a read from r5(7) could be identified as "working with the same register" ... ) 17:46
hoelzro an MVMObject's contents (ex. the slots of a P6opaque) are stored after the STable, yes? 17:53
ah, I see: MVMP6opaqueBody 17:57
timotimo the STable is always in the header of any object 17:59
18:11 kjs_ joined 18:30 kjs_ joined 18:38 FROGGS joined
dalek arVM: 71a7dd0 | TimToady++ | src/6model/reprs/NFA.c:
guard special cases with a single test
19:25
arVM: 89a13e2 | TimToady++ | src/6model/reprs/NFA. (2 files):
calculate longlit offset at last litchar

JIT invokewithcapture
This is JITted as a primitive since important parts of its logic are not shared with other invoke ops.
19:26 dalek joined 19:52 vendethiel joined
FROGGS .oO( The Last Litchar - Summer 2015 In Your Cinema ) 20:05
timotimo gamerhorizon.com/wp-content/uploads...ack_EN.jpg The Litchar 3 - Wild Parse 20:12
FROGGS :D 20:21
21:05 kjs_ joined 21:16 brrt joined 21:28 kjs_ joined 22:40 kjs_ joined 23:06 tadzik joined, pmichaud joined, Util joined, dalek joined 23:08 synopsebot joined 23:11 [Coke] joined 23:46 timo joined