timotimo huh 00:04
why is there an implementation of istype in the jit?
1) it turns into a call to MVM_6model_istype, but pretends the return value is void and
2) it can be invokish
dalek arVM/jit-moar-ops: 8c79da9 | (Timo Paulssen)++ | src/jit/graph.c:
istype is invokish and its return value ought to be INT, not VOID

so i just comment it out for the time being.
00:07
arVM/jit-moar-ops: d25255a | (Timo Paulssen)++ | src/jit/graph.c:
support for bindpos_i is trivial to add
arVM/jit-moar-ops: f5229ae | (Timo Paulssen)++ | src/ (3 files):
add isint/isnum/isstr/islist/ishash

in the interest of finding out what other ops will cause bails down the road. These ops ought to be implemented by generating a tiny bit of assembly code instead of a 15cb79e | (Timo Paulssen)++ | src/ (3 files): add isint/isnum/isstr/islist/ishash
in the interest of finding out what other ops will cause bails down the road. These ops ought to be implemented by generating a tiny bit of assembly code instead of a C call, but i leave that for brrt++ to do.
timotimo wtf 00:11
out of nowhere, 128 bails for jumplist have appeared
with jit-moar-ops, i cannot build nqp any more. i get a strange error: 00:19
/usr/bin/perl tools/build/gen-cat.pl moar src/QRegex/P5Regex/Grammar.nqp src/QRegex/P5Regex/Actions.nqp src/QRegex/P5Regex/Compiler.nqp > gen/moar/stage2/NQPP5QRegex.nqp
./nqp-m --target=mbc --output=NQPP5QRegex.moarvm \
gen/moar/stage2/NQPP5QRegex.nqp
Unhandled exception: Missing or wrong version of dependency 'gen/moar/stage1/nqpmo.nqp'
at <unknown>:1 (./NQPHLL.moarvm::1070)
well, jumplist is getting more and more interesting :) 00:31
dalek arVM/jit-moar-ops: 7483c74 | (Timo Paulssen)++ | src/jit/graph.c:
implement iscclass
00:37
arVM/jit-moar-ops: 55a74f0 | (Timo Paulssen)++ | src/jit/graph.c:
implement nfarunalt
arVM/jit-moar-ops: f5cd5be | (Timo Paulssen)++ | src/jit/graph.c:
implement substr_s and index_s
timotimo sadly can't implement ne_s, as it'd need a negation in addition to the call to MVM_string_equal ... 00:39
i think that's pretty much all the hot ops i can implement properly 00:42
the rest is just ops that we run into fewer than 10 times 00:43
hoelzro I would say "cool", but I don't really understand what you're doing right now timotimo =) 00:45
dalek arVM/jit-moar-ops: cc9e819 | (Timo Paulssen)++ | src/jit/graph.c:
fix setelemspos
00:47
arVM/jit-moar-ops: bd0c3b9 | (Timo Paulssen)++ | src/jit/graph.c:
push_n and push_s are also easy.
timotimo well
the jit will be invoked for every frame that spesh generates and that is considered even hotter still 00:48
it will just optimistically run and as soon as it sees an op that it doesn't know, it'll bail
it also writes "BAIL" and the name of the op into the jitlog
hoelzro ah, I read about that on your blog
timotimo yup
hoelzro pretty cool stuff =)
timotimo these ops i'm implementing are ops that i can just map 1:1 to C function calls
hoelzro but something about how the JIT doesn't generate faster code yet? 00:49
timotimo well, there's multiple parts to that problem
for one, it doesn't know how to deal with perl6's extops, like p6bind, p6bool, ...
so it wouldn't be used in any of the benchmarks we have
hoelzro ah ha
timotimo the other part is that the "mainline" of programs have variadic arguments, which makes spesh itself bail before even trying to do anything 00:50
hoelzro so it may benefit NQP, but not Rakudo (yet)?
timotimo so neither the perl6 benchmarks nor the nqp benchmarks will show any difference at all
except slower run-time, because it tries to jit things here and there
hoelzro ah ha
timotimo it can benefit perl6 programs, but only for frames where we generate super tight code and also don't hit any extops
hoelzro I see
timotimo i have no idea how many that would be
gist.github.com/timo/f92ff69eb4045.../revisions ā† look at these diffs, it's fun to look at how implementing one op removes it completely, but ups the counter on a bunch of other ops 00:51
hoelzro yiks 00:52
*yikes
timotimo refresh for one more
ah, look!
i implemented push_s and 4 bails disappeared, no bails were added
that means we were able to jit-compile 4 more frames! :)
oh, interesting 00:53
all of these 4 frames were versions of !dba
hoelzro \o/ 00:55
timotimo the sum of the "bytecode size" lines is 429597
whatever that means :)
hoelzro shrugs 00:56
I burned myself out on Moar stuff last night
so timotimo++ for soldiering on 00:57
timotimo gist.github.com/timo/f13611f6d587bb1e9188 - lookie here 00:59
hoelzro whoa 01:00
timotimo turning off the jit actually makes it faster at the moment
hoelzro 40 second parse?
timotimo but it's still way faster than without spesh at all
yeah, this is just my laptop
hoelzro I think I get 120 seconds
timotimo huh? 01:01
when was that %)
hoelzro tries again
timotimo well, okay, this laptop is only half a year old
hoelzro gah, I'll wait until after nqp-js tests fniish
timotimo and was pretty up-to-date at that time
hoelzro I built my desktop like 3 months ago o_O
and it's the fastest machine I've built perl6 on!
ok, let's see... 01:02
timotimo there must be something wrong, surely
hoelzro ok, I'm on drugs 01:05
46 seconds
maybe I was thinking about parrot speeds?
hoelzro should really resume S26 work
timotimo i like S26 01:06
hoelzro you wanna help? =)
timotimo i've already done a lot way back when :S 01:07
the code was strenuous to wade through, tbh
i had hoped i could refactor the parsing of pod completely
make it more "parametric"
hoelzro ugh, I had the same hope
at this point, I feel like I've done an "ok" job
timotimo (>^_^)> 01:08
hoelzro there's still much to be done
so I should stop chasing Moar bugs, playing with nqp-js, and messing with lib/Test.pm =) 01:09
timotimo now i implemented tc, lc, uc and it split into le_s (which is now at 1) and ordat (which is now at 6)
hoelzro tbf, the first was a result of the last!
huh
it's like trying to plug a leaky boat
timotimo ordat wants a bit more than just a call to the C, it wants to verify the string's length, too
well, yeah :) 01:10
but it's kind of a fun boat
hoelzro =)
timotimo tomorrow brrt will implement sp_findmeth, which will cause an explosion of new bails
hoelzro oh joy 01:11
timotimo it'll potentially spread out into 358 individual ops and ++ all of them :P
hoelzro .oO( bailing out a boat? )
timotimo (i don't think we have that many ops)
oh, huh
650 is the last line of the oplist file that contains non-sp ops
hoelzro so, how does that work? I figured once you implemented the JITing for one op, it wouldn't show up again?
timotimo 40 is the first one
so we have 610 individual ops? wow.
that's right 01:12
hoelzro so everytime you implement one, you potentially incr the counts of other unimpl'd ones?
timotimo yup
maybe even from 0
hoelzro grr 01:13
timotimo look at the very first diff on that page
that was when i implemented bindpos_i
32 less for bindpos_i
1 more iscclass, 1 more indexat, 20 more islist, 10 more sp_findmeth
hoelzro nuts 01:14
timotimo so not a single extra frame got compiled after that
but later i implemented islist
that's the next diff, one up
that diff confused me, though
didn't actually do the counts there 01:15
oh, i also un-implemented istype there
i'm surprised it didn't blow up majorly; istype would always leave the value in the register untouched
oh, hold on 01:16
i think i was wrong about *that* part
but still, istype is an invokish op, the jit has to be extra careful around these
hoelzro I see
it's pretty cool
I can't wait to see how well Moar does after the JIT is fully in place 01:17
timotimo aye.
i wonder how long the core setting compilation would take if we subtracted the time it takes to attempt a compilation and bail in the middle of it 01:18
i.e. if it'd be worth it already 01:19
the thing is, we still read and write from the working memory before and after every single translated op
brrt is going to implement a nontrivial feature in dynasm for that to improve 01:20
TimToady you might want some kind of pragma to say "Don't attempt to jit this bit for now."
timotimo i.e. keeping values in registers between ops
TimToady if you know it's going to bail
timotimo i'd assume it's not worth the time it'd take to implement :) 01:21
TimToady 'course then you fix the optimizer to work with that bit, and then it doesn't jit, d'oh
timotimo gotta run for now :) 01:22
hoelzro later timotimo
timotimo :) 01:26
02:01 btyler joined 02:05 btyler joined 02:29 jimmyz joined
jimmyz Stage parse : 39.273, before: 44 02:29
since yesterday 02:30
timotimo sweet :) 02:31
jnthn made some pretty radical improvements in nqp, and a few in moarvm as well
i'm assuming you're on nom/master/master?
02:35 btyler joined
jimmyz timotimo: yeah 02:37
timotimo yeah what exactly? :)
lizmat: just don't try the moarvm/jit-moar-ops branch with --enable-jit; it's slower than what we have with spesh, but it's still faster than without spesh 02:38
i only really wanted to benchmark the json thing, but i ended up kicking off a complete benchmark run ... oh well 02:39
when it's done with rakudo-moar, nqp-moar will only take about half the time ... :)
jimmyz timotimo: I'm on nom/master/master :P 02:40
02:41 btyler joined
timotimo ah 02:44
03:42 itz joined 04:19 ventica joined
ventica o/ 04:19
05:36 jnap joined 05:59 FROGGS joined
nwc10 jnthn: setting... 06:02
was: cmd: Rounded run time per iteration: 7.3881e+01 +/- 9.3e-02 (0.1%)
now: cmd: Rounded run time per iteration: 7.1820e+01 +/- 4.2e-02 (0.1%)
win. (about 2.5% win)
06:14 brrt joined
brrt \o 06:15
timotimo: istype is correct i believe, the function is actually marked as invokish in oplist (which means it's handled), and it's passed the address of the destination register, thus the return value is void
thanks for bringing it to my attention, though :-) 06:16
sergot hi o/ 06:17
brrt also, seeing that core.setting benchmark makes me sad 06:18
o/ sergot
oh, you said that already 06:21
much timotimo++ for hard work, though 06:22
06:40 jnap joined 06:54 zakharyas joined
nwc10 brrt: which benchmark, and why sad? 07:18
brrt nwc10: gist.github.com/timo/f13611f6d587bb1e9188 this one
and although i think there's a logical enough explanation for it, it still hurts
nwc10 I didn't say, but I didn't see faster setting compiles with JIT 07:19
07:40 oetiker joined 07:46 jnap joined, oetiker joined
nwc10 ==30641== in use at exit: 776,094,561 bytes in 2,625,245 blocks 08:02
==30641== total heap usage: 17,741,594 allocs, 15,116,349 frees, 3,670,670,682 bytes allocated
not quite a fair comparison with irclog.perlgeek.de/moarvm/2014-07-29#i_9101664 as the former is the JIT, the latter is not, and JIT will allocate more 08:03
botu quite a lot less allocation
08:03 FROGGS joined
brrt is further triangulating the regex bug 08:06
hmm, very interesting 08:42
brrt is now hoping this is some sort of OSR bug 08:44
correction, a JIT-OSR bug 08:47
and, it's not that 08:51
/me brb 08:52
09:03 jnap joined 09:25 brrt joined
brrt ok, long story short, it isn't OSR 09:37
jnthn tries to find yesterday's number 09:38
(for allocated)
brrt o/ jnthn 09:39
jnthn nwc10: oh wow, that's like, a third to a quarter of what it was. 09:40
o/ brrt
In other news, yay, I have a working keyboard
On my laptop
brrt \o/
jnthn :))))) <== I can smile again and it works
brrt in other news, i'm hardly a step further in the regex bug business 09:41
jnthn Regex bug?
It was a mis-code-gen, or?
brrt probably 09:42
basically, the t/qregex/01-qregex.nqp test breaks reliably with iter, but not without 09:43
09:43 japhb_ joined
brrt because the QRegex compiler doesn't compile the regex with jit 09:43
and.. i have /still/ no idea why exactly this is 09:44
09:44 [Coke]_ joined, TimToady_ joined 09:45 cognome joined
brrt boxing ops aren't normally invokish, are they? 10:06
jnthn No
Never
brrt hmm no, you're right 10:08
that'd be weird
ok, no need to look there 10:09
jnthn lunch; bbiab 10:12
10:25 jnap joined
brrt back from lunch :-) 10:37
y this no work 10:50
nwc10 jnthn: as stated in #perl6, MoarVM about 6.5 times faster than parrot at compiling the setting 10:53
will take 27 more 2.5% speedups to get to a factor of 10 10:56
brrt \o/
brrt wonders if JIT will get us there once it ACTUALLY WORKS
and what an actual optimizing JIT could make of regexes
well, it isn't istrue combined with iter 11:18
because that looks just dandy
tl;dr 11:19
'ok frame is ok'
11:28 jnap joined
brrt y u compile so much 11:44
timotimo brrt: do you feel like jit-moar-ops could be merged into master? it seems to compile rakudo and nqp just fine 12:28
12:30 Ven joined
timotimo also, i would revert the "omg istype is probably broken!" commit 12:30
oh, i was wrong, there's still the "missing or wrong version of dependency nqpmo.nqp" thing when building nqp
brrt make clean 12:31
:-)
i'll check it in a bit 12:32
i'm stepping through the frame now
timotimo make clean doesn't help 12:34
brrt hmm what
weird
32 bytes seems a bit small for fastcreate, no? 12:43
timotimo don't know?
don't understand the context
i was wondering ... if we have a string that's mostly ascii-encodable, but has a few multibyte-utf8-chars here and there, we could pick it apart into multiple strands and have direct jump-to-correct-byte-for-offset access to big parts of the string 12:58
nwc10 you can do that by using any fixed with encoding, without needing strands 12:59
timotimo yeah, but if we have a 20k character string with a single ƶ in it, we'd blow it up to 4x the size (if we use ucs-32) 13:00
nwc10 possibly at the cost of more memory. But the overhead of tracking the multiple strands will not be free
agree. but implementing the 8 bit NFG would also address this
timotimo next thing i'll do is get a c-level profile of parse-json
nwc10 mostly it's only going to be 2x the size, if we can use ucs-16 13:01
timotimo except ucs-16 isn't a fixed-width encoding 13:02
well, not in the strict sense
if we do analyze the string up front, then yeah
set and decont are kinda hot 13:07
interesting, guardconc is also considered hot 13:09
well, all these "hot" things are probably just "warm" 13:10
another warm instruction is in getspeshslot
brrt getspeshslot? i thought i did that already
as well as guardconc
timotimo i'm probably reading this wrong 13:11
this is trying (and failing) to interpret a report from perf, not a jit bail log
MVM_frame_decref is warm, too 13:12
are you interested in a bail log for parse-json? 13:14
gist.github.com/timo/89aafc240d13748e4278 - this is from the perl6 version of that benchmark 13:15
AFK
brrt yes, i am 13:20
paramnameused is a big one
I CAN"T FIND THE SOURCE OF THIS BLOODY BUG
and i'm not starting anything until i found it 13:23
jnthn brrt: Hm, I should probably take a look this evening :) 13:35
See if I can spot anything of note :) 13:36
brrt i'd be delighted :-) 13:37
this feels something between Real Work and a Complete Waste Of Time
13:38 klaas-janstol joined
brrt because i'm truly so little further 13:38
jnthn What do we know so far? Where do you have to disable JIT to make it work? 13:39
brrt what i know is that disabling iter and friends make it run, if only because it disables many if not all regex methods 13:40
jnthn ah, ok
Well, every regex method uses jumptable too 13:41
So it won't be a regex method per se
It may well be something in the regex engine though
brrt timotimo also reported not being able to run because of missing or wrong dependency complaints, but i typically get that when moarvm and nqp no longer feel like they agree
jnthn like MATCH or CAPHASH
brrt well, i know have the exact frame that was compiled just the moment before the errors start
so i'm hoping that should give some insight 13:42
and yes, it uses iter
but i've not seen anything weird yet
timotimo brrt: fwiw, disabling the jit for only the single call that fails with "blah dependency" doesn't fix it; disabling the jit completely does, however 14:32
brrt you've made quite a few problems for me this week :-) 14:33
timotimo the good kind of problem, i hope 14:35
brrt pff 14:36
lets just say you've uncovered a boat load of jit bugs :-)
timotimo :)
brrt waitaminute 14:38
why... why does shift_o bottom out as an u8? 14:40
timotimo oh noes! did i do that wrong? ;( 14:45
i don't see anything wrong with my implementation off-hand
dalek arVM/jit-moar-ops: e4f28ab | (Timo Paulssen)++ | src/jit/graph.c:
istype is actually totally correct; i didn't see the REG_ADDR

also, it's marked as "invokish", so the jit knows how to deal with that part of the problem at least.
14:47
brrt no, probably not
hmm.. gdb is just funky i guess 14:52
timotimo that's not a good sign ... 14:54
brrt that's optimization for you 14:55
dalek arVM/jit-moar-ops: 542fe59 | (Timo Paulssen)++ | src/jit/graph.c:
implement uc, lc, tc.
15:05
arVM/jit-moar-ops: 5239e3e | (Timo Paulssen)++ | src/ (3 files):
implement splice.
15:06
arVM/jit-moar-ops: bca161a | (Timo Paulssen)++ | src/jit/graph.c:
implement split.
15:24
timotimo implementing split caused 2 more frames to be compiled in the core setting! :)
1.5k frames still bailing out, though
[Coke] q; is having other folks committing to the jit stuff at all confusing for final grading? 15:26
(it's awesome for the community and I entirely support it; just don't want to screw up grading)
timotimo the contributions i make are all trivial
brrt \o/ 15:39
not so trivial, they have to be correct too
git can get you a log of all my commits and changes
15:43 ventica joined
TimToady brrt++'s work is already awesome; it would be difficult to ruin his grade :) 15:49
just having the framework in place is a great thing, even if there are still enough opcodes and/or bugs at the end to mask the eventual performance improvements 15:51
JIT by definition tends to perform poorly when there are scattered weak links in the chain of opcodes, which can result when you have code in a language that is not well designed for JIT :) 15:53
brrt i would add that if you didn't know you were going to do a JIT right at the start (e.g., LuaJIT-2.0), it can be difficult in any language
TimToady I think the JIT performance vs effort curve will typically look like a hockey stick. 15:54
brrt and i would further add that some things which are o so simple in plain-old-c can be devillish in ASM :-)
brrt certainly hopes so
until we hit another barrier
and then it'll be flat again for a while, and given enough effort, might increase again 15:55
TimToady pity we don't have a language designer around to fix the language... :)
brrt so perhaps sigmoidal?
TimToady prolly
carlin sounds like a stair-case 15:57
TimToady "It'f fubconfiouf!" --Sigmoid Frund
brrt dinner & 15:59
16:45 brrt joined
brrt we should be able to dump the moarvm call stack using moar-gdb.py 16:46
jnthn Can you not just MVM_dump_backtrace(tc); in gdb?
timotimo you should be able to 16:47
16:50 klaas-janstol joined 17:02 FROGGS joined 17:33 vendethiel joined
brrt i'll try, but tc has been... optimized out :-( 17:39
why, that does explain a lot 17:41
although it puzzles me why there's an invoke there 17:51
jnthn "there"?
jnthn has nommed and will attempt to work in his unwanted free home sauna
brrt there is this, it seems 17:53
github.com/perl6/nqp/blob/master/s...r.nqp#L353 17:54
jnthn I'm extremely surprised if you're in a JIT-compiled cclass_elem in so far as we don't compile jumptabl yet... 17:55
brrt well, the code fragment within is compilable :-)
jnthn ah, yes
brrt anyway, bbi2h or so :-) 17:56
jnthn ok
brrt i think my %seen creates a bindlex?
jnthn It's just a decl afaik
brrt there's a bindlex i can't really explain otherwise 17:57
but, i'll be really of now
jnthn OK. Lemme know when you're back; I can look through the disassembly. 17:58
m: say 1041064 * 64 18:05
camelia rakudo-moar 11e193: OUTPUTĀ«66628096ā¤Ā»
jnthn m: say (1041064 * 64) / 4194304 18:06
camelia rakudo-moar 11e193: OUTPUTĀ«15.885376ā¤Ā»
timotimo what awesome patch do you have for us now? :) 18:11
also ... aaw, no brrt for 2 hours? :(
18:14 Ven joined
jnthn Will let you know in 5 mins if I have one :) 18:24
TimToady is always disappointed when he reads that jnthn will not be available till evening, till he remembers that jnthn's evening comes, like, nine hours earlier :) 18:35
jnthn :)
I realized that every alternation and protoregex evaluation ended up taking a closure for a needless reason.
FROGGS I believe there a a lot of spots in our codebase like that 18:36
TimToady sugoi! 18:37
which, as in english awesome/awful or terrific/terrible, can mean both really good and really bad, though usually good 18:38
cf awfully good and terribly good
not that STD's regex engine isn't chockablock full of closures to emulate laziness... 18:41
jnthn m: say 5524580 - 4483251 18:48
camelia rakudo-moar 509b1a: OUTPUTĀ«1041329ā¤Ā»
jnthn Yeah, I'll happily make a million less closures, thanks.
TimToady
.oO(we merely have to call the slow-path binder a million times instead...)
18:49
jnthn There's no slow-path binder in NQP :) 18:50
18:59 cognome joined, cognominal__ joined 19:02 Ven joined
japhb_ Waiting for the day when jnthn can say "There's no slow path in NQP" ... 19:06
19:35 brrt joined, cognome joined
brrt jnthn, timotimo: i'm back 19:35
timotimo yay :) 19:37
turns out i was able to occupy myself with friends & food until brrt came back :)
brrt i'll have fully annotated dissassembly in about an hour or 2
(annotating dissassembly is expensive) 19:38
dalek arVM: d51a9cf | jnthn++ | src/ (2 files):
Cache fates array rather than re-allocating.
arVM: a4ac569 | jnthn++ | / (3 files):
Cache frame index, to avoid a linear scan.

Turns out linear scans for frame indexes dominated assembly time. With this, stage mbc for Rakudo's CORE.setting is a third of what it was.
jnthn m: say 35 + 73 19:39
camelia rakudo-moar c9ad80: OUTPUTĀ«108ā¤Ā»
jnthn Well under 2 minutes for a full NQP and Rakudo re-build on my box these days. :) 19:40
vendethiel you got a good box :P 19:41
what was it like, with last year's parrot ? :) 19:42
jnthn I forget. I suspect CORE.setting alone was in the 2 minute region though.
timotimo holy hell! 19:43
down to a third %)
that is pretty fantastic
jnthn Stage mbc was the shortest stage. But still, it's a nice win. :) 19:44
timotimo i always thought it ought to be faster than that 19:45
the closure thing ought to make all our parsing faster,n o? 19:47
btyler 'Stage mbc : 0.539' :)
hoelzro great scott 19:48
timotimo er ... huh
oh!
stage mbc
i thought we were talking about stage mast
well, that's still a nice little win :)
jnthn btyler: Heh, 0.287 here :) 19:50
timotimo 0.275 19:51
i win :)
jnthn :P 19:55
tadzik "stage mbc for Rakudo's CORE.setting is a third of what it was" ( Ķ”Ā° ĶœŹ– Ķ”Ā°) 20:00
timotimo what does that face mean? :\ 20:02
jnthn "I know more weird chars than you" 20:03
tadzik yeah :) 20:04
timotimo oke 20:06
brrt ok, apparantly now i'll be iterating a hash, lets see if that does anything strange 20:30
oh, my dumb ass just stepped over the rest of the frame, completely not finding why it doesn't work 20:42
jnthn I normally use a horse rather than an ass, tbh... :P
Is the bug certainly in that frame? 20:43
Also, do you know exactly which frame it is?
brrt what i know is that there is no bug before that frame is compiled :-)
i used the break-on-print / break-on-compile technique 20:44
basically, most of the time the compiler JITs quite a few frames of itself, right? so i need to skip these before getting to the frame that i'm really interested in 20:45
i can try the reverse as well, checking if the bug persists if i put something uncompileable in that frame
20:46 Ven joined
brrt anywya, iterating on a VMArray works perfectly 20:46
related question: what's still really uncompileable? 20:48
(plenty of things, of course.. but :-) 20:54
jnthn Well, since we're actually hitting jumplist now sometimes, that means fixing that would get us compiling various regexes. 20:55
Well, and tokens/rules
brrt shivers
exciting, also very scary 20:56
right now, i'm suspicious of sp_fastcreate 20:57
jnthn Oh?
It is JITted to quite a bit of code.
brrt basically, it creates an object of only 32 bytes large 20:58
thats big enough for a header, but nothing more
it's a hash iter all right 21:00
and the object is a VMHash 21:01
so far, so good
also, ,the somewhat funky loop body is transformed into a routine call for reasons unbeknownst to me 21:03
if they hadn't, i wouldn't have seen this bug, so...
21:04 klaas-janstol joined
brrt further spesh opportunities would be - i'd say - creating sp_istrue_iter (and prefereably, sp_istrue_iterhash / sp_istrue_iterarray), and transforming if_o for iters into those + if_i 21:06
that should help a bundle wrt to making a fast implementation 21:07
and it should be easy since iter is almost always followed by unless_o or if_o
jnthn Yeah
brrt but that's just, like, my opinion :-) 21:08
jnthn It makes sense.
brrt (in fact, spesh gives us an opportunity to disintermediate so many things)
it's really exciting
i should not forget that, probably, as i'm struggling to understand this 21:09
jnthn Yes, spesh is really about taking away various bits of late bidning.
brrt and it should also remove the completely uneccesary invokish guard in this case, since istrue for an iter never invokes 21:10
jnthn yeah 21:12
brrt MVMIter shift always returns the hash for a return value? that seems.. odd 21:18
jnthn Not the hash 21:22
The iterator.
brrt you're quite right 21:23
jnthn Did investigating sp_fastcerate go any further? 21:28
brrt no, not really 21:29
the object behaves exactly as intened
it's a VMHash btw 21:30
nwc10 jnthn: ASAN barfage
I need to go to bed
==8038==ERROR: AddressSanitizer: heap-use-after-free on address 0x619000356b80 a
t pc 0x7fa6838f49bc bp 0x7fff6e2d9900 sp 0x7fff6e2d98f8WRITE of size 8 at 0x619000356b80 thread T0 #0 0x7fa6838f49bb in nqp_nfa_run src/6model/reprs/NFA.c:406
]blah 21:31
0x619000356b80 is located 0 bytes inside of 1088-byte region [0x619000356b80,0x619000356fc0)freed by thread T0 here: #0 0x7fa6841778e6 in __interceptor_realloc ../../.././libsanitizer/asan/asan_malloc_linux.cc:93 #1 0x7fa6838f4970 in nqp_nfa_run src/6model/reprs/NFA.c:403
brrt and it is precisely as large as it needs to be in fastcreate, namely 32 bytes
nwc10 without using a debugger, I guess thatthe problem is
fates = (MVMint64 *)realloc(fates, sizeof(MVMint64) * fate_arr_len);
and I would hazard a guess that fate_arr_len is 0
t/spec/S05-metasyntax/longest-alternative.rakudo.moar 21:32
Nope 21:33
(gdb) p total_fates
$1 = 1
(gdb) p fate_arr_len
$2 = 1
jnthn Huh, what on earth frees it...
nwc10 #1 0x7ffff66d6970 in nqp_nfa_run src/6model/reprs/NFA.c:403 21:34
the realloc() 3 lines before that kaboom
jnthn ah
Darn, yes 21:35
nwc10 oh, it never assignes back to tc->fates 21:36
I mean tc->nfa_fates 21:37
jnthn yeah, testing a fix here now
And wondering if it's actually to blame for the SEGV I thought a different refactor had caused.
Yes, it was 21:38
nwc10++
nwc10 win!
can I go to bed please? :-)
jnthn Thanks, that's saved me quite a headache.
Yes. Sleep well. :)
nwc10 glad to be of assistance 21:39
brrt sleep well nwc10 :-) 21:40
you're always of assistance :-)
(really, hash lookups flatten strings? i can only say... wow) 21:41
jnthn Yeah, for the moment.
brrt :-)
jnthn We use uthash but need to moarph it some more to handle our strandy strings. 21:42
brrt all strings will get their own brand new normalization right?
hmm i see
then using strandy strings might sometimes be much more expensive than imagined
jnthn Yeah, for the moment.
The string opts so far aren't the end of the work :) 21:43
Just enough of an improvement for some previously painful benchmarks.
brrt is suposse VMIter has a value method 21:44
s/is/i/ 21:45
dalek arVM: 2f16928 | jnthn++ | src/6model/reprs/NFA.c:
Fix the fates allocation optimization.

It failed to update a code path that reallocated, leading to a use after free bug. Found by nwc10++ using ASAN.
jnthn brrt: Well, VMIter is just a representation
brrt: But yeah, it can be on a type with a .value method.
brrt: NQP defines an NQPHashIter type with one, iirc 21:46
brrt hmm ok
i suppose my understanding of 6model is still rather limited
i.e. i treat a REPR as a class
but that's pretty much wrong :-)
jnthn Right
REPR is about memory layout 21:47
Type = meta-object + REPR
An STable is per type (the "S" meaning "Shared", as in "stuff things of the same type share in common"
)
brrt ok, i think i get that 21:49
premature optimizer me would say 'how nice would it be if these (meta-obj + repr) could be aligned in memory 21:50
jnthn Well, a meta-obj is just a normal object
brrt i wonder why i'm starting to like nqp 21:55
f..... 21:56
oh bloody hell
this can't be serious
call MVM_dump_backtrace, ASAN kicks in, crashburn 21:57
jnthn :(
Did it point out some intresting kind of corruption?
brrt no, not really, probably just because clang optimized something away
well, i'm pretty sure now the problem is /somewhere/ in that frame 22:03
except that it seems to work precisely as advertised
as in, /precisely/ as advertised
why am i so sure? well, i inserted something that i knew wouldn't compile - namely, eq_n, and lo and behold, test no longer crashes 22:04
the secret is in there 22:05
also, despite being not american and never will be a customer of such a data provider, this scares me terribly: blogs.wsj.com/digits/2014/07/30/spr...less-plan/ 22:07
imagine an internet with only 4 sites 22:09
jnthn ugh
brrt if i had that when i was young, i never ever would have gotten even as far as i have
no irc
no mailing lists
no msdn, no wikipedia, no random guys blogs 22:10
(or gals)
no slackware, fedora, debian
jnthn Yeah. Wow. 22:12
Quite a step backwards.
brrt btw, i find it ironic that it's easier to debug a JIT frame for me than to debug interpreted frame 22:13
jnthn o.O :) 22:14
brr 22:17
brrt: oh no, I just spotted something and you're not going to like it :(
for %seen { 22:18
next if $_.value < 2;
self.worry("Repeated character (" ~ $_.key ~ ") unexpectedly found in character class");
}
Could it be that "next" there?
That's implemented as an exception. 22:19
I shoulda seen it before, but I was looking at the previous loop :(
brrt o really? 22:21
well, that's certainly possible, i guess
jnthn Yeah. Stick a breakpoint in throwcatdyn or so 22:22
brrt ok
if that's it, you are my hero
gist.github.com/bdw/899799631900fab733fa this is by the way, everything i've collected so far 22:26
that... seems to be it, yet, throwcatdyn is actually called 22:30
jnthn Yes, trace into it and see what it does... 22:31
It'll unwind a frame and then...do something...with the PC
brrt well,, the caller frame indeed has a handler 22:38
oh, lord
timotimo ooooh, are we close to a fix for the hang issues? :)
brrt well.... we're close to the /reason/ for the hang issues, that much seems certain 22:39
brrt should have debugged unoptimized code long ago
i suppose the 'next' exception should've been caught by the frame above it? 22:45
jnthn Yes 22:46
exception.c attempts to put the PC in the right place...
...then I don't quite know what will happen.
brrt well, it seems to jump way above that
i mean way way above that
why would that be?
jnthn Well, it will unwind the current frame 22:47
frame.c has that logic
Maybe it's then finding some outdated JIT re-entry address?
And re-entering the JITted code at the wrong place 22:48
Rather than at the place the exception handler points to
TimToady long run this one oughta just turn into a goto and never throw
jnthn TimToady: Yeah, that relies on the loop body being inlined...which I don't know why it isn't, tbh 22:49
TimToady basically needs to go to whatever NEXTish logic there is
p5 jumps to the continue block, but optimizes that to jump directly to the while condition, loosely speaking 22:51
(in the absence of a continue)
maybe lexical reentry conditions are a problem though 22:52
jnthn Yeah
It's already on the Moar todo list this month to optimize various throws into gotos. 22:53
TimToady for %seen has an implicit $_
jnthn Right
I'm a bit surprised it hasn't managed to optimize that away, though.
TimToady but if you know you're gonna clobber it anyway
no need to reclone
or whatever it does on entry to the block 22:54
could become a state var, really
jnthn Well, creates a frame in theory, but just re-uses the frame from last time in reality.
TimToady waves hands encourageingly :)
*ging 22:55
jnthn I thought I'd taught NQP how to turn such for loops into not needing to invoke, though.
I'll have to look into why the opt didn't work out this time.
TimToady maybe it's does, and that's why there's no catcher for 'next'?
*it 22:56
just a wag
about all I do anymore :)
23:03 ventica joined
brrt as it seems to be, it never re-enters the JIT at all 23:04
which is what i expereience here, too 23:05
oh, i see what's the problem
timotimo yay!
brrt basically, search_frame_handlers searches handlers on basis of the current bytecode offset 23:06
i.e. i suppose the handler is a mapping between a throwpoint and a catchpoint
jnthn Ahhhh
brrt clearly, when in the JIT, the current bytecode is the magic bytecode 23:07
jnthn Right.
brrt so that doesn't appear in any handler
so it wals the stack all the way until it finds something that /does/ have a handler, and continues from there
the quick fix is to disable JIT compilation on graphs that have handlers, i think? 23:08
and then tomorrow work on adding handlers to the JIT
somehow, at least
brrt ponders a bit on how that should look, but it will probably involve dynamic labels much like OSR 23:10
jnthn brrt: Yes, that could work though might rule out a lot of things
brrt well, it /should/ rule out a lot of things 23:13
it's broken :-)
i'll fix it, but for now, it won't work 23:14
jnthn *nod*
Yes, agree with the way forward.
timotimo will you put in that hotfix before bed or tomorrow? 23:15
brrt right now is when i'm testing it
again, lizmat++ and woolfy++ got me a computer that can speedwise compete with jnthn's :-) so i don't have to wait for compilations so long anymoe
more
no more hanging during building 23:16
nqp
jnthn \o/ 23:17
brrt btw, jnthn, it might interest you that the block within for %seen is invoked by invoke_o after JIT (and spesh) 23:18
i.e., not even fastinvoke
jnthn Yeah, I know why :) 23:20
dalek arVM/moar-jit: 7d17cb0 | (Bart Wiegmans)++ | src/jit/graph.c:
Don't compile frames that have handlers

Handlers and the JIT don't play so nicely together, yet. This is because searching a frame handler looks for the current bytecode offset, which is all off in the JIT because it uses a magic bytecode.
So I'll disable the handlers for now. This fixes a long-standing issue whereby adding the iter ops would case infinite loops and other weird behavior.
brrt jnthn++ for actually finding the cause 23:23
timotimo that was tough. 23:25
jnthn aye
brrt++ # persistence
brrt :-)
we're not out of the woods yet
jit-moar-ops, even with the current fix, can't compile nqp
timotimo darn 23:26
brrt current error: 'No applicable candidates found to dispatch to for 'compile_node'.
timotimo uh ... huh? 23:27
brrt hopes something went wrong in jit-moar-ops
but i doubt it
what's wrong with that, you think?
23:27 itz_ joined
brrt oh, that's not a JIT bug 23:31
still happens with MVM_JIT_DISABLE=1 :-)
and not a spesh bug
maybe a corrupted-install bug
let's check that first 23:32
thats not it 23:33
jnthn brrt: Do you have a latest NQP? 23:34
brrt i think i do
hmm 23:37
seems like a jit bug after all
or, you know, i just don't know
maybe this is the bug timotimo was talking about 23:38
jnthn Sleep on it? :)
Or are you too zoned in? :)
dalek arVM/jit-moar-ops: 7d17cb0 | (Bart Wiegmans)++ | src/jit/graph.c:
Don't compile frames that have handlers

Handlers and the JIT don't play so nicely together, yet. This is because searching a frame handler looks for the current bytecode offset, which is all off in the JIT because it uses a magic bytecode.
So I'll disable the handlers for now. This fixes a long-standing issue whereby adding the iter ops would case infinite loops and other weird behavior.
23:39
MoarVM/jit-moar-ops: d14351c | (Bart Wiegmans)++ | src/jit/graph.c:
MoarVM/jit-moar-ops: Merge branch 'moar-jit' into jit-moar-ops
timotimo may need a clean and then a JIT_DISABLE run from the very beginning
that's how one of my problems behaved
brrt yeah, i suspect that will fix it
but that's nasty
it means something in the compilation went wrong
timotimo yes 23:40
in a way that doesn't make it crash
brrt i mean that's which circle of debugging hell?
jnthn tbh, I'd pull the patches from timotimo++'s branch in one or two at a time until you hit the one that breaks 23:41
brrt yep, that seems like a plan
again :-)
:-)
brrt is going to sleep on it
jnthn 'night o/
brrt timotimo: if you're still awake, btw, when did this first appear? 23:42
'night :-) 23:43
timotimo oof 23:44
last night? dunno :(
brrt well, doesn't matter, i'll find it eventually
23:44 brrt left
timotimo good luck! 23:46