timotimo | huh | 00:04 | |
why is there an implementation of istype in the jit? | |||
1) it turns into a call to MVM_6model_istype, but pretends the return value is void and | |||
2) it can be invokish | |||
dalek | arVM/jit-moar-ops: 8c79da9 | (Timo Paulssen)++ | src/jit/graph.c: istype is invokish and its return value ought to be INT, not VOID so i just comment it out for the time being. |
00:07 | |
arVM/jit-moar-ops: d25255a | (Timo Paulssen)++ | src/jit/graph.c: support for bindpos_i is trivial to add |
|||
arVM/jit-moar-ops: f5229ae | (Timo Paulssen)++ | src/ (3 files): add isint/isnum/isstr/islist/ishash in the interest of finding out what other ops will cause bails down the road. These ops ought to be implemented by generating a tiny bit of assembly code instead of a 15cb79e | (Timo Paulssen)++ | src/ (3 files): add isint/isnum/isstr/islist/ishash in the interest of finding out what other ops will cause bails down the road. These ops ought to be implemented by generating a tiny bit of assembly code instead of a C call, but i leave that for brrt++ to do. |
|||
timotimo | wtf | 00:11 | |
out of nowhere, 128 bails for jumplist have appeared | |||
with jit-moar-ops, i cannot build nqp any more. i get a strange error: | 00:19 | ||
/usr/bin/perl tools/build/gen-cat.pl moar src/QRegex/P5Regex/Grammar.nqp src/QRegex/P5Regex/Actions.nqp src/QRegex/P5Regex/Compiler.nqp > gen/moar/stage2/NQPP5QRegex.nqp | |||
./nqp-m --target=mbc --output=NQPP5QRegex.moarvm \ | |||
gen/moar/stage2/NQPP5QRegex.nqp | |||
Unhandled exception: Missing or wrong version of dependency 'gen/moar/stage1/nqpmo.nqp' | |||
at <unknown>:1 (./NQPHLL.moarvm::1070) | |||
well, jumplist is getting more and more interesting :) | 00:31 | ||
dalek | arVM/jit-moar-ops: 7483c74 | (Timo Paulssen)++ | src/jit/graph.c: implement iscclass |
00:37 | |
arVM/jit-moar-ops: 55a74f0 | (Timo Paulssen)++ | src/jit/graph.c: implement nfarunalt |
|||
arVM/jit-moar-ops: f5cd5be | (Timo Paulssen)++ | src/jit/graph.c: implement substr_s and index_s |
|||
timotimo | sadly can't implement ne_s, as it'd need a negation in addition to the call to MVM_string_equal ... | 00:39 | |
i think that's pretty much all the hot ops i can implement properly | 00:42 | ||
the rest is just ops that we run into fewer than 10 times | 00:43 | ||
hoelzro | I would say "cool", but I don't really understand what you're doing right now timotimo =) | 00:45 | |
dalek | arVM/jit-moar-ops: cc9e819 | (Timo Paulssen)++ | src/jit/graph.c: fix setelemspos |
00:47 | |
arVM/jit-moar-ops: bd0c3b9 | (Timo Paulssen)++ | src/jit/graph.c: push_n and push_s are also easy. |
|||
timotimo | well | ||
the jit will be invoked for every frame that spesh generates and that is considered even hotter still | 00:48 | ||
it will just optimistically run and as soon as it sees an op that it doesn't know, it'll bail | |||
it also writes "BAIL" and the name of the op into the jitlog | |||
hoelzro | ah, I read about that on your blog | ||
timotimo | yup | ||
hoelzro | pretty cool stuff =) | ||
timotimo | these ops i'm implementing are ops that i can just map 1:1 to C function calls | ||
hoelzro | but something about how the JIT doesn't generate faster code yet? | 00:49 | |
timotimo | well, there's multiple parts to that problem | ||
for one, it doesn't know how to deal with perl6's extops, like p6bind, p6bool, ... | |||
so it wouldn't be used in any of the benchmarks we have | |||
hoelzro | ah ha | ||
timotimo | the other part is that the "mainline" of programs have variadic arguments, which makes spesh itself bail before even trying to do anything | 00:50 | |
hoelzro | so it may benefit NQP, but not Rakudo (yet)? | ||
timotimo | so neither the perl6 benchmarks nor the nqp benchmarks will show any difference at all | ||
except slower run-time, because it tries to jit things here and there | |||
hoelzro | ah ha | ||
timotimo | it can benefit perl6 programs, but only for frames where we generate super tight code and also don't hit any extops | ||
hoelzro | I see | ||
timotimo | i have no idea how many that would be | ||
gist.github.com/timo/f92ff69eb4045.../revisions ā look at these diffs, it's fun to look at how implementing one op removes it completely, but ups the counter on a bunch of other ops | 00:51 | ||
hoelzro | yiks | 00:52 | |
*yikes | |||
timotimo | refresh for one more | ||
ah, look! | |||
i implemented push_s and 4 bails disappeared, no bails were added | |||
that means we were able to jit-compile 4 more frames! :) | |||
oh, interesting | 00:53 | ||
all of these 4 frames were versions of !dba | |||
hoelzro | \o/ | 00:55 | |
timotimo | the sum of the "bytecode size" lines is 429597 | ||
whatever that means :) | |||
hoelzro shrugs | 00:56 | ||
I burned myself out on Moar stuff last night | |||
so timotimo++ for soldiering on | 00:57 | ||
timotimo | gist.github.com/timo/f13611f6d587bb1e9188 - lookie here | 00:59 | |
hoelzro | whoa | 01:00 | |
timotimo | turning off the jit actually makes it faster at the moment | ||
hoelzro | 40 second parse? | ||
timotimo | but it's still way faster than without spesh at all | ||
yeah, this is just my laptop | |||
hoelzro | I think I get 120 seconds | ||
timotimo | huh? | 01:01 | |
when was that %) | |||
hoelzro tries again | |||
timotimo | well, okay, this laptop is only half a year old | ||
hoelzro | gah, I'll wait until after nqp-js tests fniish | ||
timotimo | and was pretty up-to-date at that time | ||
hoelzro | I built my desktop like 3 months ago o_O | ||
and it's the fastest machine I've built perl6 on! | |||
ok, let's see... | 01:02 | ||
timotimo | there must be something wrong, surely | ||
hoelzro | ok, I'm on drugs | 01:05 | |
46 seconds | |||
maybe I was thinking about parrot speeds? | |||
hoelzro should really resume S26 work | |||
timotimo | i like S26 | 01:06 | |
hoelzro | you wanna help? =) | ||
timotimo | i've already done a lot way back when :S | 01:07 | |
the code was strenuous to wade through, tbh | |||
i had hoped i could refactor the parsing of pod completely | |||
make it more "parametric" | |||
hoelzro | ugh, I had the same hope | ||
at this point, I feel like I've done an "ok" job | |||
timotimo | (>^_^)> | 01:08 | |
hoelzro | there's still much to be done | ||
so I should stop chasing Moar bugs, playing with nqp-js, and messing with lib/Test.pm =) | 01:09 | ||
timotimo | now i implemented tc, lc, uc and it split into le_s (which is now at 1) and ordat (which is now at 6) | ||
hoelzro | tbf, the first was a result of the last! | ||
huh | |||
it's like trying to plug a leaky boat | |||
timotimo | ordat wants a bit more than just a call to the C, it wants to verify the string's length, too | ||
well, yeah :) | 01:10 | ||
but it's kind of a fun boat | |||
hoelzro | =) | ||
timotimo | tomorrow brrt will implement sp_findmeth, which will cause an explosion of new bails | ||
hoelzro | oh joy | 01:11 | |
timotimo | it'll potentially spread out into 358 individual ops and ++ all of them :P | ||
hoelzro .oO( bailing out a boat? ) | |||
timotimo | (i don't think we have that many ops) | ||
oh, huh | |||
650 is the last line of the oplist file that contains non-sp ops | |||
hoelzro | so, how does that work? I figured once you implemented the JITing for one op, it wouldn't show up again? | ||
timotimo | 40 is the first one | ||
so we have 610 individual ops? wow. | |||
that's right | 01:12 | ||
hoelzro | so everytime you implement one, you potentially incr the counts of other unimpl'd ones? | ||
timotimo | yup | ||
maybe even from 0 | |||
hoelzro | grr | 01:13 | |
timotimo | look at the very first diff on that page | ||
that was when i implemented bindpos_i | |||
32 less for bindpos_i | |||
1 more iscclass, 1 more indexat, 20 more islist, 10 more sp_findmeth | |||
hoelzro | nuts | 01:14 | |
timotimo | so not a single extra frame got compiled after that | ||
but later i implemented islist | |||
that's the next diff, one up | |||
that diff confused me, though | |||
didn't actually do the counts there | 01:15 | ||
oh, i also un-implemented istype there | |||
i'm surprised it didn't blow up majorly; istype would always leave the value in the register untouched | |||
oh, hold on | 01:16 | ||
i think i was wrong about *that* part | |||
but still, istype is an invokish op, the jit has to be extra careful around these | |||
hoelzro | I see | ||
it's pretty cool | |||
I can't wait to see how well Moar does after the JIT is fully in place | 01:17 | ||
timotimo | aye. | ||
i wonder how long the core setting compilation would take if we subtracted the time it takes to attempt a compilation and bail in the middle of it | 01:18 | ||
i.e. if it'd be worth it already | 01:19 | ||
the thing is, we still read and write from the working memory before and after every single translated op | |||
brrt is going to implement a nontrivial feature in dynasm for that to improve | 01:20 | ||
TimToady | you might want some kind of pragma to say "Don't attempt to jit this bit for now." | ||
timotimo | i.e. keeping values in registers between ops | ||
TimToady | if you know it's going to bail | ||
timotimo | i'd assume it's not worth the time it'd take to implement :) | 01:21 | |
TimToady | 'course then you fix the optimizer to work with that bit, and then it doesn't jit, d'oh | ||
timotimo | gotta run for now :) | 01:22 | |
hoelzro | later timotimo | ||
timotimo | :) | 01:26 | |
02:01
btyler joined
02:05
btyler joined
02:29
jimmyz joined
|
|||
jimmyz | Stage parse : 39.273, before: 44 | 02:29 | |
since yesterday | 02:30 | ||
timotimo | sweet :) | 02:31 | |
jnthn made some pretty radical improvements in nqp, and a few in moarvm as well | |||
i'm assuming you're on nom/master/master? | |||
02:35
btyler joined
|
|||
jimmyz | timotimo: yeah | 02:37 | |
timotimo | yeah what exactly? :) | ||
lizmat: just don't try the moarvm/jit-moar-ops branch with --enable-jit; it's slower than what we have with spesh, but it's still faster than without spesh | 02:38 | ||
i only really wanted to benchmark the json thing, but i ended up kicking off a complete benchmark run ... oh well | 02:39 | ||
when it's done with rakudo-moar, nqp-moar will only take about half the time ... :) | |||
jimmyz | timotimo: I'm on nom/master/master :P | 02:40 | |
02:41
btyler joined
|
|||
timotimo | ah | 02:44 | |
03:42
itz joined
04:19
ventica joined
|
|||
ventica | o/ | 04:19 | |
05:36
jnap joined
05:59
FROGGS joined
|
|||
nwc10 | jnthn: setting... | 06:02 | |
was: cmd: Rounded run time per iteration: 7.3881e+01 +/- 9.3e-02 (0.1%) | |||
now: cmd: Rounded run time per iteration: 7.1820e+01 +/- 4.2e-02 (0.1%) | |||
win. (about 2.5% win) | |||
06:14
brrt joined
|
|||
brrt | \o | 06:15 | |
timotimo: istype is correct i believe, the function is actually marked as invokish in oplist (which means it's handled), and it's passed the address of the destination register, thus the return value is void | |||
thanks for bringing it to my attention, though :-) | 06:16 | ||
sergot | hi o/ | 06:17 | |
brrt | also, seeing that core.setting benchmark makes me sad | 06:18 | |
o/ sergot | |||
oh, you said that already | 06:21 | ||
much timotimo++ for hard work, though | 06:22 | ||
06:40
jnap joined
06:54
zakharyas joined
|
|||
nwc10 | brrt: which benchmark, and why sad? | 07:18 | |
brrt | nwc10: gist.github.com/timo/f13611f6d587bb1e9188 this one | ||
and although i think there's a logical enough explanation for it, it still hurts | |||
nwc10 | I didn't say, but I didn't see faster setting compiles with JIT | 07:19 | |
07:40
oetiker joined
07:46
jnap joined,
oetiker joined
|
|||
nwc10 | ==30641== in use at exit: 776,094,561 bytes in 2,625,245 blocks | 08:02 | |
==30641== total heap usage: 17,741,594 allocs, 15,116,349 frees, 3,670,670,682 bytes allocated | |||
not quite a fair comparison with irclog.perlgeek.de/moarvm/2014-07-29#i_9101664 as the former is the JIT, the latter is not, and JIT will allocate more | 08:03 | ||
botu quite a lot less allocation | |||
08:03
FROGGS joined
|
|||
brrt is further triangulating the regex bug | 08:06 | ||
hmm, very interesting | 08:42 | ||
brrt is now hoping this is some sort of OSR bug | 08:44 | ||
correction, a JIT-OSR bug | 08:47 | ||
and, it's not that | 08:51 | ||
/me brb | 08:52 | ||
09:03
jnap joined
09:25
brrt joined
|
|||
brrt | ok, long story short, it isn't OSR | 09:37 | |
jnthn tries to find yesterday's number | 09:38 | ||
(for allocated) | |||
brrt | o/ jnthn | 09:39 | |
jnthn | nwc10: oh wow, that's like, a third to a quarter of what it was. | 09:40 | |
o/ brrt | |||
In other news, yay, I have a working keyboard | |||
On my laptop | |||
brrt | \o/ | ||
jnthn | :))))) <== I can smile again and it works | ||
brrt | in other news, i'm hardly a step further in the regex bug business | 09:41 | |
jnthn | Regex bug? | ||
It was a mis-code-gen, or? | |||
brrt | probably | 09:42 | |
basically, the t/qregex/01-qregex.nqp test breaks reliably with iter, but not without | 09:43 | ||
09:43
japhb_ joined
|
|||
brrt | because the QRegex compiler doesn't compile the regex with jit | 09:43 | |
and.. i have /still/ no idea why exactly this is | 09:44 | ||
09:44
[Coke]_ joined,
TimToady_ joined
09:45
cognome joined
|
|||
brrt | boxing ops aren't normally invokish, are they? | 10:06 | |
jnthn | No | ||
Never | |||
brrt | hmm no, you're right | 10:08 | |
that'd be weird | |||
ok, no need to look there | 10:09 | ||
jnthn | lunch; bbiab | 10:12 | |
10:25
jnap joined
|
|||
brrt back from lunch :-) | 10:37 | ||
y this no work | 10:50 | ||
nwc10 | jnthn: as stated in #perl6, MoarVM about 6.5 times faster than parrot at compiling the setting | 10:53 | |
will take 27 more 2.5% speedups to get to a factor of 10 | 10:56 | ||
brrt | \o/ | ||
brrt wonders if JIT will get us there once it ACTUALLY WORKS | |||
and what an actual optimizing JIT could make of regexes | |||
well, it isn't istrue combined with iter | 11:18 | ||
because that looks just dandy | |||
tl;dr | 11:19 | ||
'ok frame is ok' | |||
11:28
jnap joined
|
|||
brrt | y u compile so much | 11:44 | |
timotimo | brrt: do you feel like jit-moar-ops could be merged into master? it seems to compile rakudo and nqp just fine | 12:28 | |
12:30
Ven joined
|
|||
timotimo | also, i would revert the "omg istype is probably broken!" commit | 12:30 | |
oh, i was wrong, there's still the "missing or wrong version of dependency nqpmo.nqp" thing when building nqp | |||
brrt | make clean | 12:31 | |
:-) | |||
i'll check it in a bit | 12:32 | ||
i'm stepping through the frame now | |||
timotimo | make clean doesn't help | 12:34 | |
brrt | hmm what | ||
weird | |||
32 bytes seems a bit small for fastcreate, no? | 12:43 | ||
timotimo | don't know? | ||
don't understand the context | |||
i was wondering ... if we have a string that's mostly ascii-encodable, but has a few multibyte-utf8-chars here and there, we could pick it apart into multiple strands and have direct jump-to-correct-byte-for-offset access to big parts of the string | 12:58 | ||
nwc10 | you can do that by using any fixed with encoding, without needing strands | 12:59 | |
timotimo | yeah, but if we have a 20k character string with a single ƶ in it, we'd blow it up to 4x the size (if we use ucs-32) | 13:00 | |
nwc10 | possibly at the cost of more memory. But the overhead of tracking the multiple strands will not be free | ||
agree. but implementing the 8 bit NFG would also address this | |||
timotimo | next thing i'll do is get a c-level profile of parse-json | ||
nwc10 | mostly it's only going to be 2x the size, if we can use ucs-16 | 13:01 | |
timotimo | except ucs-16 isn't a fixed-width encoding | 13:02 | |
well, not in the strict sense | |||
if we do analyze the string up front, then yeah | |||
set and decont are kinda hot | 13:07 | ||
interesting, guardconc is also considered hot | 13:09 | ||
well, all these "hot" things are probably just "warm" | 13:10 | ||
another warm instruction is in getspeshslot | |||
brrt | getspeshslot? i thought i did that already | ||
as well as guardconc | |||
timotimo | i'm probably reading this wrong | 13:11 | |
this is trying (and failing) to interpret a report from perf, not a jit bail log | |||
MVM_frame_decref is warm, too | 13:12 | ||
are you interested in a bail log for parse-json? | 13:14 | ||
gist.github.com/timo/89aafc240d13748e4278 - this is from the perl6 version of that benchmark | 13:15 | ||
AFK | |||
brrt | yes, i am | 13:20 | |
paramnameused is a big one | |||
I CAN"T FIND THE SOURCE OF THIS BLOODY BUG | |||
and i'm not starting anything until i found it | 13:23 | ||
jnthn | brrt: Hm, I should probably take a look this evening :) | 13:35 | |
See if I can spot anything of note :) | 13:36 | ||
brrt | i'd be delighted :-) | 13:37 | |
this feels something between Real Work and a Complete Waste Of Time | |||
13:38
klaas-janstol joined
|
|||
brrt | because i'm truly so little further | 13:38 | |
jnthn | What do we know so far? Where do you have to disable JIT to make it work? | 13:39 | |
brrt | what i know is that disabling iter and friends make it run, if only because it disables many if not all regex methods | 13:40 | |
jnthn | ah, ok | ||
Well, every regex method uses jumptable too | 13:41 | ||
So it won't be a regex method per se | |||
It may well be something in the regex engine though | |||
brrt | timotimo also reported not being able to run because of missing or wrong dependency complaints, but i typically get that when moarvm and nqp no longer feel like they agree | ||
jnthn | like MATCH or CAPHASH | ||
brrt | well, i know have the exact frame that was compiled just the moment before the errors start | ||
so i'm hoping that should give some insight | 13:42 | ||
and yes, it uses iter | |||
but i've not seen anything weird yet | |||
timotimo | brrt: fwiw, disabling the jit for only the single call that fails with "blah dependency" doesn't fix it; disabling the jit completely does, however | 14:32 | |
brrt | you've made quite a few problems for me this week :-) | 14:33 | |
timotimo | the good kind of problem, i hope | 14:35 | |
brrt | pff | 14:36 | |
lets just say you've uncovered a boat load of jit bugs :-) | |||
timotimo | :) | ||
brrt | waitaminute | 14:38 | |
why... why does shift_o bottom out as an u8? | 14:40 | ||
timotimo | oh noes! did i do that wrong? ;( | 14:45 | |
i don't see anything wrong with my implementation off-hand | |||
dalek | arVM/jit-moar-ops: e4f28ab | (Timo Paulssen)++ | src/jit/graph.c: istype is actually totally correct; i didn't see the REG_ADDR also, it's marked as "invokish", so the jit knows how to deal with that part of the problem at least. |
14:47 | |
brrt | no, probably not | ||
hmm.. gdb is just funky i guess | 14:52 | ||
timotimo | that's not a good sign ... | 14:54 | |
brrt | that's optimization for you | 14:55 | |
dalek | arVM/jit-moar-ops: 542fe59 | (Timo Paulssen)++ | src/jit/graph.c: implement uc, lc, tc. |
15:05 | |
arVM/jit-moar-ops: 5239e3e | (Timo Paulssen)++ | src/ (3 files): implement splice. |
15:06 | ||
arVM/jit-moar-ops: bca161a | (Timo Paulssen)++ | src/jit/graph.c: implement split. |
15:24 | ||
timotimo | implementing split caused 2 more frames to be compiled in the core setting! :) | ||
1.5k frames still bailing out, though | |||
[Coke] | q; is having other folks committing to the jit stuff at all confusing for final grading? | 15:26 | |
(it's awesome for the community and I entirely support it; just don't want to screw up grading) | |||
timotimo | the contributions i make are all trivial | ||
brrt | \o/ | 15:39 | |
not so trivial, they have to be correct too | |||
git can get you a log of all my commits and changes | |||
15:43
ventica joined
|
|||
TimToady | brrt++'s work is already awesome; it would be difficult to ruin his grade :) | 15:49 | |
just having the framework in place is a great thing, even if there are still enough opcodes and/or bugs at the end to mask the eventual performance improvements | 15:51 | ||
JIT by definition tends to perform poorly when there are scattered weak links in the chain of opcodes, which can result when you have code in a language that is not well designed for JIT :) | 15:53 | ||
brrt | i would add that if you didn't know you were going to do a JIT right at the start (e.g., LuaJIT-2.0), it can be difficult in any language | ||
TimToady | I think the JIT performance vs effort curve will typically look like a hockey stick. | 15:54 | |
brrt | and i would further add that some things which are o so simple in plain-old-c can be devillish in ASM :-) | ||
brrt certainly hopes so | |||
until we hit another barrier | |||
and then it'll be flat again for a while, and given enough effort, might increase again | 15:55 | ||
TimToady | pity we don't have a language designer around to fix the language... :) | ||
brrt | so perhaps sigmoidal? | ||
TimToady | prolly | ||
carlin | sounds like a stair-case | 15:57 | |
TimToady | "It'f fubconfiouf!" --Sigmoid Frund | ||
brrt dinner & | 15:59 | ||
16:45
brrt joined
|
|||
brrt | we should be able to dump the moarvm call stack using moar-gdb.py | 16:46 | |
jnthn | Can you not just MVM_dump_backtrace(tc); in gdb? | ||
timotimo | you should be able to | 16:47 | |
16:50
klaas-janstol joined
17:02
FROGGS joined
17:33
vendethiel joined
|
|||
brrt | i'll try, but tc has been... optimized out :-( | 17:39 | |
why, that does explain a lot | 17:41 | ||
although it puzzles me why there's an invoke there | 17:51 | ||
jnthn | "there"? | ||
jnthn has nommed and will attempt to work in his unwanted free home sauna | |||
brrt | there is this, it seems | 17:53 | |
github.com/perl6/nqp/blob/master/s...r.nqp#L353 | 17:54 | ||
jnthn | I'm extremely surprised if you're in a JIT-compiled cclass_elem in so far as we don't compile jumptabl yet... | 17:55 | |
brrt | well, the code fragment within is compilable :-) | ||
jnthn | ah, yes | ||
brrt | anyway, bbi2h or so :-) | 17:56 | |
jnthn | ok | ||
brrt | i think my %seen creates a bindlex? | ||
jnthn | It's just a decl afaik | ||
brrt | there's a bindlex i can't really explain otherwise | 17:57 | |
but, i'll be really of now | |||
jnthn | OK. Lemme know when you're back; I can look through the disassembly. | 17:58 | |
m: say 1041064 * 64 | 18:05 | ||
camelia | rakudo-moar 11e193: OUTPUTĀ«66628096ā¤Ā» | ||
jnthn | m: say (1041064 * 64) / 4194304 | 18:06 | |
camelia | rakudo-moar 11e193: OUTPUTĀ«15.885376ā¤Ā» | ||
timotimo | what awesome patch do you have for us now? :) | 18:11 | |
also ... aaw, no brrt for 2 hours? :( | |||
18:14
Ven joined
|
|||
jnthn | Will let you know in 5 mins if I have one :) | 18:24 | |
TimToady is always disappointed when he reads that jnthn will not be available till evening, till he remembers that jnthn's evening comes, like, nine hours earlier :) | 18:35 | ||
jnthn | :) | ||
I realized that every alternation and protoregex evaluation ended up taking a closure for a needless reason. | |||
FROGGS | I believe there a a lot of spots in our codebase like that | 18:36 | |
TimToady | sugoi! | 18:37 | |
which, as in english awesome/awful or terrific/terrible, can mean both really good and really bad, though usually good | 18:38 | ||
cf awfully good and terribly good | |||
not that STD's regex engine isn't chockablock full of closures to emulate laziness... | 18:41 | ||
jnthn | m: say 5524580 - 4483251 | 18:48 | |
camelia | rakudo-moar 509b1a: OUTPUTĀ«1041329ā¤Ā» | ||
jnthn | Yeah, I'll happily make a million less closures, thanks. | ||
TimToady | .oO(we merely have to call the slow-path binder a million times instead...) |
18:49 | |
jnthn | There's no slow-path binder in NQP :) | 18:50 | |
18:59
cognome joined,
cognominal__ joined
19:02
Ven joined
|
|||
japhb_ | Waiting for the day when jnthn can say "There's no slow path in NQP" ... | 19:06 | |
19:35
brrt joined,
cognome joined
|
|||
brrt | jnthn, timotimo: i'm back | 19:35 | |
timotimo | yay :) | 19:37 | |
turns out i was able to occupy myself with friends & food until brrt came back :) | |||
brrt | i'll have fully annotated dissassembly in about an hour or 2 | ||
(annotating dissassembly is expensive) | 19:38 | ||
dalek | arVM: d51a9cf | jnthn++ | src/ (2 files): Cache fates array rather than re-allocating. |
||
arVM: a4ac569 | jnthn++ | / (3 files): Cache frame index, to avoid a linear scan. Turns out linear scans for frame indexes dominated assembly time. With this, stage mbc for Rakudo's CORE.setting is a third of what it was. |
|||
jnthn | m: say 35 + 73 | 19:39 | |
camelia | rakudo-moar c9ad80: OUTPUTĀ«108ā¤Ā» | ||
jnthn | Well under 2 minutes for a full NQP and Rakudo re-build on my box these days. :) | 19:40 | |
vendethiel | you got a good box :P | 19:41 | |
what was it like, with last year's parrot ? :) | 19:42 | ||
jnthn | I forget. I suspect CORE.setting alone was in the 2 minute region though. | ||
timotimo | holy hell! | 19:43 | |
down to a third %) | |||
that is pretty fantastic | |||
jnthn | Stage mbc was the shortest stage. But still, it's a nice win. :) | 19:44 | |
timotimo | i always thought it ought to be faster than that | 19:45 | |
the closure thing ought to make all our parsing faster,n o? | 19:47 | ||
btyler | 'Stage mbc : 0.539' :) | ||
hoelzro | great scott | 19:48 | |
timotimo | er ... huh | ||
oh! | |||
stage mbc | |||
i thought we were talking about stage mast | |||
well, that's still a nice little win :) | |||
jnthn | btyler: Heh, 0.287 here :) | 19:50 | |
timotimo | 0.275 | 19:51 | |
i win :) | |||
jnthn | :P | 19:55 | |
tadzik | "stage mbc for Rakudo's CORE.setting is a third of what it was" ( Ķ”Ā° ĶŹ Ķ”Ā°) | 20:00 | |
timotimo | what does that face mean? :\ | 20:02 | |
jnthn | "I know more weird chars than you" | 20:03 | |
tadzik | yeah :) | 20:04 | |
timotimo | oke | 20:06 | |
brrt | ok, apparantly now i'll be iterating a hash, lets see if that does anything strange | 20:30 | |
oh, my dumb ass just stepped over the rest of the frame, completely not finding why it doesn't work | 20:42 | ||
jnthn | I normally use a horse rather than an ass, tbh... :P | ||
Is the bug certainly in that frame? | 20:43 | ||
Also, do you know exactly which frame it is? | |||
brrt | what i know is that there is no bug before that frame is compiled :-) | ||
i used the break-on-print / break-on-compile technique | 20:44 | ||
basically, most of the time the compiler JITs quite a few frames of itself, right? so i need to skip these before getting to the frame that i'm really interested in | 20:45 | ||
i can try the reverse as well, checking if the bug persists if i put something uncompileable in that frame | |||
20:46
Ven joined
|
|||
brrt | anywya, iterating on a VMArray works perfectly | 20:46 | |
related question: what's still really uncompileable? | 20:48 | ||
(plenty of things, of course.. but :-) | 20:54 | ||
jnthn | Well, since we're actually hitting jumplist now sometimes, that means fixing that would get us compiling various regexes. | 20:55 | |
Well, and tokens/rules | |||
brrt shivers | |||
exciting, also very scary | 20:56 | ||
right now, i'm suspicious of sp_fastcreate | 20:57 | ||
jnthn | Oh? | ||
It is JITted to quite a bit of code. | |||
brrt | basically, it creates an object of only 32 bytes large | 20:58 | |
thats big enough for a header, but nothing more | |||
it's a hash iter all right | 21:00 | ||
and the object is a VMHash | 21:01 | ||
so far, so good | |||
also, ,the somewhat funky loop body is transformed into a routine call for reasons unbeknownst to me | 21:03 | ||
if they hadn't, i wouldn't have seen this bug, so... | |||
21:04
klaas-janstol joined
|
|||
brrt | further spesh opportunities would be - i'd say - creating sp_istrue_iter (and prefereably, sp_istrue_iterhash / sp_istrue_iterarray), and transforming if_o for iters into those + if_i | 21:06 | |
that should help a bundle wrt to making a fast implementation | 21:07 | ||
and it should be easy since iter is almost always followed by unless_o or if_o | |||
jnthn | Yeah | ||
brrt | but that's just, like, my opinion :-) | 21:08 | |
jnthn | It makes sense. | ||
brrt | (in fact, spesh gives us an opportunity to disintermediate so many things) | ||
it's really exciting | |||
i should not forget that, probably, as i'm struggling to understand this | 21:09 | ||
jnthn | Yes, spesh is really about taking away various bits of late bidning. | ||
brrt | and it should also remove the completely uneccesary invokish guard in this case, since istrue for an iter never invokes | 21:10 | |
jnthn | yeah | 21:12 | |
brrt | MVMIter shift always returns the hash for a return value? that seems.. odd | 21:18 | |
jnthn | Not the hash | 21:22 | |
The iterator. | |||
brrt | you're quite right | 21:23 | |
jnthn | Did investigating sp_fastcerate go any further? | 21:28 | |
brrt | no, not really | 21:29 | |
the object behaves exactly as intened | |||
it's a VMHash btw | 21:30 | ||
nwc10 | jnthn: ASAN barfage | ||
I need to go to bed | |||
==8038==ERROR: AddressSanitizer: heap-use-after-free on address 0x619000356b80 a | |||
t pc 0x7fa6838f49bc bp 0x7fff6e2d9900 sp 0x7fff6e2d98f8WRITE of size 8 at 0x619000356b80 thread T0 #0 0x7fa6838f49bb in nqp_nfa_run src/6model/reprs/NFA.c:406 | |||
]blah | 21:31 | ||
0x619000356b80 is located 0 bytes inside of 1088-byte region [0x619000356b80,0x619000356fc0)freed by thread T0 here: #0 0x7fa6841778e6 in __interceptor_realloc ../../.././libsanitizer/asan/asan_malloc_linux.cc:93 #1 0x7fa6838f4970 in nqp_nfa_run src/6model/reprs/NFA.c:403 | |||
brrt | and it is precisely as large as it needs to be in fastcreate, namely 32 bytes | ||
nwc10 | without using a debugger, I guess thatthe problem is | ||
fates = (MVMint64 *)realloc(fates, sizeof(MVMint64) * fate_arr_len); | |||
and I would hazard a guess that fate_arr_len is 0 | |||
t/spec/S05-metasyntax/longest-alternative.rakudo.moar | 21:32 | ||
Nope | 21:33 | ||
(gdb) p total_fates | |||
$1 = 1 | |||
(gdb) p fate_arr_len | |||
$2 = 1 | |||
jnthn | Huh, what on earth frees it... | ||
nwc10 | #1 0x7ffff66d6970 in nqp_nfa_run src/6model/reprs/NFA.c:403 | 21:34 | |
the realloc() 3 lines before that kaboom | |||
jnthn | ah | ||
Darn, yes | 21:35 | ||
nwc10 | oh, it never assignes back to tc->fates | 21:36 | |
I mean tc->nfa_fates | 21:37 | ||
jnthn | yeah, testing a fix here now | ||
And wondering if it's actually to blame for the SEGV I thought a different refactor had caused. | |||
Yes, it was | 21:38 | ||
nwc10++ | |||
nwc10 | win! | ||
can I go to bed please? :-) | |||
jnthn | Thanks, that's saved me quite a headache. | ||
Yes. Sleep well. :) | |||
nwc10 | glad to be of assistance | 21:39 | |
brrt | sleep well nwc10 :-) | 21:40 | |
you're always of assistance :-) | |||
(really, hash lookups flatten strings? i can only say... wow) | 21:41 | ||
jnthn | Yeah, for the moment. | ||
brrt | :-) | ||
jnthn | We use uthash but need to moarph it some more to handle our strandy strings. | 21:42 | |
brrt | all strings will get their own brand new normalization right? | ||
hmm i see | |||
then using strandy strings might sometimes be much more expensive than imagined | |||
jnthn | Yeah, for the moment. | ||
The string opts so far aren't the end of the work :) | 21:43 | ||
Just enough of an improvement for some previously painful benchmarks. | |||
brrt | is suposse VMIter has a value method | 21:44 | |
s/is/i/ | 21:45 | ||
dalek | arVM: 2f16928 | jnthn++ | src/6model/reprs/NFA.c: Fix the fates allocation optimization. It failed to update a code path that reallocated, leading to a use after free bug. Found by nwc10++ using ASAN. |
||
jnthn | brrt: Well, VMIter is just a representation | ||
brrt: But yeah, it can be on a type with a .value method. | |||
brrt: NQP defines an NQPHashIter type with one, iirc | 21:46 | ||
brrt | hmm ok | ||
i suppose my understanding of 6model is still rather limited | |||
i.e. i treat a REPR as a class | |||
but that's pretty much wrong :-) | |||
jnthn | Right | ||
REPR is about memory layout | 21:47 | ||
Type = meta-object + REPR | |||
An STable is per type (the "S" meaning "Shared", as in "stuff things of the same type share in common" | |||
) | |||
brrt | ok, i think i get that | 21:49 | |
premature optimizer me would say 'how nice would it be if these (meta-obj + repr) could be aligned in memory | 21:50 | ||
jnthn | Well, a meta-obj is just a normal object | ||
brrt | i wonder why i'm starting to like nqp | 21:55 | |
f..... | 21:56 | ||
oh bloody hell | |||
this can't be serious | |||
call MVM_dump_backtrace, ASAN kicks in, crashburn | 21:57 | ||
jnthn | :( | ||
Did it point out some intresting kind of corruption? | |||
brrt | no, not really, probably just because clang optimized something away | ||
well, i'm pretty sure now the problem is /somewhere/ in that frame | 22:03 | ||
except that it seems to work precisely as advertised | |||
as in, /precisely/ as advertised | |||
why am i so sure? well, i inserted something that i knew wouldn't compile - namely, eq_n, and lo and behold, test no longer crashes | 22:04 | ||
the secret is in there | 22:05 | ||
also, despite being not american and never will be a customer of such a data provider, this scares me terribly: blogs.wsj.com/digits/2014/07/30/spr...less-plan/ | 22:07 | ||
imagine an internet with only 4 sites | 22:09 | ||
jnthn | ugh | ||
brrt | if i had that when i was young, i never ever would have gotten even as far as i have | ||
no irc | |||
no mailing lists | |||
no msdn, no wikipedia, no random guys blogs | 22:10 | ||
(or gals) | |||
no slackware, fedora, debian | |||
jnthn | Yeah. Wow. | 22:12 | |
Quite a step backwards. | |||
brrt | btw, i find it ironic that it's easier to debug a JIT frame for me than to debug interpreted frame | 22:13 | |
jnthn | o.O :) | 22:14 | |
brr | 22:17 | ||
brrt: oh no, I just spotted something and you're not going to like it :( | |||
for %seen { | 22:18 | ||
next if $_.value < 2; | |||
self.worry("Repeated character (" ~ $_.key ~ ") unexpectedly found in character class"); | |||
} | |||
Could it be that "next" there? | |||
That's implemented as an exception. | 22:19 | ||
I shoulda seen it before, but I was looking at the previous loop :( | |||
brrt | o really? | 22:21 | |
well, that's certainly possible, i guess | |||
jnthn | Yeah. Stick a breakpoint in throwcatdyn or so | 22:22 | |
brrt | ok | ||
if that's it, you are my hero | |||
gist.github.com/bdw/899799631900fab733fa this is by the way, everything i've collected so far | 22:26 | ||
that... seems to be it, yet, throwcatdyn is actually called | 22:30 | ||
jnthn | Yes, trace into it and see what it does... | 22:31 | |
It'll unwind a frame and then...do something...with the PC | |||
brrt | well,, the caller frame indeed has a handler | 22:38 | |
oh, lord | |||
timotimo | ooooh, are we close to a fix for the hang issues? :) | ||
brrt | well.... we're close to the /reason/ for the hang issues, that much seems certain | 22:39 | |
brrt should have debugged unoptimized code long ago | |||
i suppose the 'next' exception should've been caught by the frame above it? | 22:45 | ||
jnthn | Yes | 22:46 | |
exception.c attempts to put the PC in the right place... | |||
...then I don't quite know what will happen. | |||
brrt | well, it seems to jump way above that | ||
i mean way way above that | |||
why would that be? | |||
jnthn | Well, it will unwind the current frame | 22:47 | |
frame.c has that logic | |||
Maybe it's then finding some outdated JIT re-entry address? | |||
And re-entering the JITted code at the wrong place | 22:48 | ||
Rather than at the place the exception handler points to | |||
TimToady | long run this one oughta just turn into a goto and never throw | ||
jnthn | TimToady: Yeah, that relies on the loop body being inlined...which I don't know why it isn't, tbh | 22:49 | |
TimToady | basically needs to go to whatever NEXTish logic there is | ||
p5 jumps to the continue block, but optimizes that to jump directly to the while condition, loosely speaking | 22:51 | ||
(in the absence of a continue) | |||
maybe lexical reentry conditions are a problem though | 22:52 | ||
jnthn | Yeah | ||
It's already on the Moar todo list this month to optimize various throws into gotos. | 22:53 | ||
TimToady | for %seen has an implicit $_ | ||
jnthn | Right | ||
I'm a bit surprised it hasn't managed to optimize that away, though. | |||
TimToady | but if you know you're gonna clobber it anyway | ||
no need to reclone | |||
or whatever it does on entry to the block | 22:54 | ||
could become a state var, really | |||
jnthn | Well, creates a frame in theory, but just re-uses the frame from last time in reality. | ||
TimToady waves hands encourageingly :) | |||
*ging | 22:55 | ||
jnthn | I thought I'd taught NQP how to turn such for loops into not needing to invoke, though. | ||
I'll have to look into why the opt didn't work out this time. | |||
TimToady | maybe it's does, and that's why there's no catcher for 'next'? | ||
*it | 22:56 | ||
just a wag | |||
about all I do anymore :) | |||
23:03
ventica joined
|
|||
brrt | as it seems to be, it never re-enters the JIT at all | 23:04 | |
which is what i expereience here, too | 23:05 | ||
oh, i see what's the problem | |||
timotimo | yay! | ||
brrt | basically, search_frame_handlers searches handlers on basis of the current bytecode offset | 23:06 | |
i.e. i suppose the handler is a mapping between a throwpoint and a catchpoint | |||
jnthn | Ahhhh | ||
brrt | clearly, when in the JIT, the current bytecode is the magic bytecode | 23:07 | |
jnthn | Right. | ||
brrt | so that doesn't appear in any handler | ||
so it wals the stack all the way until it finds something that /does/ have a handler, and continues from there | |||
the quick fix is to disable JIT compilation on graphs that have handlers, i think? | 23:08 | ||
and then tomorrow work on adding handlers to the JIT | |||
somehow, at least | |||
brrt ponders a bit on how that should look, but it will probably involve dynamic labels much like OSR | 23:10 | ||
jnthn | brrt: Yes, that could work though might rule out a lot of things | ||
brrt | well, it /should/ rule out a lot of things | 23:13 | |
it's broken :-) | |||
i'll fix it, but for now, it won't work | 23:14 | ||
jnthn | *nod* | ||
Yes, agree with the way forward. | |||
timotimo | will you put in that hotfix before bed or tomorrow? | 23:15 | |
brrt | right now is when i'm testing it | ||
again, lizmat++ and woolfy++ got me a computer that can speedwise compete with jnthn's :-) so i don't have to wait for compilations so long anymoe | |||
more | |||
no more hanging during building | 23:16 | ||
nqp | |||
jnthn | \o/ | 23:17 | |
brrt | btw, jnthn, it might interest you that the block within for %seen is invoked by invoke_o after JIT (and spesh) | 23:18 | |
i.e., not even fastinvoke | |||
jnthn | Yeah, I know why :) | 23:20 | |
dalek | arVM/moar-jit: 7d17cb0 | (Bart Wiegmans)++ | src/jit/graph.c: Don't compile frames that have handlers Handlers and the JIT don't play so nicely together, yet. This is because searching a frame handler looks for the current bytecode offset, which is all off in the JIT because it uses a magic bytecode. So I'll disable the handlers for now. This fixes a long-standing issue whereby adding the iter ops would case infinite loops and other weird behavior. |
||
brrt | jnthn++ for actually finding the cause | 23:23 | |
timotimo | that was tough. | 23:25 | |
jnthn | aye | ||
brrt++ # persistence | |||
brrt | :-) | ||
we're not out of the woods yet | |||
jit-moar-ops, even with the current fix, can't compile nqp | |||
timotimo | darn | 23:26 | |
brrt | current error: 'No applicable candidates found to dispatch to for 'compile_node'. | ||
timotimo | uh ... huh? | 23:27 | |
brrt hopes something went wrong in jit-moar-ops | |||
but i doubt it | |||
what's wrong with that, you think? | |||
23:27
itz_ joined
|
|||
brrt | oh, that's not a JIT bug | 23:31 | |
still happens with MVM_JIT_DISABLE=1 :-) | |||
and not a spesh bug | |||
maybe a corrupted-install bug | |||
let's check that first | 23:32 | ||
thats not it | 23:33 | ||
jnthn | brrt: Do you have a latest NQP? | 23:34 | |
brrt | i think i do | ||
hmm | 23:37 | ||
seems like a jit bug after all | |||
or, you know, i just don't know | |||
maybe this is the bug timotimo was talking about | 23:38 | ||
jnthn | Sleep on it? :) | ||
Or are you too zoned in? :) | |||
dalek | arVM/jit-moar-ops: 7d17cb0 | (Bart Wiegmans)++ | src/jit/graph.c: Don't compile frames that have handlers Handlers and the JIT don't play so nicely together, yet. This is because searching a frame handler looks for the current bytecode offset, which is all off in the JIT because it uses a magic bytecode. So I'll disable the handlers for now. This fixes a long-standing issue whereby adding the iter ops would case infinite loops and other weird behavior. |
23:39 | |
MoarVM/jit-moar-ops: d14351c | (Bart Wiegmans)++ | src/jit/graph.c: | |||
MoarVM/jit-moar-ops: Merge branch 'moar-jit' into jit-moar-ops | |||
timotimo | may need a clean and then a JIT_DISABLE run from the very beginning | ||
that's how one of my problems behaved | |||
brrt | yeah, i suspect that will fix it | ||
but that's nasty | |||
it means something in the compilation went wrong | |||
timotimo | yes | 23:40 | |
in a way that doesn't make it crash | |||
brrt | i mean that's which circle of debugging hell? | ||
jnthn | tbh, I'd pull the patches from timotimo++'s branch in one or two at a time until you hit the one that breaks | 23:41 | |
brrt | yep, that seems like a plan | ||
again :-) | |||
:-) | |||
brrt is going to sleep on it | |||
jnthn | 'night o/ | ||
brrt | timotimo: if you're still awake, btw, when did this first appear? | 23:42 | |
'night :-) | 23:43 | ||
timotimo | oof | 23:44 | |
last night? dunno :( | |||
brrt | well, doesn't matter, i'll find it eventually | ||
23:44
brrt left
|
|||
timotimo | good luck! | 23:46 |