timotimo for some reason, my findmeth_s → findmeth optimization causes two getspeshslots in a row to be generated that write to the same register, but read from different spesh slots ... 00:25
dalek arVM: dfa0007 | (Timo Paulssen)++ | src/spesh/optimize.c:
code-gen builds a bunch of const_s + findmeth_s

we can turn this into just findmeth, which wan further become a sp_getspeshslot and turn invoke_o into fastinvoke further down the line
00:35
arVM: 583710e | (Timo Paulssen)++ | tools/graph_spesh.p6:
pull a tiny bit more info out of spesh logs:

callsite address and argument counts as well as named argument names, but also file name, line number and cuid
00:36
timotimo ^- this findmeth_s optimization could make a noticable difference because in the past it has been preventing invoke_o → sp_fastinvoke transformations as well 00:37
dalek arVM/spesh_box_tracking: 17bc9e1 | (Timo Paulssen)++ | / (4 files):
WIP on tracking the data inside box containers
13:45
arVM/spesh_box_tracking: 5de844b | (Timo Paulssen)++ | src/spesh/optimize.c:
if_o/unless_o on a boxed int/num/str could be much cheaper.
arVM/spesh_box_tracking: c3ccfae | (Timo Paulssen)++ | src/spesh/optimize.c:
much safer, but segfaults a few spec tests still
timotimo ^- if anybody wants to have a look 13:46
i'm not finding suspicious things 14:03
perl6 t/spec/S03-operators/bag.rakudo.moar; perl6 t/spec/S03-sequence/nonnumeric.rakudo.moar; perl6 t/spec/S05-mass/charsets.t; perl6 t/spec/S05-transliteration/trans.rakudo.moar
these fail on my machine with that branch of moarvm
dalek arVM/spesh_box_tracking: f5110fb | (Timo Paulssen)++ | src/spesh/optimize.c:
the right line in the wrong place can make all the difference in the world
16:28
timotimo ^- feel free to try this out :)
japhb timotimo: Did this fix the spec failures you had earlier? 16:30
timotimo yes 16:33
but i'm looking at a spesh output right now and it's still looking pretty stupid :(
lots of occurences of "box a constant in a p6bool, unbox an int from the p6bool and branch conditionally" 16:34
my code should *at least* change the code to skip the unbox and use the int register that's set from the constant directly
and then i could just recurse and try for a constant branch elimination
oh, huh 16:42
the flags don't have the "known box source" bit set
jnthn: what could cause an operation - in this case p6bool - to not have its facts discovered? 16:53
aaaah 16:54
i'm not setting the flag in the discover sub inside rakudo
silly me. to the max.
timotimo brrt, would you like to give me hints on how to implement nan, inf and isnanorinf? 18:27
they only cause a single bail each in my test case here, but i'd like to have it jitted still
it seems like loading a nan into a floating point register is commonly accomplished by setting all bits to 1? 18:29
hum. it seems like i'd want to be careful WRT signaling vs quiet NaNs? 18:31
timotimo on top of that ... can has invokewithcapture? :3 19:02
jnthn timotimo: Note that p6bool gives back singletons 19:28
So it's not actually an alloocation you're avoiding
timotimo ah, damn :)
but we have no reason to unbox_i the p6bool, if we just "boxed" it :)
jnthn Correct, or if_o if when we could if_i 19:30
Esp under JIT
timotimo correct
that's what i'm thinking will give us the very most
i'm now spec testing a change where i recurse into optimize_iffy if i just turned an if_o into an unbox-skipping thingie so that perhaps it can benefit from known values 19:31
because i've seen enough instances of const + box + unbox + if_o
er, i mean:
const + box + if_o
my previous optimizations turned that if_o into an unbox + if_i
doesn't seem to break anything at all this time %) 19:33
hm, actually ... i could put a check at the very end of the spesh dispatch, check if the next is not a set, but the previous link goes through multiple sets 19:34
and that could trigger set squishing
that seems robust enough
nwc10 ASAN happy. I disabled the fixed size allocator and ASAN still happy 19:51
timotimo \o/ 19:52
that's definitely good news
nwc10 not *totally*.
there was barfage a few days ago
but then it has gone
and I'm suspicious that it depends a lot on some sort of buffer size, or spesh threshold, or something
so I think that there's at least one bug still 19:53
but it's a rare heisenbug
timotimo blerh :\
nwc10 that pretty much sums it up 19:54
jnthn Yeah, I fear we have such a thing too :( 19:58
bbiab 20:01
japhb jnthn: What else can I do to help you figuring out the threading stability issues? perl6-bench stress testing support is in pretty good shape (delta html_plot output of the diagnoses, which I'm frankly procrastinating on a bit, since text and html output already work). There's only one test tagged 'stress', but that's because I don't have any other *small* code snippets that tickle failure easily. 20:04
If someone points me to snippets that are known to cause r-m or r-j to fail or crash as problem size or concurrency level increases, I'm happy to the monkey work of turning those into perl6-bench tests. 20:06
timotimo i only managed to build a super conservative version of set squishing :\ 21:25
my head feels kind of blocked for some reason
it should just be a simple-ish unification, right?
jnthn japhb: Well, my main blocker at the moment is tuits 21:27
japhb jnthn: :-( 21:43
Sadly, that can be the hardest block to overcome. 21:44
.oO( Speed of light? Pffft, no problem. Full calendar? Insurmountable. )
japhb is frustrated that his tuits come in smaller blocks than the size needed to go deep on this problem. 21:45
jnthn Yeah, I'm needing some undisturbed time also