00:01 lizmat joined 01:19 woosley joined 01:29 jnap joined 01:30 FROGGS_ joined 01:47 ilbot3 joined 02:20 btyler joined 02:30 jnap joined 02:56 benabik joined 03:31 jnap joined 04:32 jnap joined 05:33 jnap joined 06:23 woolfy1 joined 06:33 jnap joined 07:06 zakharyas joined 07:20 brrt joined 07:29 FROGGS_ joined 07:34 jnap joined 08:35 jnap joined
jnthn makes coffee and digs into some coding 09:07
brrt morning 09:11
JimmyZ morning 09:12
09:13 lizmat joined
jnthn o/ 09:16
moritz \o 09:17
yawnt o/ 09:18
FROGGS_ \o 09:23
nwc10 o/
TimToady \o 09:35
09:36 jnap joined 09:37 lizmat joined 10:02 woolfy joined
masak o/ 10:17
10:18 woosley left 10:20 woolfy left 10:36 jnap joined 11:37 jnap joined 11:50 FROGGS_ joined 12:17 colomon joined 12:38 jnap joined
dalek Heuristic branch merge: pushed 38 commits to MoarVM/spesh by jnthn 13:02
13:07 brrt joined
jnthn So, what I just pushed to the spesh branch is the work I've been doing to build a framework for optimization. It knows how to turn MoarVM bytecode into a CFG in SSA form, turn that back into bytecode, and can do a few very simple optimizations so far. 13:11
It currently has a crack at anything that is called at least 10 times.
nwc10 \o/
I think
jnthn It likely won't make much difference *yet* because it's doing a lot of analysis and then barely using it so far. 13:12
It is smart enough to remove a hash lookups for most method dispatches in any hot-path code. 13:13
And can do some check tossing on parameter handling. 13:14
nwc10 OK, this optimises bytecode. So it's completely independant of any optimisation the compiler does?
(stupid question, but follows up with)
jnthn Right, different levels.
nwc10 what sorts of optimisations exist that can be done at runtime, but not at compile time? 13:15
jnthn Anything based on knowing what types things have at runtime.
Which can't be inferred statically.
Also things you can't convey over the bytecode boundary to the satisfaction of "proof", or things that aren't safe to assume until post-"link". 13:16
nwc10 ah OK. The former is stuff that it has to be ready to undo at runtime if types change. The latter is not things that I would have thought of. 13:17
jnthn Later things include escape analysis (which goes on beneath the API of the VM). 13:18
And inlinings we can't figure out at compile time, due to insufficient info.
Also, the graph it builds up to do this should be useful for JIT work. 13:20
13:24 jnap joined
JimmyZ does it mean parts can be aslo done at compile time if we have types things? 13:25
jnthn JimmyZ: Yes, that's what the inlining stuff in Rakudo's Optimizer.pm is doing. 13:26
uh. Optimizer.nqp
JimmyZ Nice! 13:27
nwc10 my sort-of-implied question is, can't that also be done on the AST? Or at least, earlier than the bytecode?
jnthn nwc10: The bytecode is what you have at runtime.
nwc10: And QAST trees are OK for some kinds of optimization, but hardly suited to all analyses. 13:28
nwc10 yes, I more meant, "inlining stuff in optimiser" - it did seem that some of the things could also be done at compile time
I was asuming that only the bytecode was present at runtime. Else you'd have to trust that a supplied AST was truthfully the same as the bytecode. 13:29
jnthn my int $a = 1; my int $b = 2; say $a + $b; # the infix:<+> here is already inlined at compile time.
nwc10 which seems, um, less than secure
jnthn Yes, the VM shouldn't trust the bytecode too much either :)
And that compile-time inline I showed above happens on the AST. 13:30
It's worth noting also that C# and Java compilers do comparatively little optimization work, and leave the JVM/CLR to do the heavy lifting. 13:31
I believe they do *no* inlining, for example. 13:32
nwc10 the assumption is that there's a JIT, and the JIT gets it all? 13:33
jnthn Right
13:46 btyler joined 13:55 cognominal joined
dalek arVM/spesh: e7ed01e | jnthn++ | src/spesh/args.c:
Fix off-by-one in args optimization.
arVM/spesh: 5d311de | jnthn++ | src/ (4 files):
If MVM_SPESH_LOG is in env, log spesh work to it.
brrt wait wait 14:36
i missed the conversation
jnthn There's lacklog :) 14:39
uh, bcklog
...yeah, I can type.
brrt seeing backlog 14:43
JimmyZ jnthn: I havea stupid question: Why does only effective_spesh_slots in tc->cur_frame need lock... 14:46
jnthn JimmyZ: Because it's not thread local 14:47
JimmyZ: It's a global cache
The lock prevents us racing to write into it.
timotimo ooooh spesh 14:48
jnthn Just about everything on cur_frame is thread local or immutable.
timotimo can you recommend something specific for me to look at?
to perhaps come up with cool new stuff?
JimmyZ the pdf? 14:49
timotimo you mentioned something about strength reduction
there's a pdf?
14:49 woolfy joined
jnthn There's a PDF? :D 14:49
JimmyZ yes, two
jnthn Oh...those two :)
timotimo well, i kind of know about SSA, i don't know much about CFG except i have a basic idea how you would make one
jnthn timotimo: Well, to start with I suggest reading/comprehending spesh/graph.h. 14:50
That's *the* data structure
spesh/facts.h is also important
timotimo thanks :) 14:51
ooh this is exciting :3
any easy optimization opportunity you've left out so far due to lack of time? something i could use as a way to get to know the system? 14:52
brrt what timotimo said :-) 14:53
jnthn timotimo: Oh, there's almost no optimizations implemented yet.
The work has all been building analysis.
And then just a few opts.
cognominal I know what is a SSA but not a CFG :( 14:54
brrt control flow graph
cognominal ho
brrt i had to google / ask :-)
jnthn cognominal: Just a graph of the basic blocks and how flow goes between them.
cognominal I should have guesse
jnthn cognominal: Like a conditional can branch two ways.
brrt could've been many things
cognominal I did not know the acronym
jnthn Basically, every basic block has a few relations to others in the graph
brrt i should spend some time, too, to get to know it 14:55
jnthn It's predecessors, successors, and (dominance) children
brrt is the CFG parralel to the SSA?
cognominal good slides about the subject : www.cs.utexas.edu/~pingali/CS380C/2...slides.pdf
timotimo brrt: the CFG is a bunch of blocks, each of the blocks is instructions in SSA
brrt ah… i see
timotimo is my understanding
brrt yes
jnthn brrt: Well, that's a curious way to put it... 14:56
brrt: You need a CFG before you can compute SSA
brrt … yes
exciting stuff
jnthn timotimo: Well, except that things have lifetimes longer than blocks so you have to insert phi functions to cope with that.
dalek arVM/spesh: fb0f63c | jnthn++ | src/spesh/optimize.c:
Turn known-unrequired decont into set.
arVM/spesh: cec2a2b | jnthn++ | src/ (3 files):
Make MVM_SPESH_DISABLE env var disable spesh.

For debugging, or easily trying with/without it. Note that it's about if the envvar is defined/non-empty; it doesn't care for the value.
jnthn If you define MVM_SPESH_LOG=somefile and then run some NQP or Perl 6 program, it will dump *loads* of info.
timotimo jnthn: ah, so that's the magical phi instruction
jnthn: should i see if i can parse the log and create a graphviz visualization or something? 14:59
jnthn timotimo: that would be cute, though probably one of the other nice things to do is to take the before/after output and produce/show diffs. 15:00
timotimo: So you can easily see what changed.
timotimo i assume you'll allow me to change the output of the log to make it easy to parse?
jnthn So long as it's still easy to diff
It's already quite structured
However, there's a bunch of <nyi> 15:01
Where it doesn't know how to dump certain things
timotimo src/spesh/manipulate.h:1:1: warning: data definition has no type or storage class [enabled by default]
i get this a bazillion times during compilation
jnthn whee
oh, I see...
timotimo a missing typedec perhaps?
jnthn fixing 15:02
timotimo oh
missing a void in front
dalek arVM/spesh: aceb5dd | jnthn++ | src/spesh/manipulate. (2 files):
Missing return type; timotimo++.
timotimo (in two files)
will it support MVM_SPESH_LOG=- ? :) 15:03
jnthn It calls fopen
Seems not on Windows. 15:04
timotimo ah. well, /dev/stdout works :)
and yeah, that's a lot of output :)
jnthn Yeah; it's before/after of everything it specialized.
timotimo the "goto <nyi>" i see, does that mean that the output of the goto op is not yet implemented?
jnthn Yeah, it's purely that dump.c is missing a bunch of things
I wrote the ones I needed to debug the problems I had :)
Instructions are actually always wanting output like BB42 or so 15:05
(that is, basic block 42)
The info is all there in the graph.
It's just adding more cases to the switch/case for operand dumping.
There's a few things we need besides more optimizations to let this be more effective... 15:06
1) It currently only knows how to deal with locals, not lexicals. This makes lexical => local lowering even more valuable. 15:07
2) It doesn't know how to de-opt yet.
timotimo oof, that opt ...
i've had quite a lot of trouble with that optimization in the past :)
lexical to local i mean
jnthn Yeah, I don't mind trying to pick it up and work on it.
Maybe I should do that.
There are a few things that can force de-opt. 15:08
Mixins are *the* huge pain because they're the only time an object we already think we know what is can change type.
I thought about trying to track it cleverly
Then in the end was like, "you know, screw it, just walk the whole stack and de-opt the lot" 15:09
timotimo probably good enough
jnthn It's far safer.
timotimo i wonder if my newest code for lex2loc is up on github yet
jnthn Anyway, implementing the de-opt is...a todo. :) 15:10
Basically, you always need a map of return address in optimized code to return address in original code. 15:11
timotimo there it is.
jnthn Thanks.
timotimo i'm not proud of the state of the code ... 15:12
but i'll be happy when you drag it kicking and screaming into a working and beautiful state
jnthn OK.
timotimo the basic structure is probably sound; a friend gave me the helpful advice that i'll need to do two descents 15:14
first i descend the whole tree downwards so i'm closest to the leaves
then, while "going outwards" (that is: after visit_children() has finished) i need to go towards the levaes again to do the check what variables can be local'd and whether or not any remain that would prevent block-to-stmts transformation 15:15
jnthn Good to know. I'll take a look at it later on.
timotimo yay 15:16
aaw 15:20
build_cfg clears out ins_to_bb
if i want to dump the output of a jumping op, i'll need to access that information later on, won't i?
hm, though it should stash the address of the basic block somewhere as well, shouldn't it?
jnthn The info is in the spesh graph still 15:21
timotimo ah. just not in the same way. okay 15:22
jnthn See MVMSpeshOperand
MVMSpeshBB *ins_bb;
That one
If you grab that and ->idx it, you get the index.
timotimo oh! 15:23
i thought i'd find an ins_offset there
but if it'll be a _bb, that's the best thing that can happen
jnthn No; it replaces all those once the bbs exist.
The offsets are meaningless as soon as you mutate the graph, but bbs are forever :)
Well, unless you make them unreachable. :) 15:24
timotimo best buddies forever <3
o_O 15:25
these phi instructions will not end up in actual code, right?
jnthn Correct. 15:26
Before the stuff got shoved into academic papers, they were called phoney instructions.
timotimo hehehe. 15:27
yeah, a greek letter is much more academic
unless_i r4(2), BB(3) 15:29
doesn't look half bad 15:30
jnthn \o/
Note that r4(2) means "r4 in the original bytecode, 2nd version"
timotimo also, now there's "<nyi lit>" in addition to "<nyi>"
maybe i'll put a _ there
yeah, i should do that
brrt will check this out @home after dinner 15:31
jnthn You can dump the literals quite easily though, I think :)
o/ brrt
timotimo bon appetit!
oh, he's gone
yeah, i'll dump them next.
oh, oops 15:32
seems like i forgot to ->info in between and it still worked?
cur_ins->operands vs cur_ins->info->operands
oh, the code i saw before used both, too
jnthn They are...differnet :) 15:33
->info->operands is the flags telling you what the operand is (register, literal, etc.) 15:34
timotimo i use one &ed to the operand masks to figure out what the type is and the other ... ah
jnthn ->operands are the actual values
timotimo well, that's perfect then :)
throwcatdyn r5(1), liti64(16) 15:35
jnthn :)
Poor cat...
timotimo what are these FH related annotations?
jnthn FH = Frame Handler 15:36
Consider a try, or a loop with control exceptions
timotimo ah, right
jnthn We need to know where the start/end of the exception handler is in the specialized code.
dalek arVM/spesh: 720e905 | (Timo Paulssen)++ | src/spesh/dump.c:
dump out BBs for jump targets
arVM/spesh: 1490226 | (Timo Paulssen)++ | src/spesh/dump.c:
dump literal ints and nums
timotimo should i dump strings literally (actually, escaped)?
or as an index to the string table?
or both? 15:38
jnthn Hmm
Good question. Maybe the string itself is more useful for reading the code.
timotimo yes, but it may disturb the flow of the reader :P
if it's a ten-line heredoc or something :)
jnthn True :) 15:39
But if somebody gives me a program to debug spesh on with a 10 line heredoc in, I might just remove the 10 line heredoc :P
timotimo :D
i could implement it as a fuse filesystem on linux that gives you a folder strings/ and that has text files in it for each number! :P 15:40
or a HTTP server that'll serve these strings to any curl that asks!!
this is brilliant!
all the possibilities forever!
jnthn o.O 15:42
Crazy ideas are crazy :)
15:45 colomon joined
FROGGS_ a less crazy idea would be to upload the star release tarballs :o) 15:47
15:50 FROGGS[mobile] joined
jnthn FROGGS[mobile]: I don't seem to find an SSH key..hmm 15:58
16:29 benabik joined
dalek arVM/spesh: a32a5f1 | jnthn++ | src/spesh/graph.c:
Set ->prev correctly during phi insertion.
arVM/spesh: ed9db72 | jnthn++ | / (4 files):
Add :pure annotations to side-effect-free ops.

These are ones we know we can safely delete if they are unused.
arVM/spesh: 677a5dc | jnthn++ | src/spesh/facts. (2 files):
Start tracking usage per SSA local.
arVM/spesh: 9ef4e75 | jnthn++ | src/spesh/optimize.c:
Back-propagate usage; eliminate unused pure ops.
benabik … phi insertion? Excellent.
jnthn :) 16:38
timotimo should the log only output diffs or should it also log whenever it successfully applies some transformation? 16:45
jnthn I think the diff between before and after will tell us enough. 16:46
timotimo oh, also: the diff between before and after will also contain blocks being re-ordered and perhaps dropped or merged, right?
jnthn It shows them in linear_next order 16:49
They may get dropped I guess...
Though not yet. 16:50
timotimo will transformations likely reorder BBs? 16:51
jnthn Not in the output. 16:52
Fordump, I mean.
timotimo OK, that's nice
jnthn The most immediate elimination we're gonna see is on optional params.
timotimo oh, that's nice
and with it, it'll probably also be able to do some dead code removal? 16:53
jnthn Right.
timotimo i like that
is the name "spesh" significant in any other ways than short and cute for "specializer"? 16:54
jnthn Short and cute for specializer. 16:55
"spec" would give idea of "specification"
And "specializer" was too long
timotimo sounds good to me 16:56
cognominal sounds like yiddish 17:01
FROGGS[mobile] shmok! 17:02
timotimo would you be okay with a diff display tool that's written in python?
jnthn timotimo: hm, I'd prefer it in Perl 6 ;) 17:05
timotimo: tbh I was gonna dig the bits out and then shell out the git diff :P
timotimo heh. 17:06
jnthn No need to re-invent the wheel too much :)
timotimo aye
git diff also makes it pretty colorful
and it has an implementation of patience diff
jnthn aye, the colors are nice 17:08
Time for some dinner. :) 17:11
timotimo i now realize the problem i imagined with using perl6 to parse the output of a perl6 script is not even a problem
i was like "but if i pipe it to a moarvm based perl6, it'll inherit the log env var and clobber the log!"
but it would make more sense to output the log to an actual file.
timotimo heads out 17:13
17:31 FROGGS joined
timotimo i'm getting "malformed UTF8" from my tool now :| 18:11
ah 18:12
somehow something weird came into the file
possibly from me hitting ctrl-c or so
wat. 18:13
more bogus output
where do these weird characters come from. 18:14
17851786After: 18:15
that's not actually helpful output
jnthn hmm
I was gonna say did it fail to null terminate it, but it looks OK to me... 18:16
append(&ds, "\n\0");
timotimo a well placed + 1 seems to have fixed it 18:17
in the memcpy of append 18:18
jnthn I just noticed we don't free the before/after
oh :)
timotimo len + 1 causes it to no longer explode
jnthn oh, hang on...
timotimo the output of this simple script is quite huuuuge 18:19
jnthn Well, remember it's optimizing the compiler too :)
timotimo i was about to complain that some cuids get optimized multiple times 18:20
but that totally makes sense.
jnthn timotimo: OK, here's the acutal issue 18:21
The memcpy is fine normall
It just doesn't append the \0 'cus it uses strlen to figure out how much to append 18:22
timotimo ah, hehehe :)
okay, i'm getting somewhere 18:23
i haven't done actual proper perl6 code in a long while, it seems
i'm surprised by how clean the code seems to be
it's a for lines() with a bunch of whens that change state and push a completed thingie into an array once it's done
jnthn Building a fix now for the dump thing 18:24
timotimo do you mean it isn't always safe to just copy an additional byte?!?!?!?!!kkkkk 18:25
jnthn :P
timotimo (capslock + k gives ! on my layout, capslock+shift + k gives κ) 18:26
dalek arVM/spesh: e53001b | jnthn++ | src/spesh/ (2 files):
Fix spesh dump missing null and memory leak.

  timotimo++ for discovering the former.
timotimo time for foods 18:30
18:48 lizmat joined
timotimo and also groceries 18:50
cognominal jnthn, is spesh dynamic or static optimizations? 19:25
jnthn dynamic; it factors things we can't (or can't easily) see at compile time 19:26
*factors in
And will increasingly come to do so. 19:27
cognominal like v8 as described in the self papers?
jnthn Well, some of the same techniques, sure 19:31
Though nothing so advanced yet.
cognominal yet is the operative word :) 19:33
timotimo is back 19:45
gist.github.com/timo/c32f6687135849294aa4 - what do you think? 19:59
apart from it being after -> before rather than before -> after m)
jnthn nice :) 20:00
timotimo ok, there were changes in the mean time o_O 20:01
i just accidentally reset spesh to master 20:02
git diff only accepts two filesystem files at once 20:03
should i just shell out to git-diff a bunch of times in sequence for all pairs? 20:04
in order to get the colors, i need to shell out :|
otherwise i could output it to files
or perhaps offer a menu
jnthn Could do, yeah. And if the script can take an optional arg which is a way to grep for names, it's even better.
timotimo oh, yeah
how about a -e that'll get eval'd and given to a smartmatchy grep? 20:05
jnthn sounds OK
timotimo -e "name => /foo/" will give you "search for /foo/ in the name field" for example
jnthn Though plain text would do me :)
timotimo is that how it works?
jnthn oh wow
That's nice :)
timotimo i've got to try that first
before i promise anything :)
if you f5 the gist, you'll get th eright direction, -before +after 20:07
(need to upload it first)
i promised too much, sorry 20:10
name => /foo/ wouldn't "just work"
but i could make it work :)
jnthn I don't especially need it ;) 20:11
timotimo well, then you could search for stuff in before and after 20:12
maybe it'd be most interesting to search for befores that have one string and afters that have another string 20:13
actually, in that case you'd want something like the pickaxe search where only lines with + or - in the diff will get considered
the specializer does a good job getting rid of PHI instructions :P 20:14
20:34 brrt joined
timotimo i now have a sub called supersmartmatch 20:43
jnthn timotimo: Yeah - it looks silly at first glance to do so, until you realize that getting rid of them means it also decrements the usage counts on the things that they bring together 20:47
timotimo: Which may in turn allow getting rid of real instructions
timotimo oh god this is glorious 20:55
it's working
jnthn ooh :) 21:00
timotimo gist.github.com/timo/8d0187163cdbebc35d70 21:01
you see the * in the list of cuids? :)
jnthn yes? 21:05
That means multiple? 21:06
Or no difference?