timotimo the problem is we really want to mprotect jit code regions read-only and executable 00:02
that means we'd have to at some point decide "we should move some jit segments together" and memcpy them into a big chunk and switch over
00:07 lizmat_ joined
timotimo so we won't be faster at setting up stuff 00:07
we'll just end up using fewer mmaps all in all
er, not even "all in all", just "eventually" 00:15
other JITs must have something for this 00:18
oh, actually, maybe we can mprotect pieces of an mmap? 00:19
then we can map big chunks at a time and fill them with jitcode, making parts read-only as we go
yes, that should work 00:21
japhb I still don't quite understand why we don't consider persisting JIT output, so we can just mmap it back in when we detect it's safe. Or perhaps that last bit is too hard right now. 00:40
timotimo we have references to objects (, types, stables) in our jit objects that would have to magically point to the same object afterwards as well 01:06
japhb That's not a hard blocker, that's just a matter of having e.g. a fixup table. But I understand your basic point that this makes things more difficult. 02:14
03:30 ingy joined 03:52 ilbot3 joined 07:30 sivoais joined 08:06 FROGGS joined 08:21 Ven joined 08:25 kjs_ joined 08:32 Ven joined
jnthn On the cost of JIT affecting stratup, that probably means we should look at why we do enough work at startup to trigger JIT, or whether we've set JITting thresholds too low. 09:02
09:13 kjs_ joined 09:45 tadzik joined 09:55 brrt joined
brrt timotimo - i'm not sure if you can mprotect pieces-of-an-mmap, and even if you could, i'm less sure windows can 09:58
i believe mprotect relies on the paging hardware of the CPU? and those have page granularity
and moreover
moreover
i don't care about writing a clever allocator-for-rx-memory without exhaustive proof of it's necessity 09:59
its]
nwc10 This is partly what I'm wondering (and I don't know how to benchmark it, or figure it out from docs/code inspection]
jnthn I'd rather try to other things I suggested first :)
nwc10 is the syscall to change memory protection about as slow as mmap itself 10:00
both have to fiddle with page tables.
brrt well, one can check if the # of mmaps goes down significantly with MVM_DISABLE_JIT=1
nwc10 but yes, if JIT thresholds are too low, change them and startup shouldn't hit the JIT
and if they aren't, but startup is hitting the JIT, why is the relevant setting code doing work at runtime anyway? 10:01
brrt parsing the file is done at runtime at least?
nwc10 true, but how much parsing does 'say 1' take? 10:02
brrt and regexes / grammars can be compiled
i don't know
good point
nwc10 me neither. but it's a good question.
brrt fwiw, python has 0.010s sys time in a -e 'print "hi"' 10:03
oh, that'd be 0.030 sys time
ruby has 0.006s sys 10:04
and perl5 has 0.002s
with JIT the sys time is 0.047s, without it it's 0.03s 10:07
so, ok, significant
however, both are dwarfed by user time (0.29s w/o JIT, 0.30 with JIT) 10:08
japhb - the main reason is linking 10:09
we typically insert hard pointers to functions, sometimes to nonmoving objects (when this can be proven safe) 10:10
(for not persisting JIT output, btw)
another reason is that frames would need to be related to another accross VM runs, which is pretty hard actually 10:11
if it weren't i wouldn't have such problems giving a name to a frame
oh, timotimo alreadty answered that :-) 10:12
another solution would be to batch JIT compilations so a single mmap can be made for all of them 10:13
but that is.. not necessarily awesome
[Coke] brrt - I can't see the picture in brrt-to-the-future.blogspot.com/201...FstRrW7PkQ 12:50
FROGGS hmmm, I can 13:00
3.bp.blogspot.com/-QP33McMn3qQ/VQ_e...erview.png 13:01
[Coke]: does that work? 13:03
[Coke] ERR_NAME_NOT_RESOLVED 13:09
nwc10 does resolve for me (to different places, from machines on different contintents) 13:10
[Coke] mmm, probably just my work proxy & firewall. 13:11
nwc10 [Coke]: freethoughtblogs.com/lousycanuck/20...orkaround/ :-) 13:13
[Coke] by default, I'm not even using DNS - it's going through work's http proxy, so it's -that- dns. I'd have to do the lookup separately, plug the IP in, still subject to the proxy letting me see anything... 13:21
was hoping it was either broken for everyone or there was a copy somewhere. no worries. 13:22
13:39 colomon joined
timotimo asking around, it seems like many jits actually don't W^X yet 13:48
14:16 zakharyas joined
timotimo "locking" jit-destination-pages seems like a bad idea, too 14:42
15:05 colomon joined 15:25 ShimmerFairy joined 15:42 kjs_ joined
timotimo if the jit-destination-pages hang off of the TC, that wouldn't be a problem 15:49
then we'd never be trying to run code in a given segment and appending to the segment at the same time
16:03 Ven joined 17:01 FROGGS[mobile] joined
timotimo pointers are hard sometimes 17:17
nwc10 sometimes? :-) 17:28
pointers are hard, lets go Java! 17:29
17:31 rurban_ joined
timotimo structs make the whole thing much less annoying to handle 17:35
timotimo is building a prototype mmap-consolidation-allocator-thingie 17:38
and it doesn't work at all m( 17:39
jnthn You can't really easily hang JIT pages off tc 17:58
Or if you try, your allocator is going to be harder than you imagine
Consider a hot EVAL'd bit of code
That gets GC later.
18:06 AndChat|228864 joined
AndChat|228864 aloha 18:06
timotimo wow, it runs now 18:11
do we actually throw out jit code? 18:12
oh, we do 18:14
yeah, my idea wasn't such a good one, apparently 18:15
jnthn Yeah, they hang off MVMStaticCode objects which are GC-able 18:20
Specializations too
timotimo ah
well, damn
jnthn So your allocator would need a free operation also, and you don't know which thread will finalize a MVMStaticFrame so it'd need to be a thread-safe allocator off instance rather than per tc 18:21
So I'm afraid it quickly gets complex :(
Better might be to adjust the thresholds for when we spesh
And use perl6-m --profile-compile -e '1' to see the % of time spent specializing/JIT-compiling at startup
18:25 Peter_R joined
timotimo gaaahh 18:25
my life: have idea, build it (mostly), find out it's wrong from the start
18:50 FROGGS[mobile] joined
timotimo maybe we should just malloc the jit pages and just mark our whole program heap as executable 18:52
19:07 FROGGS[mobile] joined 19:18 FROGGS joined 19:21 kjs_ joined
japhb timotimo: You forget step 4: Learn a ton in the process 19:29
Also, if you push a commit to mark all of heap executable, I shall whip you with a wet noodle until you revert. 19:30
timotimo :)
20:03 brrt joined
brrt timotimo: allocaters are hard, but trying is always helpful 20:08
jnthn wonders if a wet noodle can actually inflict pain... 20:12
brrt i wonder if pain was the point :-)
japhb I think it's more a matter of "Eww, would you please STOP THAT?!" "Sure, just as soon as you revert your commit!" "FINE, SHEESH." 20:16
Now, a wet mackerel is more likely going to combine *both* the ickiness and the pain. 20:18
20:19 colomon joined
japhb Don't make me pull out the fermented foods. Because I will. You just try me. ;-) 20:19
.oO( News flash: Hacker threatened with food, story at 11 ... )
20:21
jnthn Noo....no the pickled wallnuts! 20:22
*not
brrt blog.erratasec.com/2015/03/x86-is-h...RMUs-nd-V5 interesting stuff about the x86 20:24
jnthn Aye :) 20:27
I was reading up on a bunch of that stuff recently. It's pretty fun :)
brrt yes, rather 20:29
it's also funny that in many cases the old 'high level' ops like push and pop are really slower than the lower-level ops 20:30
jnthn I suspect the clever hanlding of MOV *may* mitigate some of Moar's currently sub-optimal code-gen, but the WORK fetches it really can't help much with, and I suspect those are what cost us the most. 20:33
nwc10 WORK fetch?
jnthn frame->work 20:35
The VM-level registers 20:36
nwc10 aha
jnthn It's fetched into a register at frame entry known as WORK in the JIT, iirc
nwc10 (in the enterprise edition, this will be called MEETINGS?) 20:38
jnthn Nah, we'll embrace agile and call it STANDUP :P 20:39
20:53 FROGGS[mobile] joined
brrt lol :-D 20:55
i was wondering about calling them all just REGISTER or REG 20:56
jnthn WORK nicely matches what's in frame, though :)
brrt true 20:57
dalek arVM: e2e908b | jnthn++ | src/spesh/threshold.c:
Tweak dynamic optimization thresholds.

These are tweaked by considering profiles of NQP and Rakudo startup. We now do barely any specialization during NQP startup, and only about a third as much as we used to during Rakudo startup. This improves the startup time of both a little (measurable outside the profiler in both cases too, including NQP "make test" about taking 10% less time).
21:06
jnthn timotimo: If you have a chance to kick off an easy perl6-bench run, could be good to check that this doesn't make things worse there.
timotimo OK 21:08
jnthn hold on a mo
dalek arVM: a752064 | jnthn++ | src/spesh/osr.h:
Bump OSR theshold also.

Cuts the number OSRs done during Rakudo startup from 4 to 2.
21:09
jnthn timotimo: That one might as well be included too :)
timotimo k 21:10
21:22 colomon joined 21:34 FROGGS[mobile]2 joined
timotimo jnthn: is going to happen 21:40
starting the benchmarks right now 21:41
jnthn Very happen :) 21:44
timotimo++
21:50 tgt joined, tgt left 21:57 travis-ci joined
travis-ci MoarVM build passed. jnthn 'Bump OSR theshold also. 21:57
travis-ci.org/MoarVM/MoarVM/builds/55861712 github.com/MoarVM/MoarVM/compare/e...52064501a7
21:57 travis-ci left
jnthn Travis seems a little confused... 22:00
I got emails saying both of my last commits got the build passing again :) 22:01
timotimo guess what, jnthn 22:33
SUMMARY SCORE 108.5 100.0
^- after tuning gets a better score
t.h8.lv/p6bench/2015-03-25-threshold_tuning.html 22:34
i wouldn't have thought the improvements would be this noticable 22:36
oh, wait
derp.
i compared moarvm-2013.03 vs moarvm-master
that's rather dumb
let's try that again... 22:37
jnthn d'oh :) 22:40
timotimo a full-on refresh will give you the new data 23:02
jnthn How on earth has zero got worse, but hello got better?! :) 23:07
I wonder why while_hash_set got notably better 23:08
Same with for_hash_set. 23:09
timotimo: Thanks. I think seeing this, I'm a bit curious about a couple of things, but the change can stay. 23:19
jnthn gets some much needed rest
timotimo maybe we've been occupying slots for any given callsite/target combination with stuff that's only useful in the very beginning of startup and that wastes some time during the "real" program run? 23:20
i don't really know how all that works :)
23:52 colomon joined