github.com/moarvm/moarvm | IRC logs at colabti.org/irclogger/irclogger_logs/moarvm
Set by AlexDaniel on 12 June 2018.
00:30 patrickz left 01:13 AlexDaniel left 02:20 Kaiepi joined
MasterDuke .ask timotimo i just tried profiling lizmat's code in github.com/rakudo/rakudo/issues/1421 and it's saying `push SETTING::native_array:682` isn't getting jitted. but a spesh log just shows two `# expr bail: Cannot handle DEOPT_ONE (ins=sp_guard)` for the after of that function. any idea what's up? 02:34
yoleaux MasterDuke: I'll pass your message to timotimo.
03:19 dalek left 04:26 Kaiepi left 06:22 Kaiepi joined 06:27 domidumont joined
timotimo does it run into any deopts? 06:30
06:32 brrt joined
brrt good * #moarvm 06:34
timotimo MasterDuke: liz' version of the sieve for 5mil takes 13s on my machine, 2.3 on hers? i'm a little surprised 06:36
brrt timotimo: you just have a weak machine, obviously 06:37
:-P
timotimo clearly 06:38
brrt istr you were looking for a new laptop recently
timotimo yeah, this is my desktop though 06:40
brrt haven't had a desktop in years
decades, maybe 06:41
timotimo i like to play video games every now and then, and that includes ones that require some gpu power :)
i've never had a laptop with an integrated discrete gpu 06:42
06:42 domidumont left
tadzik I used to only use my laptop for work and my desktop for gaming, until one day I tried my desktop for work and noticed how much faster everything is 06:45
you really don't see how slow things are on your machine until you see it on some other
timotimo mhm
also, lots of ram tends to eat a lot of battery 06:46
or at least that was the justification for not putting more than whatever-many-that-was into that one apple laptop at some point or other
(as you can see, my claim is backed up by thorough research)
tadzik :) 06:47
timotimo though that was before ddr4 06:48
i think?
is hack acting up again? :( 06:56
yup. 06:57
06:59 Geth left, synopsebot left, dalek joined, Geth joined, synopsebot joined, p6lert joined 07:03 domidumont joined 07:21 brrt left 07:33 Kaiepi left 07:42 patrickb joined 07:56 brrt joined 08:21 zakharyas joined
lizmat timotimo: that version now takes 4.8 seconds on my machine 08:29
08:46 brrt left
timotimo resumes the afl-fuzzy run 09:02
"Oops, the program crashed with one of the test cases provided.", d'oh 09:18
09:30 Kaiepi joined 09:51 brrt joined 10:05 Kaiepi left, domidumont left
Guest2775 lizmat: so performance has regressed then 10:14
lizmat that would be a conclusion, yes
Guest2775 I wonder when that regression showed up 10:17
lizmat too
Guest2775 snoops around a little 10:18
lizmat: there's a large perf drop somewhere between 2018.03 - 2018.06. Will investigate a bit more 10:37
lizmat could be due to jnthn's scalar refactor work 10:38
Guest2775 we'll see in a bit (I hope) 10:39
jnthn Not sure quite how scalar refactor would affect pulling data out of a buf :)
lizmat I was merely looking at the period 10:43
10:43 brrt left 10:52 benchable6 left 10:53 benchable6 joined
Guest2775 narrowed down to the period 2018.03 -> 2018.04 10:57
11:01 zakharyas left 11:03 benchable6 left, benchable6 joined 11:44 domidumont joined 12:02 Kaiepi joined
Guest2775 lizmat, jnthn: wrt to Lizmat's sieve code. Everything is fast until commit github.com/rakudo/rakudo/commit/4b...5e6fa36cbb which caused massive slowdown. 12:25
After that things improve a bit but it never recovers back to what it was. 12:26
jnthn ah, interesting 12:31
Though probably nothing for it other than looking carefully at the generated code :)
timotimo afl found out how to get MVM_frame_vivify_lexical to drive a pointer off of a cliff 12:33
12:35 zakharyas joined
patrickb oh these visuals in my head... 12:45
timotimo Unhandled exception: Bytecode corruption: illegal sc dependency of lexical: 6656 > 3 13:05
this is now an error that we can throw
it'd of course be better to put this in the bytecode verifier
but what can i say, better than segfaulting 13:06
Geth MoarVM: c8a44b8d1b | (Timo Paulssen)++ | src/core/frame.c
make sure invalid indices for lexical vivify throw

rather than segfaulting with an invalid object later on
this would be better in the bytecode validator, though.
13:12
timotimo afl found out how to infinitely recurse autoclose 13:47
maybe it made a frame that has itself as its outer 13:48
13:52 domidumont1 joined 13:54 domidumont left
timotimo nqp: nqp::setelems(nqp::clone(nqp::bootintarray), 100000) 14:06
m: use nqp; nqp::setelems(nqp::clone(nqp::bootintarray), 100000)
evalable6 (signal SIGSEGV)
timotimo m: use nqp; nqp::elems(nqp::bootintarray) 14:07
evalable6
timotimo ^- reading invalid memory
looks like none of the repr ops in VMArray do a concreteness check, and neither do the ones in the interpreter 14:08
needless to say, afl is having a field day :) :)
i bet it changed a create op into that clone op and got a type object instead of a concrete object there 14:09
jnthn: should concreteness checks go into the interpreter? or into the repr ops themselves? 14:11
14:23 domidumont joined 14:26 domidumont1 left
jnthn timotimo: Interp preferably, because then when we spesh/JIT them we can remove those 14:55
Though I want to try and restructure things to make that lot much more regular
(But that won't happen extremely soon) 14:56
14:58 brrt joined
brrt ohai #moarvm 15:04
jnthn o/ brrt 15:05
brrt \o jnthn 15:31
I'm going to ask a question that I already kind of know the answer to... 15:32
but first, context
you know how a cpu can have multiple register classes 15:33
e.g. for floating point operations, vector operations, integer operations
jnthn *nod*
brrt let's ignore, for the time being, the legacy x87 FPU register stack
So, I've decided to map each register to a separate integers, and to map register classes to integer ranges 15:34
Practically, that's implemented by making a set (i.e., a bitmap) of registers 15:35
Now, I have a bunch of operations that are (potentially) polymorphic on the register class
things like loading and spilling, and selecting the spare register
I have a design choice to make: 15:36
- I can make the register allocator pick the right operation out of a set (e.g., a spill-from-xmm, load-to-gpr, etc)
- I can make the register allocator pick a 'spill' operation and have that operation be polymorphic 15:37
- And then, I can make that polymorphism based on an extra paramter (register class), or, I can derive the register class from the register number (because each register class maps to a separate range) 15:38
the path of least resistance is the last of these options
I'm not sure that makes me very happy though.
Ideas?
I need to decommute... speak you later :-) 15:41
15:42 brrt left 15:47 patrickb left 15:56 patrickb joined 16:07 zakharyas left 16:09 domidumont left 18:18 lucasb joined
timotimo huh. atkey_o returns null if it's called on a null object 18:52
afl managed to set a static frame's outer to a null pointer and then invoke it, leading to a nice little null deref 19:06
Unhandled exception: When invoking 6 '', provided outer frame 0x6b6450 (5 '<mainline>') does not match expected static frame (nil) ( '<anonymous static frame>') 19:11
much better
hah, we check for the current read offset of the serialization reader being above the end, but not whether the offset is negative 19:15
whoa 19:34
==20196== Invalid read of size 8
==20196== at 0x513155F: go_to_first_inline (frame_walker.c:74)
==20196== Address 0x80 is not stack'd, malloc'd or (recently) free'd
(large parts of the backtrace omitted) 19:35
it's called via getdynlex 19:36
the frame walker has both cur_caller_frame and cur_outer_frame nulled 19:39
tc->cur_frame->caller is null; i wonder what afl did to make that so
Geth MoarVM/master: 5 commits pushed by (Timo Paulssen)++ 20:05
20:07 patrickb left
timotimo haha, d'oh, i had my fuzzing stuff mirrored to /tmp, then i installed updates and rebooted 20:07
timotimo starts afl over with the new binary 20:09
20:11 patrickb joined
jnthn brrt: Hmm, if the register allocator picks a polymorphic spill operation, will that not cause issues in so far as the register allocator wants to understand precisely which registers are going to be involved? 22:00
brrt: Maybe for spill not, but for other polymorphic operations, maybe yes
MasterDuke Guest2775++ for bisecting the sieve slowdown 22:30
timotimo yeah, it looked like there was a correctness/speed trade-off and we didn't get the speed impact flattened since
22:33 AlexDaniel joined
MasterDuke timotimo: btw, did you see my question about profiles and spesh logs from last night? 22:34
timotimo yeah, i didn't see what was wrong there 22:37
MasterDuke you mean you didn't understand my question? or did, but don't have an idea of the answer? 22:38
timotimo i tried the code myself and couldn't figure out what happened
MasterDuke ah, you also got the 0% jitted with no indication why? 22:40
timotimo yeah 22:42
MasterDuke huh
think i should file an issue? 22:43
timotimo perhaps, yeah
MasterDuke btw, now that perl6 is an executable, not a wrapper shell script, does that change anything about using afl on it? 22:46
timotimo don't think it'd change much 22:47
ideally we'd build a harness that'd allow us to execute up to a certain point, and then fork for new input
that'll get startup down by a bit 22:48
maybe even eval a few lines of code before forking to warm up spesh on some compiler parts
MasterDuke github.com/MoarVM/MoarVM/issues/1096 22:50
timotimo we should hook up synopsebot to also grab issue titles and status for links like those ^
(though it should probably not repeat the url then)
MasterDuke MVM_SPESH_NODELAY should help with that, right?
timotimo in theory yes, but once the actual programs start, it'd be more troublesome from a performance standpoint 22:51
we don't yet have something to turn that on or off maybe? shouldn't be hard if you're already writing that harness thing, though
MasterDuke github.com/perl6/synopsebot/issues/2 22:52
timotimo perhaps first try getting nqp to work in afl, though
23:47 lucasb left