nwc10 "good" *, #moarvm 08:50
more ASAN - paste.scsys.co.uk/565293 t/spec/S17-promise/nonblocking-await.t
jnthn Arse. In that there's a release next week so maybe I should look at these but I really, really don't want to be distracted yet again from working on hyper/race, which stuff always drags me away from. :/ 09:07
nwc10 there's also ASAN in repl.t (yesterday's paste) but it might be the same cause 09:08
sorry to be the bearer of bad news (and not really able to help fix it)
brrt good * #moarvm
yoleaux 9 Oct 2017 17:30Z <MasterDuke> brrt: nice talk. now which templates are the most needed?
brrt MasterDuke: thanks :-). ehm, depends on the testcase, but i think i can whip up a quick script 09:09
timotimo FFS, locally my moarvm segfaults in NFA's unmanaged_size, but only with --optimize=3 14:23
so in gdb i hardly see anything 14:24
timotimo and putting an fprint there, it won't crash 14:34
timotimo just cuts out the part that's explodey and goes on wit hlife 14:44
jnthn Seems that at least the new semaphore.t crash goes away on reverting 4092ccebcac7bf9848029d068ed361235cab2d8f
timotimo "ubnlocked" :)
jnthn At first I thought "hmm, condvar not detecting spurious wakeup?" but it's actually doing it in a loop already 14:45
oh wait, grr 14:49
I blew away the GC stressing while reverting, I think
timotimo you know ... now that our nurseries grow dynamically anyway, we might get away with enabling gc stressing at run time 14:51
jnthn Indeed... 14:52
It was that
So, wasn't that commit at all
timotimo gist.github.com/timo/7fc21c4aedd16...8bd98d7b9a - i wonder ... this doesn't look like the thing nwc saw 14:55
jnthn Ugh, the crashes are all over the map 15:02
timotimo on the other hand, on my system the unmanaged_size function reliably segfaults and i see no reason for it to do that 15:03
maybe my system's in some sort of bad state
timotimo huh, i had a commit on my master branch that does nothing but jit MVM_decode 15:07
and without that the crash from the mutex in nonblocking-await goes away?!?
brrt oh
oops
jnthn timotimo: I still have a crash in that locally 15:09
Without any local patches
Doesn't occur every time 15:10
timotimo OK, but probably requires gc torture?
or stress or what
jnthn Yeah 15:11
timotimo i just recompiled with asan
ugexe btw PR#719 should -only- be used with libuv 1.15.0 or higher. uv_fs_copyfile didn't exist pre 1.14, and didn't work right until 1.15. so if a user supplies their own libuv during Configure it probably needs a version check 15:12
ugexe there is a libuv PR to add copy on write option to uv_fs_copyfile (github.com/libuv/libuv/pull/1491) that looks like it will land in the next release 15:21
dogbert2 jnthn: sorry for reporting a lot of bugs 15:31
jnthn They may all work out to be the same thing
Looks like at some point we have a ->caller that is outdated 15:32
And all kinds of things go bad from that
timotimo that'd be bad, yeah 15:33
jnthn Yeah, sticking a fromspace check in remove_one_frame does it 15:38
Geth MoarVM: f9975f8893 | (Jonathan Worthington)++ | src/core/frame.c
An extra GC debug assert in remove_one_frame

Catches some problems with an outdated ->caller cropping up
15:40
timotimo Unhandled exception in code scheduled on thread 4 15:52
continuationinvoke expects an MVMContinuation
that'? a new failure mode. interesting.
but i need that newest commit
brrt i need some help in deciding my next course of action
brrt i can focus on: 15:53
brrt - eliminating the 'dynamic label markers' using a stack walker 15:53
- starting the optimizer
- imporving tiling by having a iteration order we can apply per node type 15:54
timotimo jnthn: with a running valgrind, we can put a command into our c code that finds memory locations that point at a given memory location; can you imagine a good spot where that'd be useful?
brrt hmm, i'm wondering if the iteration order thing would even work
jnthn brrt: Expending the range of ops/tiles covered by the expr JIT would also be a wrothwhile task, I guess 15:55
timotimo but brrt wants *others* to contribute that :D
jnthn brrt: In that, that'll get it used more, meaning that the optimizer will be tested better
Yeah, fair point :) 15:56
brrt well, i'm not against doing that myself
:-)
hmmm 15:57
timotimo is there still much to be gained from tuning cost functions or something?
brrt in the expr jit?
well
i need to calculate the cost of tiles more properly
but that's a difficultish job
timotimo i seem to recall something about that from a long time ago
brrt and it's … not really super essential in that the current thing mostly work 15:58
*works
timotimo fair enough
will optimizations happen on the same level the graphviz output lives at? 15:59
or the same level we have these list expressions at?
if someone were so inclined, they could build tiles for our bigint ops that implement fast paths for smallbigint arithmetic 16:01
the first one would be the hardest, as it'd have to have macros for "is smallbigint" and for properly storing the result 16:03
and perhaps building versions of the arithmetic that can assume parameters are already real big ints 16:04
hm. not sure if there's really much to be gained from making smaller functions for that
Geth MoarVM: ac471af4ba | (Jonathan Worthington)++ | 4 files
Further GC debug checks to help find ->caller bug
16:16
timotimo could anything be wrong with the special return mechanism? 16:19
jnthn Don't think so
It seems to never crash with spesh disabled 16:20
Disabling inline makes no difference
timotimo you're running nonblocking-await.t? 16:21
jnthn Yeah
oh wow
That's an interesting clue if I got it right 16:22
MVM_SPESH_LIMIT=1 and it still goes bad
timotimo was it in some paste that showed that extra_data was being use-after-freed? 16:23
jnthn Maybe
Hm, so if MVM_SPESH_LIMIT is 1 then it's never actually installing any specializations 16:24
Well, I guess it's installing 1 of them
timotimo do we keep collecting logs when spesh limit was reached? 16:26
jnthn yeah 16:27
Thus my wondering if it's somehow about logging
timotimo right
timotimo with the headache i'm cultivating, i won't be of much use ... but i better not keyboard any more today anyway :\ 16:46
Geth MoarVM: 895b217230 | (Jonathan Worthington)++ | src/core/interp.c
Missing MVMROOT of return value

Spesh logging may allocate a new log. This could cause us to return outdated return values.
16:48
jnthn Good news: that decreases the number of failure modes
Bad news: it doesn't decrease them to zero, so there's still something not right
Geth MoarVM: 49e5882510 | (Jonathan Worthington)++ | src/core/frame.c
Check caller chain before promotion

So if there's a problem after promotion, then we know for sure that it was the promotion code that caused it.
16:51
MoarVM: 29250d60f1 | (Jonathan Worthington)++ | src/spesh/log.c
Fix possible access to moved object body

Could happen only on spurious wakeup of the condvar, and only in a codepath that happens with MVM_SPESH_BLOCKING=1 set, so this would not impact normal use. Still, good to fix it.
MoarVM: dc9a338b23 | (Jonathan Worthington)++ | src/core/frame.c
Ensure ->caller/->static_info can't get outdated

Recent changes moved ->caller and ->static_info to be set up earlier, to make sure a frame has them. However, in the case that we have a frame allocated on the callstack rather than the heap, then MVMROOT of that frame does nothing. This is normally fine; a frame not on the heap will be hanging off something's `tc->cur_frame` and be marked if GC is triggered. But in this particular case, we're making a new frame, and it's not yet in that chain, so the two could go unmarked and thus become outdated when the spesh log got sent.
17:06
jnthn I think those commits fix all the issues dogbert2 and nwc10 reported today-ish 17:08
timotimo nice! 17:12
jnthn oops, what time...
jnthn goes home to make dinner
o/
timotimo gutes gelingen!
dogbert2 jnthn+++ 17:21
travis-ci MoarVM build errored. Jonathan Worthington 'Further GC debug checks to help find ->caller bug' 18:01
travis-ci.org/MoarVM/MoarVM/builds/286116404 github.com/MoarVM/MoarVM/compare/f...471af4ba72
Zoffix Travis glitches on two jobs. Nothing todo with our code. 18:03
Geth MoarVM: jsimonet++ created pull request #726:
Typo symobls -> symbols
19:39
MoarVM: e4a9b84907 | (Julien Simonet)++ (committed using GitHub Web editor) | docs/jit/overview.org
Typo
19:52
MoarVM: 599eb46642 | (Zoffix Znet)++ (committed using GitHub Web editor) | docs/jit/overview.org
Merge pull request #726 from jsimonet/patch-1

Typo symobls -> symbols
lizmat good *, #moarvm 20:02
yoleaux 17:45Z <Zoffix> lizmat: do you know if the `:view` can be removed? It doesn't appear to be used in core, undocumented, and unspecced. I'm making the method shove Nils into holes to resolve RT#132261 and user would still get Mus with `:view` arg. github.com/rakudo/rakudo/commit/e9...d32736c739
synopsebot RT#132261 [resolved]: rt.perl.org/Ticket/Display.html?id=132261 Unclear what a hole in a List is
lizmat Zoffix: will get back about that
so what is the best way to see if something is Callable in nqp proper ?
Note: Callable doesn't exists there (afaik) 20:03
lizmat jnthn: a thought re startup times 21:34
timotimo ears twitch
lizmat would it make sense to start the spesh thread let's say after 500 msecs ? 21:35
if the program is done within 500 msecs, spesh probably wouldn't have made much difference anyway
timotimo hmm, i wonder
if the compiler part takes a bunch of time it'll be worth having it in
like, having spesh be active for that duration 21:36
with the way logging now works we no longer keep frames around if they started logging and then were never run again 21:37
lizmat well, it feels to me that we maybe doing too much at startup that may not be needed for short running programs
timotimo but specialized frames that don't get called any mmore after some point will still stick around; shouldn't have a big impact on run time, though
since spesh runs on its own thread, it only increases cpu time, not wallclock time 21:38
lizmat ok, lemme put it this way: for me, bare startup is around 136 msecs atm
about 16 msecs of that is running the mainline of the setting 21:39
what is it doing the other 120 msecs (wallclock!) ?
also: sometimes (about 1/30 times) I have a startup of 110 msecs 21:40
timotimo how does disabling spesh change the wallclock on your end?
lizmat why would that be? maybe something didn't get started and therefore didn't have to be stopped ?
a few msecs at most
timotimo the telemetry log can show you how things behave wrt GC pauses 21:41
like, worst case spesh could kick in right before other threads want to GC, then it'll do its work before it allows the GC to happen
spesh is generally rather fast, though, and it checks for gc in between every little job
(have not actually measured, just my gut feeling)
lizmat ok, well, it was just an idea.. :-) 21:42
timotimo i get around 130% cpu with spesh on and 98% with spesh off
lizmat so: what do you think happens in those 120 msecs ? Is there a way to find out ?
timotimo 0.1 seconds elapsed either way
we can callgrind it to see what internal functions spend how many cpu cycles
lizmat ok, will keep that in mind 21:43
meanwhile, /me is going to get some shuteye :-)
timotimo good night!
lizmat good night, #moarvm!
timotimo perhaps a little optimization in the bytecode validator could pay off 21:44
Zoffix \o 21:45
timotimo it's not very expensive, but MVM_proc_getenvhash gets called 6.3k times
that spends a bunch of time doing utf8-c8 decoding
timotimo runs again without spesh 21:46
timotimo might want to cache the result of getenvhash in a few places, or maybe even inside moarvm 21:49
timotimo pretty much all the calls to getenvhash come from nqp's moduleloader where it looks if there's an NQP_LIB env var 22:06
i'll now compare Ir counts with what it looks like when i cache that hash inside the moduleloader
timotimo it's really worth a lot 22:09
463,878,397 before the change, 405,717,806 after
it doesn't translate 1:1 to time, of course 22:10
a really big portion of startup is still deserialization 22:14
jnthn Huh, I was sure the env hash was already cached o.O 22:18
timotimo:
timotimo++ for catching that one
timotimo we'd want to cache it in the instance?
currently i'm caching it in nqp hll code
jnthn Yeah, in instance should do it 22:19
timotimo should we ever invalidate that hash?
jnthn Can the environment ever change "from the outside"?
timotimo if you nativecall to setenv 22:20
but user code will always be looking at %*ENV
rather than nqp::getenvhash
jnthn OK, well, if you're doing that, you can nativecall to getenv so far as I care :P
Right
And that's 'cus setenv/getenv are, iirc, a multi-threading disaster
timotimo yeah, ugh.
i don't know how to get a more accurate wallclck measurement from perl6 startup 22:25
timotimo caching it in moarvm gets the count down to 403,548,367 22:28
anyway, it's like two strings for each env var and another for the = we're searching for each time 22:29
so we're also pushing back the first GC run a little bit
jnthn Nice :) 22:30
timotimo hm. it might be beneficial for the spesh thread to check some "is moarvm currently quitting?" flag or something? 22:31
here i see "moarvm teardown" at 316857586 ticks, but spesh worker finishes up at 334677178
m: say (334677178 - 316857586) / 3392386327 22:32
camelia 0.00525281919
timotimo so 0.005 seconds? that's not so terrible i guess
MasterDuke look, i'm a very important person and i need those 0.005s! 22:33
timotimo you truly are
Geth MoarVM/cache_env_hash: f2deedfc96 | (Timo Paulssen)++ | 3 files
cache the env var hash after first access

turns out the nqp module loader asked for the whole hash to maybe find an NQP_LIB var about 6k times during bare startup of rakudo.
22:39
timotimo should be fine, right? even with concurrent first accesses it shouldn't crash? 22:40
the threads will just race to install it and whichever wins doesn't get cleaned up by gc later? 22:41
jnthn Yeah
We do that trick in a few places
timotimo good
jnthn So long as you only install it once it's fully populated
And it's immutable from then on 22:42
Then it's fine
timotimo there's the merge
Geth MoarVM: f2deedfc96 | (Timo Paulssen)++ | 3 files
cache the env var hash after first access

turns out the nqp module loader asked for the whole hash to maybe find an NQP_LIB var about 6k times during bare startup of rakudo.
timotimo well, it's not really immutable; you can still bindkey and such to the result of nqp::getenvhash
but since it doesn't have an effect otherwise, why would you. and if you do, you're on your own i guess?
jnthn Yeah 22:43
timotimo a nice little win.
jnthn Indeed
timotimo++
timotimo wonder how many msecs it'll turn out to be on liz' machine
(not enough to satisfy us, of course) 22:44
timotimo out of the 445,664,567 Ir for a -e 'say "hi"', about 131,290,357 is under MVM_6model_get_how, i.e. causing metaobjects to be deserialized 22:46
timotimo in total, work_loop is responsible for 245,266,376 Ir 22:47
gets invoked 1.5k times
i think we run work_loop once every time we want to have an object and need to get deserializin'?
jnthn Yeah 22:48
1.5k things is a lot less than all the things :)
timotimo OK, how about MVM_validate_static_frame: responsible for 50,380,527 Ir; called 1,843 times in total
does it sound bad that we call "repossess" 438 times? 22:49
jnthn It might be interesting to see which types we need the HOW
timotimo i can surely instrument that
jnthn Easy thanks to debug_name
timotimo yay 22:50
jnthn I've poked at the bytecode validation a few times and struggled to see any more obvious speedups there 22:50
But if you can find them, then that's surely a good thing 22:51
timotimo hm. 6model_get_how calls sc_get_object for these: gist.github.com/timo/9d90da147191d...7e52d4bcfa
added a file for total times get_how is called 22:52
unsurprisingly with -e '' we have no s1, h1, f1, c1, or a1, and no qq or b1 either
and no Perl6::QGrammar+* 22:53
jnthn gist.github.com/timo/9d90da147191d...e1-txt-L43
Where'd the hi come from? )
timotimo that's -e 'say "hi"' for you
jnthn Oh 22:54
I thought you meant -e ''
timotimo nah, but i can put the data in there for that, too
jnthn Oh, right, you were comparing it with the stuff in the gist :)
timotimo yup
jnthn I'm a bit surprised by some of them
NQPFileHandle for example 22:55
timotimo the very first how we ask for
jnthn Oh, we could log the backtraces of where we ask too :)
timotimo i just put an uv_hrtime before and after to see which one takes how long 22:56
though i probably want it in front for better readability 22:57
updated 22:58
not terribly surprising that Perl6::Grammar takes the longest
by far, it looks like
jnthn Lots of methods. Lots of NFAs. 23:00
timotimo i mean, we still have the thing where every NFA deserializes all its guts followed (or preceded?) by a bunch of nqp arrays with the same data again
jnthn Yeah, there's a big memory and startup time win to be had for whoever takes that task on 23:01
timotimo wasn't developer enough the last time 23:01
2 years 1 month ago
jnthn Probably one that's in need of either patience or stubbornnes :) 23:02
timotimo with the way the sc work loop works it's not terribly easy to account for what object accounts for what amount of time spent deserializing once the work loop has been entered 23:05
i had a potentially silly idea in the shower today: should we have a mode (ifdefd) that saves a pointer to "which object was the first to put me into a worklist" when doing GC? 23:06
our problem is usually more that things are not put into worklists, right?
though if we encounter an object in the fromspace we'd have a pointer into the fromfromspace for an object that used to hang on to us? 23:07
i think it's clear i haven't thought this through
jnthn Hm, not quite sure I follow, but it is kinda late... 23:12
We already do trap on worklist addition of something that is in a fromspace though
(In debug mode, that is)
timotimo right 23:14
it's not clear if my idea si a win in any situation
rest time o/ 23:21
jnthn me too o/ 23:29