01:53
ilbot3 joined
02:07
vendethiel joined
04:37
AlexDaniel joined
06:18
domidumont joined
07:06
vendethiel joined
07:22
zakharyas joined
07:48
domidumont joined
|
|||
jnthn | morning, #moarvm | 08:02 | |
I couldn't reproduce the ASAN barfage that nwc10++ reported using valgrind | 08:19 | ||
But I did spot a bug where it pointed | |||
Still a load of test failures alas | 08:20 | ||
Geth | MoarVM/spesh-worker: 9abca38a03 | (Jonathan Worthington)++ | src/spesh/osr.c More robust OSR frame resize calculations. Of note, this copes with the case where we deopt, but OSR returns us to the optimized version again at a later point. |
08:23 | |
jnthn | Oh, I think that I also managed a pointer type screwup in the arithmetic of the previous code | 08:24 | |
Which this also fixes | |||
Hmm. A Rakudo built with spesh disabled passes make test even with spesh enabled. | 08:31 | ||
And a load more spectests | |||
I'm guessing that some OSR bug is causing a mis-compilation. | 08:32 | ||
08:41
robertle joined
|
|||
Geth | MoarVM/spesh-worker: 7a3ca8c4ff | (Jonathan Worthington)++ | src/spesh/osr.c Make MVM_SPESH_OSR_DISABLE work again. |
08:47 | |
MoarVM/spesh-worker: c6d586f931 | (Jonathan Worthington)++ | src/spesh/osr.c Add a way to get a simple log of performed OSR. |
|||
jnthn | Looking at spectests that explode, it seems that the issue is indeed that when we enter OSR'd code, we may be after guards which are then used | 08:48 | |
This should have been a problem in the past always however | |||
So I can only assume we're somehow doing a lot more OSR now | 08:49 | ||
And got lucky enough before now | |||
lizmat | .oO( moar OSR is good for you ) |
09:14 | |
jnthn: fwiw, looking at --profile output, in many cases code that *is* OSR'd according to MVM_JIT_LOG, is not marked as OSR'd with --profile | 09:15 | ||
this is on HEAD, not your branch, BTW | |||
jnthn | Hm, curious | 09:20 | |
Geth | MoarVM/spesh-worker: b5f2b4b77e | (Jonathan Worthington)++ | src/spesh/graph.c Fix bad guard assumptions around OSR points. Prior to this, it was possible that a condition checked before an OSR point would always hold after it. This was not a good assumption, as at the point OSR happens the current value in the interpreted code may not have met the guard. ... (5 more lines) |
09:39 | |
jnthn | The good news is that fixes a lot. | 09:41 | |
The bad news is that it doesn't fix the mis-compile of Rakudo | 09:42 | ||
Worse, spectest seems to pass cleanly | |||
With a Rakudo compiled without OSR, and then OSR enabled | |||
So at the moment the haystack is "all of Rakudo" :/ | 09:43 | ||
Apparently, CORE.setting can be excluded though | 09:44 | ||
nine | That doesn't leave all that much anymore? | 09:45 | |
jnthn | nine: Only all of the grammar/actions/world :) | ||
09:53
domidumont joined
|
|||
Geth | MoarVM/spesh-worker: 1d799dd4c7 | (Jonathan Worthington)++ | src/spesh/stats.c Detect another case of incomplete type tuples. When we have a container type, but no information on its contents. |
10:08 | |
10:18
dogbert17 joined
|
|||
samcv | jnthn, what's OSR stand for? | 10:23 | |
jnthn | On Stack Replacement | ||
samcv | ah ok | 10:24 | |
jnthn | It's where you're in a hot loop, and produce an optimized version of that code, and then replace the version currently running "on the call stack" with the optimized version | ||
hah, found somehting | 10:39 | ||
*something | |||
nine | Memory used in 8649 for localhost:3000/search_results: 6092 KiB | 10:52 | |
Geth | MoarVM/spesh-worker: ec1ed15fbf | (Jonathan Worthington)++ | src/spesh/osr.c More detailed logging of what OSR does. |
11:02 | |
MoarVM/spesh-worker: 21122e1aab | (Jonathan Worthington)++ | 2 files Make native references OSR-safe. Previously, they could end up pointing into freed memory. This is not a new bug, just one that recent spesh refactors have brought to the surface. |
|||
11:03
lizmat joined
|
|||
jnthn | And the mis-compile remains | 11:04 | |
This is gonna be tedious, I fear. :S | 11:06 | ||
dogbert17 | jnthn: I could build rakudo with your branch, am I missing something? | 11:13 | |
jnthn | dogbert17: make test? | ||
dogbert17 | one test failed | 11:14 | |
t/04-nativecall/21-callback-other-thread.t | |||
jnthn | Huh, odd | ||
(that others don't fail too, I mean) | 11:15 | ||
You may want to try bulding with MVM_SPESH_BLOCKING=1 which may un-hide it | |||
I'm getting failures in most NativeCall tests and some spectests | 11:16 | ||
dogbert17 | perhaps this might be of interest: gist.github.com/dogbert17/dffb56a8...65e68bb9da | ||
jnthn | That one fails 'cus it's missing a patch from master :) | ||
dogbert17 | from master | 11:17 | |
so where does MVM_SPESH_BLOCKING hide | |||
jnthn | It's an env var | ||
We do specializations on a background thread | 11:18 | ||
This means that a bug may hide or show up depending on timing | |||
dogbert17 | ok, I'll rebuild rakudo with MVM_SPESH_BLOCKING=1 | ||
or is it enough to run the tests with it | 11:19 | ||
jnthn | No | ||
So far as I can tell we do something wrong when compiling Rakudo that then causes the tests to fail whatever you run them with | |||
I thought it was an OSR thing | |||
But...just did a build with that disabled and still got a busted build | 11:20 | ||
I mean, the build is OK, but the result is bad | |||
Just got that again | |||
Grr | |||
lunch time, bbiab | 11:22 | ||
11:27
domidumont joined
12:06
AlexDani` joined
|
|||
jnthn back | 12:09 | ||
dogbert17 | jnthn: exported MVM_SPESH_BLOCKING=1, applied your patch, rubuilt moarvm and rakudo, no spectest fails but t/04-nativecall/21-callback-other-thread.t still fails albeit differently this time | 12:16 | |
jnthn | Very strange | 12:18 | |
dogbert17 | could it be because I'm on 32 bit? | ||
jnthn | I wonder what's different | ||
Maybe but...hard to see how | |||
Just done a rebuild of NQP to make sure it wasn't something there | 12:19 | ||
dogbert17 | have you had to make any changes to frame.c? | 12:22 | |
jnthn | Various | ||
I wonder | |||
dogbert17 | anything here? gist.github.com/dogbert17/f97eea42...696525ef91 | 12:23 | |
jnthn nukes his install directory in case he polluted it in the past | 12:24 | ||
dogbert17: Just to confirm: you're on the spesh-worker branch of MoarVM? | 12:25 | ||
dogbert17 | On branch spesh-worker | 12:26 | |
Your branch is up-to-date with 'origin/spesh-worker'. | |||
jnthn | And HEAD NQP/Rakudo? | 12:27 | |
jnthn is updating his now | |||
And 21-callback-other-thread.t is the only failure you see in `make test`? | 12:28 | ||
dogbert17 | that's the only failure. as for nqp I'm on HEAD detached at 3e8089404 | 12:29 | |
12:29
zakharyas joined
|
|||
jnthn | ah, that's HEAD anyway | 12:30 | |
And same as me now | |||
OK, then we're certainly seeing different things | |||
(I'm seeing a lot more of make test fail) | 12:31 | ||
dogbert17 | odd | ||
I built Moar with '--debug --no-optimize --asan' | |||
jnthn | Just --debug here | ||
I nuked all I could think of that I coulda corrupted, but still it seems sensitive to spesh being enabled or disabled when building Rakudo | 12:32 | ||
And, interestingly, not sensitive to OSR or inlining | 12:33 | ||
Spesh disabled for everything up to CORE.setting, then enabled for CORE.setting, seems to show the bug | 12:36 | ||
dogbert17 | how did you do that | 12:37 | |
jnthn | Just Ctrl+C at the appropriate point then "make" again with the other flags | 12:38 | |
Well, env vars, not flags | |||
If you're not seeing the problem with spesh enabled the whole way through, and even with MVM_SPESH_BLOCKING=1 set, though, I don't think you're likely to reproduce it | |||
dogbert17 | that limits my possibilities :( | 12:39 | |
jnthn | OK, spesh enabled until metamodel, then metamodel, BOOTSTRAP, and CORE.setting with it fully disabled makes things work | 12:40 | |
dogbert17 | cool | ||
jnthn | So next question is if I include metamodel and bootstrap | ||
Then it also passes | 12:43 | ||
So darn, it's a CORE.setting mis-compile | |||
dogbert17 | interesting | ||
nine | now that's a huge chunk :/ | 12:45 | |
jnthn | Yeah, if I rm CORE.setting and make it again and then test, then I get the failures | ||
Includes if I do it with OSR and inlining disabled, so can rule those out | |||
Though now I'm curious how I didn't see this before | 12:46 | ||
12:46
domidumont joined
|
|||
jnthn | Trying now with JIT disabled | 12:46 | |
dogbert17 wouldn't be surprised if that worked | |||
jnthn | Since when I brought back kicking out osrpoint suddenly we could JIT loads of stuff again | ||
nine | jnthn: could you narrow it down by moving parts of CORE.setting to CORE.d.setting? | 12:47 | |
jnthn | hah, pass | ||
So it's something that only happens when CORE.setting is compiled and the compiler is JITted | |||
Hm, I just wonder... | 12:49 | ||
dogbert17: oh, and this explains why you don't see it on 32-bit: we only JIT on x64, not x86 | 12:52 | ||
dogbert17 | indeed, so it seems that I won't be able to help fix this particular problem :( | 12:54 | |
the fail in t/04-nativecall/21-callback-other-thread.t reamins though but that's possibly something different | |||
jnthn | Yeah, that's 'cus the fix for it is in master but not merged into spesh-worker | 12:56 | |
dogbert17 | ahh | ||
so I'm essentially bug free then :) | 12:57 | ||
jnthn | Wow, all the failures are variants on the theme "Expected Callable but got <something here that you'd expect to be callable>" | ||
dogbert17 tries a stresstest instead | |||
jnthn | And always with things mixed in on the RHS | ||
13:09
buggable joined
13:11
buggable joined
|
|||
Zoffix | That sounds similar to a ticket | 13:12 | |
m: say Any.^can("push")[0] ~~ Callable; | |||
camelia | False? | ||
Zoffix | rt.perl.org/Ticket/Display.html?id...et-history | ||
jnthn | Seems that it's that the JIT didn't spit out the SC write barriers | 13:19 | |
It'd be rather nice if we could specialize on lack of need for them | |||
But for now I'll pop them in all the time | |||
Seems that does it | 13:20 | ||
yay, clean tests | 13:24 | ||
Now to stress it some :) | |||
Geth | MoarVM/spesh-worker: b4ad1dcce9 | (Jonathan Worthington)++ | 3 files JIT should include SC write barriers. Otherwise, when we are compiling things that rely on repossession of serialized objects, that will not happen, resulting in mis-compiles. It would be nice in the future ot only conditionally include these (and specialize on us not being in the context of compilation). For now, this at least gets things correct. |
13:26 | |
jnthn | Another bug that's been there a good while | ||
Rakudo builds and make tests with MVM_SPESH_BLOCKING=1 MVM_SPESH_NODELAY=1 | 13:27 | ||
As does NQP | 13:28 | ||
And make spectest with those flags now running | |||
timotimo | man, it really is fascinating that it never b0rked before from those bugs you're finding here | 13:29 | |
jnthn | Yeah, really | ||
Though I think we knew that MVM_SPESH_NODELAY=1 broke things | |||
timotimo | we did know that | 13:30 | |
jnthn | But that was never really dub into | ||
*dub | |||
*dug! | |||
I suspect with NQP and Rakudo make test-ing clean with it, we're in a better shape than we've been before | |||
Alas, make spectest isn't clean | 13:31 | ||
But only 6 files with issues | 13:32 | ||
timotimo | that sounds good already | ||
jnthn | Taking the SEGV first :) | 13:37 | |
timotimo | oh, huh | 13:39 | |
did i miss something | |||
./perl6-m tools/build/install-core-dist.pl /home/timo/perl6/install/share/perl6 | |||
===SORRY!=== | |||
At Frame 123, Instruction 56, op 'sp_guard' has invalid number (4) of operands; needs 2. | |||
jnthn | wat | ||
works for me fwiw | 13:40 | ||
timotimo | i did a fresh configure.pl in rakudo, so it's not the ops.c that coulda b0rk | ||
jnthn | huh, I'm getting MVM_gc_debug_find_region not producing any output | 13:42 | |
Even though there's no cdoe paths where it would not produce output | |||
timotimo | it can't throw exceptions, either? | ||
jnthn | No | ||
timotimo | oh, i did have a gdb still running | 13:43 | |
that could have kept moar or libmoar from being updated | |||
jnthn | ah, if I run fflush(stdout) it does it | 13:45 | |
oh, I see why | |||
Geth | MoarVM/spesh-worker: 304605c14d | (Jonathan Worthington)++ | src/gc/debug.c Add missing newline in debug output. |
||
jnthn | So apparently the bad pointer we're following is in gen2 bin of thread 1 | 13:46 | |
Huh, with a junk replaced pointer | 13:48 | ||
timotimo | well, spesh creates a sp_guard op and only gives it two parameters | 13:52 | |
on the other hand | |||
sp_guard is also defined to only have two arguments | |||
oooh | 13:53 | ||
i had been working on a nativecallglobalwrite op | |||
just adding it caused nativecall tests to segfault | 13:56 | ||
jnthn | d'oh, shoulda noticed the flags | ||
Turns out it's a type object | |||
timotimo | ah, there's no valid data there, then | 13:57 | |
jnthn | yeah | ||
timotimo | callack-other-thread is the one that's "allowed" to crash? | 13:59 | |
it prints "MoarVM panic:" and immediately segfaults, that might be what you're hunting now, then | 14:00 | ||
however, it also does that without the flags set | |||
Geth | MoarVM/spesh-worker: b351124ae1 | (Jonathan Worthington)++ | src/spesh/optimize.c Stronger validation of code objects in optimizer. |
14:08 | |
jnthn | timotimo: yeah, that one is fixed in master, not in spesh-worker | ||
Just needs a merge/rebase | |||
Down to 5 after that patch | 14:09 | ||
timotimo | oooh | ||
Geth | MoarVM/spesh-worker: 217334e2b6 | (Jonathan Worthington)++ | src/spesh/osr.c NULL out registers on re-OSR. If we previously ran the OSR'd code for this frame, then we don't need to resize work/env. However, they were unused while the deopt code was being run, and so may contain outdated pointers that will upset the GC if it comes across them. Fix this by making sure that space is cleared out. |
14:21 | |
jnthn | And now down to 4. | 14:22 | |
japhb | w00t! | ||
jnthn | Which give the same kinda error, so may be the same thing. | ||
japhb | Would be nice if so | 14:23 | |
jnthn | Also they go away with OSR disabled | ||
Who'd have thought replacing running code with an optimized version of that code could be so tricky :P | 14:24 | ||
timotimo | nobody knew code care would be this complicated - Dolan Drumpf, probably | 14:25 | |
jnthn | oh, interesting | 14:30 | |
MVM_JIT_DISABLE=1 also fixes it | |||
So more then likely to be a JIT issue | 14:31 | ||
It's the JIT of can_meta in Perl6::Grammar that's at issue, it seems | 14:34 | ||
If I disable JITting that, it works out | 14:35 | ||
timotimo | disabling jitting that, as in, a jit limit? or a non-jittable op right in the middle? | ||
jnthn | I just check for the name can_meta :) | ||
timotimo | ah, heh | 14:37 | |
jnthn | oh | ||
I bet the JIT doesn't do github.com/MoarVM/MoarVM/blob/mast...rp.c#L2354 | 14:38 | ||
timotimo | you're right | 14:40 | |
jnthn | What an annoying thing to need to do | 14:41 | |
timotimo | i wonder how hard it is to insert labels into the jit graph directly | ||
yes, it is | |||
much easier in the new jit, for sure | |||
let me investigate | |||
jnthn | I guess the easy option is for us to not devirt it for now and to put the check into MVM_repr_at_key_o | 14:42 | |
timotimo | we only devirt if the type is known to spesh facts | ||
but not that it's known to be concrete | |||
jnthn | Oh | ||
If we add the "know it's concrete" | |||
Then we're good :) | |||
timotimo | hm, you know | 14:43 | |
since the jit uses the reprconv | |||
jnthn | Well, provided we stick the check into the function too | ||
Right, that's the function I meant to tweak :) | |||
timotimo | ah, yes | ||
jnthn | oh, it already *is* in there | 14:44 | |
Except it returns a real NULL. | |||
D'oh | |||
timotimo | aha! | 14:45 | |
is this part of the repr contract? if so, we could actually throw the check out from the interpreter if the repr op function does it again anyway | |||
jnthn | Seems to do it | ||
timotimo | though of course that'll save us a function pointer call | ||
jnthn | The interpreter just doesn't call the convenience function | 14:46 | |
timotimo | oh, you mean *that* has the check | ||
jnthn | right :) | ||
timotimo | d'oh, i was looking at the first one in the file, which was for atkey_i | ||
of course that doesn't have the null check | 14:47 | ||
jnthn | Seems I've a fix :) | ||
timotimo | habemus fixem | ||
jnthn lets another spectest with blocking/nodelay rip | 14:48 | ||
Geth | MoarVM/spesh-worker: d2adde4e23 | (Jonathan Worthington)++ | src/6model/reprconv.c Never return a real NULL. |
14:52 | |
MoarVM/spesh-worker: e44e7b21d4 | (Jonathan Worthington)++ | src/jit/graph.c Only devirtualize when concreteness is known. |
|||
jnthn | ENOCIGAR, alas | ||
That accounted for 3 out of the remaining 4 test files with issues | |||
t/spec/S32-io/io-spec-win.t remains unhappy | 14:53 | ||
Oh. | |||
It's gonna be another JIT one | |||
Invalid string index: max 1, got 2 at SETTING::src/core/IO/Spec/Win32.pm:81 (./CORE.setting.moarvm:is-absolute) | 14:55 | ||
ah, it's 'cus the JIT tries to be clever rather than calling ord_at | 15:07 | ||
But they two do different handling of out of bounds | |||
And I ain't sure that it's a win anyway | 15:08 | ||
timotimo | ah | ||
jnthn | In that the C compiler probably inlines get_grapheme_at_nocheck | ||
Alright, here we go again for a blocking/nodelay test run :) | 15:10 | ||
Geth | MoarVM/spesh-worker: 53277743d5 | (Jonathan Worthington)++ | src/jit/emit_x64.dasc Fix JIT of nqp::ordat and nqp::ordfirst. They have different rules on out-of-bounds than get_grapheme_at. |
15:18 | |
jnthn | There we go, spectest happy with MVM_SPESH_BLOCKING=1 MVM_SPESH_NODELAY=1 | ||
Run with a NQP/Rakudo built with the same | 15:21 | ||
Even better | 15:23 | ||
make stresstest is *also* happy \o/ | |||
timotimo | man, this is good stuff | 15:24 | |
jnthn | So in terms of correctness, we're OK to merge. | 15:25 | |
That bad news is that somewhere along the way doing these improvements we've lost some performance on CORE.setting compilation | 15:27 | ||
The perf output is kinda nuts | 15:34 | ||
Guess I'm using the wrong options | 15:36 | ||
hah, yes | 15:40 | ||
Well, more performance analysis tomorrow, I guess | 15:52 | ||
cooking time; bbl | 15:53 | ||
15:57
colomon joined
|
|||
timotimo | 2.61% MVM_spesh_arg_guard_run, is that expected? | 16:13 | |
[Coke] | jnthn++ | 16:18 | |
timotimo | interestingly the core setting compilation takes only 99% cpu time | 16:22 | |
whereas i'd've expected it to have more than 100% because of the spesh worker thread | |||
nwc10 | jnthn: didn't get a chance to say that something you did fixed the ASAN failure | 16:30 | |
ASAN was a bit bored, but now it is very excited by t/04-nativecall/21-callback-other-thread.t | |||
timotimo | oh, that's the thing that's fixed on master but not rebased for spesh-worker yet | 16:32 | |
so can you rebase or merge master and try that one again? | |||
nwc10 | paste.scsys.co.uk/564676 | 16:33 | |
timotimo: I should rebase it locally? | |||
timotimo | yes, or merge locally | ||
nwc10 | yes, that fixes it | 16:34 | |
ASAN thinks that you are no fun :-) | |||
timotimo | :) | 16:35 | |
sometimes you can't be the fun dad, you have to be the boring dad instead | |||
jnthn | :) | ||
Well, there's the bolognese going... | |||
timotimo: It's..."not bad" I guess. It's quite a hot path | |||
timotimo: Before this logic was sat inside of invoke, which was always very costly | |||
timotimo | indeed it was | 16:36 | |
17:15
domidumont joined
17:17
domidumont joined
18:03
robertle joined
18:41
lizmat joined
20:01
ggoebel joined
|
|||
lizmat | jnthn: would it be a good thing to explain in one paragraph what you've done the past week ? | 20:51 | |
jnthn | lizmat: Yeah, sounds like a good idea. | 20:52 | |
(Do I need to write said paragraph? :)) | 20:53 | ||
lizmat | yes please :-) | ||
jnthn | Jonathan continued working on the the first step of his overhaul of the MoarVM dynamic optimizer, which optimizes hot code based on collected type information. He now has optimization and JIT compilation running on a background thread rather than interrupting code, and a new means of data collection that will allow for smarter optimization decisions in the future. Along the way, he has fixed a range of optimization bugs that existed prior to his | 20:57 | |
Maybe better: s/now has/now has a branch with/ | 20:58 | ||
[Coke] | (stopped on "prior to his") | ||
jnthn | ah | ||
prior to his changes and were driven out by stresstesting. With everything working again, he will now switch to tuning it ahead of a merge. | 20:59 | ||
lizmat | jnthn++ :-) | ||
(not only making things faster, but also easier for me :-) | 21:00 | ||
jnthn | Trouble is that at this point its slower :P | ||
(Thus the tuning to come.) | |||
Also this branch isn't really doing any of the clever new things this work aims to enable. | |||
timotimo | btw, we don't put osrpoints into regexes, i don't think. should we? | 21:01 | |
lizmat | jnthn: BTW, looks like moarvm.org doesn't know about 2017.07 yet | ||
jnthn | oops | ||
Thought I'd done that. D'oh. | |||
jnthn blames the hot weather last week :P | |||
timotimo | it's pleasantly cold today | ||
below 20 degC i believe | 21:02 | ||
jnthn | Yes, now it is | ||
Last week it wasn't | |||
Right now it's great | |||
21:10
colomon_ joined
|
|||
lizmat | and another Perl 6 Weekly hits the Net: p6weekly.wordpress.com/2017/07/24/...h-produce/ | 22:33 | |
samcv | more work going on with the collation-arrays. since i got things working pretty great yesterday now i'm working on making certain sections of it more correct. such as when i start tracing the linked list of possibilities of a codepoint, making sure that i push the last seen node with colation elements to the stack then forwarding the rest of the codepoints back into the function another time | 23:37 | |
always feels good to delete a bunch of code and write much nicer code in its place. though it took a lot of work to get to that ugly code i had written in certain areas | 23:43 | ||
but it's (usually) never a waste because i would never have been able to write the nice code without writing the ugly code first :) |