| IRC logs at
Set by AlexDaniel on 12 June 2018.
01:53 pamplemousse joined 02:15 pamplemousse left 02:16 pamplemousse joined 02:20 pamplemousse left 02:26 pamplemousse joined, AlexDani` joined 02:28 AlexDaniel left 04:13 AlexDani` is now known as AlexDaniel, AlexDaniel left, AlexDaniel joined 04:18 pamplemousse left 04:38 farcas1982regreg joined
nwc10 good *, #moarvm 06:33
07:07 Kaiepi left 07:08 Kaiepi joined 07:09 farcas1982regreg left
nine MasterDuke: oh, sorry, I missed the message with the URL. But it wouldn't have helped anyway. Looked at it now and I'm stumped. 07:30
MasterDuke: at this point I'd start poking at it with gdb and see if the values we get make sense
MasterDuke nine: ok, i'll see if anything looks off 07:38
btw, while you're here, any further comments on ?
nine MasterDuke: yeah....thanks for getting us rid of those warnings! 07:41
Geth MoarVM/master: 5 commits pushed by (Daniel Green)++, MasterDuke17++ 07:47
MasterDuke and thanks for reviewing
07:51 zakharyas joined
Geth MoarVM: 5863b02355 | (Stefan Seifert)++ | src/spesh/arg_guard.c
Silence a compiler warning in add_nodes_for_typed_argument

The compiler is actually wrong here. The update_node variable will in fact only be used when it's properly initialized, but the logic is hard to follow as the initialization state can only be tracked via two other variables.
nine This leaves only my longjmp issue 08:17
nwc10 I remember dislikeling longjmp. Or more "dammit, why does *that* also need to ve volatile even though it's const and never assigned to?" 08:20
08:27 sena_kun joined 08:56 Altai-man_ joined 08:58 sena_kun left 08:59 Geth left, Geth joined 09:11 AlexDaniel` joined 09:13 farcas1982regreg joined
nine Looks like I finally have a fix 09:36
lizmat nine++ 09:38
09:46 farcas1982regreg left 10:09 lizmat_ joined 10:11 lizmat left 10:23 lizmat_ is now known as lizmat
Geth MoarVM: c0fe97a68a | (Stefan Seifert)++ | 5 files
Fix variable might be clobbered by ‘longjmp’ or ‘vfork’ warnings in interp.c

Moving the affected variables out of the scope of the setjmp instruction avoids potential clobbering. They are only used for nested interpreters anyway, so they may as well live in MVM_interp_run_nested. The important part is that we don't NULL them in MVM_interp_run.
10:57 sena_kun joined 10:58 Altai-man_ left 12:08 evalable6 left, linkable6 left 12:09 evalable6 joined 12:10 linkable6 joined
MasterDuke nine++, now i don't see any warnings at all when building 12:19
timotimo sounds like i haven't contributed code in a long time! 12:20
MasterDuke heh. how about your in-situ string branch? needs much work? 12:25
timotimo i don't remember :( :( 12:30
12:49 pamplemousse joined 12:56 Altai-man_ joined 12:58 sena_kun left
jnthn There's further updates on which fold in the new dispatch instruction ideas, lay out an overall implementation approach (and some more detailed steps on the first round of the work). 14:49
I think I'm probably at the point where I've done enough planning and need to do some coding. :)
14:57 sena_kun joined 14:58 Altai-man_ left 14:59 farcas1982regreg joined
timotimo ForCegC 14:59
.o( i wonder if all this will be succ'd by a bag of nqp::dx's ) 15:00
that pun was quite an enormous stretch :S 15:02
nine It's kinda funny how spesh plugins were the hot new stuff that did wonders for speeding up some operations, while now they're becoming that legacy thing which require twice as many steps as necessary. The wheel turns... 15:26
15:28 farcas1982regreg left
timotimo put your Fs in chat 15:30
15:33 farcas1982regreg joined
AlexDaniel hey, I submitted it as a Rakudo ticket but it should be interesting to moarvm enthusiasts :) R#3648 15:40
linkable6 R#3648 [open]: [MoarVM] Gradual performance slowdown when pushing into a Channel
15:44 zakharyas left
nine AlexDaniel: ConcBlockingQueue is a linked list. We do keep pointers to both head and tail, so pushing is not an issue. But I'd bet that GC times grow a lot when we have to follow 2 million pointers. 15:46
MasterDuke: we calloc MVMConcBlockingQueueNodes. I guess that'd be a good place to switch to the FSA as well? 15:47
MasterDuke could be 15:48
we have a lot of random places where we don't free stuff that was alloced before throwing an exception 15:49
nine Sounds like something where the GCC plugin could help? 15:51
MasterDuke i have no idea what is possible, but seems likely
AlexDaniel nine: yeah, well, pushing 2 million values into a Channel without reading any is far from being a real world scenario 15:55
although, hm, maybe it can happen accidentally
nine Well given a list of functions that allocate and free, it can go through all possible control flow paths of a function and keep track of whether values that result from an allocation are passed to a freeing function before we arrive at a throw
MasterDuke nice 15:56
nine AlexDaniel: if it's accidental, I'd wager that the slow down is the least of your problems :)
AlexDaniel nine: why?
that's just a few megabytes of data
nine But it's also 2 million items in a queue which you expected to get processed in some way but aren't 15:57
AlexDaniel could be some sort of logging gone wrong or similar, I see no problem 15:58
nine MasterDuke: it's pretty straight forward actually:
AlexDaniel I use channels for realtime data so if the reading side dies I might end up in this situation quite easily 15:59
2 million values over 10+ hours isn't that much 16:00
nine It's just to me "reading side dieing" sounds worse than "sending side gets slowed down a bit" 16:01
AlexDaniel that's just 56 per second, easy
nine: could be one of the non-essential features, but the whole program slowing down can cause all kinds of nuisance 16:02
I use raku for camera motion control and stuff, and honestly this sounds like an everyday scenario to me :X 16:03
but then most of it is hacked-on-the-fly throwaway scripts put together, so there's that too :) 16:04
MasterDuke ugh. 15s to run some regexes on a 1.1m line spesh log 16:08
sticking `.contains("duplicated string from the regext")` before each of the three regexes drops the time to 2s 16:12
lizmat MasterDuke: there's a reason I created that method :-) 16:13
MasterDuke yeah. just wish our regex engine would do it for me 16:14
m: say "a test" ~~ /q|test|/ 16:20
camelia 5===SORRY!5=== Error while compiling <tmp>
Null regex not allowed
at <tmp>:1
------> 3say "a test" ~~ /q|test|7⏏5/
expecting any of:
infix stopper
MasterDuke i can't use q strings in regexes?
nine Well, regexes are a different (sub-)language 16:21
16:56 Altai-man_ joined 16:58 sena_kun left
MasterDuke nine: everything builds ok with MVMConcBlockingQueueBodies and MVMConcBlockingQueueNodes using the FSA. think it'll help AlexDaniel's benchmark? 17:05
AlexDaniel: want to try a patch?
AlexDaniel MasterDuke: sure
MasterDuke 17:06
you'll need `-p0` if you `git apply` that 17:07
AlexDaniel MasterDuke: should I apply it on 2020.02.1 or master head?
MasterDuke it should work on either. i guess use 2020.02.1 since that's what your previous measurements were with, right? 17:08
AlexDaniel yeah
nine MasterDuke: I don't think it will help with that particular benchmark since I expect GC time to be the issure here. But it can help others that put lots of stuff into a queue (and take it out again) 17:09
MasterDuke fwiw, nqp and rakudo pass all their tests 17:11
nine: while we're on the subject, any thoughts on ? 17:13
AlexDaniel MasterDuke: actually, it is significantly faster 17:22
17:22 Geth left
AlexDaniel oops 17:23
17:24 Geth joined
MasterDuke oh? 17:24
AlexDaniel ok sorry was distracted by bot eating 17:25
MasterDuke: yep, it fixes it 17:30
MasterDuke have some pretty new graphs?
AlexDaniel yes, leaving a comment now
MasterDuke cool 17:31
AlexDaniel MasterDuke: 17:33
MasterDuke nice! 17:34
AlexDaniel MasterDuke: you can confirm it yourself by running the code in question piped into /dev/null 17:38
it should be noticeably faster after the patch 17:40
by the way, is it a quirk of libuv that stdout comes in batches? Or why am I seeing this? 17:42
lizmat buffering? 17:47
AlexDaniel lizmat: do we have it?
lizmat we shouldn't if .t is true on the handle 17:48
AlexDaniel lizmat: how can I disable it? 17:49
tbh I thought we simply have no buffering at all
lizmat well, the OS buffers any writes to files, afaik 17:50
in Perl you can call autoflush on the IO::Handle, I don't think we have that in Raku yet 17:51
AlexDaniel but it's not a file, it's another process
but not a TTY
in perl5 `$| = 1;` works
lizmat jnthn moritz nine might know 17:52
jnthn $*OUT.out-buffer = False
AlexDaniel oh. 17:56
lizmat TIL 17:57
AlexDaniel yeah, me too
yeah, that's much better 18:01 18:02
now that's maybe even more interesting 18:03
so it stutters, then prints some values, then stutters again for a shorter amount of time
and it does that consistently throughout the whole execution 18:04
MasterDuke and takes the same amount of time as before my patch? 18:17
AlexDaniel MasterDuke: well, with no output buffering it is of course slower 18:18
MasterDuke yeah 18:19
is no buffering without the patch even slower?
AlexDaniel yes
I'm rerunning the benchmarks, maybe I'll update the ticket 18:20
Geth MoarVM: MasterDuke17++ created pull request #1280:
Use the FSA for ConcBlockingQueue
18:35 squashable6 left 18:38 squashable6 joined
nine Incredible....though I'm curious to know why this fixes the issue. 18:45
MasterDuke: sorry, been doing our fitness program and now it's dinner time
MasterDuke i'm not too terribly surprised. those should be much faster allocations 18:46
AlexDaniel MasterDuke: new graphs without output buffering: 18:48
18:56 sena_kun joined
nine MasterDuke: I'm not surprised that it's faster. I'm surprised that it scales better. GC time should still grow linearily and noticably with the queue's size. 18:57
And I'd have bet that GC time is what slows the benchmark down over time.
18:58 Altai-man_ left
nine My only hypothesis is that with the FSA those queue nodes are packed tightly in memory, so many of the ->next dereferences can be satisfied from the CPU cache 18:59
timotimo wow, where do those graphs come from 19:01
AlexDaniel how often is GC called? On the last graph, the stutters do become longer over time, but is it really GC? So often?
nine timotime: click on "How do I take these measurements?" :)
MasterDuke yeah. it probably wouldn't be quite so much better if there were (many) other FSA allocations happening inbetween the Channel.push's 19:02
nine AlexDaniel: after allocating 4 megs
MasterDuke AlexDaniel: take a profile. or compile with --telemeh. those would give you gc timings 19:03
timotimo oh, those are clickable!
AlexDaniel: telemeh also uses rdtscp and runs on a separate thread and gives a bit more than just "when was a print", for example GC timings 19:04
AlexDaniel oh 19:07
timotimo where can i look at the tool that makes the graphs? they are very pretty
AlexDaniel timotimo: thanks! Yeah, well, the graphs are much prettier that the code that generates them :) 19:09
it's 400 lines of pure filth 19:10
we'll see, if this proves to be more useful than fancy then I'll fix it up and publish it 19:15
timotimo haha 19:17
AlexDaniel I'll try to use telemeh later, if it's possible to mark GC runs on these graphs without messing up the timings… woah that'd be cool 19:19
then what, telemehable? x) 19:20
AlexDaniel sleep & 19:21
20:56 Altai-man_ joined 20:58 Kaiepi left, sena_kun left, Kaiepi joined 21:14 pamplemousse left
Geth MoarVM: c59514234f | (Daniel Green)++ | src/6model/reprs/ConcBlockingQueue.c
Use the FSA for ConcBlockingQueue

This makes sending things into a Channel much quicker.
MoarVM: 3d8ff61e20 | (Jonathan Worthington)++ (committed using GitHub Web editor) | src/6model/reprs/ConcBlockingQueue.c
Merge pull request #1280 from MasterDuke17/use_fsa_for_cbq

Use the FSA for ConcBlockingQueue
22:40 Kaiepi left 22:49 Altai-man_ left 23:15 Kaiepi joined