03:30 lizmat joined 05:43 domidumont joined 05:50 domidumont joined 06:34 brrt1 joined 06:36 domidumont joined
brrt1 .ask jnthn, i need two operators, get_lex_type and get_reg_type, for a spesh graph 07:25
yoleaux brrt1: What kind of a name is "jnthn,"?!
timotimo brrt: what, run-time instructions? 07:45
the spesh graph has a local_types and lexical_types that are valid if there has been inlining 07:47
otherwise you'd probably go via the staticframe
brrt hmm yeah 07:48
timotimo yeah, staticframe also has local_types and lexical_types 07:51
gotta go~
brrt see you
Geth MoarVM/even-moar-jit: 5d0db25c7b | (Bart Wiegmans)++ | 6 files
Collect types for expr nodes

We need this to ensure that we can mark object / string values as such when we spill them to memory. Also, unify handling of read/write operands, bailing if we have a non-zero write operand. bindlex is a destructive template.
08:25
jnthn wonders if brrt is still in need of anything 08:42
brrt nah, i'm good 08:43
well, maybe check if my assumptions are correct
especially: github.com/MoarVM/MoarVM/commit/5d...3975f86R33
jnthn Looks sensible to me 08:44
Geth MoarVM: gerd++ created pull request #607:
add help text for environment variables
08:47
brrt thanks for checking 08:48
Geth MoarVM: 755404c915 | gerd++ | Configure.pl
add help text for environment variables
08:56
MoarVM: 093ae97e7b | (Jonathan Worthington)++ (committed using GitHub Web editor) | Configure.pl
Merge pull request #607 from gerd/master

add help text for environment variables
jnthn m: say 16226602756 - 16068523935 09:27
camelia Resource temporarily unavailable 09:28
timotimo that looks like a cycle count
jnthn wat
m: say 16226602756 - 16068523935
camelia 158078821
timotimo c: 16068523935 / 16226602756
committable6 timotimo, ¦1606852: «Cannot find this revision (did you mean “ff63582”?)»
timotimo c: say 16068523935 / 16226602756
committable6 timotimo, Seems like you forgot to specify a revision (will use “v6.c” instead of “say”)
timotimo c: HEAD say 16068523935 / 16226602756
committable6 timotimo, ¦v6.c (19 commits): «0.990258045792»
timotimo, ¦HEAD(5facb26): «0.990258045792»
timotimo *cough*
Geth MoarVM: 0a83384545 | (Jonathan Worthington)++ | 3 files
Move NFG initialization into nfg.c.
MoarVM: 5814822f13 | (Jonathan Worthington)++ | 3 files
Cache CRLF grapheme.

Saves us resolving it every single time. This, on a read a million lines benchmark in Perl 6, saves 158 million cycles.
timotimo BBL again 09:29
jnthn m: say 16068523935 - 15989524161 09:41
camelia 78999774
Geth MoarVM: f0854ac46d | (Jonathan Worthington)++ | 2 files
Cache maximum separator length.

Meaning that we need not calculate it every time we want to look for a line separator. On a read a million lines benchmark in Perl 6, saves 79 million instructions.
09:42
dogbert17_ How fast are we now? 09:44
is it perl5 speed? 09:47
jnthn Not yet
m: 1.793 / 1.134
camelia WARNINGS for <tmp>:
Useless use of "/" in expression "1.793 / 1.134" in sink context (line 1)
jnthn m: say 1.793 / 1.134
camelia 1.581129 09:48
jnthn We still take 1.6x longer
m: say 15989524161 - 15569412274 10:02
camelia 420111887
dogbert17_ not too bad
Geth MoarVM: a8f2ac74fa | (Jonathan Worthington)++ | 2 files
Cache a list of final separator graphemes.

This makes it cheaper to check if the current grapheme we have just decoded/normalized is a stopper. On a read a million lines benchmark in Perl 6, this saves 420 million instructions.
10:13
10:18 geekosaur joined 10:19 geekosaur joined
jnthn Turns out trying to use memmem as a filter makes things worse 10:23
Incredibly so
m: say 15569412274 - 14724412255 10:39
camelia 845000019
Geth MoarVM: 8fa1857837 | (Jonathan Worthington)++ | 2 files
Introduce a max final grapheme codepoint filter.

This means we can usually avoid a separator search loop entirely in the common case where separators are control chars. In the Perl 6 million line file benchmark, where each line is 60 chars long, this saves 845 million cycles. (How much it helps will depend on line length; the longer the line, the more it will help.)
10:41
jnthn That helped rather more 10:45
Geth MoarVM: 8972ab2ee2 | (Jonathan Worthington)++ | 4 files
We never have both separator and char limit.

So use an `else if` to save an extra check/branch. Gets 56 million instructions off the Perl 6 million lines/60 chars per line benchmark.
11:03
jnthn And that one's a case of "every little helps", I guess :)
Remembering to have lunch will also help...bbl 11:04
11:31 brrt joined 11:42 AlexDaniel joined
jnthn Righty, let's try and squeeze some more out of this :) 12:07
timotimo i have a local patch on my laptop that moves a "if (!result.o) { result.o = tc->instance->VMNull }" from one if branch inside MVMIter's MVM_iterval out to the end 12:11
i think i did that to fix a crashbug involved with iterating over something - likely an array? 12:12
but i never committed it?
but that'd require an atpos to be able to return a low-level null; is that possible? 12:13
jnthn I don't think that happens anywhere
timotimo yeah, doesn't seem so 12:14
i have no idea from what date/time that patch is %)
13:06 robertle joined
jnthn Grrr 13:08
jnthn managed to achieve "fast and wrong"
lizmat
.oO( now for the sequel: quick fix and right :-)
13:09
jnthn feeds attempt two to the test suite 13:30
timotimo i wonder if it'd be helpful to use the FSA for the buffers we have in decodestream and such 13:32
jnthn Depends which ones you mean 13:33
timotimo i don't know what the buffer sizes typically are
i'm just seeing a bunch of malloc calls from MVM_decodestream :)
jnthn I'm aware of one possible change in that area that may help, though
timotimo cool
jnthn m: say 1.365 / 1.691 13:37
camelia 0.807215
jnthn Think I've got another 20% off
timotimo wow
gah, i can no longer just "nqp::sayfh(nqp::getstderr(), "foo bar")" in nqp code because the char API is gone :) 13:40
jnthn But NQP has a note(...) sub :P
eveo note("foo bar")
jnthn And has for a long while :P
timotimo that's fair, i suppose 13:41
i was afraid of using something as high level as a sub inside the methods method of MethodContainer :P
jnthn m: say 14668412249 - 11451760612
camelia 3216651637
timotimo can't use note, it's VMNull at that time 13:43
nwc10 jnthn: 20% off, but then taxes, booking fee, admin fee and all the small print and it's actually still slower? :-/
(or the result arrives a terminal 200km from where you actually wanted it, and after you factor in the cost of travel from there to here, it's more expensive) 13:44
timotimo with a try { } around the note it actually never even runs once 13:45
Geth MoarVM: a6abd3c665 | (Jonathan Worthington)++ | src/strings/utf8.c
Add a UTF-8 decoding fast-path.

In many cases, we are reading things that are already in normal form thanks to being in the ASCII/Latin-1 range, which we can very cheaply check. Add a fast path to the UTF-8 decoder for this case. It doesn't actually skip anything UTF-8 related, but rather can take a shortcut on the full normalizer. On the benchmark reading a million lines using ... (8 more lines)
13:49
13:49 brrt joined
jnthn nwc10: No, that was after paying the taxes. The first attempt was a bit more off but also wrong. 13:50
timotimo cool. i was able to put atan2_n in the function and count occurences with a breakpoint in gdb :D
by virtue of gdb counting up from $1 for each "print 'methods called'" i also get a nice little sum at the end 13:51
huh, i expected more than three calls to method methods during the core setting compilation 13:52
timotimo BBL
[Coke] jnthn: nifty commit. 13:54
jnthn [Coke]: Yeah, was a bit fiddly to get right, though. Glad we have lots of tests :) 14:04
timotimo nice. 14:21
Geth MoarVM: 9c5ea41cc7 | (Jonathan Worthington)++ | 2 files
Keep last freed chars buffer handy for re-use.

In steady state, we will often have just one of these at a time (for example, when reading lines or chunks). By keeping it around for re-use, we can save quite a lot of malloc/free. Gets another 446 million instructions off the Perl 6 million lines benchmark.
14:26
timotimo yes! \o/ 14:36
Geth MoarVM: aaffa714af | (Jonathan Worthington)++ | 6 files
Make a smarter guess at decode result buffer size.

Previously we took the bytes available to decode, but when reading lines there may be a thousand lines worth in the buffer. This avoids allocating huge chunks of memory only to throw them away again soon afterwards. It also is a prerequisite for just using those buffers in the result string rather than doing copying.
15:14
MoarVM: e48266b5e7 | (Jonathan Worthington)++ | src/strings/decode_stream.c
Don't copy when we can steal decoder output.

When reading lines or chunks of chars, the decoder will often produce a single buffer with the required content. In this case, just take it and use it as the string's body, instead of copying it to another buffer. This saves around 400 million instructions on the Perl 6 read a million lines benchmark.
15:16
MoarVM: ea1f506170 | (Jonathan Worthington)++ | src/strings/utf8.c
Lift read of lagging codepoint out of hot loop.

This saves 237 million instructions in the Perl 6 read 100 lines benchmark, and arguably makes the code clearer, if a little longer, by handling the lack of lagging codepoint up front rather than mixed in with the rest of the decode logic. It also saves a branch on the hot path.
15:36
timotimo ooooh 15:48
okay we have about ten thousand million instructions 15:49
so half a thousand isn't even bad at all
Geth MoarVM: b885d996d6 | (Jonathan Worthington)++ | src/strings/utf8.c
Save a deref by introducing a local variable.

The C compiler can't see (somewhat reasonably so) that it can avoid the dereference every time. Saves 57 million instructions from the benchmark, so effectively one instruction per decoded char.
15:52
eveo Soon enough we'll start reading lines in negative time, at this rate :)
hmmm 15:53
jnthn We're really edging in on the Perl 5 time by now. Like, our lowest outlier and it's highest outlier are now equal. 15:55
On average we're still a tad above
Geth MoarVM: 369c0c5ca6 | (Jonathan Worthington)++ | src/strings/utf8.c
Only update last accepted position when needed.

Saves a bunch of writes during the UTF-8 hot loop. Saves another 180 million instructions, or 2 per character decoded in the fast path.
16:08
eveo That's great. Overtaking perl5 in perf for some common operation would make some noise 16:11
jnthn++
timotimo we'll be able to read lines as fast as perl5, but not do stuff with 'em as fast for the most part 16:12
i mean, some things we'll be able to do faster i expect 16:13
jnthn We're at 41 instructions per char we decode and normalize and stopper-filter by this point.
Well, amortizing in the fixed overhead and so on 16:14
And assuming we can take the fast path :)
timotimo right. it's probably a whole lot more when our file's only got 100 lines
not complainin' ;)
jnthn No, I'm just talking about instructions actually spent in the MVM_string_utf8_decodestream function :) 16:15
timotimo oh!
okay, that's nicer than i thought
cool beans
jnthn Anyways, I think I'll call it a day working on speedups for this :) 16:23
More next week, I guess :)
Though we're getting to the point where spesh will need some deeper attention to be able to make a better job of the loop itself 16:24
Also worth noting that if you take off the UTF-8 decoding, Perl 5 still beats us handily 16:25
eveo Cool. I'll bump versions and start pre-release stresstesting then after I hunt down that windows bug
nwc10 jnthn++ 16:26
jnthn eveo: About the proc stuff - this month's MoarVM release did not actually drop support for the old synchronous proc ops 16:27
eveo: So if you feel that you'd prefer the Proc issues from before over the Proc issues now, a revert is possible 16:28
eveo Noted. So far I prefer Proc issues now :) 16:29
jnthn Yeah, the deadlocks from before were pretty icky
I'll address the fd plumbing thing that was reported on RT, probably next week or so 16:30
Wasn't keen to try and rush in a fix beforehand as it'd end up, well, rushed
jnthn heads home; bbl 16:33
17:02 buggable joined 17:29 robertle joined 17:46 Geth joined 18:11 robertle joined 19:29 Geth joined 19:38 Geth joined 19:44 AlexDaniel joined 19:46 colomon joined 19:48 Geth joined 19:51 ZofBot joined
japhb jnthn++ # So many plusses for so many major perf improvements! 21:14
jnthn :) 21:15
Hoping to keep them coming :)
japhb Ditto that. 21:16
jnthn Fingers crossed news.perlfoundation.org/2017/06/gra...erl-6.html goes through 21:21
timotimo why did i just see this for the first time today? 21:22
jnthn I dunno, I've linked it at least once before :)
timotimo posted a comment 21:24
jnthn Thanks...I figure the more of those the merrier 21:25
21:33 ilmari[m] joined
lizmat refrains from commenting, but will vote :-) 21:33
jnthn I hope I'll like the results more than in the last couple of votes I've participated in :P 21:36
eveo m: my @a = 1..3; my @b = 4..5; my %h = :42a, :70b; my @d; @d.append: %h; dd @d 21:40
camelia Array @d = [:a(42), :b(70)]
eveo m: my @a = 1..3; my @b = 4..5; my %h = :42a, :70b; my @d; @d.append: @b; dd @d
camelia Array @d = [4, 5]
eveo Well, I'm kinda leaning towards calling it a bug that .append doesn't just use normal .flat semantics 21:41
m: my @d; my @b = (1, 2, 3), (4, 5, 6); @d.append: @b; dd @d
camelia Array @d = [(1, 2, 3), (4, 5, 6)]
eveo But probably a ton of stuff now depends on that :(
m: my @d; my @b = (1, 2, 3), (4, 5, 6); @d.append: @b».List; dd @d # this I mean. I'd expect this to just get flattened 21:42
camelia Array @d = [(1, 2, 3), (4, 5, 6)]
eveo Ooops. Wrong channel
23:21 leedo_ joined, mst_ joined 23:28 jnthn joined