01:52
ilbot3 joined
02:44
http_GK1wmSU joined
02:47
http_GK1wmSU left
04:12
vendethiel joined
04:56
lizmat joined
06:01
brrt joined
07:11
brrt joined
|
|||
brrt | good * #moarvm | 08:06 | |
08:13
zakharyas joined
08:30
vendethiel- joined
08:33
harrow joined
08:48
vendethiel joined
08:57
edehont joined
|
|||
brrt | interestingly, my bug is insensitive to gc debugging | 08:59 | |
so i think i'm adding a pointer somewhere | |||
to garbage | |||
jnthn | moarning o/ | 09:10 | |
Second to last day of the heatwave... | 09:13 | ||
brrt | moarning jnthn | 09:20 | |
jnthn | o/ brrt | ||
How's your pointer to garbage hunt going? | |||
brrt | not super good | 09:21 | |
my second bisect had a slightly different frame (without ASAN) that is responsible for breakage | 09:22 | ||
although, it might arguably still be the same frame, given that the bb numbers match up | |||
yeah, it's the same frame after all | 09:23 | ||
but it has a different frame number | |||
i wonder if i changed something silly | |||
i'm going to run a third bisect, and if it comes up the same, then i'm happy | 09:24 | ||
(i tried installing gc debugging, but it made virtually no difference) | |||
09:40
vendethiel joined
|
|||
Geth | MoarVM: 25419ccf97 | (Jonathan Worthington)++ | 7 files When logging invokes, don't store closures. This can, in some cases, seriously extend the lifetimes of objects that were closed over. Instead, just log what we'll really want: the static frame of the invocation target (which actually simplifies the handling in the stats code) and whether the caller was its outer (this will be useful for future optimizations). |
09:54 | |
jnthn | Nice. In the read a million lines benchmark that wins us a bit back by meaning we no longer do 2 full GC runs due to things getting promoted | 09:58 | |
brrt | well, third bisect agrees with first | 10:14 | |
jnthn | lunch; bbiab | 11:23 | |
11:28
markmont joined
11:34
TimToady joined
11:52
AlexDani` joined
|
|||
jnthn back | 11:57 | ||
12:16
brrt joined
12:18
dogbert17 joined
12:22
AlexDani` joined
13:00
brrt joined
13:06
zakharyas joined
|
|||
Geth | MoarVM: f66c82641a | (Jonathan Worthington)++ | 3 files Move stack simulation types into header. So we'll be able to refer to them outside of the spesh stats source file. |
13:53 | |
MoarVM: e354d1f901 | (Jonathan Worthington)++ | 4 files Start keeping simulated stack around between logs. So we can continue incrementing OSR points and accumulating data in long-lived frames. At least, that's the theory; it doesn't seem to quite manage that yet. It does, however, seem to avoid various bad stack depth values, so the ordering of specializations works out looking more like it should. |
|||
MoarVM: e12d52c251 | (Jonathan Worthington)++ | 4 files Record and insert rw-ness into type tuples. Gets us able to resolve more multis and do more inlining again. |
14:13 | ||
14:15
zakharyas joined
14:31
brrt joined
|
|||
Geth | MoarVM: d865eb5172 | (Jonathan Worthington)++ | src/spesh/optimize.c Fix "close enough to inline" check. 99% was meant to be sufficient, but never was due to > instead of >=. |
14:35 | |
MoarVM: 9f9c4b4673 | (Jonathan Worthington)++ | src/spesh/stats.c Fix accounting error in callsite stats. Led to spurious planning of unrequired certain specializations. |
|||
14:49
Slaash joined
14:51
Slaash left
14:56
travis-ci joined
|
|||
travis-ci | MoarVM build errored. Jonathan Worthington 'Fix accounting error in callsite stats. | 14:56 | |
travis-ci.org/MoarVM/MoarVM/builds/261031926 github.com/MoarVM/MoarVM/compare/e...9c4b4673f2 | |||
14:56
travis-ci left
|
|||
lizmat | just as I was about to do a version bump | 14:58 | |
Zoffix | github error | 15:01 | |
Zoffix is getting angry unicorns in the browser too | |||
15:07
brrt joined
|
|||
jnthn | ah, just a git clone error | 15:16 | |
japhb wonders from what source data (if any) github.com/status is computed from | 15:18 | ||
brrt | okay, i may have found the source of my weird, weird bug | ||
???? and it may be embarassing | 15:19 | ||
japhb | brrt: Congrats on finding it nonetheless! | ||
brrt | maybe :-P | ||
jnthn | brrt: haha, can't beat my d865eb5172 in terms of embarassing find today :P | ||
brrt | anyway, turns out, carg will put arguments on the stack for anything that isn't a int or a ptr | 15:20 | |
which is ehm, surprising | |||
Geth | MoarVM/even-moar-jit: 82ca7faf91 | (Bart Wiegmans)++ | 3 files ARGLIST on should not be fussy about type I think a casual user could be forgiven for not caring about the difference about int and ptr and reg (although if portability is a concern, it is a real difference). So the code for selecting arg locations should not be fussy about that difference (the ABI isn't). This appears to fix the bugs in CORE.setting compilation reported by nwc10++, but I'm not 100% positive since ASAN on OS X does not quite meet my expectations. |
15:30 | |
japhb | brrt: Is it ready for merge after that fix? | 15:34 | |
brrt | if it passes ASAN, then i'd hope so | 15:35 | |
strictly speaking that was a template bug, not a compiler bug, but the unfussing should help a bit | |||
but i mean, we've got to draw the line somewhere :-) | 15:37 | ||
japhb | Indeed! | 15:38 | |
jnthn | m: say 1741 / 349 | 15:41 | |
camelia | 4.988539? | ||
jnthn | m: say 1741 / 351 | ||
camelia | 4.960114? | ||
jnthn | Hmm | ||
A bit of dodgy arithmetic somewhere in the stats, methinks... | |||
lizmat | .oO( off by two ) |
15:42 | |
jnthn | Off by a fator of 5 actually | 15:44 | |
lizmat | 349 vs 351 :-) | 15:46 | |
jnthn | Oh, I was trying to see if 1741 was an exact multiple of the other two | ||
brrt | (351 is 3^3*13, 349 is prime | 15:53 | |
Geth | MoarVM: 8153063fa0 | (Jonathan Worthington)++ | src/spesh/stats.c Bump spesh stats version on updates. Otherwise we'll never do planning for them, and so never produce new specializations. This was the final issue blocking OSR of long-lived frames that are hot loops, but highly costly on their first call (by, for example, spending time in the multi-dispatch slow path for some calls). |
15:56 | |
jnthn | Hmmm | 15:58 | |
Well, weekends pondering: can we do a better job of exception handler representation in the CFG than anchoring them all from the entry block? | 16:00 | ||
16:01
sivoais joined
|
|||
jnthn | I think we surely can | 16:05 | |
Though at a cost | |||
We'd need to insert actual BB breaks at all invokish/throwish (which arguably we should anyway) | |||
timotimo | those are a lot of ops | ||
jnthn | Then when we're in the region of a handler, we make all such BBs have the exception handler as a successor | ||
timotimo: It is, but this is just a graph representation, so the only real cost is at spesh time | 16:06 | ||
timotimo | i'm not sure how many of our optimizations require there to be no bb break between things | ||
right | |||
jnthn | Well, and for us reading the graph | ||
I don't think many of them do | |||
Because we have an SSA representation | |||
The problem is that if you consider something like | |||
my $iterator := $seq.iterator; | |||
while (my $val := $iterator.pull-one) !=:= IterationEnd { ...stuff... } | 16:07 | ||
We'd really like to be able to spesh that pull-one call | |||
But we can't | 16:08 | ||
Because $iterator isn't known because of a PHI merge from the 0 node | |||
Because "redo" exceptions take you to the start of the loop | |||
But we don't have a precise representation of such things | |||
timotimo | ah, redo, right | ||
lizmat | jnthn: and $iterator, aka a lexical ? | ||
ah, eyes :-( | |||
jnthn | lizmat: Actually they're not lexicals in the code I'm considering here :) | 16:09 | |
(Lowered to registers) | 16:10 | ||
Anyway, because we stick an edge from 0 from all handlers, we conservatively figure we don't know what's in the register | |||
So we can't do anything about that findmethod op except the usual monomorphic cache trick | 16:11 | ||
Which of course means we don't know the coderef so we can't sp_fastinvoke_o it (or inline it were it small enough) | 16:12 | ||
Geth | MoarVM: 8325f0117b | (Jonathan Worthington)++ | src/spesh/stats.c Better no-arg and no-object-arg callsite handling. Give these all a single type specialization. This means that we can now use logged information in performing the optimization of them. |
16:15 | |
16:16
TimToady joined
|
|||
jnthn | Well, tests seem happy and stuff :) | 16:19 | |
So, I think that's my spesh hackery for this week. | 16:21 | ||
Overall status now is that we have (a) a much better data set to work on, (b) a LOT less spesh bugs and various historical workarounds cleared up, (c) the ability to inline/fastinvoke more things already (and potential for a bunch more), (d) are doing the work on a background thread | 16:24 | ||
16:25
travis-ci joined
|
|||
travis-ci | MoarVM build passed. Jonathan Worthington 'Bump spesh stats version on updates. | 16:25 | |
travis-ci.org/MoarVM/MoarVM/builds/261063197 github.com/MoarVM/MoarVM/compare/9...53063fa0b2 | |||
16:25
travis-ci left
|
|||
nwc10 | so even single threaded "traditional" code can exploit both cores of one's smartwatch... | 16:25 | |
Zoffix | \o/ | ||
jnthn | This has been...a lot of work. | ||
Zoffix | jnthn++ | ||
nwc10 | jnthn++ | ||
I hope that past jnthn has rewared future jnthn with a suitably filled beer fridge | 16:26 | ||
timotimo | nwc10: not terribly likely that the usage of the second processor will go up very far | ||
unless we make our spesh optimizer do a whole lot more stuff | |||
nwc10 | which is now a more sensible option to choose/explore, because it's not a direct slowdown on (under loaded) multi-core machines | 16:27 | |
but "make the CPU hot because we can" isn't an end in itself :-) | 16:28 | ||
We could make the spesh thread mine bitcoin when it would otherwise be idle, and donate to www.perlfoundation.org/perl_6_core_...pment_fund | 16:29 | ||
this is neither cost effective nor ethical | |||
jnthn | Talking of TPF... | 16:30 | |
Well, or said fund | |||
Around 4 weeks ago I was told the grant was approved, at which point I put aside all other work so I could focus full time on getting this spesh stuff done, which it really needed. Then a couple of days after I'd dug in, I was told they would announce it after completing one internal legal procedure. | 16:31 | ||
So anyway, I've actually been working for about 4 weeks full time before TPF put up the announcement. | 16:32 | ||
Which means there's rather *less* than 200 hours left by this point. | |||
nwc10 | "yay" | ||
Zoffix | Understandable. | 16:36 | |
We need to get jnthn++ more money \o/ | |||
I think I know of a way to get a bit, actually. Gonna think about it over the weekend. | 16:37 | ||
like 1-2 grand | |||
16:41
edehont joined
|
|||
jnthn | :) | 16:41 | |
jnthn wonders how many weeks escape analysis and related opts would need... | 16:42 | ||
nwc10 | 1) what would be the estimated benefit | 16:43 | |
2) does most of the benefit only happen near the end of the work, or is some of it incremental? | |||
and i guess even | |||
3) does it open up more stuff that other folks could do | |||
jnthn | Good question. My guess on (1) is "fairly high" for Perl 6 in that at the moment: | 16:44 | |
nwc10 | (not that it's obvious who might be other folks) | ||
jnthn | my $a = foo(); bar($a); | ||
Where bar is sub bar($b) { } | |||
GC allocates a Scalar container | |||
That kind of thing is very common | 16:45 | ||
nwc10 | .tell brrt 'works on "my" machine' - ie ASAN views origin/even-moar-jit as boring. | ||
jnthn | So I suspect a *lot* of Scalars would be possible to avoid allocating using the GC | ||
Probably a number of temporary box/unbox too | 16:46 | ||
2 - probably partly incremental in that you can do the inter-procedural bit later, but given nearly eveyrthing in Perl 6 is a procedure call... :) | |||
(Before inlining) | 16:47 | ||
Granted you'd do this after inlining | |||
timotimo | well, we already do inlining :) | ||
jnthn | Right :) | 16:49 | |
I mean you'd inline *then* perform the analysis | |||
timotimo | oh, yes indeed | ||
16:50
travis-ci joined
|
|||
travis-ci | MoarVM build passed. Jonathan Worthington 'Better no-arg and no-object-arg callsite handling. | 16:50 | |
travis-ci.org/MoarVM/MoarVM/builds/261069689 github.com/MoarVM/MoarVM/compare/8...25f0117b78 | |||
16:50
travis-ci left
|
|||
timotimo | the spvm author wrote a post "spvm is now 6x faster than perl 5.26" and says that'll go up to 20x when a jit is implemented | 16:50 | |
16:50
brrt joined
|
|||
timotimo | i'm hoping to see the actual code they used to get to that number | 16:51 | |
jnthn | Yeah, I was curious about the claim, but figured I can wait for the results :) | 16:52 | |
timotimo | i see moarvm.org/measurements has entries for recent days, too. they are all 0 bytes big, though | ||
nwc10 | I think that it might be "6x faster than Perl 5.26 for the things that spvm can do" but this part is unclear | 16:53 | |
jnthn | I figured it was "for a very specific set of benchmarks" | 16:54 | |
timotimo | yeah, spvm is "for fast computations", which i assume means arithmetic and such | ||
brrt | nwc10: thats good news | 16:55 | |
timotimo | i wonder if there's anything we can do to get faster random numbers | ||
nwc10 | the impression I got (not sure if I read this, inferred it, or plain got it wrong) was that the intent was to work outwards and increase the scope of the things that spvm can do | ||
timotimo | building a list from ^100 .roll(1_000_000) takes us about 6.3 seconds on my machine | ||
running [+] on that afterwards takes only 0.8s which is nice, but not stellar | 16:56 | ||
brrt | i did see some broken spectests when runn | ||
timotimo | aha, 67% time spent in a push-all inside range.pm, which is osr'd + speshed, but not jitted | 16:57 | |
brrt | on even-moar-jit | ||
timotimo | hah, we don't jit rand_I | 16:59 | |
that's easy! | |||
jnthn | nwc10: Yeah, keeping the speed up while adding features is the tricky part :) | 17:01 | |
nwc10: I glanced at how scope handling worked and it looked like it'd need a good re-work for closures, for example. | |||
timotimo | i wonder if we should have a specific non-bigint rand function that can do completely without Int | 17:03 | |
jnthn | Righty, I'm off to cook | 17:04 | |
bbl | |||
timotimo | good cookin' | ||
nwc10 | that's a lot more thorough than my level of inspection. (clearly I slack) | ||
timotimo | actually, "just" a special case inside bigint_rand for smallbigint would be a big win already | 17:06 | |
oh, we use mp_rand, though | |||
jitting this made hardly a difference, clearly because it spends most of its time doing other stuff | 17:17 | ||
fascinating. even though $!min is 0, adding that to the result of rand_I takes about 16.8% of the total time inside push-all | 17:19 | ||
it's probably rather bad that we force the bigint to full big int before every single rand_I | 17:21 | ||
hum. we're using mp_mod to get a number in the range that the user was asking for | 17:36 | ||
isn't that a bad idea for distributions of random numbers? | |||
gtg, but i'll continue making rand_I faster inside moarvm | 17:41 | ||
17:42
hoelzro joined
|
|||
Geth | MoarVM: MasterDuke17++ created pull request #624: Fix spelling in comments |
17:42 | |
MoarVM: d1951981ef | MasterDuke17++ (committed using GitHub Web editor) | src/math/bigintops.c Fix spelling in comments |
17:43 | ||
MoarVM: 604da4d062 | lizmat++ (committed using GitHub Web editor) | src/math/bigintops.c Merge pull request #624 from MasterDuke17/patch-2 Fix spelling in comments |
|||
nwc10 | jnthn: ASAN does not wish to comment on MoarVM at this time. (neither master nor even-moar-jit generate any excitement) | 18:13 | |
lizmat wonders if this is a good moment to bump | 18:35 | ||
nwc10 | while jnthn is eating? :-) | ||
[Coke] | GO GO GO GO | ||
18:38
brrt joined
|
|||
japhb | lizmat: Given that jnthn++ called it a week, I'd say go for it. | 18:43 | |
brrt | if you bump, i'll merge master, and create a pull request | 18:56 | |
jnthn | OMG. even-moar-jit time? :D | 19:00 | |
19:00
robertle joined
|
|||
timotimo | \o/ | 19:03 | |
brrt | uhuh | ||
i can't promise a completely clean spectest | |||
jnthn | Is it dirtier than before? :) | 19:05 | |
Zoffix | :o | 19:07 | |
wow cool :D | |||
brrt | to be honest, i don't run master so often :-) | 19:08 | |
Geth | MoarVM/even-moar-jit: 26 commits pushed by MasterDuke17++, (Jonathan Worthington)++, lizmat++, (Bart Wiegmans)++ review: github.com/MoarVM/MoarVM/compare/8...14c3f6820a |
||
jnthn | 2017.06 did a huge I/O overhaul. 2017.07 did a ton of string/Unicode stuff. 2017.08 will be huge amount of spesh changes + huge amount of JIT changes + re-implemented GC sync up :P | 19:09 | |
Crazy times ;) | 19:10 | ||
Zoffix | :) | ||
timotimo | so ... with my changes to sidestep bigint arithmetic for small rand_I max values i get different numbers for the same srand initialization value | 19:13 | |
should i bother trying to make that compatible? | |||
jnthn: opinions? | 19:14 | ||
i mean, you can switch tommath between rand and arc4random with a define flag | 19:15 | ||
jnthn | Hmmm | ||
I guess it behaves predictably still, just different values? | 19:16 | ||
timotimo | yeah | ||
jnthn | Probably it's OK | 19:17 | |
timotimo | thanks, let me measure the speedup now | ||
huh, i thought this was faster before | 19:19 | ||
oh, because of the .say | |||
ah, yes | |||
Geth | MoarVM: bdw++ created pull request #625: Merge new 'expression' JIT backend |
19:20 | |
timotimo | m: say 1.5 / 5.5 | 19:21 | |
camelia | 0.272727? | ||
timotimo | almost but not quite 4x | ||
m: say (87512 + 87472 + 94660 + 94552 + 94692) / (103676 + 103752 + 103564 + 104184 + 104068) | 19:22 | ||
camelia | 0.8837618? | ||
timotimo | that's the memory usage difference | 19:23 | |
[Coke] | timotimo: between what and what? | ||
timotimo | srand(1); say [+] ^1000 .roll(1_000_000) | 19:24 | |
brrt | okay, weekend for me | 19:25 | |
Geth | MoarVM/speed_up_rand_I: db4997d87c | (Timo Paulssen)++ | 4 files jit rand_I and also give it a smallbigint path almost 4x faster for max values that fit in 32bit |
||
timotimo | [Coke]: ^- this is the patch; the jitting itself made hardly a difference | 19:26 | |
jnthn | brrt++ # now I've got my weekend reading, I guess :) | 19:30 | |
timotimo | huh, infix:<+> is 50% interped, 50% jitted | 19:36 | |
how does that happen | |||
Zoffix | brrt, how long have you worked on that branch? (github shows earliest as Jan 2016, but says it's showing only latest 250 commits) | ||
timotimo | no bails in the jitlog at least | 19:40 | |
19:45
brrt joined
|
|||
brrt | Zoffix: June 2015 iirc | 19:45 | |
jnthn: have fun :-) | |||
Zoffix | wow. brrt++ | 19:47 | |
brrt | and its been over 300 commits for sure | ||
20:39
AlexDaniel joined
|
|||
nwc10 | ASAN considers origin/even-moar-jit to be unworthy of comment | 21:44 | |
(we do not live in interesting times) | |||
21:44
markmont joined
|
|||
japhb | Speak for yourself, nwc10. ;-) | 21:59 | |
22:05
yoleaux joined
|
|||
timotimo | my branch survives spectests | 22:09 | |
but i haven't tried asan | |||
trying now | 22:11 | ||
oh | |||
nwc10: you turn off the leak detecto, right? | |||
cool, it's appy | 22:31 | ||
Geth | MoarVM: db4997d87c | (Timo Paulssen)++ | 4 files jit rand_I and also give it a smallbigint path almost 4x faster for max values that fit in 32bit |
22:33 | |
MoarVM: 600c2e9cf2 | (Timo Paulssen)++ | 4 files Merge branch 'speed_up_rand_I' adds a smallbigint path to MVM_bigint_rand so we can get around allocating an mp_int to hold the max value as real bigint and doing a mod on the mp_int to get the real value. |
|||
MasterDuke | timotimo, brrt: is this something that the jit could do directly in assembly? github.com/MoarVM/MoarVM/blob/mast...#L620-L639 | 22:37 | |
timotimo | the expr jit can surely do this | 22:38 | |
is pow_i currently just unjitted and thus breaks some jit compilation? | |||
MasterDuke | currently unjitted, haven't checked any jit logs | 22:39 |