github.com/moarvm/moarvm | IRC logs at colabti.org/irclogger/irclogger_logs/moarvm Set by AlexDaniel on 12 June 2018. |
|||
00:06
Kaiepi joined
00:34
sivoais left,
sivoais joined
02:26
squashable6 left
02:30
squashable6 joined
05:40
Woodi left
05:41
domidumont joined
05:59
squashable6 left
06:02
squashable6 joined
06:05
Elronnd joined
06:18
brrt joined
06:23
Woodi joined
07:08
robertle joined
07:10
lizmat joined
07:20
lizmat left
08:25
zakharyas joined
|
|||
brrt | \o | 08:45 | |
nwc10 | o/ | 09:11 | |
brrt | our memmem.h header isn't really acting like a proper header | 09:13 | |
09:23
domidumont1 joined,
brrt left
09:26
domidumont left
|
|||
jnthn | So while looking at the profile of a Rat benchmark, I spotted some easy wins (provided spectest agrees) in our bigint stuff | 09:31 | |
09:47
brrt joined
|
|||
Geth | MoarVM: c5545770ca | (Jonathan Worthington)++ | src/math/bigintops.c Use mp_set_int when we know we have a 32-bit int It's quite a lot cheaper. |
09:49 | |
MoarVM: aad6fdc0bd | (Jonathan Worthington)++ | 3 files Greatly reduce allocation churn on bigint ops When we get an operation where we have a big integer and a small big integer, we have to upgrade the small one to do the operation. We used to `malloc` and `mp_init` a fresh `mp_int` to use every time, and then `mp_clear` and `free` it at the end. We can get rid of this cost by holding a few `mp_int` objects on the `ThreadContext` and re-using them for all of these operations. This shaves around a third off a Rat addition benchmark. |
|||
MoarVM/more-pea: 38 commits pushed by (Jonathan Worthington)++ review: github.com/MoarVM/MoarVM/compare/e...91f9c14d5d |
10:01 | ||
timotimo | whoa | 10:17 | |
10:17
sena_kun joined
10:26
klapperl left
10:33
klapperl joined
|
|||
jnthn | Might have found another 15% off that improved time too | 10:37 | |
brrt | :-o | 10:38 | |
jnthn | And that's before further improvements in the EA to devirt some more bigint operations :) | 10:39 | |
brrt | apparently, memmem segv's if it's argument is null | 10:40 | |
... I think if the argument to MVM_VECTOR_CONTAINS is NULL, and MVM_VECTOR_SIZE is 0, then we should not crash | 10:41 | ||
timotimo | yeah | ||
brrt | which is what happens now | 10:42 | |
but in my patch it does crash | |||
oh, hang on, I see the problem | |||
Geth | MoarVM: 2fe2f588fd | (Jonathan Worthington)++ | src/math/bigintops.c Use a cheaper way to force_bigint This does rely on mp_digit being at least 32-bit, but we rely on that elsewhere in multiple other places already. Gives ~15% off a Rat addition benchmark. |
10:47 | |
MoarVM/more-pea: 38 commits pushed by (Jonathan Worthington)++ review: github.com/MoarVM/MoarVM/compare/e...32709e8988 |
10:48 | ||
brrt | - problem nr. 1: I used MVM_VECTOR_ELEM_SIZE for the haystack size, rather than MVM_VECTOR_SIZE | ||
jnthn | oops :) | 10:49 | |
brrt | - problem nr. 2: memmem will treat the needle as a pointer to the data it'll compare, whereas the current implementation was looking for the needle-as-value (with ==) | 10:52 | |
so I need to pass &t rather than t | 10:53 | ||
timotimo | hm. are you sure that memmem is the right thing for MVM_VECTOR_CONTAINS though? since it'll advance by 1 byte during search, if i'm not mistaken? | 10:57 | |
brrt | jnthn: gist.github.com/bdw/993188860117ef...f454ffaa1d | ||
timotimo: is that so.... well... what's the alternative? | 10:58 | ||
I thought memmem was optimized and stuff | |||
timotimo | it is optimized! | ||
brrt | but idk how :-D | ||
timotimo | we do have memmem32, which is a memmem that goes by 4 bytes instead of 1 bytes | ||
brrt | hmm | 10:59 | |
I'll leave that for you to discuss | |||
timotimo | the thing is | ||
brrt is listening | |||
timotimo | the optimization of memmem relies on the needle being able to mismatch at a given position in an "interesting" way | 11:00 | |
i.e. if you have a needle "bbbbbbbb" and a haystack of "aaaaaaaaaaaabbbbbbbb" you can look at the 8th byte, see that it's a, and know that the match can't be at position 8 or earlier at all | |||
but - unless i'm misunderstanding what VECTOR_CONTAINS is supposed to do - you're supposed to get back the index of an element | 11:01 | ||
brrt | ok.... well.. is there a default, library optimized, version of 'find this exact match at these offsets' | ||
timotimo | those don't overlap AFAIK? | ||
11:01
zakharyas left
|
|||
brrt | no, MVM_VECTOR_SEARCH is supposed to return a pointer | 11:01 | |
timotimo | right, a pointer to an element | 11:02 | |
brrt | ok, my understanding is wrong, then | ||
because I need to search only the boundaries | |||
timotimo | and since VECTOR_SEARCH is passing MVM_VECTOR_ELEM_SIZE, it can't be used to search for two consecutive elements either | ||
brrt | I think that's normal, though | ||
but what I don't want, is a half-item | 11:03 | ||
so memmem is, indeed, not what I want | |||
timotimo | that's just a peculiarity of the API we have | ||
i mean the API you're offering | |||
maximum shrug ;) | |||
brrt | :-) | ||
timotimo | i don't think you can get much faster than what the compiler turns MVM_VECTOR_CONTAINS into. that said | ||
the previous version of MVM_VECTOR_CONTAINS uses ==, does that work properly with structs in all cases? | 11:04 | ||
does it compile to something memcmp-like? | |||
i sometimes get confused as to what you're allowed to do with structs when it comes to assignment and comparison | |||
11:08
brrt left
|
|||
jnthn | lunch; bbl | 11:10 | |
jnthn back | 11:54 | ||
timotimo | can't wait to see where that "another 15%" could be from | 11:59 | |
Kaiepi | the stuff that gets discussed here always seems like arcane wizardry | 12:00 | |
timotimo | you can tell by the fact that i'm a part of it that this is all completely learnable | ||
Kaiepi | lol | ||
i haven't really looked into moarvm much outside of bsd compatibility | 12:01 | ||
jnthn | timotimo: It was from the "use a cheaper way to force_bigint" | 12:03 | |
timotimo | oh, i see | ||
yeah, the commit message even mentions the 15% figure again | |||
jnthn | grmbl, I was going to make us devirt/avoid boxing with gcd_I, but turns out I've discovered a case where we're too conservative in BB merging... | 12:05 | |
And so refuse to do things that would be safe enough | |||
timotimo | aaw | 12:16 | |
but that also means when that is fixed, other optimizations might be able to step in in other, similar situations? | 12:17 | ||
along with exprjit boundaries being disappered | |||
jnthn | Yeah | ||
Hmmmm. | |||
Implemented what I thought would be a solution and it does something awful. | |||
12:23
Guest5089 joined
|
|||
jnthn | (It was 'cus I did something silly.) | 12:29 | |
12:37
zakharyas joined
|
|||
Guest5089 | hmm, SEGV in several spectests | 12:49 | |
12:52
brrt joined
|
|||
Guest5089 | perhaps that's what jnthn is grumbling about above | 12:52 | |
disappears with MVM_JIT_DISABLE=1 | 12:54 | ||
timotimo | you're on the more-pea branch? | ||
Guest5089 | master | ||
timotimo | oh | ||
this actually landed on master | |||
Guest5089 | do you think that was a mistake? | 12:55 | |
timotimo | i don't think so, but if it segfaults, that's not so good | ||
Guest5089 | Thread 1 "moar" received signal SIGSEGV, Segmentation fault. | 12:57 | |
jnthn | The bigint stuff went into master, since it's not related to PEA | ||
Guest5089 | 0x00007ffff766e1cd in force_bigint (tc=0x557e4360, body=0x555556f65e80, idx=1) at src/math/bigintops.c:170 | ||
brrt | timotimo: I think you're allowed to use '==' on structs | 12:58 | |
timotimo | that's cool | ||
so it surely even ignores padding bytes? | |||
Guest5089 | jnthn: try running t/spec/S32-temporal/DateTime.t | ||
brrt | not sure if it does ignore padding bytes, no | 12:59 | |
jnthn | Guest5089: I had clean spectest | ||
Distracted with some other things for now | |||
Guest5089 | that's odd, perhaps something is wrong with my install, hmm | ||
timotimo | can you find what exact access makes the thing go asplode? | 13:00 | |
Guest5089 | sec | ||
timotimo | i have a dirty moarvm locally, so rebuilding would be quite a hassle | ||
actually, i might have to go afk for a bit soon | |||
Guest5089 | timotimo: gist.github.com/dogbert17/83ae9222...873328c3bb | 13:01 | |
timotimo | huh, how does that segfault? | 13:05 | |
13:05
MasterDuke left
|
|||
timotimo | the tc seems odd | 13:05 | |
Guest5089 | indeed | 13:06 | |
timotimo | i mean, it's the jit, so the stack gets a little screwed up | 13:07 | |
but still | |||
Guest5089 | (gdb) p *tc | ||
Cannot access memory at address 0x557e4360 | |||
brrt | that does $r14 say | 13:08 | |
Guest5089 | sigh, valgrind seems to hide the problem | 13:10 | |
(gdb) p $r14 | 13:11 | ||
$1 = 93824994348208 | |||
timotimo | m: say 93824994348208.base(16) | ||
camelia | 555555758CB0 | ||
timotimo | that's the proper tc pointer | ||
jnthn | /* Call the bigint comparison function. */ | 13:13 | |
| mov ARG1, tc; | |||
| mov ARG2, WORK[reg_b]; | |||
| mov ARG3, WORK[reg_c]; | |||
Should that tc not be TC ? | |||
e.g. the register holding it? | |||
brrt: Can you spot anything silly in gist.github.com/jnthn/faa05a408bd9...97740f4da0 ? | 13:15 | ||
timotimo | you're working directly with eax, ecx, edx? | 13:16 | |
jnthn | Yeah. 'cus cdq and div work on specific registers | 13:17 | |
timotimo | ah | ||
jnthn | TMP3 and TMP4 are r8 and r9, so no conflicts there | ||
timotimo | of course they do | ||
right, TMP1 and TMP2 are rcx and rdx | 13:18 | ||
jnthn | Yeah, ain't using those :) | ||
timotimo | OK, that's fine at least | ||
i assume the code above the last hunk is also fine, since calling MVM_bigint_fallback_mul probably already worked fine? | |||
i wonder if | jo >1 is fine without braces around it | 13:19 | ||
jnthn | ~Yeah | ||
timotimo | good | ||
good to hear dynasm does The Right Thing there | |||
with regards to amount of statements | |||
jnthn | Wrapping it in braces doesn't help. no | 13:20 | |
timotimo | OK | ||
jnthn | The erorr is utterly strange | ||
timotimo | i've seen worse back when i didn't know about floating point registers in the calling convention :P | ||
jnthn | Cannot invoke null object | 13:21 | |
at SETTING::src/core/Rat.pm6:49 (/home/jnthn/dev/MoarVM/install/bin/../share/perl6/runtime/CORE.setting.moarvm:DIVIDE_NUMBERS) | |||
from SETTING::src/core/Rat.pm6:133 (/home/jnthn/dev/MoarVM/install/bin/../share/perl6/runtime/CORE.setting.moarvm:infix:<+>) | |||
timotimo | where sometimes it just so happened to have the right value in the register regardless | ||
does turning off the exprjit make a difference btw? | |||
jnthn | No | ||
timotimo | OK | 13:23 | |
just to make sure there are no gremlins; make clean everywhere? rakudo does have some extops, but i don't think they access MVMThreadContext fields directly | 13:30 | ||
brrt | jnthn: will look | 13:34 | |
jnthn | I've been putting jmp >1 at various places (that is, esape to the fallback) | ||
brrt | why `xor eax, edx` | 13:35 | |
and, it may be the case that the '4:' label is reused somewhere? | 13:36 | ||
jnthn | brrt: Technique taken from helloacm.com/optimized-abs-functio...-assembly/ | 13:37 | |
Do labels have to be numbers, or can they be names? | |||
But yeah, I've been putting in a jmp >1 in various places (e.g. escape to the slow path) | |||
brrt | global labels can be names | ||
what I typically do is put an 'int 3' and step through | 13:38 | ||
but that works in lldb, not in gdb | |||
jnthn | And if I put it before the jnz >4 all is well | ||
If put it after the jnz >4 (e.g. when we should have fallen out of the loop) I get the error | |||
How do the label numbers actually work? | |||
brrt | luajitish. | ||
but to be more precise, there's a 10-item array of offsets that are computed in one of the three dynasm phases | 13:39 | ||
and updated | |||
jnthn | If I use the label 5 instead of 4 I get a SEGV | ||
brrt | Yes, I think I made label 5 special | 13:40 | |
jnthn | 6 also segvs | ||
timotimo | 4 non-special labels should be enough for anybody | ||
jnthn | 7 too :P | ||
Well, 4 is busted also ;) | |||
timotimo | d'oh :) | ||
brrt | hmm | 13:41 | |
jnthn | Just differently | ||
brrt | :-D | ||
jnthn | But yeah, it seems to be something about the labels | ||
brrt | actually, I think label 4 should be safe to use | ||
oh, hang on | |||
this is a loop backwards, isn't it | |||
jnthn | I don't have to do anything special 'cus it's a back branch, do I? | ||
Yes | |||
brrt | yes, you do | 13:42 | |
no, I think label 4 should be | |||
corsix.github.io/dynasm-doc/instructions.html | |||
under 'Jump Targets' | |||
a backwards jump is spelled 'jmp <4' | 13:43 | ||
jnthn | oh! | ||
brrt | there's a mechanical (i'm going to avoid callit it 'sane') reason for that | ||
jnthn | And there we have it | ||
Thanks, brrt :) | 13:44 | ||
Thanks timotimo for ideas also :) | |||
brrt | :-) | ||
timotimo | glad to see it was figured out :) | ||
jnthn | The devirt of that is worth something; I didn't do the div_I yet so it's materializing it sharpish | 13:45 | |
Geth | MoarVM/more-pea: f7c7db200a | (Jonathan Worthington)++ | src/spesh/pea.c Avoid some spurious PEA bailouts When we reach a merge point in the graph, if the allocation was only seen on one side and not the other, then it cannot possibly need a materialization on the side where it wasn't yet created. We don't insert such materializations yet, but rather bail out; still, fixing this means we bail out in some cases where we didn't need to do so. |
13:53 | |
jnthn | That was the thing that I needed to do so we even attempted gcd_I devirt :) | 13:54 | |
Now, dare I attempt div_I... :) | 13:55 | ||
hah, it won't help yet, for reasons | 14:04 | ||
Geth | MoarVM/more-pea: 0e2be43fc8 | (Jonathan Worthington)++ | src/spesh/facts.c A couple more facts |
14:05 | |
14:30
domidumont joined
14:33
domidumont1 left
14:36
mornfall left
14:37
mornfall joined
14:38
lizmat joined
14:40
lizmat_ joined
14:43
lizmat left
14:47
lizmat_ left
14:49
robertle left
14:58
brrt left
15:23
domidumont left
15:55
zakharyas left
|
|||
Geth | MoarVM/more-pea: 5eff90b3ce | (Jonathan Worthington)++ | 10 files Devirtualization and EA for further bigint ops * gcd_I, with the nice tight GCD algorithm inlined direclty into the machine code * cmp_I, eq_I, ne_I, lt_I, le_I, gt_i, and ge_I |
15:55 | |
16:27
lizmat joined
|
|||
jnthn | I think we do Rat + in about half the time we did before I started on this bigint EA work, plus my early bigint opts. | 16:27 | |
*earlier | |||
And not finished yet :) | 16:28 | ||
Though we do have a slight problem in that I was hoping entire Rat objects would go away, not just the calcs inside them | |||
But it all goes rather awkwardly because of the Num failover | |||
Or at least, that's what'll block it later | 16:29 | ||
16:29
lizmat left
|
|||
jnthn | Still, if we can eliminate all the intermediate allocations during the calc, we're a long way ahead. | 16:29 | |
16:30
lizmat joined
16:55
robertle joined
16:56
domidumont joined
|
|||
Geth | MoarVM/more-pea: da088863a4 | (Jonathan Worthington)++ | 10 files Devirtualize and EA for neg_I, abs_I |
17:09 | |
MoarVM/more-pea: 201e8e6b0b | (Jonathan Worthington)++ | src/spesh/pea.c Log which bigint ops were optimized in EA |
|||
jnthn | I guess the only common ops left to go are div and mod | 17:11 | |
But not today :) | |||
17:13
domidumont left
|
|||
Geth | MoarVM: 248e2980af | (Jonathan Worthington)++ | src/jit/x64/emit.dasc Fix wrong argument in JIT of cmp_I |
17:39 | |
jnthn | That should fix the SEGV that was reported earlier | ||
It's been wrong for ages, and nobody noticed until today when I used the tc argument :P | 17:40 | ||
m: say 306 / 491 | 17:43 | ||
camelia | 0.623218 | ||
jnthn | Quite a lot less GC runs in that Rat benchmark now also :) | ||
timotimo | very nice | 17:45 | |
jnthn | Next week I'll take a look at inserting materializations at merge points that require them, and also at fixing up the representation we produce to get rid of the broken SSA issues | 17:48 | |
And probably put in div_I and mod_I devirt/EA too, though if anyone wants to steal that part (or indeed any of the remaning operators) then go ahead. :) | 17:49 | ||
Note that the branch isn't in a very good state in terms of stuff working :) | |||
I'm getting towards the point where I've done the features I want in it, and then the focus will switch to fixing it up and preparing for merge. | |||
There's at least one problem with deopt left. | 17:50 | ||
17:50
zakharyas joined
|
|||
jnthn | Off to rest :) o/ | 17:53 | |
18:56
zakharyas left
19:31
lizmat left
19:41
scovit left
19:48
scovit joined
22:20
MasterDuke joined,
MasterDuke left,
MasterDuke joined
23:50
Elronnd left
|