github.com/moarvm/moarvm | IRC logs at colabti.org/irclogger/irclogger_logs/moarvm Set by AlexDaniel on 12 June 2018. |
|||
00:16
Kaeipi left,
Kaeipi joined
00:33
Kaeipi left
00:34
Kaeipi joined
00:50
leont left
02:01
lucasb left
04:01
squashable6 left
04:03
squashable6 joined
04:37
raku-bridge1 joined,
raku-bridge1 left,
raku-bridge1 joined
04:38
raku-bridge left
04:39
raku-bridge1 is now known as raku-bridge
06:15
sena_kun joined
07:07
Kaeipi left
07:17
Kaiepi joined
07:29
Kaiepi left
07:30
Kaiepi joined
07:35
domidumont joined
08:10
Altai-man joined
08:13
sena_kun left
08:38
zakharyas joined
08:53
domidumont left
|
|||
nine | MasterDuke: still no joy. Even on those precise commits now, I just can't get it to segfault. I don't even get an error. It seems to work just fine. | 08:54 | |
tellable6 | nine, I'll pass your message to MasterDuke | ||
08:57
domidumont joined
09:24
domidumont left
09:38
Kaiepi left
09:39
domidumont joined
09:46
domidumont1 joined
09:48
domidumont left
|
|||
timotimo | hm. so would the WHICH of an Int have to be human-readable? because if we didn't render it in base10, but in whatever we like that's faster, that could be a good improvement on some workloads that check ints against presence in a hash or set | 09:50 | |
nine | I don't think WHICH is meant to be human-readable | 09:56 | |
09:59
domidumont1 left
10:02
domidumont joined
|
|||
jnthn | No, it doesn't need to be | 10:20 | |
nine | Actually it doesn't even have to be a string. We expect an ObjAt. We assume that to be unboxable to a string in some places but I guess that's not strictly required | 10:23 | |
10:40
Kaiepi joined
|
|||
lizmat | Stiil, I'm pretty sure Int.WHICH will be inlined | 10:42 | |
if it would be possible to dispatch on *exact* type versus type match, it could be made even simpler | 10:43 | ||
nine | It will probably be inlined, yes. But avoiding stringification altogether would be even nicer | ||
lizmat | Int:D:X: would match an *exact* Int:D, not a subtype ? | 10:44 | |
Int:U:X: would match the Int type object and not a subtype of Int | |||
jnthn | I don't think ObjAt should be string based really | ||
nine | lizmat: why is type matching a problem? | 10:45 | |
jnthn | lizmat: If it's on the invocant, then submethod :) | ||
lizmat | hmmmm.... | ||
intriguiing | |||
jnthn: if not a str, then what ? | 10:46 | ||
nine | WHICH is used for defining equality of objects. An equality test includes same type and same WHICH value. So the ObjAt implementation may actually depend on the type. | 10:48 | |
lizmat | can you have a submethod Foo *and* a multi method Foo in the same class? | 10:49 | |
if not, how would I handle subclasses inside the class ? | |||
jnthn | lizmat: No, you can't, so it's only useful in some cases | ||
timotimo | anyway, yeah, i was refering entirely to the amount of cpu time spent inside moarvm when turning an int into a string | ||
jnthn | It's not realy designed for that :) | 10:50 | |
lizmat | jnthn: ok, so not useful in the Int.WHICH case then | ||
nqp::tostr_I(self) is pretty fast, is it not? | |||
timotimo | i don't know if stringifying to base 16 is alredy a lot faster than 10 | ||
lizmat | ah, like that... | 10:51 | |
timotimo | let me dig out the collatz code again | ||
lizmat | timotimo: looks like there is no difference in base_I(foo,16) and tostr_I(foo) | 10:53 | |
nine | But there could be | 10:54 | |
An optimized base 16 stringification must be faster than an optimized base 10 stringification | |||
lizmat | yeah, looks like it *does* make a difference with --optimize=0 | 10:55 | |
timotimo | whoops that froze my computer for a little bit ... | ||
lizmat | so maybe my benchmark is just measuring the loop | ||
timotimo | try measuring it in nqp land | ||
jnthn | If you don't use the result then it may be optimized out :) | 10:56 | |
lizmat | yeah... :-) | 10:57 | |
he... I just made a selfi :-) | 10:58 | ||
(vim artefact of insertion after "self" :-) | 10:59 | ||
timotimo | 0.91% perl6 libmoar.so [.] MVM_coerce_i_s | ||
hmm | |||
did i misremember? | 11:00 | ||
42.20% perl6 libmoar.so [.] MVMHash_gc_mark | |||
nine | Do we miss an HASH_ITER_FAST equivalent | 11:04 | |
? | |||
Looks like gc_mark could benefit from hash iteration where we know that the hash cannot be modified during the iteration | |||
lizmat | there are plenty of situations in the settings where we know the hash cannot be modified during iteration | 11:06 | |
if there's some flag that can be set... tell me :-) | 11:07 | ||
or an nqp::hash_fixed op :-) | |||
timotimo | inside gc_mark it's also impossible for a hash to be modified during the iteration | 11:14 | |
nine | oh, how_ | ||
_ | |||
timotimo | hm? | 11:15 | |
nine | How can that hash be modified during gc_mark? | ||
timotimo | it can not | 11:16 | |
nine | oh, "impossible" not "possible" | 11:17 | |
timotimo | need more coffee? :) | ||
nine | apparently :) | ||
timotimo | also, it's not that great of an idea to have such a short particle invert the meaning of a word | ||
lizmat | yeah, it's anormal | 11:18 | |
timotimo | and amoral | 11:19 | |
lizmat | TIL anormal is not english :-) | ||
nwc10 | abnormal | ||
lizmat | yeah, that :-) | ||
nwc10 | no, don't ask me which of (maybe 10) negation prefixes is the right one for any given word | ||
lizmat | I was thinking of a shorter negation prefix :-) | ||
works in Dutch and German :-) | 11:20 | ||
nwc10 | atonal is the negation of tonal | ||
lizmat | :-) | ||
nwc10 | achromatic | ||
lizmat | .oO( ouch! ) |
||
jnthn | The number of negating prefixes in English is one of those "how does anyone ever learn this" moments :) | 11:21 | |
lizmat | nwc10: it's not all black and white :-) | ||
timotimo | black and ablack | 11:22 | |
jnthn | For all its horrors, Czech is at least simple in this regard :) | ||
.oO( Nice waterbed you have there... ) |
|||
nwc10 | immoral illogical irrational insane | ||
and yes, amoral and immoral are different things :-) | 11:23 | ||
lizmat | re base_I vs tostr_I: github.com/rakudo/rakudo/commit/69f3e959b5 | 11:25 | |
timotimo | whoa, orders of magnitude faster? | 11:30 | |
have we mentioned flammable and inflammable yet? | 11:32 | ||
can has measurements for base_I and tostr_I? :3 | 11:33 | ||
lizmat | ah, I guess I had one 0 less in one of the benchmarks :-( | 11:34 | |
yeah, it's more like a few % :-( | 11:35 | ||
timotimo | d'oh | ||
lizmat | meh | ||
nine | well faster is faster... | 11:36 | |
Geth_ | MoarVM/thread_local-doh: ec50786290 | (Nicholas Clark)++ | 3 files D'oh! We need to set MVM_running_threads_context. Add `MVM_set_running_threads_context` which sets the value to be returned from `MVM_get_running_threads_context`. This fixes a bug that I introduced in commits 4cfde6edf15b0bc0 and ac941c0d59286528. It was not spotted because the only place that currently calls `MVM_get_running_threads_context` doesn't ever actually need the value. |
11:37 | |
nine | Since the resulting strings will be shorter, string comparison will be a bit faster, too | ||
timotimo | we do fastpath comparison with the cached hash values though | ||
Geth_ | MoarVM: nwc10++ created pull request #1389: D'oh! We need to set MVM_running_threads_context. |
||
timotimo | we do mp_radix_size for Int in non-10 bases | 11:39 | |
er | |||
more exactly, for small bigints we do, otherwise it all goes through libtommath | |||
no mp_radix_size is just for calculating how big the number would be in the given base | 11:44 | ||
mp_to_radix is the function that does the actual stringification | |||
that doesn't seem to have a very optimized implementation for any specific bases | 11:45 | ||
but also, we can use base 64 for WHICH if we want | 11:46 | ||
it will at least do fewer divisions of the number | |||
Geth_ | MoarVM/exception-in-spesh-should-oops: 998ea76a17 | (Nicholas Clark)++ | src/core/exceptions.c Calling `MVM_exception_throw_adhoc` in the spesh worker should be an oops. Similarly no exceptions should be thrown in the event loop thread. Also report these two special threads in `MVM_oops`, and treat a NULL thread context pointer in oops as a bug worthy of a panic. (Previously it would have "panic"ed because it would have crash due to the NULL pointer dereference. This way, we give slightly better diagnostics.) |
12:06 | |
MoarVM/exception-in-spesh-should-oops: fa9b6659a1 | (Nicholas Clark)++ | src/core/exceptions.c Use simpler stdio calls in exceptions.c where possible. fputc('\n', stderr) is simpler than fprintf(...) or fwrite(...) Because `fputs` is not varargs, the caller has a simpler ABI than `fprintf` on some architectures (eg x86_64). (Well, that's the theory anyway. In practice, it seems that a new enough gcc optimises simple cases of either into calls to `fwrite`.) |
|||
timotimo | nwc10: so is str_hash_first + !str_hash_at_end + str_hash_next_nocheck already the fastest we can do? | 12:10 | |
can we be any faster by knowing the number of entries up front? i imagine not | |||
wow github went a little wild and shows me five copies of lizmat's comment on the commit | 12:11 | ||
12:12
sena_kun joined
|
|||
Geth_ | MoarVM: nwc10++ created pull request #1390: Exceptions in spesh should oops |
12:12 | |
12:13
Altai-man left
|
|||
nwc10 | I think it is. And end is 0, so that's a fast chcek | 12:14 | |
timotimo | OK | ||
we iterate backwards through the hash storage? | |||
nwc10 | yes. | ||
timotimo | can we prefetch the whole body? | 12:15 | |
nwc10 | ask your CPU :-) | ||
in that, (a) I don't exactly know how to do this, and certainly not portable | |||
and (b) wouldn't the CPU spot this after a while | |||
timotimo | don't know actually | 12:17 | |
Geth_ | MoarVM/thread_local-nativecall: 0c3a38fad8 | (Nicholas Clark)++ | 8 files Move MVM_{set,get}_running_threads_context to threadcontext.h As this now includes "platform/threads.h" remove that include from several C source files. |
12:19 | |
MoarVM/thread_local-nativecall: 34e0686435 | (Nicholas Clark)++ | 2 files Simplify and inline `MVM_nativecall_find_thread_context` Use `MVM_get_running_threads_context` to get the thread context directly, instead of needing to take a mutex and iterate over the linked list of threads. As the function is only used in one place, and is now much smaller, convert it to an inline function, which means moving it into the header file. |
|||
timotimo | gcc.gnu.org/projects/prefetch.html - looking at this page now | ||
nwc10 | I'm not quite sure how to open this sentance, but right now the hash code on master isn't the finished product, so it's a bit awkward if we think we want to go patch *it* | 12:21 | |
timotimo | OK, no prob :) | ||
nwc10 | there are two pull requests (sort-of) patiently waiting for review | ||
Geth_ | MoarVM: nwc10++ created pull request #1391: Implement MVM_nativecall_find_thread_context using MVM_get_running_threads_context |
12:23 | |
nwc10 | and the version in the PRs uses a single block of memory, so likely it is already more cache (and maybe prefetch) friendly than what is in master | 12:27 | |
12:34
leont joined
|
|||
timotimo | i wish i had the brain capacity to review them | 12:46 | |
i seem to recall they are rather big? | |||
nwc10 | if you squash them, yes. But not if you don't. (How contrary) :-) | 12:52 | |
sequence of smaller commits. But yes, quite a lot | |||
12:53
zakharyas left
|
|||
Geth_ | MoarVM: ec50786290 | (Nicholas Clark)++ | 3 files D'oh! We need to set MVM_running_threads_context. Add `MVM_set_running_threads_context` which sets the value to be returned from `MVM_get_running_threads_context`. This fixes a bug that I introduced in commits 4cfde6edf15b0bc0 and ac941c0d59286528. It was not spotted because the only place that currently calls `MVM_get_running_threads_context` doesn't ever actually need the value. |
14:17 | |
nwc10 | you've seen that already. That's a fast forward of master | 14:18 | |
14:18
MasterDuke joined
|
|||
MasterDuke | nine: odd. i actually don't get the segv on my use_fsa_for_vmarray branch, but i get it every time on master | 14:23 | |
tellable6 | 2020-11-24T08:54:45Z #moarvm <nine> MasterDuke: still no joy. Even on those precise commits now, I just can't get it to segfault. I don't even get an error. It seems to work just fine. | ||
MasterDuke | but if you're trying to chase it down i could send an rr recording | ||
timotimo | mhh, invoking code from the debugserver crashes when it had been suspended when it came out of "sleep", i assume since sleep isn't :invokish | 14:31 | |
MasterDuke | re converting strings to integers, i know gmp is much faster than tommath, but i think i was mainly testing with biggish integers, don't remember if i checked more normal sized ones. it has some optimizations for bases that are powers of 2 | 14:40 | |
sena_kun | Was migration to gmp blocked by something? | 14:47 | |
MasterDuke | uh, am i crazy? i see nqp::base_I($i, 16) as being 10x slower than nqp::base_I($i, 10) | ||
sena_kun: yeah, i can't get it to build a libgmp.a, it fails during linking. the conversion of the MoarVM source code is done (well, except for catching up to some recent fixes nwc10++ made) | 14:49 | ||
no reply yet to gmplib.org/list-archives/gmp-discu...06587.html | 14:50 | ||
sena_kun | :/ | ||
MasterDuke | i can run by using my system libgmp, and it passes spectest, it's just building that's a blocker | 14:51 | |
m: use nqp; my $a; my $s = now; for ^10_000_000 -> Int $i { $a = nqp::base_I($i, 16) }; say now - $s; say $a | |||
camelia | 5.85934862 98967F |
||
MasterDuke | m: use nqp; my $a; my $s = now; for ^10_000_000 -> Int $i { $a = nqp::base_I($i, 10) }; say now - $s; say $a | 14:52 | |
camelia | 1.66136848 9999999 |
||
MasterDuke | m: use nqp; my $a; my $s = now; for ^10_000_000 -> Int $i { $a = nqp::tostr_I($i) }; say now - $s; say $a | ||
camelia | 1.6944929 9999999 |
||
MasterDuke | lizmat: you benched base_I as faster than tostr_I? | 14:54 | |
timotimo | m: use nqp; my $a; my $s = now; for ^10_000_000 -> Int $i { $a = nqp::base_I($i,16) }; say now - $s; say $a | 14:55 | |
camelia | 5.748907 98967F |
||
timotimo | huh. odd | ||
MasterDuke | we currently have a fast path for base-10 | ||
timotimo | ah | 14:56 | |
dang. | |||
MasterDuke | oh wait, maybe that's just for small bigints converting to base 10 | 14:59 | |
github.com/MoarVM/MoarVM/blob/mast...1076-L1078 yep, otherwise everything uses mp_to_radix | 15:02 | ||
timotimo | yes | ||
MasterDuke | but i'm pretty sure tommath doesn't have any optimizations/fast paths for different bases, it's all just division by the base. so why in the world is 16 slower? | 15:05 | |
oh, because those are mostly small bigints | 15:07 | ||
15:08
zakharyas joined
|
|||
MasterDuke | yeah, all the time is spent in coerce_i_s compared to mp_clear+mp_div_2d+mp_div_d+mp_set_u64 | 15:09 | |
m: use nqp; my $a; my $s = now; for 10_000_000_000..10_001_000_000 -> Int $i { $a = nqp::base_I($i, 10) }; say now - $s; say $a | 15:12 | ||
camelia | 3.91260929 10001000000 |
||
MasterDuke | m: use nqp; my $a; my $s = now; for 10_000_000_000..10_001_000_000 -> Int $i { $a = nqp::base_I($i, 16) }; say now - $s; say $a | ||
camelia | 2.7878906 2541B2640 |
||
MasterDuke | results as expected with not-small bigints | ||
timotimo | m: use nqp; my $a; my $s = now; for 10_000_000_000..10_001_000_000 -> Int $i { $a = nqp::base_I($i, 64) }; say now - $s; say $a | 15:16 | |
camelia | 2.66220716 9K6oP0 |
||
MasterDuke | but that does mean that tostr_I (i.e., base-10) will be faster for most of our uses of WHICH, right? | 15:22 | |
15:58
MasterDuke left
16:11
Altai-man joined,
MasterDuke joined
16:13
sena_kun left
16:44
zakharyas left
16:45
zakharyas joined
18:02
domidumont left
18:16
domidumont joined
18:20
domidumont left
18:51
zakharyas left
20:12
sena_kun joined
20:13
Altai-man left
|
|||
MasterDuke | this comment github.com/MoarVM/MoarVM/blob/mast...e.h#L8-L12 seems relevant to what i was planning to do with MVMArrayBody. do we want those characteristics for arrays? | 20:30 | |
20:43
zakharyas joined
20:51
zakharyas1 joined,
zakharyas left
21:00
MasterDuke left
21:21
Kaiepi left
21:22
MasterDuke joined,
Kaiepi joined
|
|||
Geth_ | MoarVM: 0c3a38fad8 | (Nicholas Clark)++ | 8 files Move MVM_{set,get}_running_threads_context to threadcontext.h As this now includes "platform/threads.h" remove that include from several C source files. |
21:23 | |
MoarVM: 34e0686435 | (Nicholas Clark)++ | 2 files Simplify and inline `MVM_nativecall_find_thread_context` Use `MVM_get_running_threads_context` to get the thread context directly, instead of needing to take a mutex and iterate over the linked list of threads. As the function is only used in one place, and is now much smaller, convert it to an inline function, which means moving it into the header file. |
|||
MoarVM: bc219078d8 | (Jonathan Worthington)++ (committed using GitHub Web editor) | 9 files Merge pull request #1391 from MoarVM/thread_local-nativecall Implement MVM_nativecall_find_thread_context using MVM_get_running_threads_context |
|||
MoarVM: 3838247ea7 | (Ben Davies)++ | src/jit/x64/emit.dasc Add word and byte-sized return value macros to the lego JIT This makes it so shorthand exists for any size of return value. |
21:30 | ||
MoarVM: bfce05394d | (Jonathan Worthington)++ (committed using GitHub Web editor) | src/jit/x64/emit.dasc Merge pull request #1331 from Kaiepi/lego-jit-rv Add word and byte-sized return value macros to the lego JIT |
|||
21:31
sena_kun left
21:33
zakharyas1 left
|
|||
jnthn | MasterDuke: The motivation is a little more involved there, but at least partly the same, yes | 21:36 | |
MasterDuke | ok, good. i was comparing with MVMHash because i remembered nwc10's comments about it being two chunks currently, but planned for one, and it looked a bit different. wanted to make sure what i was planning would work ok | 21:59 | |
22:31
MasterDuke left
23:22
travis-ci joined
|
|||
travis-ci | MoarVM build errored. Nicholas Clark 'D'oh! We need to set MVM_running_threads_context. | 23:22 | |
travis-ci.org/MoarVM/MoarVM/builds/745645098 github.com/MoarVM/MoarVM/compare/a...507862902e | |||
23:22
travis-ci left
23:38
leont left
|