00:23
vendethiel joined
|
|||
timotimo | in my particles benchmark, the "put VMNull everywhere" thing is quite hot, and allocate_frame is at 4.84% Self time :\ | 00:41 | |
hm, i've gotta check if it's compiled with optimizations at all | 00:43 | ||
nope, --optimize=0 | 00:44 | ||
that explains a little bit, i s'pose | |||
oh, did i never merge the "avoid atomic ops when no threads are involved" thing? | 00:48 | ||
actually, i'm not sure acquire_ref and friends are still interesting for frames when they become collectables on the heap | |||
damn, of course the function that is hottest is the one i didn't write an optimized version for | 00:52 | ||
huh, looks like the fence instructions (yes, two of them in a row for some reason) that get inlined from (what i guess to be) gc_worklist_add_frame are quite hot indeed for MVMCode's gc_mark | 00:57 | ||
in the gc_mark function that itself makes up 4.5% Self time | |||
diakopter | which profiler are you using | ||
timotimo | that's "perf" | 01:00 | |
diakopter | I guess that's a linux thing? | ||
timotimo | yeah, a probabilistic instruction-level profiler | ||
diakopter | those CPU-interrupt-based samplers don't work in VMs, usualy :( | 01:01 | |
except in the flagship ones (VMWare, HyperV), I think, if you pay a lot | 01:02 | ||
I mention that only because I have linux in a VM | |||
timotimo | right, i feared as much | 01:03 | |
the inner loop for my particle system has 93.12% inclusive time, and 50.11% self-time | 01:04 | ||
%) | 01:08 | ||
decont the thing, check concreteness, assert parameter check based on the result, decont the original thing again to use it | 01:09 | ||
classic moar bytecode | |||
diakopter | uh huh | 01:10 | |
how much time did you spend with parrot instructions | |||
PIR, I guess I mean | 01:11 | ||
(my memory sadly fails me) | 01:13 | ||
timotimo | not terribly much, i'm afraid | 01:15 | |
why? | 01:16 | ||
diakopter | well just about all our codegen .. wasn't great | ||
timotimo | :) | 01:18 | |
the code gen is still at that quality level in a few isolated places, i'm guessing | 01:20 | ||
in my particle system i've made extra sure to write the code so it'd do the least amount of indirect stuff possible ... and it's now inlining two subs that aren't jitted | 01:21 | ||
diakopter | quite a few places, I'm sure | ||
timotimo | actually, a few more subs are only jitted for a tiny percentage | ||
but also not entered often at all | |||
i wonder how to interpret this correctly | 01:22 | ||
one of them is floor(Num:D:), the other is ASSIGN-POS | |||
floor_n is still unimplemented :R | |||
:) | |||
diakopter | heh | 01:23 | |
timotimo | huh | 01:24 | |
apparently calling .Int on a num (native num?) gives an error | |||
it's not supposed to coerce, then? | |||
no, looked at the wrong thing | |||
cool, got it to jit by using .Int instead of .floor | 01:26 | ||
but .Int isn't inlined any more | |||
diakopter | m: say 9.9e101.Int | 01:28 | |
camelia | rakudo-moar a45224: OUTPUT«989999999999999971062477677470550235220096190889648004812994130017827049653182301025734968880029237248» | ||
diakopter | m: say 9.9e101.Rat | 01:29 | |
camelia | rakudo-moar a45224: OUTPUT«989999999999999971062477677470550235220096190889648004812994130017827049653182301025734968880029237248» | ||
diakopter | I thought for sure Rat would do something smarter | ||
timotimo | m: say 9.9e101.Rat.nude | 01:30 | |
camelia | rakudo-moar a45224: OUTPUT«(989999999999999971062477677470550235220096190889648004812994130017827049653182301025734968880029237248 1)» | ||
timotimo | like what? | ||
have a denominator of 0.00000001 perhaps :D | |||
diakopter | well it could get the raw significant digits from the num | 01:31 | |
timotimo | mhm | ||
i suppose at that point the value already underwent num semantics | |||
m: say (9.9 * 10 ** 101).nude | |||
camelia | rakudo-moar a45224: OUTPUT«(990000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 1)» | ||
timotimo | i mean: floating point semantics | ||
using the e in the middle is how you ask for a floating point value, after all | 01:32 | ||
diakopter | right, but it's still stored as something that displays 9.9 | ||
timotimo | m: say 9.9e101 | 01:33 | |
camelia | rakudo-moar a45224: OUTPUT«9.9e+101» | ||
timotimo | hm. | ||
diakopter | there's a few bits for the value, a few bits for where the decimal point is in the value, a few bits for the exponent | 01:34 | |
probably a bit for the sign, probably a bit for Inf/NaN | 01:35 | ||
a bit or two, I mean | |||
timotimo | nah, Inf and NaN are stored in otherwise redundant cases | ||
diakopter | oh | ||
timotimo | the wikipedia article about IEEE floating point doesn't actually say anything specific about how they look at the bit level | 01:37 | |
which is really strange to me | |||
"sign in or purchase" | 01:38 | ||
fuck you, ieee. i'm poor. | |||
m: use NativeCall; my $foo = Pointer[num64].new(NaN); say $foo; say nativecast(Pointer[int64], $foo).perl | 01:39 | ||
camelia | rakudo-moar a45224: OUTPUT«Default constructor for 'NativeCall::Types::Pointer[num64]' only takes named arguments in block <unit> at /tmp/RNolUHRsRE line 1» | ||
timotimo | oh, whoops :) | ||
m: use NativeCall; my $foo = CArray[num64].new(NaN); say $foo; say nativecast(CArray[int64], $foo)[0].perl | 01:40 | ||
camelia | rakudo-moar a45224: OUTPUT«NativeCall::Types::CArray[num64].new9221120237041090560» | ||
timotimo | m: use NativeCall; my $foo = CArray[num64].new(NaN); say $foo; say nativecast(CArray[int64], $foo)[0].fmt("%b") | ||
camelia | rakudo-moar a45224: OUTPUT«NativeCall::Types::CArray[num64].new111111111111000000000000000000000000000000000000000000000000000» | ||
timotimo | m: use NativeCall; sub bits_of_float($f) { my $foo = CArray[num64].new($f); say $foo; say nativecast(CArray[int64], $foo)[0].fmt("%b") }; bits_of_float(Inf); bits_of_float(NaN); bits_of_float(-Inf) | ||
camelia | rakudo-moar a45224: OUTPUT«NativeCall::Types::CArray[num64].new111111111110000000000000000000000000000000000000000000000000000NativeCall::Types::CArray[num64].new111111111111000000000000000000000000000000000000000000000000000NativeCall::Types::CArray[num64].new-100000…» | ||
timotimo | m: use NativeCall; sub bits_of_float($f) { my $foo = CArray[num64].new($f); say nativecast(CArray[int64], $foo)[0].fmt("%b") }; bits_of_float(Inf); bits_of_float(NaN); bits_of_float(-Inf) | 01:41 | |
camelia | rakudo-moar a45224: OUTPUT«111111111110000000000000000000000000000000000000000000000000000111111111111000000000000000000000000000000000000000000000000000-10000000000000000000000000000000000000000000000000000» | ||
timotimo | oh, sweet, it has the minus sign there as well! | ||
m: use NativeCall; sub bits_of_float($f) { my $foo = CArray[num64].new($f); say nativecast(CArray[int64], $foo)[0].fmt("%x") }; bits_of_float(Inf); bits_of_float(NaN); bits_of_float(-Inf) | |||
camelia | rakudo-moar a45224: OUTPUT«7ff00000000000007ff8000000000000-10000000000000» | ||
timotimo | i don't know by heart where exactly the signature and mantissa live, but i suspect the "11111111111" part is the signature | 01:42 | |
i find it quite surprising to see gc_root_add_frame_roots_to_worklist to be at 4.11% Self time | 02:01 | ||
02:24
vendethiel joined
06:08
cognominal_ joined
06:31
cognominal joined
07:26
vendethiel joined
07:31
Ven joined
|
|||
masak | 'signature' == 'exponent'? | 08:28 | |
09:44
lizmat joined
09:56
vendethiel joined
10:00
Ven joined
10:40
Ven joined
10:51
vendethiel joined
12:31
Ven joined
13:48
Ven joined
14:14
colomon joined
14:59
colomon joined
16:07
synopsebot6 joined
16:43
colomon joined
17:03
zakharyas joined
17:21
Ven joined
17:58
domidumont joined
18:11
Ven joined
18:16
colomon joined
18:18
domidumont1 joined
|
|||
timotimo | if we had a pre-made work blob for every frame that we could just memcpy, that'd probably be neat | 19:02 | |
19:07
Ven joined
19:29
colomon joined
19:39
Ven joined
|
|||
[Coke] | If I had a pre-made work blob for every time someone asked me that... | 19:51 | |
19:59
TimToady joined
20:04
colomon joined
21:45
leont joined
22:24
zakharyas joined
23:33
colomon joined
|