github.com/moarvm/moarvm | IRC logs at colabti.org/irclogger/irclogger_logs/moarvm Set by AlexDaniel on 12 June 2018. |
|||
00:18
Merfont joined
00:19
Kaeipi left
01:18
Merfont left,
Merfont joined
01:27
Merfont left,
Merfont joined
04:14
Merfont left
04:15
Merfont joined
05:23
Util left
05:35
Util joined
07:00
bingos_ is now known as bingos
07:01
bingos left,
bingos joined
07:03
bingos is now known as BinGOs
07:25
MasterDuke joined
|
|||
nwc10 | good *, #moarvm | 07:32 | |
07:52
brrt joined
|
|||
brrt | good * #moarvm | 07:54 | |
07:54
zakharyas joined
|
|||
nwc10 | good *, brrt | 07:54 | |
07:55
zakharyas1 joined
|
|||
nwc10 | brrt: I've been staring a lot at compiler assembler output. Is this somethign that will make me go blind? :-) | 07:56 | |
07:58
zakharyas left
|
|||
brrt | no, will not | 08:04 | |
mad, maybe | |||
then again, might be a good lockdown time spender | 08:05 | ||
nwc10 | Yes, I think it makes me mad, maybe in two ways :-/ | ||
08:05
sena_kun joined
|
|||
nwc10 | at least, a bit disappointed | 08:05 | |
clang on arm manages to use LRDB to read one value | |||
and then the other one, LDR | |||
and then some ARM6 instrcution that I can't remmeber the mnemonic for to truncate *that* read down to 8 bits | 08:06 | ||
dude, you just showed me that your code generator knows about LDRB - why are you now usign 2 instructions to do the other byte load when we both know you could use one? | |||
gcc on ARM was not that stupid | |||
probably not goign to change - I'm assuming that most clang ARM development comes from Apple, and Apple has gone 64 bit. | 08:07 | ||
08:14
zakharyas joined
|
|||
brrt | probably yes | 08:15 | |
08:17
zakharyas1 left
08:39
sena_kun left
08:46
domidumont joined
09:31
sena_kun joined
10:13
brrt left
|
|||
timotimo | could be a memory alignment thing, so that the assembly fits more neatly on some boundary somewhere further down in the code | 11:11 | |
11:12
sena_kun left
11:17
zakharyas left
|
|||
lizmat | github.com/rakudo/rakudo/issues/39...-714389210 # posting here on Altai-man's behalf | 11:22 | |
12:25
zakharyas joined
|
|||
Geth | MoarVM/plain-char-can-be-unsigned: a989f7b553 | (Nicholas Clark)++ | src/strings/ascii.c `char` can be unsigned. Rewrite the "is it ASCII?" test to handle this too. Without this gcc correctly warns that the "comparison is always false" on platforms where plain char is unsigned, and the code (as was) was buggy. On x86_64 (where plain char is signed) this commit makes no difference to the generated code. |
13:01 | |
MoarVM: nwc10++ created pull request #1363: `char` can be unsigned. Rewrite the "is it ASCII?" test to handle thiā¦ |
13:02 | ||
14:18
sena_kun joined
15:14
domidumont left
15:34
patrickb joined
17:38
linkable6 left,
evalable6 left
17:40
linkable6 joined,
evalable6 joined
17:42
zakharyas left
18:39
rypervenche left
|
|||
Geth | MoarVM: a989f7b553 | (Nicholas Clark)++ | src/strings/ascii.c `char` can be unsigned. Rewrite the "is it ASCII?" test to handle this too. Without this gcc correctly warns that the "comparison is always false" on platforms where plain char is unsigned, and the code (as was) was buggy. On x86_64 (where plain char is signed) this commit makes no difference to the generated code. |
18:42 | |
18:42
rypervenche joined
18:43
sena_kun left
|
|||
samcv | timotimo, MVM_string_buf32_can_fit_into_8bit ... MVMGrapheme32 val2 = ((active_blob[i] & 0xffffff80) + 0x80) & (0xffffff80-1); | 19:21 | |
so we do it 32 bits at a time. dunno how many bits we can do. 128? forget how wide simd are | |||
timotimo | there's even 256 wide ops | 19:22 | |
samcv | well i mean it depends on the instruction but. i guess it could make sense for that i guess | ||
though. actually i think it DOES vectorize it. i am remembering now | |||
it has multiple alternatives based on how long the string is, and already does vectorize for wider vectors | 19:23 | ||
assuming it still optimizes the same | |||
timotimo | it has the benefit that it doesn't have to check for \r\n and friends | ||
samcv | i remember it checks the length, does it x bits at a time, then if any remain does 32 bits for the rest | 19:24 | |
timotimo | right, that makes sense | ||
that's what our code is written for, or that's what gcc and clang do with it? | |||
samcv | well the code is written to do 32 bits at a time, though i wrote it with many tweaks until it actually did that properly. so i guess a bit of both | 19:25 | |
but verbatim it only says to do 32 bits at a time | |||
timotimo | OK | 19:26 | |
samcv | MasterDuke, i don't think we get to memory speed limits unless we are just doing straight memcpy | ||
19:30
brrt joined
|
|||
MasterDuke | not sure what you mean? | 19:31 | |
19:44
patrickb left
19:53
zakharyas joined
19:59
zakharyas left
20:19
brrt left
|
|||
samcv | sorry. referring to this comment "<MasterDuke> how many strings are we validating that don't fit in l2?" | 20:30 | |
also i'm curious about the FSA. what benchmarks show the greatest performance benefit from that? I am trying to compare using FSA to just using normal malloc, on linux and getting pretty much dead even. But guessing i'm not doing a good test case | 20:31 | ||
MasterDuke | well, i think our arenas might not be big enough. so the FSA falls back to malloc too frequently | 20:35 | |
but there was a recent change i made to concblockingques from malloc to fsa that made a noticeable difference | 20:36 | ||
there is a rakudo issue with some commentary, don't remember the # | 20:37 | ||
samcv: github.com/rakudo/rakudo/issues/3648 | 20:50 | ||
timotimo | we can trace with systemtap or something how often what sizes happen in the FSA | 21:02 | |
MasterDuke | i've logged it before, don't remember at all | 21:04 | |
most common value requested is 88, then 656, then 104, then 256, then 200, then 192, then 8 | 21:10 | ||
samcv: if you're interested, github.com/MoarVM/MoarVM/pull/1277 | 21:12 | ||
timotimo | ideally we'd reduce the total amount of allocations | 21:15 | |
MasterDuke | yeah, fewer allocations would be great | 21:18 | |
22:03
Kaeipi joined
|
|||
MasterDuke | most common bin is 10, then 81, the 31, then 12, then 24 | 22:06 | |
22:08
Merfont left
22:11
Kaiepi joined
22:15
Kaeipi left
|
|||
timotimo | if we also count deallocations we can see how many of what size survive until the program exuts | 22:22 | |
MasterDuke | most common bytes value for free is 88, then 656, then 256, then 104, then 200, then 192, then 8 | 22:35 | |
timotimo | macroify-allocs (or something) in the tools folder tries to give the allocations names on top of just sizes | 22:38 | |
it will make a very big diff that goes into most files | 22:41 | ||
MasterDuke | might get a chance to try tomorrow, off to sleep now | 22:45 |