00:42
MasterDuke joined
|
|||
MasterDuke | timo: i had `-march=native` set for my desktop (x86-64), but not the laptop (aarch64). setting on the laptop *slightly* improved times | 00:43 | |
adding `&& !writing_32bit` to the loop condition (to break out early) meant clang couldn't vectorize it (didn't try gcc) | 00:45 | ||
timo | OK, presumably there's a way to get it to not have trouble with that. do you think putting the definition of the loop variable inside the for (***; ;) could help? it shouldn't keep the variable alive after, so it wouldn't matter if it's the number of the spot it stopped exactly | 00:49 | |
MasterDuke | nope, clang still says it can't vectorize | 00:51 | |
also, i have a version that checks for crlf in that same loop, and that speeds things up even more (but makes an early exit even harder to do) | 00:52 | ||
let me push that so you can see | 00:54 | ||
done | 00:57 | ||
my test corpus is perhaps not very good though | 00:59 | ||
172 10 | 01:00 | ||
255 65 | |||
588 57 | |||
1243 49 | |||
2389 41 | |||
4821 33 | |||
8560 25 | |||
17693 17 | |||
45023 9 | |||
205321 1 | |||
count of lengths | |||
i really should just stick a print in and capture what's decoded during a rakudo build... | 01:01 | ||
01:12
MasterDuke left
|
|||
timo | do you think a special case for a single character could be worth it? | 01:25 | |
i wonder what percentage come from the string heap in .moarvm files | 01:26 | ||
02:47
MasterDuke joined
|
|||
MasterDuke | timo: i got a print of what's decoded during a rakudo build, characteristics are a bit different: | 02:48 | |
121992 11 | |||
135586 12 | |||
142789 9 | |||
150224 10 | |||
158566 8 | |||
179682 7 | |||
183802 3 | |||
534090 4 | |||
561218 5 | |||
612082 6 | |||
so based on this, no, i don't think special casing a single character is a good idea | 02:49 | ||
that's the problem with benchmarks... | |||
btw, this is my benchmark, very open to suggestions. `MVM_SPESH_BLOCKING=1 raku -e 'use nqp; my str $f; for ^10 { for "latin1".IO.slurp.lines -> str $s { my $a := nqp::list_i; nqp::encode($s, "iso-8859-1", $a); $f = nqp::decode($a, "iso-8859-1") }; }; say now - INIT now; say $f'` | 02:52 | ||
don't like how i'm also encoding everything, but i don't really grok encoding/decoding, so not sure if there's a way to avoid it in this case | 02:53 | ||
timo | you can slurp the file as binary, then you get a buf out. haven't tried how .lines works with binary handles. with just \n as the line end byte i imagine it would work fine? | 03:02 | |
MasterDuke | hm. so `.slurp(:bin).lines -> $a { ...` ? | 03:05 | |
`No such method 'lines' for invocant of type 'Buf[uint8]'` | 03:06 | ||
my branch doesn't seem to be as faster with this corpus as the input to the benchmark | 03:10 | ||
timo | ah sorry it would ahve to be .IO.lines or so | 03:14 | |
slurp gives the buffer, you want to use the lines iterator on the IO::Handle or IO::Path | |||
maybe make the array of bufs eagerly up front so it doesn't mix between the decoding work | 03:15 | ||
MasterDuke | ? in the past i've seen IO.lines as quite a bit slower than slurping and then .lines | ||
timo | may want to put the "read file into array of bufs" outside the "for ^10" for the measurement then? | 03:17 | |
MasterDuke | well, given this file is quite a bit bigger than the previous one i've just ditched the `for ^10` part | 03:18 | |
but i don't know how to not do the encode each time | |||
timo | encode is what you do to go from Str to Buf | 03:20 | |
MasterDuke | yeah. it would be nice to directly slurp/lines directly into a Buf | 03:21 | |
timo | check what IO::Handle.get does when the handle is opened without an encoding | 03:23 | |
depending on whether you set chomp on or off you'll either have strings where it's got a newline at the end every time, or not a single time | 03:24 | ||
MasterDuke | but then i'll still need to turn the string into a buf | 03:25 | |
timo | i was hoping IO::Handle without encoding (or i guess with :bin?) gives you buf instead of Str | 03:27 | |
MasterDuke | `Cannot do 'get' on a handle in binary mode` | 03:28 | |
timo | >:( | 03:29 | |
you know, i think we should just put support in for that for people who really know that it's really what they want and what can go wrong %) | |||
i need to go to bed i have a headache :( | 03:30 | ||
MasterDuke | good luck with that. about to go to bed here too | 03:31 | |
03:42
MasterDuke left
09:01
sena_kun joined
09:56
sugarbee1 left
09:57
sugarbeet joined
10:39
El_Che left
18:42
japhb left
18:49
japhb joined
|
|||
[Coke] | looks like we have a mix of 3rdparty https:// and git@ submodules. Any interest in standardizing? | 18:56 | |
ugexe | as we don't update those repositories we should use https | 19:33 | |
https is generally recommended as well | 19:34 | ||
at $work we've explicitly gone through everything and changed them to https. we use github and github enterprise | 19:36 | ||
19:47
[Coke] left,
[Coke]_ joined
21:34
[Coke]_ is now known as [Coke]
|
|||
[Coke] | what is 3rdparty/sha1 ? has a remote of moarvm itself? | 21:34 | |
ah, it's not a 3rdparty. why are are storing it there? | |||
timo | probably because it's so small? and unlikely to need to change? | 21:38 | |
[Coke] | why not keep it with the other source? | 21:48 | |
ah, to keep licensing more smooth, probably | |||
timo | that at least sounds like a good reason to me | 21:50 | |
22:29
sena_kun left
|