This channel is logged for the purpose of history keeping about its development | Logs available at irclogs.raku.org/raku-dev/live.html
Set by lizmat on 8 June 2022.
MasterDuke ok, so i'm in a module's directory. what should be faster? the first `raku -I. -e 'use foo'`? or subsequent invocations also/instead? 02:04
ugexe assuming you `rm -rf .precomp/` first, the second invocation should be faster 02:06
MasterDuke by first you mean before any of the invocations? 02:07
ugexe yeah
although i imagine it will call nqp::sha1 the same number of times regardless 02:08
also ++ to anyone who can get us e.g. sha2 02:09
MasterDuke ok, it is always faster. but i mean if i'm using a rakudo with `map  { nqp::sha1_b($_) },` instead of `map  { nqp::sha1($_) },` in src/core.c/CompUnit/Repository/FileSystem.pm6, what do i hope is faster compared to stock rakudo?
well, it would be nice, but i'm pretty sure that for our use case, sha1 is unbroken
ugexe i'm not so sure about that. if i could create a collision then when the precomp repository goes to load the file name nqp::sha1(...) then the wrong code would be loaded 02:11
but i agree its not very practical and thus arguably not a problem worth prioritizing 02:12
I think just `raku -I./ -e ''` might be sufficient 02:13
using whatever module is in that directory for sure would exercise it, but that value is cached 02:14
if you have a directory with a bunch of different modules you could do `raku -Ifolder1/ -Ifolder2/ -Ifolder3/ -e '$*REPO.repo-chain.grep(CompUnit::Repository::FileSystem).map(*.id).sink'` 02:18
MasterDuke which .precomp should i remove? 02:19
ugexe whatever one is in the path of each -I, so -I. would be .precomp, and -Ilib would be lib/.precomp
MasterDuke doing that for the 10 largest modules i have around takes ~8s. both with my patch and without 02:33
ugexe did you change github.com/rakudo/rakudo/blob/0204...em.pm6#L63 to not encode? 02:34
MasterDuke yeah, changed it to just `open(:bin)...` 02:35
`map  { $distribution.content($_).open(:bin).slurp(:close, :bin) },`
ugexe and you changed that nqp::sha1($parts,...)? 02:36
nqp::sha1($parts.join('')); rather
MasterDuke no. isn't that just the sha1 of the individual sha1s? 02:37
ugexe ah yeah, i guess that would be pointless
MasterDuke it seems we just don't spend a lot of time hashing. all the time is spent elsewhere. MVM_string_gi_get_grapheme, knuth_morris_pratt_string_index, MVM_string_utf8_c8_encode_substr, MVM_get_lexical_by_name are just some of the names in the top 20 or so of a perf report 02:39
have to go a while down before you get to a sha1-related function 02:40
the binary sha1 is definitely faster when done in a loop. but it's not a bottleneck for our module loading 02:41
hashing src/core.c/Int.pm6 10k times takes ~0.66s for nqp::sha1, but ~0.24s for nqp::sha1_b 02:43
ugexe how are you hashing it?
MasterDuke MVM_SPESH_BLOCKING=1 ./rakudo-m -e 'use nqp; my $a := "src/core.c/Int.pm6".IO.slurp(); my $b; my $s = now; $b := nqp::sha1($a) for ^10_000; say now - $s; say $b' 02:43
ugexe specifically are you using open(:enc<iso-8859-1>) ?
MasterDuke and 02:44
MVM_SPESH_BLOCKING=1 ./rakudo-m -e 'use nqp; my $a := "src/core.c/Int.pm6".IO.slurp(:bin); my $b; my $s = now; $b := nqp::sha1_b($a) for ^10_000; say now - $s; say $b'
oh, right...
MasterDuke that's actually a tiny bit slower 02:45
MasterDuke small enough difference it could be noise though 02:47
ugexe github.com/rakudo/rakudo/blob/f203...le.pm6#L16 is the other potentially useful one related to module loading 02:51
MasterDuke this was a pretty simple change, but i think we won't see any difference until we get the stuff in the profile above it faster. string encoding seems like the major area to tackle 03:30
nine Makes me wonder what strings we are encoding in the first place 10:21
Especially with utf8_c8 10:22
lizmat fwiw, for this particular purpose, I don't think any encoding is needed... 13:26
the worst thing that could happen is that an extra recompile is done if something like an compose é is changed to a decomposed é 13:27
and how often does that happen?
MasterDuke nine: interesting, i hadn't noticed it was _c8 13:59
lizmat: i think that's why iso-8859-1 is used currently, as i believe it's the lightest-weight of the encodings 14:00
lizmat ack 14:01
MasterDuke i'll keep playing around, but have very scattered free time right now. feel free to make any suggestions in chat, i'll backlog 14:02
looks like it's mostly (all?) paths. which makes sense, github.com/rakudo/rakudo/blob/main...#L267-L277 is one of the very few uses of utf8-c8 in the core 14:07
and is pretty much the exact use-case for utf8-c8
almost 360k calls to MVM_string_utf8_c8_encode_substr 14:08
afk 14:09
Nemokosch Who can release community modules on zef and how? 14:42
(Who can one even ask these questions to get an answer? I have asked it for 5 times at least with literally no form of response.) 14:43
Also github.com/rakudo/rakudo/pull/5133 seems like this is a 5-minute bugfix; what can I do to make progress? 14:45
MasterDuke Nemokosch: any idea why the windows tests failed? can you force push to get them to rerun? i tried to rerun them from the PR, but it's too old and it won't let me 15:22
tellable6 MasterDuke, I'll pass your message to Nemokosch
MasterDuke Nemokosch: btw, were you asking about Actions.nqp? i think it will eventually go away when rakuast fully takes over 15:24
tellable6 MasterDuke, I'll pass your message to Nemokosch
MasterDuke but jnthn and nine could give a more definitive answer
huh. a bunch of the strings returned by MVM_string_utf8_c8_encode_substr do seem to have weird bytes in them, even ones that i wouldn't think should 15:29
e.g., `/home/dan/Source/perl6/install/share/nqp/lib/Perl6/Actions.moarvm√ź}8√Ņ` 15:30
Nemokosch rebuilt 15:57
Geth rakudo/main: f078896c77 | (M√°rton Polg√°r)++ (committed using GitHub Web editor) | src/core.c/Seq.pm6
Fix swapped iterators in ACCEPTS

Resolves github.com/rakudo/rakudo/issues/2468.
rakudo/main: 480fe2bdd3 | (M√°rton Polg√°r)++ (committed using GitHub Web editor) | 47 files
Merge branch 'rakudo:main' into patch-1
rakudo/main: cdc9aa9879 | MasterDuke17++ (committed using GitHub Web editor) | src/core.c/Seq.pm6
Merge pull request #5133 from 2colours/patch-1

Resolves github.com/rakudo/rakudo/issues/2468
MasterDuke hm, none of the paths with weird bytes do actually seem to have weird bytes in the filesystem 16:14
github.com/rakudo/rakudo/blob/main...m.pm6#L324 i believe is doing a bit more work than is necessary 16:20
lizmat: optimizing that might be up your alley 16:28
but i'm still not sure why those weird bytes are showing up 16:30
it does look like using `$!abspath` instead of `$!prefix.absolute` on that line is a tiny bit faster. ~7.6s instead of ~7.8s 16:34
hm, maybe the weird bytes are an artifact of multi-threaded printing to one file...is any of this compunit::filesytem/module loading stuff multi-threaded? 16:38
because i also see strings like `/home/dan/Source/perl6/modules/raku-bench/components/perl5/perl5/cpan/Encode/KR/KR.pmhell.pmm`, which looks suspicious 16:39
well, running with RAKUDO_MAX_THREADS=1 didn't change anything 16:43
huh. if i also print out the string regular utf8 encoded, sometimes both strings have weird bytes, sometimes just the utf8-c8 version does, and sometimes just the utf8 version does 16:51
ugexe MasterDuke: with runtime requires it can be multithreaded. also precompilation is multithreaded 16:57
that being said, module loading is not thread safe 16:58
MasterDuke oh, looks like some of this weirdness is because we're not reallocing the final string down to it's right size
ugexe github.com/rakudo/rakudo/issues/1920
MasterDuke well, we get weird bytes when the realloc does *not* change the size, and when it both shrinks and expands it 17:03
based on a strlen before and after the realloc 17:04
doh. i'm not in MVM_string_utf8_c8_encode_C_string, so it's not zero terminated 17:13
so ignore *some* of things i've just said
