00:16
sena_kun left
01:37
MasterDuke joined
|
|||
MasterDuke | if anyone is curious, here's a commit that slows down MVM_string_find_cclass github.com/MasterDuke17/MoarVM/com...ee00db79f3 | 01:41 | |
but if it inspires any suggestions to instead make it faster...i'm all ears | 01:42 | ||
`MVMCodepoint cp = 0 <= g ? g : MVM_nfg_get_synthetic_info(tc, g)->codes[0];` is there any way to get rid of the branch/conditional, given we know we're looking for newlines? | 01:49 | ||
02:24
MasterDuke left
03:16
guifa left
04:24
guifa joined
04:28
MasterDuke joined
|
|||
MasterDuke | even slower github.com/MasterDuke17/MoarVM/com...b4017b7a26 | 04:29 | |
04:55
MasterDuke left
05:52
guifa left
07:22
[Coke] left
07:49
[Coke] joined
|
|||
timo | uhh `MVMCodepoint cp = 0 <= g ? g : MVM_nfg_get_synthetic_info(tc, g)->codes[0];` is this ternary the right way around? | 08:58 | |
... it's 0 <= g and not g <= 0, i just haven't woken up yet | 09:30 | ||
i think due to the grapheme cluster algorithm a few of the things we are checking for in that check after getting the codes[0] from the synthetic grapheme can never happen, like you can't make a grapheme cluster out of vertical tab and something or form feed and something (that's 0x0b and 0x0c) and i think the only grapheme cluster that can have \r as its first entry maybe cannot be anything other | 09:35 | ||
than our crlf synthetic | |||
10:25
sena_kun joined
11:54
lizmat left
12:18
lizmat joined
12:28
lizmat_ joined
12:30
lizmat left
12:37
sena_kun left
12:45
lizmat_ left,
lizmat joined
12:54
sena_kun joined
|
|||
lizmat | perhaps of interest for MoarVM: www.theregister.com/2025/02/13/has...akthrough/ | 19:01 | |
jdv | where's jnthn? i thought he likes that stuff. | 19:10 | |
just kidding. wasnt it nic clark that did the recent hash stuff? | 19:13 | ||
?ghr? | |||
ugexe | github.com/MWARDUNI/ElasticHashing...Hashing.py | 19:15 | |
19:22
rakkable left,
rakkable joined
19:23
rakkable left,
rakkable joined
|
|||
timo | there seems to be a big caveat that the bounds they have found are not possible to achieve when you can also delete elements from a hash, if i read this correctly, which i'm quite possibly not | 20:28 | |
lizmat | ok, so a MoarVM / NQP "map" like structure would be needed to take advantage of this | 20:34 | |
on which a nqp::deletekey would fail | 20:35 | ||
timo | i'm not following that python implementation | 20:36 | |
for one, the example uses a delta of 0.1 and the paper says 1/delta is supposed to be a power of two, but 10 isn't a power of two? maybe i've not read the paper far enough | 20:38 | ||
lizmat | 10 is a power of 2 in binary ? | 20:42 | |
.tell MasterDuke17 looks like the find_cclass update is causing HTTP::Tiny to fail | 20:43 | ||
tellable6 | lizmat, I'll pass your message to MasterDuke | ||
lizmat | .tell MasterDuke17 symptoms are having newlines at the end of a string where they were not expected according to the test | 20:44 | |
tellable6 | lizmat, I'll pass your message to MasterDuke | ||
ugexe | probably a \r\n issue | 20:46 | |
timo | i guess the code may just be incomplete? | 20:49 | |
lizmat: you mean an unmerged branch, yeah? | 20:50 | ||
lizmat | no, in bumped Rakudo in main, will break in 2025.02 as it stands now | 20:51 | |
timo | lizmat: yes, but not when you divide 1 / 0.1 when you write 0.1 like that in python code, then it's actually 10 in decimal and not a power of two | ||
do you perhaps mean utf8 decoding changes? i don't see any changes to find_cclass merged to moarvm/main | 20:52 | ||
lizmat | github.com/rakudo/rakudo/commit/3b...94aeab140b | 20:53 | |
I think a0b70918771669555e635ec ? | 20:54 | ||
linkable6 | (2025-01-26) github.com/MoarVM/MoarVM/commit/a0b7091877 Speedup MVM_string_find_cclass by adding cases... | ||
timo | oh, i guess i just didn't search back long enough | 20:57 | |
well, scroll rather than search | |||
looks like CI was pretty red for that. was that perhaps one of the CI runs where quick bumps broke half the runs? | 20:58 | ||
oh ... oh no ... | 21:01 | ||
... disregard my last message | 21:02 | ||
lizmat | what message? :-) | ||
timo | yes, exactly | 21:06 | |
the reproducing example is just "zef install HTTP::Tiny" and have it run its test cases? | 21:07 | ||
lizmat | yup | ||
timo | aha, the shared code for GRAPHEME_8 and GRAPHEME_ASCII don't look up synthetic info | 21:14 | |
so they don't understand that -1 is crlf | |||
m: .raku.say for "Hello there\r\nHow are you".lines | 21:16 | ||
camelia | "Hello there" "How are you" |
||
timo | m: .raku.say for "Hello there\r\nHow are you".encode("latin-1").decode("latin-1").lines | 21:17 | |
camelia | "Hello there\r\nHow are you" | ||
timo | it also doesn't do it for IN_SITU_8 | 21:20 | |
got a patch | 21:27 | ||
Geth | MoarVM/give_find_cclass_synthetics_back: 23cc069152 | (Timo Paulssen)++ | src/strings/ops.c Re-instate lookup of synthetic codepoint info for find_cclass It got left out accidentally. Since GRAPHEME_8 and IN_SITU_8 can store negative numbers, and especially newlines are a case of this because of the crlf synth, we need to go through the synthetic info lookup. |
21:42 | |
MoarVM: timo++ created pull request #1913: Re-instate lookup of synthetic codepoint info for find_cclass |
21:44 | ||
timo | 02-rakudo/15-gh_1202.t crashed again huh. | 21:54 | |
a commenter on The Register (in the article about elastic hashing) points out that this is also only for hashing where the size of the table is known ahead of time | 22:00 | ||
22:16
sena_kun left
|
|||
lizmat | timo: am going to merge that PR, looks green enough to me :-) | 22:20 | |
Geth | MoarVM/main: 21d96b374e | timo++ (committed using GitHub Web editor) | src/strings/ops.c Re-instate lookup of synthetic codepoint info for find_cclass (#1913) It got left out accidentally. Since GRAPHEME_8 and IN_SITU_8 can store negative numbers, and especially newlines are a case of this because of the crlf synth, we need to go through the synthetic info lookup. |
22:21 | |
lizmat | HTTP::Tiny clean! | 22:33 | |
ugexe | seems odd only 1 module broke | 22:39 | |
22:39
lizmat_ joined
22:43
lizmat left
|
|||
timo | we should revive the C level coverage reports | 22:47 | |
22:57
guifa joined
23:39
lizmat_ left
23:40
lizmat joined
|