samcv dropped my laptop and the screen went all blocked up and making bad noises 01:07
maybe dropping it tapped the ram? idk
turned on fine once i turned it off
timotimo, with 1/100 the size i didn't see collapse strands btw
timotimo interesting! 01:09
01:14 vendethiel joined 01:48 ilbot3 joined
MasterDuke and --profile shows the most expensive routine (by exclusive time) is `infix:<~> (SETTING::src/core/`, at 30%, next most is 5.8% 01:52
that's the Str:D multi 01:54
samcv yeah i bet it is 02:01
that thig always sucks
MasterDuke, what are you using to measure? and how big is your file?
MasterDuke perf record and --profile, of `use JSON::Fast; spurt("otherCache", from-json(to-json(slurp("cache").split("\0"))).join("\0"))`, with the first 2.5Mb of the file timotimo was using 02:03
samcv argh complaining i don't have enough permissio 02:05
it's set to -1
at 502017, unexpected \ inside list of things in an array 02:09
whatever i just made a coverage report of it 02:21
will check that function
ok i see 02:22
uploading now 02:23
MasterDuke, this must be G as in giga 02:25
but the whole function itself runs 4.5k times 02:26
for loop is pretty hot
but looks like turn_32bit_into_8bit_unchecked is never hit. so that's not causing the issue
though it does still hit that function 25.8k times 02:27
MasterDuke this is heaptrack's overview 02:29
samcv if (g < -127 || g > 127) probably (hopefully? gets optimized to not check sign and just check binary hopefully. since that's all that's needed
MasterDuke oh, oops, all that measuring and profiling was with my modified version 02:31
samcv modified how
MasterDuke 02:32
samcv but 02:34
why would you change it like that. won't it possibly break
oh i see i guess you already collapse it
then check there. any change in time?
MasterDuke i just time it once before and once after. 0.1s slower after my change (out of 31s) 02:35
samcv ok i've made it faster 02:41
02:41 agentzh joined
samcv let me calculate how much. but this is only measuring differences between a single change i made, and i made a couple other changes that may have a minor effect idk 02:42
average of 5 times made it go from 22.188 to 23.43 seconds 02:44
so 1.2s shorter
5.4% faster 02:45
Geth MoarVM: 7beffd390c | (Samantha McVey)++ | src/strings/ops.c
collapse_strands 5.4% speed boost under some workloads

This loop gets ran a huge number of times under collapsing strands workloads. If we're already set can_use_8bit, don't bother checking if the following graphemes are < -127 and > 127
samcv pushing that change MasterDuke
samcv i'm thinking of making that MVMint8 a plain old int, so it'll use the fastest type on whichever cpu it's ran on. it's bool so doesn't matter the size of it 02:51
MasterDuke you mean 23.43 to 22.188? 02:59
samcv yes 03:00
i'm making MVM_string_graphs_nocheck too, so that should help make string things faster cause that gets called a ton of times 03:01
uhm and got 21.408s with some more of my changes 03:08
03:15 vendethiel joined
samcv i think i found a bug 03:19 here. if it's an ascii string, we should access blob_ascii 03:20
which is what is used in every other places
can can compare them with memcmp but we can't point to the same place 03:23
i don't think we run that command anywhere on ascii strings since they're hardly used (or something) 03:24
imo it would be easier to deal with if they both used blob_8 and we still have the string be of type ASCII, but they are both stored in blob_8's 03:25
then we won't have to have branching when we're really just comparing two things that are subsets of each other, and can be memcmp'd fine 03:26
05:51 brrt joined 05:53 domidumont joined 05:59 domidumont joined 06:00 domidumont joined 06:36 geekosaur joined 06:44 brrt joined
brrt merge pushed 07:14
the 'has-dynasm' thingy has been removed, because, a): we have our own dynasm fork, and it's not going to be the one that's installed, b): dynasm is a *built-time-dependency*, damnit, and it doesn't make sense to use some other guys' version 07:15
07:36 domidumont joined 07:41 zakharyas joined 09:05 domidumont joined 10:43 vendethiel- joined
timotimo it occurs to me that collapse_strands uses the grapheme iterator in all cases. we could probably special-case one or two different storage types there 11:17
MasterDuke timotimo: ah, interesting idea 11:26
timotimo i'm doing a run on a quarter (elementwise) of the whole cache 11:27
MasterDuke timotimo: did you notice samcv's possible bug find in MVM_string_substrings_equal_nocheck? it does look odd 11:28
timotimo typedef MVMint8 MVMGraphemeASCII; 11:29
typedef MVMint8 MVMGrapheme8; /* Future use */
i imagine if we split it the compiler will merge it again until we change one of these typedefs 11:36
so since reducing it down to like a tenth makes it complete in mere minutes, i was hoping a quarter would give me a nice hour or so run time 12:00
like, just a tiny bit up the hockey stick 12:01
13:27 spebern joined 14:10 nine_ joined 14:13 SmokeMachine joined 14:33 spebern joined
timotimo almost 4 hours now 15:06
15:08 brrt joined
brrt good * #moarvm 15:09
15:47 brrt joined 16:39 domidumont joined
samcv good * 17:02
where do you get future use timotimo 17:03
<timotimo> typedef MVMint8 MVMGrapheme8; /* Future use */
it is used currently though
timotimo that's just part of the code :) 17:04
i think i was Future Man who got to Use that
geekosaur ^
samcv heh
well it's used currently :P 17:05
geekosaur couple weeks(?) ago timotimo implemented that optimization for strings restricted to the ISO8859 subset
so yes, the future is now(tm)
timotimo the future is so last week
samcv was longer than couple weeks ago 17:06
also timotimo what are your thoughts of the ascii thing. they're both stored int int8's and that places looks like a bug right? in MVM_string_substrings_equal_nocheck. nowhere else do i see an ascii type string have blob_8 17:08
but i wouldn't be against making ascii strings just use blob_8 anyway or something
timotimo we should decide that blob_8 is supposed to be signed, just like ascii is. and then we can throw it out :) 17:09
samcv they'd still be 8 bit strings and still be distinguishable, but would make comparison of ascii and others less prone to errors
they are both signed
8 bit numbers
timotimo yeah, it's important to have signedness here 17:10
for our synthetics
samcv yeah
timotimo also, ascii is - of course - only defined up to 127
samcv 12. what. 17:11
timotimo if you try to store latin-1 in "ascii", we'll not be in agreement
samcv we store latin-1 in blob_8 17:12
geekosaur enh, question is whether synthetics or latin-1 make more sense there. I'm not sure there are that many useful cases for synthetics in that range
timotimo we do? :o 17:13
samcv yes timotimo
they're just numbers...
timotimo but but but but
samcv so now we want another blob format for no reason XD
blob_latin1 XD
TimToady well, even ASCII can have CRLF in it...
timotimo yeah, and we turn that into a negative number 17:14
samcv yep
would be easier if ascii, latin-1 and 8bit all stored in blob_8. we don't have a Latin-1 type of string btw 17:15
just ascii, grapheme 8, grapheme 32, and strand
can still have the string be of type ascii, but no need to confuse things and make comparing 8 bit strings more work than it needs to be 17:16
programming errors etc
also i kind of think CRLF should be a constant synthetic grapheme. that could save some having to check what crlf is in the trie 17:21
Geth MoarVM: affec75b9c | (Samantha McVey)++ | 4 files
Add MVM_string_graphs_nocheck funct, use it places we prev. already check

There are many places where we check arguments with MVM_string_check_arg, and then will later on call MVM_string_graphs. This is redundant because MVM_string_graphs runs the same checks every time it runs that MVM_string_check_arg has already done.
Shows a minor, but measurable speed increase.
samcv is this supposed to be called MVN_unicode_normalizer_form MVN? and not MVM? 17:29
18:17 AlexDaniel joined
samcv hmm. i. want to steal some code 18:22
Knuth-Morris-Pratt search, memmem (searching for memory within memory, so we can use it on whatever size grapheme we want) 18:23
19:02 zakharyas joined
TimToady wondered whether CRLF should always be -1 19:08
timotimo we can easily make that happen inside moarvm 19:10
by just creating -1 as soon as moarvm is started
samcv yeah what timotimo said 19:29
19:44 dalek joined, SourceBaby joined
samcv ugh i can't get `memmem` working 19:50
i thought it would return a pointer to the location in the haystack the substring starts at. which is what the manpage says 19:51
but it returns numbers that are inconsistently different than haystack 19:52
i thought if i did `haystack - memmemresult` i would get the size_t from the start of the haystack to the found result and it should be consistent. but it isn't
and i'm very confused
whatever it's pointing to seems to hold the same data every time i run it. but it isn't the right memory region... 20:00
timotimo, help 20:01
man page if anybody needs it... argh.
if we can get it to work we can get kruth-morris-pratt optimized string search tho 20:02
then include the source for mac/windows since it's not a standard c lib function 20:04
timotimo hmm 20:22
samcv: seems correct to me 20:24
you're telling it to look for "a" and it finds it right at the beginning
when you make needlelen 2 instead of 1, it'll find it a bit later
haystack[4195968] needle[4195976] found[4195972]
i.e. 4 bytes in
samcv ok that's not what i get 20:25
i get something totally not that
haystack[94595572664408] needle[94595572664416] found[3212937304]
timotimo did you #define _GNU_SOURCE?
it doesn't compile otherwise
32bit machine perchance?
samcv it compiles for me. without that. it's 64bit 20:26
timotimo the char output is b0rked on my end
samcv and adding #define _GNU_SOURCE does not help me
that's okay though. as long as it's giving the right answer...
i just get random different found values.... 20:27
that are like much smaller than anything else
and it's not even consistent when i do haystack - found
timotimo oh
you're storing the result of memmem into an int
int isn't defined to be able to store a pointer
samcv yes 20:28
timotimo that's probably why you're getting such a bogus result. and i have no idea why i'm getting a correct one. perhaps it's storing these things in a memory location much closer to 0 so that it fits into 32bit?
samcv it still does not work XD that's what i tried FIRST
if i set it void * or char * i get even CRAZIER ranging values
haystack[94913660565544] needle[94913660565552] found[18446744072887842856]
T_T 20:30
ok now it's working.
idk what options my in editor C compiler uses. but it works fine compiling myself
weird....... 20:31
will have to look into that
ok well at least i have a working example. will try and make it work in mvm now 20:33
timotimo sorry, i was just out on the balcony to see an iridium flare 20:35
but isn't memmem using byte-granularity?
so we'd have to continue searching until we hit a properly aligned one, right?
samcv ok i think i fixed it 20:36
in mvm 20:37
oh you mean if by some magic the byte is found between bytes?
err for grapheme32's i guess. but yeah i think it does by byte 20:38
i don't think that will happen. but we will have to have a check to make sure that does not occur
however unlikely it may be. it is possible 20:39
timotimo mhm
hm, say ...
samcv (maybe possible)
timotimo if it does a fancy algorithm, it may discover by itself that it can skip almost all alignments, except of course the proper one
samcv but until we know for sure we need to check
well it does do fancy
which is why i want it
timotimo how much fancier than boyer-moore is this? 20:40
samcv maybe less? idk 20:41
timotimo it's not mentioned once in the wikipedia article
samcv i do know it's a lot faster than what we have now
timotimo right
it's sad we hardly spend any time searching for a needle in a haystack when compiling the core setting
samcv i've heard of it before. it's different than booyer moore. not quite as complex i think
gonna run spectest now 20:43
yay spectest pass 20:46
ok. now. so can a 32 bit number exist in an array of 32 bit numbers, not at a normal offset 20:47
i am not sure how to prove or disprove that 20:48
well easiest would be to try to construct one synthetically i guess 20:49
nice timotimo. it's 2.2x faster :O 20:51
doing 'a' x 100000 ~~ /b/;
worst case
that is huge 20:52
timotimo nice 21:03
it's easy for such a 32bit number to exist inside two 32bit numbers 21:04
how many f is 32bit again %)
two is 8bit, so 8 is 32bit 21:05
0000ffff, ffff0000 has ffffffff in it at a non-aligned offset
samcv yeah. wondering what the probability of that is
timotimo hm 21:07
imagine you have a number that has its lowest byte full
m: say 0x100.uniname
timotimo okay, so imagine we have two nullbytes and then this, and we're looking for \1 21:08
m: say ("A".ord +< 8).uniname
camelia <CJK Ideograph Extension A>
timotimo hmm
samcv ok i need a break. bbs 21:35
i'm getting very frustrated trying to use the memmem function standalone. 21:39
i try and rename it and import it. and then it just says it can't find it 21:40
timotimo, let me know if you come up with some way i could test the bug and check when it triggers. would be nice to write a test for it
timotimo, 21:51
timotimo that's the page i looked at 21:54
samcv timotimo, maybe you can at least write some code to detect if it's not on the boundary. or help me here's the commit
it works fine. but for the 32bit graphemes has that rare bug with overlap 21:55
so the pointer subtraction works as expected since i store the result of memmem in a MVMGrapheme32* 21:56
so uh. do i cast as char * or what and then subtract them? not 100% sure
to find out if it's not divisible cleanly by 32
(cast both memmemrtrn32 and haystack-> as char * i mean) 21:57
22:26 agentzh_ joined
timotimo okay so i quartered the workload and it's still going after 11h30m 22:57
samcv well it compiled fine on mac on travis
bsd has a non optimized memmem thing. not sure how it compares in speed to what we previously did 22:58
timotimo maybe it's better to cast to intptr_t, but char* should be fine 22:59
the file is 11 gigs big %)
samcv nice 23:00
timotimo just waiting for it to finish writing it out
oh, i think it just finished
i wonder how long it'll take to perf report that :D 23:01
[ perf record: Woken up 43617 times to write data ] 23:02
[ perf record: Captured and wrote 10904.887 MB (156696959 samples) ]
samcv timotimo, i think there must be a bug 23:33
at 502017, unexpected \ inside list of things in an array 23:34
in sub parse-array
in JSON::Fast
can you compress perf data 23:38
also working on making that loop even tighter. i love now i can run roast and have it finish in 3.5mins 23:46
Geth MoarVM: samcv++ created pull request #573:
Have two part loop in collapse strands to make loop tighter when possible