02:11 tokuhirom joined 03:48 tokuhirom joined 04:42 vendethiel joined 05:58 FROGGS[mobile] joined 06:10 FROGGS joined 06:38 Ven joined 06:55 zakharyas joined 07:03 domidumont joined 08:14 Ven joined 08:27 kjs_ joined 09:01 kjs_ joined
jnthn lizmat: yes, they look fine; thanks 09:33
m: say uniname(0x6E) 10:07
camelia rakudo-moar f3aace: OUTPUT«LATIN SMALL LETTER N␤»
jnthn m: say uniname(0x2BC)
camelia rakudo-moar f3aace: OUTPUT«MODIFIER LETTER APOSTROPHE␤»
jnthn m: say chr(0x2BC)
camelia rakudo-moar f3aace: OUTPUT«ʼ␤»
jnthn m: say chr(0x149) 10:08
camelia rakudo-moar f3aace: OUTPUT«ʼn␤»
jnthn m: say chr(0x1F0); say uniname(0x30C) 10:10
camelia rakudo-moar f3aace: OUTPUT«ǰ␤COMBINING CARON␤»
jnthn m: say Uni.new(0x6A, 0x30C).NFC 10:12
camelia rakudo-moar f3aace: OUTPUT«NFC:0x<01f0>␤»
jnthn m: say Uni.new(0x03B9, 0x0308, 0x0301).NFC 10:15
camelia rakudo-moar f3aace: OUTPUT«NFC:0x<0390>␤»
jnthn m: say Uni.new(0x03B9, 0x0308, 0x0301).Str
camelia rakudo-moar f3aace: OUTPUT«ΐ␤»
10:46 Ven joined
dalek arVM: 12c5262 | jnthn++ | src/strings/normalize. (2 files):
Add normalizer API to push many codes to buffer.

To ease implementing post-case-change renormalization.
11:07
arVM: 741cb9c | jnthn++ | src/strings/ops.c:
Enforce NFG over growing case folds.
11:52 brrt joined
brrt good * #moarvm 11:57
timotimo heyo brrt 11:59
brrt i have established, among other things, that our dear frame at the very least does not return from the JIT code with its entry label NULLed 12:01
(the EXPR frame, that is)
consequently, it must be NULLed somewhere else 12:02
or, I must have made a mistake in my gdb
timotimo it's a very curious thing to happen in any case! 12:03
brrt setting a breakpoint on jit field entry is actually extremely expensive
s/field entry/leaving code/ 12:04
timotimo maybe you can find a conditional you can append to the breakpoint that'll make it fire only when we've reached the right frame?
brrt wonders, nay, is sure that adding a sequence number to JITted frames will make debugging easier
yeah, that's what i do
but it's probably significantly cheaper to create the branch in code and have the breakpoint there 12:05
otherwise it breakpoints all the time, has to check the condition, and continue
12:07 Ven joined 12:13 Ven joined
jnthn brrt: One other thing to check: does a deopt happen at any point? 12:31
(Between the entering of the frame with the inlined lexical and the failed lookup)
Bit of a long shot though as we shoulda found the thing in the uninlined frame...
brrt not sure jnthn, i'll check that too 12:32
timotimo we could just turn deopt off via an env var! /s
brrt great idea :-P 12:33
jnthn Given that deopt stops us going ahead and reading bogus memory...maybe not :P 12:34
But there *are* deopt logging things
Which could be turned on by env var
brrt oh, that is probably an excellent idea 12:36
the dynvar log was also very helpful :-) 12:37
timotimo mhm
brrt deopt is actually pretty unlikely, since i'd have to explain why the first 200 or so iterations we don't have deopt, and in the last one we do 12:50
i still suspect that something is compiled underneath the old frame, and that the jit label is reinitialized somehow 12:53
timotimo we don't have to deopt if we're not opted in the first place 12:56
remember that spesh has multiple candidates per call site
brrt aye, that is true 13:06
13:13 Ven joined 13:48 synbot6 joined 13:51 Ven joined 14:52 domidumont joined 14:55 domidumont joined 15:16 Ven joined 15:24 Ven joined
flussence I decided to mess around with callgrind... a full quarter of the CORE.setting compile is spent in MVM_sc_find_object_idx (twice as much as MVM_interp_run!) 15:48
jnthn o.O 15:51
That's...*really* odd, we're meant to be annotating the objects with that index so it's O(1)...
flussence++
flussence just ooc, I tried making it iterate over that array in the opposite direction. That shaves a few tenths of a second off most of the compile stages, but doubles the MAST part... 15:54
timotimo sort and binary search? 15:57
16:00 Peter_R joined
jnthn But we should only be searching it as the "fallback" case 16:01
16:04 Ven joined
timotimo well spotted in any case 16:05
flussence oh, MVM_get_idx_in_sc is inlined, so it doesn't show counts for that, but that hardly looks expensive to begin with.
timotimo is that run time spent in there? could be that it spends a lot of cycles there, but has hardly any cache traffic? 16:07
flussence that's exclusive-time in cycles (52 billion out of 211 billion)
good question though, how big *is* that array it's scanning? 16:08
timotimo cachegrind would be able to give you more idata on that, i believe 16:09
i mean
on how it behaves in relation to the cache
jnthn flussence: It grows over the setting compile, it reaches over 64,000 items 16:10
Which is why we're meant to also cache the index on objects, but apparently that's not working as it should
flussence 64k sounds like it should fit into my CPU, but I'll re-run with --cache-sim=yes anyway. Might take a while... 16:11
jnthn We still shouldn't be scanning it though 16:12
timotimo i *think* --cache-sim=yes only has to simulate the L3 cache - or something like that? 16:13
16:26 tokuhirom joined 17:10 FROGGS[mobile] joined 17:22 leont joined 17:28 cygx joined
cygx o/ 17:28
nwc10 \o
timotimo sup cygx :) 17:30
cygx Well, I looked at what I did back in February to see if there's anything that might be useful to push upstream 17:31
the JVM open modes have already been merged
I've 2 other branches of more questionable usefullness
github.com/MoarVM/MoarVM/compare/m...rtualfiles and github.com/MoarVM/MoarVM/compare/m...x:embedapi
the former allows to register virtual bytecode files, which is how I created a single-file NQP executable, embedding the .moarvm files 17:32
the latter is a vtable-based API, useful eg for a MoarVM plugin 17:33
17:42 FROGGS joined
flussence that run with --cache-sim *finally* finished... and it shows pretty much nothing in terms of misses. /me shrugs 17:45
timotimo oh
no misses! that's perfect! ;)
flussence well, not zero, but around 0-5%... 17:46
timotimo that's surprisingly good 17:47
i mean, the whole program's gotta go into cache via memory at least once
well, all the parts we run anyway
flussence for all the time MVM_sc_find_object_idx spends, only 2% of that is cache misses, 20% is cache hits, and I guess the remainder is it looping over that list 17:49
FROGGS jnthn: about serialization: gist.github.com/FROGGS/cf2a24d89361e760e7bc 17:52
18:20 dalek joined 18:21 FROGGS[mobile] joined 18:25 vendethiel joined 18:48 dalek joined
dalek arVM: 3b09c21 | jnthn++ | tools/ucd2c.pl:
Add SpecialCasing to Unicode DB generation script.
19:20
arVM: 0981b5e | jnthn++ | src/strings/unicode_ (2 files):
Update Unicode database with SpecialCasing.
arVM: eea3ed8 | jnthn++ | src/strings/unicode_ops.c:
First pass at handling special casings.

Covers most of the cases, though some further work will be needed to properly handle these in combination with synthetic codepoints.
19:26
leont o/
FROGGS hi leont
jnthn The synthetics aside, there's only that darned final sigma greek thing to deal with 19:27
timotimo is the "final sigma" the only letter in greek that does that? 19:31
jnthn Yeah, but it *seems* that to implement it you have to look ahead one char to see if it's the "end of the word"
jnthn is struggling to find a rigourous definition of what that means
timotimo ugh. and that's always unhappy when we're not sure if any more data is going to come in
jnthn Well, this is for uc/tc/lc etc. so we'll just have to go on the string you give us 19:32
omg, it seems you maybe have to look behind also 19:37
timotimo urgh 19:39
but why
jnthn Because the rule only applies if it's the final letter in the word, but not the only letter in the word 19:40
timotimo but wait ... this is only about changing case on a string we already have, right?
jnthn knight666.com/blog/the-curious-case...nal-sigma/ is helpful
Right
It's not a probably to implement the lookaround
timotimo OK, so not necessarily a problem with regards to data coming through a tcp connection or something
jnthn Just want to get it right
timotimo it's also a plus if we don't take a performance hit for every single string we case-change just because there could be a sigma somewhere 19:41
jnthn Aha. That uses the GeneralCategory of Letter.
And then start/end of string bound checks
And I was thinking \w and then the latter, but Letter is righter 19:42
timotimo: I'll later special-case ASCII to avoid the property lookups completely
Anyway, think I did enough for today, but will deal with final sigma and some tests tomorrow :) 19:44
timotimo \o/ 19:52
food is done! 19:54
[Coke] jnthn++ 20:00
lizmat jnthn++ indeed :-) 20:03
20:23 cygx left 20:28 tokuhirom joined 21:30 LLamaRider joined 22:24 _itz_ joined 23:19 tokuhirom joined