samcv callgrind so slow 00:34
it's been like 20 minutes almost. and at least i think it's almost done with what used to take like 7 secs to run 00:35
MasterDuke jnthn++ 01:23
Geth MoarVM: f2acad4215 | (Samantha McVey)++ | src/strings/ops.c
MVM_string_chr: Only allocate and normalize for cp's that require it

Makes nqp::chr 2x faster for codepoints that don't require normalization.
01:55
MasterDuke cool 02:03
02:51 evalable6 joined 02:52 unicodable6 joined, bisectable6 joined, bloatable6 joined, committable6 joined, benchable6 joined, statisfiable6 joined 04:17 mst joined 05:58 vendethiel joined 07:19 dalek joined 07:37 unicodable6 joined 08:00 domidumont joined 08:09 domidumont joined 09:07 domidumont joined 10:56 vendethiel joined
Geth MoarVM: 8f9325b8ec | (Samantha McVey)++ | src/strings/ops.c
chr: For cp < 0x300 short circuit a unicode property test

If the codepoint decomposes we may need to normalize it. The first cp that decomposes is U+0340, To be on the safe side, for now we go with the first significant character which at the time of writing is COMBINING GRAVE ACCENT U+300
This saves us a property check on the Decomposition_Type property.
11:46
12:03 travis-ci joined
travis-ci MoarVM build errored. Samantha McVey 'chr: For cp < 0x300 short circuit a unicode property test 12:03
travis-ci.org/MoarVM/MoarVM/builds/214953601 github.com/MoarVM/MoarVM/compare/f...9325b8ec20
12:03 travis-ci left
samcv looks like travis stalled 12:09
restartig test
Geth MoarVM: samcv++ created pull request #563:
UTF-8 Provide a better error when failing to encode surrogates
12:14
14:06 AlexDaniel joined 14:07 unicodable6 joined 14:16 domidumont joined 15:25 vendethiel joined
timotimo ha 16:35
i just took a pretty thinkful shower
not only did i figure out a way to probably make JSON::Fast do its strings more quickly and hopefully also with much less memory overhead
but also i figured out a new class of json blobs that will asplode violently in our parsers but not in others 16:36
consider: "hello\ñworld" 16:37
MasterDuke you should get a waterproof laptop and take it with you whenever you shower
timotimo depending on whether that's a composed or decomposed ñ, it'll have to be rejected or allowed
MasterDuke tricky 16:38
timotimo easy-ish
also, i'm wondering if anybody would enjoy having a :path argument to parse that'll only turn parts of the whole json document into actual objects 16:39
if you're writing code like from-json($foo)[3]<bloop>[1] anyway, why create objects for the whole rest? 16:42
but that might be better in a JSON::Fast::Lazy or something
JSON::Fast really, really wants the impose-no-normalization stuff in place %) 16:44
MasterDuke for `perl6 --target=mast -e 'class :: { has int64 $.x; }.new( x => 9223372036854775807 ).x.say' | grep bindattr`, would you expect to see a bindattr_i? 16:52
samcv good *
lol timotimo
please provide samples :) 16:53
timotimo samples of what now?
samcv oh the thing that make things explode
timotimo ["a\ñb"] - invalid json 16:54
["a\ñb"] - valid json
both are considered invalid by JSON::Fast
samcv heh
that should not happen
well. give me the codepoints involved and i will see if mvm is doing anything wrong or could be better
timotimo no, it's my code that's doing it wrong 16:55
it sees the \ and then looks for an n
but instead of an n it gets ñ no matter whether it was n + combiner or n-with-combiner in teh input
the only other option for JSON::Fast right now is to accept both as valid 16:56
even though the other one is supposed to be rejected
samcv rejected? wouldn't it just not be a newline and still be valid? 17:07
dogbert17 anyone else getting spectest fails? 17:08
yoleaux2 24 Mar 2017 01:28Z <MasterDuke> dogbert17: do you get a SIGABRT in t/spec/S17-supply/supplier-preserving.t if you run it with valgrind?
samcv on which dogbert17 ?
dogbert17 t/spec/S02-types/pair.rakudo.moar and t/spec/integration/advent2011-day23.t 17:09
timotimo samcv: no, the railroad diagram for string says a \ must be followed by either ", \, /, b, f, n, r, t, or u + 4 hexdigits 17:10
dogbert17 # Failed test 'List.invert maps via a required Pair binding'
# at t/spec/S02-types/pair.rakudo.moar line 382
samcv ah i see timotimo
timotimo so \ followed by a n-with-tilde codepoint is wrong and is to be rejected
samcv yeah i think i got that one maybe dogbert17
timotimo though the verbiage of the standard doesn't actually say that other things are disallowed 17:11
oh, no, actually it does
"All characters may be placed within the quotation marks except for the characters that must be escaped"
samcv ah kk
timotimo and \ must be escaped
i think i understood that right
samcv would be nice if it warned but didn't like totally break in cases like that. idk. being able to decode something is useful 17:12
even if it may not be 100% compliant
but regardless. that should be a secondary concern
to not having things be broken
timotimo i can live with accepting both and resulting in a newline followed by a lone combiner
for the time being, that is
i'd also like to give it a switch that'd allow trailing , inside [] and a} 17:13
sorry, {}
samcv dogbert17, yeah i get test 178 failing for pair.t 17:14
List.invert maps via a required Pair binding
17:21 domidumont joined
dogbert17 samcv, I got the same 17:21
timotimo hey samcv, what are "unicode noncharacters"?
code.google.com/archive/p/json-tes...e/issues/1 ?!?! 17:22
like, codepoints outside the specified range? or something?
dogbert17 test 91 fails as well 'hash stringification'
timotimo seriot.ch/parsing_json.php this ought to be good for me 17:23
The previous section discussed non-Unicode codepoints that appear in strings, such as "\uDEAD", which is valid Unicode in its u-escaped form, but doesn't decode into a Unicode character. 17:25
^- probably that
samcv noncharacter?. uh. does that mean nonvisible characters? 17:26
timotimo i don't think that's what they mean 17:27
the standard seems to use "character" and "codepoint" interchangably
samcv yes timotimo 17:30
timotimo maybe when they say "noncharacter" i have to substitute "code point" for "character" and it becomes "noncode point"
samcv when unicode in their docs say character. they mean *unicode thing that has an assigned codepoint*
timotimo so ... like how 420 is code for weed
so 421 is probably a noncode point? 17:31
samcv maybe
timotimo that must be it
samcv haha
timotimo does the ECMA accept 420?
samcv m: say 420.uniname 17:32
camelia LATIN CAPITAL LETTER P WITH HOOK
samcv P for pot.
this is intentional obviously
though 421 is small letter p with hook so idk
timotimo omg 17:33
m: say 0x420.uniname
camelia CYRILLIC CAPITAL LETTER ER
timotimo m: say 0x420.chr
camelia Р
timotimo OMFG
samcv LOL 17:34
timotimo i'd like to tweet this discovery, is that okay with you? do you want credit? :P
samcv no you deserve full credit
timotimo FakeUnicode might just love this
hm 17:35
did i copypaste that correctly?
m: say "РP".ords
camelia (1056 80)
timotimo looks like
samcv neither of those numbers are 420 though 17:36
timotimo twitter.com/loltimo/status/845690720098963457
m: say 0x420
camelia 1056
samcv oh 17:37
but 80 is not 420 though
timotimo yeah, the other one is what i get when i press P
samcv ye
Zoffix but... 0x420.chr is "r" :/
timotimo wait, what? 17:38
m: say 0x420.chr
camelia Р
Zoffix 0x420 is the cyrillic version of "R"
timotimo so it's code for Rot?
samcv there's gotta be at least one russian slang word for pot that starts with that letter. i mean. you could probably take half the latin alphabet and come up with slang synonyms 17:41
so it's really just an effort of statistics 17:42
Zoffix can't think of any... only K and T 17:44
timotimo perfect
Zoffix And well, The russian "P"; Basically the slang is "plan"
samcv use Test; plan 420; 17:46
Geth MoarVM/even-moar-jit: 1012a46f65 | (Bart Wiegmans)++ | 7 files
Split value-nodes from void nodes

IF, DO and CALL would return a value only if their child nodes were returning a value. (CALL had an parameter signifiying the 'return type'). But because the tiler does not take such parameters into account, this leads to confusing results. By splitting the value-yielding from void uses of the nodes, each node now corresponds to exactly one type, and no conflicts can occur. Although this increases the number of nodes, it reduces the number of mechanisms, which is pretty interesting.
18:06
samcv timotimo, can we use macros for printf? like for bit sizes? 18:10
how do we ensure it is the correct size 18:11
18:18 unicodable6 joined 19:09 bartolin joined
bartolin the other day I stumbled acrosse unicode noncharacters while looking at RT #130914 19:11
synopsebot6 Link: rt.perl.org/rt3/Public/Bug/Display...?id=130914
bartolin there are some faq for those: www.unicode.org/faq/private_use.html#nonchar1
samcv: do you have an opinion about that ticket, perhaps? 19:13
timotimo samcv: i'm not sure what you mean? like tell the compiler that the arguments to our function behave like printf? 19:14
19:19 FROGGS joined
timotimo we do get warnings about format strings and too-small identifiers all the time 19:19
samcv: you know, "hookah" is a device you can use to consume pot 19:23
samcv: so latin capital letter P with hook(ah) is also very good 19:24
19:25 zakharyas joined
lizmat .tell jnthn how threadsafe is nqp::p6firstflag ? 21:12
yoleaux2 lizmat: I'll pass your message to jnthn.