samcv | callgrind so slow | 00:34 | |
it's been like 20 minutes almost. and at least i think it's almost done with what used to take like 7 secs to run | 00:35 | ||
MasterDuke | jnthn++ | 01:23 | |
Geth | MoarVM: f2acad4215 | (Samantha McVey)++ | src/strings/ops.c MVM_string_chr: Only allocate and normalize for cp's that require it Makes nqp::chr 2x faster for codepoints that don't require normalization. |
01:55 | |
MasterDuke | cool | 02:03 | |
02:51
evalable6 joined
02:52
unicodable6 joined,
bisectable6 joined,
bloatable6 joined,
committable6 joined,
benchable6 joined,
statisfiable6 joined
04:17
mst joined
05:58
vendethiel joined
07:19
dalek joined
07:37
unicodable6 joined
08:00
domidumont joined
08:09
domidumont joined
09:07
domidumont joined
10:56
vendethiel joined
|
|||
Geth | MoarVM: 8f9325b8ec | (Samantha McVey)++ | src/strings/ops.c chr: For cp < 0x300 short circuit a unicode property test If the codepoint decomposes we may need to normalize it. The first cp that decomposes is U+0340, To be on the safe side, for now we go with the first significant character which at the time of writing is COMBINING GRAVE ACCENT U+300 This saves us a property check on the Decomposition_Type property. |
11:46 | |
12:03
travis-ci joined
|
|||
travis-ci | MoarVM build errored. Samantha McVey 'chr: For cp < 0x300 short circuit a unicode property test | 12:03 | |
travis-ci.org/MoarVM/MoarVM/builds/214953601 github.com/MoarVM/MoarVM/compare/f...9325b8ec20 | |||
12:03
travis-ci left
|
|||
samcv | looks like travis stalled | 12:09 | |
restartig test | |||
Geth | MoarVM: samcv++ created pull request #563: UTF-8 Provide a better error when failing to encode surrogates |
12:14 | |
14:06
AlexDaniel joined
14:07
unicodable6 joined
14:16
domidumont joined
15:25
vendethiel joined
|
|||
timotimo | ha | 16:35 | |
i just took a pretty thinkful shower | |||
not only did i figure out a way to probably make JSON::Fast do its strings more quickly and hopefully also with much less memory overhead | |||
but also i figured out a new class of json blobs that will asplode violently in our parsers but not in others | 16:36 | ||
consider: "hello\ñworld" | 16:37 | ||
MasterDuke | you should get a waterproof laptop and take it with you whenever you shower | ||
timotimo | depending on whether that's a composed or decomposed ñ, it'll have to be rejected or allowed | ||
MasterDuke | tricky | 16:38 | |
timotimo | easy-ish | ||
also, i'm wondering if anybody would enjoy having a :path argument to parse that'll only turn parts of the whole json document into actual objects | 16:39 | ||
if you're writing code like from-json($foo)[3]<bloop>[1] anyway, why create objects for the whole rest? | 16:42 | ||
but that might be better in a JSON::Fast::Lazy or something | |||
JSON::Fast really, really wants the impose-no-normalization stuff in place %) | 16:44 | ||
MasterDuke | for `perl6 --target=mast -e 'class :: { has int64 $.x; }.new( x => 9223372036854775807 ).x.say' | grep bindattr`, would you expect to see a bindattr_i? | 16:52 | |
samcv | good * | ||
lol timotimo | |||
please provide samples :) | 16:53 | ||
timotimo | samples of what now? | ||
samcv | oh the thing that make things explode | ||
timotimo | ["a\ñb"] - invalid json | 16:54 | |
["a\ñb"] - valid json | |||
both are considered invalid by JSON::Fast | |||
samcv | heh | ||
that should not happen | |||
well. give me the codepoints involved and i will see if mvm is doing anything wrong or could be better | |||
timotimo | no, it's my code that's doing it wrong | 16:55 | |
it sees the \ and then looks for an n | |||
but instead of an n it gets ñ no matter whether it was n + combiner or n-with-combiner in teh input | |||
the only other option for JSON::Fast right now is to accept both as valid | 16:56 | ||
even though the other one is supposed to be rejected | |||
samcv | rejected? wouldn't it just not be a newline and still be valid? | 17:07 | |
dogbert17 | anyone else getting spectest fails? | 17:08 | |
yoleaux2 | 24 Mar 2017 01:28Z <MasterDuke> dogbert17: do you get a SIGABRT in t/spec/S17-supply/supplier-preserving.t if you run it with valgrind? | ||
samcv | on which dogbert17 ? | ||
dogbert17 | t/spec/S02-types/pair.rakudo.moar and t/spec/integration/advent2011-day23.t | 17:09 | |
timotimo | samcv: no, the railroad diagram for string says a \ must be followed by either ", \, /, b, f, n, r, t, or u + 4 hexdigits | 17:10 | |
dogbert17 | # Failed test 'List.invert maps via a required Pair binding' | ||
# at t/spec/S02-types/pair.rakudo.moar line 382 | |||
samcv | ah i see timotimo | ||
timotimo | so \ followed by a n-with-tilde codepoint is wrong and is to be rejected | ||
samcv | yeah i think i got that one maybe dogbert17 | ||
timotimo | though the verbiage of the standard doesn't actually say that other things are disallowed | 17:11 | |
oh, no, actually it does | |||
"All characters may be placed within the quotation marks except for the characters that must be escaped" | |||
samcv | ah kk | ||
timotimo | and \ must be escaped | ||
i think i understood that right | |||
samcv | would be nice if it warned but didn't like totally break in cases like that. idk. being able to decode something is useful | 17:12 | |
even if it may not be 100% compliant | |||
but regardless. that should be a secondary concern | |||
to not having things be broken | |||
timotimo | i can live with accepting both and resulting in a newline followed by a lone combiner | ||
for the time being, that is | |||
i'd also like to give it a switch that'd allow trailing , inside [] and a} | 17:13 | ||
sorry, {} | |||
samcv | dogbert17, yeah i get test 178 failing for pair.t | 17:14 | |
List.invert maps via a required Pair binding | |||
17:21
domidumont joined
|
|||
dogbert17 | samcv, I got the same | 17:21 | |
timotimo | hey samcv, what are "unicode noncharacters"? | ||
code.google.com/archive/p/json-tes...e/issues/1 ?!?! | 17:22 | ||
like, codepoints outside the specified range? or something? | |||
dogbert17 | test 91 fails as well 'hash stringification' | ||
timotimo | seriot.ch/parsing_json.php this ought to be good for me | 17:23 | |
The previous section discussed non-Unicode codepoints that appear in strings, such as "\uDEAD", which is valid Unicode in its u-escaped form, but doesn't decode into a Unicode character. | 17:25 | ||
^- probably that | |||
samcv | noncharacter?. uh. does that mean nonvisible characters? | 17:26 | |
timotimo | i don't think that's what they mean | 17:27 | |
the standard seems to use "character" and "codepoint" interchangably | |||
samcv | yes timotimo | 17:30 | |
timotimo | maybe when they say "noncharacter" i have to substitute "code point" for "character" and it becomes "noncode point" | ||
samcv | when unicode in their docs say character. they mean *unicode thing that has an assigned codepoint* | ||
timotimo | so ... like how 420 is code for weed | ||
so 421 is probably a noncode point? | 17:31 | ||
samcv | maybe | ||
timotimo | that must be it | ||
samcv | haha | ||
timotimo | does the ECMA accept 420? | ||
samcv | m: say 420.uniname | 17:32 | |
camelia | LATIN CAPITAL LETTER P WITH HOOK | ||
samcv | P for pot. | ||
this is intentional obviously | |||
though 421 is small letter p with hook so idk | |||
timotimo | omg | 17:33 | |
m: say 0x420.uniname | |||
camelia | CYRILLIC CAPITAL LETTER ER | ||
timotimo | m: say 0x420.chr | ||
camelia | Р | ||
timotimo | OMFG | ||
samcv | LOL | 17:34 | |
timotimo | i'd like to tweet this discovery, is that okay with you? do you want credit? :P | ||
samcv | no you deserve full credit | ||
timotimo | FakeUnicode might just love this | ||
hm | 17:35 | ||
did i copypaste that correctly? | |||
m: say "РP".ords | |||
camelia | (1056 80) | ||
timotimo | looks like | ||
samcv | neither of those numbers are 420 though | 17:36 | |
timotimo | twitter.com/loltimo/status/845690720098963457 | ||
m: say 0x420 | |||
camelia | 1056 | ||
samcv | oh | 17:37 | |
but 80 is not 420 though | |||
timotimo | yeah, the other one is what i get when i press P | ||
samcv | ye | ||
Zoffix | but... 0x420.chr is "r" :/ | ||
timotimo | wait, what? | 17:38 | |
m: say 0x420.chr | |||
camelia | Р | ||
Zoffix | 0x420 is the cyrillic version of "R" | ||
timotimo | so it's code for Rot? | ||
samcv | there's gotta be at least one russian slang word for pot that starts with that letter. i mean. you could probably take half the latin alphabet and come up with slang synonyms | 17:41 | |
so it's really just an effort of statistics | 17:42 | ||
Zoffix can't think of any... only K and T | 17:44 | ||
timotimo | perfect | ||
Zoffix | And well, The russian "P"; Basically the slang is "plan" | ||
samcv | use Test; plan 420; | 17:46 | |
Geth | MoarVM/even-moar-jit: 1012a46f65 | (Bart Wiegmans)++ | 7 files Split value-nodes from void nodes IF, DO and CALL would return a value only if their child nodes were returning a value. (CALL had an parameter signifiying the 'return type'). But because the tiler does not take such parameters into account, this leads to confusing results. By splitting the value-yielding from void uses of the nodes, each node now corresponds to exactly one type, and no conflicts can occur. Although this increases the number of nodes, it reduces the number of mechanisms, which is pretty interesting. |
18:06 | |
samcv | timotimo, can we use macros for printf? like for bit sizes? | 18:10 | |
how do we ensure it is the correct size | 18:11 | ||
18:18
unicodable6 joined
19:09
bartolin joined
|
|||
bartolin | the other day I stumbled acrosse unicode noncharacters while looking at RT #130914 | 19:11 | |
synopsebot6 | Link: rt.perl.org/rt3/Public/Bug/Display...?id=130914 | ||
bartolin | there are some faq for those: www.unicode.org/faq/private_use.html#nonchar1 | ||
samcv: do you have an opinion about that ticket, perhaps? | 19:13 | ||
timotimo | samcv: i'm not sure what you mean? like tell the compiler that the arguments to our function behave like printf? | 19:14 | |
19:19
FROGGS joined
|
|||
timotimo | we do get warnings about format strings and too-small identifiers all the time | 19:19 | |
samcv: you know, "hookah" is a device you can use to consume pot | 19:23 | ||
samcv: so latin capital letter P with hook(ah) is also very good | 19:24 | ||
19:25
zakharyas joined
|
|||
lizmat | .tell jnthn how threadsafe is nqp::p6firstflag ? | 21:12 | |
yoleaux2 | lizmat: I'll pass your message to jnthn. |