|
01:20
tokuhiro_ joined
02:12
colomon joined
02:18
BinGOs_ joined,
btyler_ joined,
leedo_ joined,
[Coke]_ joined
02:19
jnthn_ joined
|
|||
| timotimo | i think we threw the frame pool out because it no longer helped performance | 03:19 | |
| but it wasn't replaced by malloc; it was replaced by the Fixed Size Allocator | |||
|
03:22
tokuhiro_ joined
05:24
tokuhiro_ joined
|
|||
| diakopter | yah, but from the fixed sized allocator, it spends 10% of time in malloc | 06:20 | |
|
07:26
tokuhiro_ joined
07:47
domidumont joined
07:52
domidumont joined
08:24
arnsholt_ joined
09:27
tokuhiro_ joined
09:30
domidumont joined
09:45
Peter_R joined
10:10
kjs_ joined
10:14
BinGOs joined
10:35
vendethiel joined
11:17
tokuhiro_ joined
11:25
domidumont joined
12:03
FROGGS joined
12:23
tokuhiro_ joined
12:24
TimToady joined
13:28
leont joined
|
|||
| timotimo | oh | 13:36 | |
|
14:24
tokuhiro_ joined
14:53
vendethiel joined
16:11
kjs_ joined
16:18
colomon joined
16:26
tokuhiro_ joined
16:37
colomon joined
17:08
zakharyas joined
|
|||
| hoelzro | o/ #moarvm | 17:32 | |
| japhb | o/ | ||
|
18:02
colomon joined
|
|||
| hoelzro | o/ japhb | 18:05 | |
| I've been digging around in MVM_string_utf16_encode_substr, and afaict, it doesn't specify if the output is UTF-16BE or UTF-16LE | 18:06 | ||
| it seems to just use native encoding | |||
| er, endianness | |||
| is that something that should be well defined for that function? | |||
| timotimo | needs a BOM :P | 18:13 | |
| hoelzro | should we set us up the bom in that function, then? | 18:14 | |
| that feels like it belongs at a higher level, maybe | |||
| timotimo | dunno, what does the spec say about its power level? | ||
|
18:15
colomon joined
|
|||
| hoelzro | > 9000 | 18:15 | |
| timotimo | WHAT NINE THOUSAND | ||
| well, the BMP is a bit more than nine thousand, isn't it? | |||
| hoelzro | 16K, right? | 18:16 | |
| er, no | |||
| 64K | |||
| hoelzro can't math today | |||
| timotimo | math! what is it good for? | 18:17 | |
| dalek | arVM: 47ab6f3 | hoelzro++ | src/strings/utf16.c: Resize buffers as needed when taking a UTF-16 substring |
18:18 | |
| arVM: 05ad276 | hoelzro++ | src/strings/utf16.c: Initialize repl_length to 0 Otherwise we depend on uninitialized values for growing the buffer |
|||
| hoelzro | timotimo: do you think the endianness thing is RT worthy? | 18:22 | |
| timotimo | no clue | 18:24 | |
| i'ven't seen an UTF-16 thing in a long time | |||
| isn't it quite common in asian parts of the world? | 18:25 | ||
| hoelzro | I thought it was just MS stuff | ||
| and Java, but Java uses UCS-2 | |||
| leont | I think so does Oracle | ||
| hoelzro | I'm not sure about asian countries, but I thought that Japan, for example, has stuck with Shift-JIS | ||
| oh, I didn't know that | 18:26 | ||
| timotimo | what is Shift-JIS? | ||
| hoelzro | it's an encoding that was (is?) popular in Japan | ||
|
18:27
tokuhiro_ joined
|
|||
| timotimo | let's see ... | 18:27 | |
| leont | Or at least it's producing CESU-8, which is an eldrich horror | 18:28 | |
| (UTF-8, but with surrogate pairs…) | 18:29 | ||
| hoelzro | wtf | ||
| timotimo | so just like json? | 18:30 | |
| leont | Almost | ||
| AFAIK JSON is Modified UTF-8, which is the same except that a null character is encoded as 0xC0,0x80… | 18:31 | ||
| Which is a Java thing | 18:32 | ||
| Don't see it mentioned in the JSON RFC, I may be mistaken there | 18:33 | ||
|
18:39
colomon joined
|
|||
| jnthn | hoelzro: I think we should probably have UTF-16 write a BOM and mean native, and add UTF-16-LE and UTF-16-BE | 18:41 | |
| Which can re-use the same code near enough | 18:42 | ||
| And just twiddle the endianness on the way out | |||
| Or in | |||
| leont | "twiddle the endianness on the way out" | 18:43 | |
| arnsholt | leont: What on Earth is the rationale for something like CESU-8? If you're restricted to bytes, wouldn't UTF-8 be simpler? | ||
| leont | ? | ||
| arnsholt: it's cheaper to convert UTF-16 to CESU-8 than to UTF-8, I guess | |||
| jnthn | leont: As in, after grabbing codepoints, doing the surrogate pair split, and so forth | ||
| arnsholt | True, I guess | ||
| hoelzro | jnthn: should I make a ticket for that? | ||
| jnthn | Heck, can even pass in a function pointer | ||
| hoelzro: Yeah, can do | 18:44 | ||
| leont | endianness and surrogates have a clear order in my head | ||
| hoelzro | rt.perl.org/Ticket/Display.html?id=126704 | 18:45 | |
| leont | (possibly I'm misunderstanding what you just said and we're in agreement) | ||
| jnthn | leont: You write the surrogates in a different order too? | ||
| I thought you just wrote the 16-bit values in a different order... | |||
| hoelzro | rt.perl.org/Ticket/Display.html?id=126705 | ||
| leont | No, I don't think so | ||
| We're probably talking past each other, just ignore what I said :- | 18:46 | ||
| ) | |||
| hoelzro | jnthn: re: a BOM, though; I would think that would be the responsibility of a higher layer? ex. what if a protocol *always* uses UTF-16BE; does it make sense to throw a BOM on? | ||
| leont | Depends on the protocol | 18:47 | |
| jnthn | leont: en.wikipedia.org/wiki/UTF-16#U.2BD...o_U.2BDFFF seem to agree with what I mean... :) | ||
| hoelzro | leont: right, so why force the BOM if the programmer doesn't need it? | ||
| leont | jnthn: indeed that's the obvious thing | 18:49 | |
| jnthn | bah | 18:51 | |
| "If the BOM is missing, RFC 2781 says that big-endian encoding should be assumed. (In practice, due to Windows using little-endian order by default, many applications similarly assume little-endian encoding by default.)" | |||
| Standards... :/ | |||
| leont | Little-Endian is a bit silly, but given that's how all architectures work nowadays (even ARM switched) it seems a fait accompli | 18:52 | |
| ilmari | however, because the first character of JSON must be < 127, you can tell by the pattern of nulls | ||
| RFC 7159 says «Implementations MUST NOT add a byte order mark to the beginning of a | 18:54 | ||
| JSON text.» | |||
| leont | UTF-16 has all the disadvantages of UCS-2 with all the disadvantages of UTF8, and adds one of its own: it isn't binary sortable (even UTF-16BE) due to surrogate pairs. It's a mess really. | 18:56 | |
| ilmari | 00 00 00 xx: UTF32-BE, xx 00 00 00: UTF-32LE, 00 xx: UTF-16BE, xx 00: UTF-16LE, xx: UTF-8 | 18:58 | |
| hoelzro | 9^/win3 | 19:07 | |
| oops | |||
|
19:29
tokuhiro_ joined
19:36
kjs_ joined
19:44
domidumont joined
19:45
vendethiel- joined
19:57
kjs_ joined
20:07
kjs_ joined
20:22
tokuhiro_ joined
20:27
lizmat joined
21:34
vendethiel joined
|
|||
| diakopter | here's a CORE.setting compilation profile output using XCode Instruments: imgur.com/5LJOuf7 | 21:43 | |
| in case anyone wants to find some low-hanging fruitzies | 21:45 | ||
| that's at a 40-microsecond sample rate | 21:48 | ||
| and sorted by Self (ms) if you're interested: i.imgur.com/uW1KBZ6.png | 21:54 | ||
| jnthn | Nice | 21:58 | |
| jnthn drops them in browser tabs for when he's not tired :) | |||
| Rest time for now... o/ | |||
| diakopter | o/ | 21:59 | |
|
22:02
Ven joined
22:24
tokuhiro_ joined
22:31
Ven_ joined
|
|||
| diakopter | in core setting compilation, MVM_sc_find_object_idx hits its cache 127855 times, but misses the cache 667817 times. Each time it misses the cache, it does a linear search through possibly thousands of objects to find the match.. | 23:28 | |
| 667817 linear searches is not good | |||
| timotimo | that seems like a good catch | 23:32 | |
| i'm definitely looking forward to when moar's jit builds a /tmp/perf-PID.map | 23:50 | ||