github.com/moarvm/moarvm | IRC logs at irclog.perlgeek.de/moarvm/today
Set by moderator on 15 May 2018.
00:17 lizmat joined 01:57 ilbot3 joined
moderator github.com/moarvm/moarvm | IRC logs at irclog.perlgeek.de/moarvm/today
03:32 AlexDaniel joined 06:07 domidumont joined 06:13 domidumont joined 06:49 robertle joined 06:50 Ven`` joined 06:52 AlexDaniel joined, shareable6 joined 06:57 AlexDani` joined 07:26 AlexDani` joined 07:27 Ven`` joined 07:28 Ven`` joined 09:11 squashable6 joined 09:28 shareable6 joined 10:22 lizmat joined 10:34 Ven`` joined 10:42 domidumont joined 11:12 Ven`` joined 12:17 leedo joined 12:26 Ven`` joined 12:33 Ven`` joined 12:43 domidumont joined 13:03 domidumont joined 13:10 shareable6 joined 13:12 robertle joined 14:12 domidumont joined 14:31 Ven`` joined 16:13 robertle joined 17:32 domidumont joined 21:21 MasterDuke joined
MasterDuke jnthn, samcv, timotimo: you might find this interesting lemire.me/blog/2018/05/16/validati...-per-byte/ 21:21
timotimo i just saw it, too
if we didn't have to load our incoming utf8 strings into arrays with larger entries, that'd apply 21:22
we do, however, have an operation that checks if a string would fit into 8 bit entries
but it runs over 32bit integers rather than over 8 bit integers
we did put a bit of work into vectorizing these functions recently, maybe the techniques used here also apply to ourcode 21:23
MasterDuke well, we are using the FSM he benchmarks against 21:24
github.com/MoarVM/MoarVM/blob/mast...8.c#L3-L63 21:25
timotimo ah 21:26
if we don't do streaming decoding, we can certainly quick-check the whole string up front with the new thing
not sure how applicable it is to streaming decoding; if the string comes in at like 16 bytes at a time, it will hardly make a difference 21:27
samcv well we vectorize the code to check if it fits in 8bits
timotimo but if it were, say, 4k bytes at a time, that'd be helpful there, too
22:36 Kaiepi joined 22:45 AlexDaniel joined