github.com/moarvm/moarvm | IRC logs at colabti.org/irclogger/irclogger_logs/moarvm
Set by AlexDaniel on 12 June 2018.
00:15 sena_kun joined 00:16 Altai-man_ left 01:23 lucasb left 02:13 Altai-man_ joined 02:16 sena_kun left 04:09 leont joined 04:14 sena_kun joined 04:15 sivoais_ left, sivoais joined 04:16 Altai-man_ left 06:13 Altai-man_ joined 06:16 sena_kun left 08:14 sena_kun joined 08:16 Altai-man_ left 09:13 MasterDuke joined 10:13 Altai-man_ joined 10:16 sena_kun left 12:14 sena_kun joined 12:17 Altai-man_ left 12:47 Ven`` joined 12:59 sena_kun1 joined 14:05 sena_kun1 left 14:14 Altai-man_ joined 14:16 sena_kun left 15:45 Voldenet left 15:57 Voldenet joined, Voldenet left, Voldenet joined 16:15 sena_kun joined 16:16 Altai-man_ left 16:26 zakharyas joined 17:18 Ven`` left 17:36 Ven`` joined 18:14 Altai-man_ joined 18:16 sena_kun left 20:15 sena_kun joined 20:17 Altai-man_ left 20:51 zakharyas left 21:03 Kaeipi joined, Kaiepi left
timotimo i wonder if the internal structure of NFA should be changed a bit 21:12
currently we have an array of arrays for the state infos
i.e. the "kind of edge", the target state, and the parameters, like what grapheme or some such 21:13
MasterDuke array of classes instead?
timotimo hm?
MasterDuke didn't jnthn say recently that accessing class attributes was even faster than arrays? 21:14
timotimo i'm talking about C code right now 21:15
21:16 leont left
timotimo my thought is to put all the MVMNFAStateInfo entries in one long array and instead of having an array with "how many edges does each state have" we'd have "at what index do this state's edges start?" in one array, and one last entry for where the total array ends 21:18
whenever we're "at" a state, we go through all the outgoing edges, checking what their "act" is to see how the check should work, then the "extra data" for whatever kind of check it is, and only when the check is successful do we use the "to" value 21:19
i think that the percentage of successful checks can be relatively small in general 21:20
and hopefully there's also cases where none of the checks succeed
so my thinking is, if the "to" values live in a separate array, there would be fewer accesses to that array, and going through the rest of the data could be faster
in other words, the stuff we always need to look at should be the only stuff that gets fetched always 21:21
and an array of arrays could very well become a little fragmented in memory 21:22
plus, for states that have only one or two outgoing edges anyway, the allocation overhead of malloc (or do we use the fixed size allocator already here?) could add up?
OK, we're already using the fixed size allocator 21:24
MasterDuke interesting 21:25
timotimo MVMNFAStateInfo is 24 bytes big already 21:27
moving the "to" out of it will drop that down to 16, which i imagine is a lovely tiny win 21:28
MasterDuke how many of them are there? 21:31
timotimo also, to is 64bit, but since it's an index into the states array, having 32bit could be enough
MasterDuke are there ever more than 65k states? 21:32
timotimo gist.github.com/timo/308debec42f59...796c5e545e - here's a little excerpt from the beginning of compiling the core setting 21:34
not all nfas are deserialized from the very beginning, of course
MasterDuke so maybe 16bit would be sufficient? 21:35
timotimo seems like we already reach the 20k every now and then
to be totally safe, unsigned 32bit integer should definitely be fine 21:36
21:39 Ven`` left
timotimo mhhh ... huh? NFA's deserialize function isn't being called? 21:39
updated the gist with what nfa_from_statelist creates 21:43
MasterDuke hm. yeah, 16 might be too small 21:53
22:04 Kaeipi left
timotimo at some point we could generate some mvm bytecode for extremely small nfas 22:04
i'm not sure if there would be any significant winnings, though 22:05
aaaaaanyway, if we have a big one like the 37k edges over 21.8k states, we'd have up to 21.8k calls to fixed_size_free to free it up, as well as the same amount of calls to fixed_size_alloc t create these arrays in the first place 22:08
if it all goes into one big flat array, that'd just be one call, that i guess often hits malloc directly because it's too big for even the biggest fsa bins 22:09
MasterDuke tradeoffs 22:10
timotimo not sure there's a downside here :)
one big malloc is better than a hundred small fsa's
MasterDuke yeah, probably 22:11
22:15 Altai-man_ joined 22:16 sena_kun left
timotimo some of the arguments are just an MVMGrapheme32, other times it's a MVMString *, an MVMint64, or two MVMGrapheme32 22:17
i wonder how often it's the 32bit one, because that'd be a waste of 32bits every time
gist.github.com/timo/308debec42f59...796c5e545e 22:22
at the expense of one number per state, a "buffer" of mixed 64bit and 32bit values could be indexed, and then traversed one-by-one, since we're going through all edges anyway; there's a binary search optimization in there, though, which may make this a bit mre difficult 22:23
22:33 Kaiepi joined
timotimo oh, hmm, doing the binary search thing adds edges in the middle of the nfa, though 22:49
so i'd have to memmove a bunch of stuff ...
Geth MoarVM/nfa_flat_statelist: 1f182b48cf | (Timo Paulssen)++ | Configure.pl
new Configure.pl flag: --dtrace

When this is turned on, we can put DTrace points in different places in the code.
These DTrace trace points show up as just a single nop in the resulting binary, but an additional section in the ELF lists the available trace points along with how to get at whatever is in the arguments.
22:53
MoarVM/nfa_flat_statelist: 60edb07c81 | (Timo Paulssen)++ | 2 files
WIP on getting nfa to use a flat state array

instead of having one array per state that has arrays for each state's outgoing edges in it, we store the index into the array where the first edge lives, and use the next number in that array to calculate the amount. ... (7 more lines)
timotimo oh, that other commit belongs to master actually 22:54
Geth MoarVM: 1f182b48cf | (Timo Paulssen)++ | Configure.pl
new Configure.pl flag: --dtrace

When this is turned on, we can put DTrace points in different places in the code.
These DTrace trace points show up as just a single nop in the resulting binary, but an additional section in the ELF lists the available trace points along with how to get at whatever is in the arguments.
23:30