github.com/moarvm/moarvm | IRC logs at colabti.org/irclogger/irclogger_logs/moarvm
Set by AlexDaniel on 12 June 2018.
timotimo i wonder if the internal structure of NFA should be changed a bit 21:12
currently we have an array of arrays for the state infos
i.e. the "kind of edge", the target state, and the parameters, like what grapheme or some such 21:13
MasterDuke array of classes instead?
timotimo hm?
MasterDuke didn't jnthn say recently that accessing class attributes was even faster than arrays? 21:14
timotimo i'm talking about C code right now 21:15
timotimo my thought is to put all the MVMNFAStateInfo entries in one long array and instead of having an array with "how many edges does each state have" we'd have "at what index do this state's edges start?" in one array, and one last entry for where the total array ends 21:18
whenever we're "at" a state, we go through all the outgoing edges, checking what their "act" is to see how the check should work, then the "extra data" for whatever kind of check it is, and only when the check is successful do we use the "to" value 21:19
i think that the percentage of successful checks can be relatively small in general 21:20
and hopefully there's also cases where none of the checks succeed
so my thinking is, if the "to" values live in a separate array, there would be fewer accesses to that array, and going through the rest of the data could be faster
in other words, the stuff we always need to look at should be the only stuff that gets fetched always 21:21
and an array of arrays could very well become a little fragmented in memory 21:22
plus, for states that have only one or two outgoing edges anyway, the allocation overhead of malloc (or do we use the fixed size allocator already here?) could add up?
OK, we're already using the fixed size allocator 21:24
MasterDuke interesting 21:25
timotimo MVMNFAStateInfo is 24 bytes big already 21:27
moving the "to" out of it will drop that down to 16, which i imagine is a lovely tiny win 21:28
MasterDuke how many of them are there? 21:31
timotimo also, to is 64bit, but since it's an index into the states array, having 32bit could be enough
MasterDuke are there ever more than 65k states? 21:32
timotimo gist.github.com/timo/308debec42f59...796c5e545e - here's a little excerpt from the beginning of compiling the core setting 21:34
not all nfas are deserialized from the very beginning, of course
MasterDuke so maybe 16bit would be sufficient? 21:35
timotimo seems like we already reach the 20k every now and then
to be totally safe, unsigned 32bit integer should definitely be fine 21:36
timotimo mhhh ... huh? NFA's deserialize function isn't being called? 21:39
updated the gist with what nfa_from_statelist creates 21:43
MasterDuke hm. yeah, 16 might be too small 21:53
timotimo at some point we could generate some mvm bytecode for extremely small nfas 22:04
i'm not sure if there would be any significant winnings, though 22:05
aaaaaanyway, if we have a big one like the 37k edges over 21.8k states, we'd have up to 21.8k calls to fixed_size_free to free it up, as well as the same amount of calls to fixed_size_alloc t create these arrays in the first place 22:08
if it all goes into one big flat array, that'd just be one call, that i guess often hits malloc directly because it's too big for even the biggest fsa bins 22:09
MasterDuke tradeoffs 22:10
timotimo not sure there's a downside here :)
one big malloc is better than a hundred small fsa's
MasterDuke yeah, probably 22:11
timotimo some of the arguments are just an MVMGrapheme32, other times it's a MVMString *, an MVMint64, or two MVMGrapheme32 22:17
i wonder how often it's the 32bit one, because that'd be a waste of 32bits every time
gist.github.com/timo/308debec42f59...796c5e545e 22:22
at the expense of one number per state, a "buffer" of mixed 64bit and 32bit values could be indexed, and then traversed one-by-one, since we're going through all edges anyway; there's a binary search optimization in there, though, which may make this a bit mre difficult 22:23
timotimo oh, hmm, doing the binary search thing adds edges in the middle of the nfa, though 22:49
so i'd have to memmove a bunch of stuff ...
Geth MoarVM/nfa_flat_statelist: 1f182b48cf | (Timo Paulssen)++ | Configure.pl
new Configure.pl flag: --dtrace

When this is turned on, we can put DTrace points in different places in the code.
These DTrace trace points show up as just a single nop in the resulting binary, but an additional section in the ELF lists the available trace points along with how to get at whatever is in the arguments.
22:53
MoarVM/nfa_flat_statelist: 60edb07c81 | (Timo Paulssen)++ | 2 files
WIP on getting nfa to use a flat state array

instead of having one array per state that has arrays for each state's outgoing edges in it, we store the index into the array where the first edge lives, and use the next number in that array to calculate the amount. ... (7 more lines)
timotimo oh, that other commit belongs to master actually 22:54
Geth MoarVM: 1f182b48cf | (Timo Paulssen)++ | Configure.pl
new Configure.pl flag: --dtrace

When this is turned on, we can put DTrace points in different places in the code.
These DTrace trace points show up as just a single nop in the resulting binary, but an additional section in the ELF lists the available trace points along with how to get at whatever is in the arguments.
23:30