|
Welcome to the main channel on the development of MoarVM, a virtual machine for NQP and Rakudo (moarvm.org). This channel is being logged for historical purposes. Set by lizmat on 24 May 2021. |
|||
|
01:39
MasterDuke joined
|
|||
| MasterDuke | timo, et al.: two thoughts: 1) have you seen iev.ee/blog/the-quadratic-problem-nobody-fixed/ ? i don't know enough about the regex engine in nqp, but maybe we can steal something from them. 2) unrelated, but if we reorder MVMStringBody (github.com/MoarVM/MoarVM/blob/main...#L40-L56), and put storage_type | 01:43 | |
| before the storage union, could we include num_strands in the in_situ members to make them a little bigger? | |||
| tellable6 | 2026-03-08T23:55:17Z #moarvm <lizmat> MasterDuke: I can't get rakudo to build anymore after the NQP bump | ||
|
02:06
MasterDuke left
|
|||
| patrickb | Sanity check: The debugserver does not seem to null out it's struct during initialization, but also doesn't initialize every field (e.g. debugspam_protocol isn't nulled when not requested). | 10:03 | |
| Is this somehow relying on calloc nulling out requested memory? | |||
|
11:20
librasteve_ joined
|
|||
| lizmat | timo might know | 14:19 | |
| timo | for a quick first approximation, you can use malloc instead of calloc and run a representative workload under valgrind (or with asan I would assume) which can tell you if anything depends on uninitialised memory | 16:46 | |
| [Coke] | WHOOP. is that "can't get rakudo to build" message still valid? Release in 3 days | 16:51 | |
| lizmat | no, it isn't | ||
| there's some issue *if* you are rebuilding from an existing checkout that did builds before | |||
| timo | MasterDuke, I thought about expanding the in-situ bit by including num_strands as well in the past. not sure why I never actually tried implementing it | 16:52 | |
| lizmat | [Coke]: you need to pull, build (fails), pull again, then it builds (in my recollection) | ||
| [Coke] | oh, that one. *whew* | 16:54 | |
| lizmat | yeah | 16:55 | |
| timo | on the other question, I would have to spend some time investigating how well we do in the case of getting "all matches" instead of just "the first match", though in order for "ok give me the next match" to correctly work we already have to be smart and keep backtracking information around and whatnot. there's possibly stuff we can do to optimize for cases where we know upfront that the user wants | 16:56 | |
| not just the next match, but every match? | |||
| lizmat | does that imply that a cursor object has already done the whole source, and it's just producing values from an internal array >? | 16:57 | |
| timo | I'm thinking of introducing a tiny "sanity check" .moarvm file in moarvm's repo as a test case to run after compiling but before installing so that we catch issues like "the jit literally explodes on the first frame it tries to run" before we go on to NQP, which should give better diagnostics in such cases (i.e. moar doesn't install in the first place, so you don't waste time looking for a problem | ||
| with nqp) | |||
| lizmat: not exactly "the whole source", just "until the first successful match" and when you ask for the next one it's effectively the same behaviour as if an assertion had given "failed match" after the last step | 16:58 | ||
| if you match "abcdaaaaaaabcd" against / a+? / you'll get the lonely "a" first, so the cursor hasn't looked further than the second character, but for the second a and then aa aaa aaaa etc the cursor would have of course seen a bit more of the string | 16:59 | ||
| lizmat | ah, ok... understood | ||
| timo | if we know up front that you want every possible match, "a+?" could immediately take the stretch of as and generate all combinations of position + length that sits inside the stretch of as | 17:00 | |
| the question is, how hard is it to do the same kind of optimization when the regex is more complicated than just a single quantifier | 17:01 | ||
| if the regex is /a+? b/ then there's no reason to generate any results for the a stretch that don't end at the "b", so the naive approach would immediately be very wasteful | |||
| lizmat | ack | 17:02 | |
| timo | the regex optimizer (static, part of perl6/optimizer or raku/optimizer) could look for patterns where something like that is useful, which of course is a little cost for every regex and a bit of waste every time it doesn't help, which we would want to weigh against the benefits etc etc etc :) | 17:04 | |
| but also, i haven't read the article yet, it may be about something else than i've been thinking of entirely | |||
| someone i follow on the fediverse is quite vocal about how good the approach of hyperscan (presumably also vectorscan?) is for how an application would use it, it's kind of callback based AIUI | 17:07 | ||
| the way action methods and code assertions / embedded curly blocks inside raku rules work are very slightly similar | |||
| but also, we don't promise linear time matching in the first place. that doesn't mean the optimizations done in RE# can't apply for us though | 17:09 | ||
| the first example given is matching / .* a || b / (this is the raku equivalent) against a string of bs and getting all results; 16k bs in a row take 5.4s but 32k bs in a row takes 21.7s, so that's a quadrupling of time for doubling of input, i.e. almost exactly quadratic | 17:18 | ||
| using | instead of || funny enough is almost 2x faster, but also grows quadratically (5.9s for 32k, 23.4s for 64k) | 17:20 | ||
| in any case, one of the bits in the beginning of the article is that a user-supplied pattern can DOS your application by easily giving you exponential run time, and even regex engines that promise linear time matches get quadratic if you go from "give me the first match" to "give me all matches" when they don't necessarily need to | 17:25 | ||
| taking a raku regex from a possibly adversarial user is already immediately obviously a really really bad idea, and we never ever tell people that's a fine thing to want to do | 17:26 | ||
| among other things that's because raku regexes don't pretend not to be code | |||
| lizmat | well,, if a regex is received as a string, the only way to work with it, is to interpolate it as a string, no? | 17:29 | |
| so at least the code bit would be disarmed? or not? am I missing something? | |||
| timo | so treating the regex as if it were just a piece of text? | 17:30 | |
| lizmat | hmm... yeah, you're right, that wouldn't make sense if you ask people to enter a regex | 17:32 | |
| timo | yep | ||
| you could have a whitelist approach for allowing a very limited range of regexes as input | |||
| remember when I was musing about the case where possibly-zero-length matches were being quantified, leading to an infinite loop? I think one of the things that came up was checking at run time if we're in a pathological case right now and aborting | 17:34 | ||
| lizmat | actually, I have been working on just that in RakuAST at CHECK time | 17:35 | |
| timo | i feel like it shouldn't be so hard to generate regex code in a way that can call back into user code regularly to let the user decide if aborting is a good idea | ||
| ah? something like a way to go from a string with a regex source code in it to a RakuAST of the regex without the danger of code from inside the input being executed? | 17:36 | ||
| lizmat | my initial goal was simpler, actually, but related | ||
| / :i / e.g. will match anything | 17:37 | ||
| timo | isn't raku meant to reject null regexes? | 17:38 | |
| lizmat | the modifier disables the empty regex compile time check | ||
| "it's not empty if it has a modifier" :-) | |||
| timo | i always thought of that as a trap-door if what you really really want is a regex that just immediately returns "matched" | ||
| i have been hoping that I could look into generating quick-rejection bits for regexes in an optimizer pass or something ... I've been hoping to do that for like, a decade now it almost feels like | 17:40 | ||
| there's many cool ideas in my head that probably only ever work in extremely synthetic situations | 17:42 | ||
| lizmat | please put those ideas in a gist somewhere | ||
| timo | oof. | ||
| lizmat | hehe... simple question :-) | 17:43 | |
| timo | like, if your regex looks something like / (blabla complicated stuff) "foobar" / then instead of going and doing the complicated stuff you could first look if there's literally any "foobar" in the target string and if not, you can immediately say "no match" | 17:44 | |
| if the "blabla complicated stuff" can be determined to have a specific minimum and maximum length that's maybe not too far apart, you can go from a match of "foobar" backwards the maximum match length and start searching there, and when you're closer than minimum match length away from the foobar you saw, you can skip past the "foobar" | 17:45 | ||
| in theory, if we've already been spending some amount of time in a regex match attempt against a string, we could kick off additional work to analyse the string in question and when the results are in, switch strategies and use results from the analysis | 17:47 | ||
| simplest idea, if you have a histogram of all characters in the string, you may find that some parts of the entire regex you're working with are literally unable to ever match | |||
| but making use of that kind of information also has a run time cost of course | 17:48 | ||
| lizmat | so, I assume we have a regex opt to look for a static string inside a larger string | 17:49 | |
| timo | we have an opt that turns "scan" followed by a literal into using the "index" op, yeah, that's fast string-in-string search | 17:50 | |
| but as soon as it's not a literal after a scan, that opt is out of the window :) | |||
| if it's more than one literal, also tough luck IIRC | |||
| lizmat | why? the opt would consist of failing the match if the string isn't there | 17:51 | |
| ? | |||
| timo | so / .* "hello" / → fast | ||
| / .* a? "hello" / → not as fast | 17:52 | ||
| / .* ["hello" | "goodbye"] / → not as fast | |||
| and it doesn't have to be literally .* in the regex, since regexes are by default not anchored | |||
| lizmat | ok, I think I'm going to frobnicate a bit on that :-) | 17:53 | |
| timo | i've been interested in a multi-needle haystack search op into moarvm | 17:55 | |
| lizmat | would that be multi-threaded? or [foo|bar|baz] checking "f" and "b" at a cursor position, then on "f" move to check on "o", and on "b" move to check for "a", etc ? | 18:00 | |
| and all of that wrapped in an NFA ? | |||
| timo | there's faster approaches to multi-needle haystack search than NFA | 18:01 | |
| but no, it wouldn't be multithreaded, i don't think you need that | 18:02 | ||
| there's an algorithm for needle-haystack-search called teddy that is also mentioned by name in the article MasterDuke linked | 18:37 | ||
| some cool SSE-related tricks in there to search a whole bunch of bytes at once | 18:38 | ||
| lizmat | which can be expressed in C-code without actually having to write machine code ? | 18:42 | |
| timo | but quite possibly wouldn't be trivial to apply to our 32bit-per-character storage type though | 18:53 | |
| you can write assembly intrinsics in C as if they were function calls so you don't have to write inline assembly | |||
| I have no idea if you can write the algorithm with regular boring C and get the compiler to give you the optimal instructions using SSE | 18:54 | ||
|
21:14
harrow left
21:25
[Coke] left
|
|||
| Voldenet | you can attempt to use auto-vectorization, but in practice it's very brittle and you end up writing code that's even worse to read | 21:37 | |
| iirc in both gcc and clang you get loop + slp parallelization with -O2 | 21:39 | ||
| in gcc SLP may work better in O3 thanks to aggressive loop unrolling | 21:40 | ||
|
21:41
harrow joined
|
|||
| Voldenet | but O3 may produce slower code overall (also thanks to aggressive loop unrolling :>) | 21:41 | |
| so imo it's better to use intrinsics explicitly if that's performance-critical | 21:43 | ||
| doesn't depend on compiler optimizations at all | 21:44 | ||
|
23:50
librasteve_ left
|
|||