patrickb I think I have an idea how to still make it work. I'll postpone the validation of whether the checksum is ok till the first breakpoint hit. 00:02
Down side: This won't work for files that I don't have the source locally. How should a user be expected to be able to usefully set a breakpoint without the source on the display? 00:03
timo we'll want to offer display of bytecode or something at some point 00:08
if someone is interested enough to build not just a disassembler (which is easy enough) but an actual decompiler, that could be helpful too 00:09
patrickb Can we possibly return a context handle (of the unit context) in the file loaded notification? 05:54
Ah, no, we obviously can't. If there is no frame with that context yet, then the context doesn't exist yet either... 06:06
Phew. I'm out of ideas...
Can we somehow make the debugserver break on all lines in a given file? Then we could do the business in the first breakpoint hit. 06:18
In the new file notification, is there a line number set? If not we'd be able to differentiate. 06:19
I've had a quick look. It seems, the line number is 1 for that notify_new_file suspend. :-( 12:22
lizmat looking at the number of levensthein implementations that we have in core, I wonder why we don't have that as an MoarVM op 16:07
21:08 librasteve_ joined
[Coke] B+1 21:36
er, +1
Voldenet Does it have to be levensthein distance? I'm feeling that something like `top_n_suggestions` could be more generalized 21:47
and wouldn't put any specific constraints on numeric ranges used for string distance (e.g. sift4 could be used) 21:49
timo could really be anything 21:50
i think we have some additional scoring stuff in there already compared to traditional levenshtein based on like case difference for example 21:51
like, foo_bar vs fooBar vs foo-bar vs FOOBAR vs FOO_BAR could all be considered a low distance from each other compared to changing a letter to a different letter 21:52
if we want to go fancy with it, we could also consider foo_bar vs bar_foo a smaller difference than it normally would be with just the regular algorithm
Voldenet that makes sense – it's not a typo that normal edit distance would get, but tokenized distance would 21:55
so every identifier would need to get converted into singular tokens, then per-token matching would have to be done, then combinations of these top matches could result in scores 21:57
> identifiers = 'foo_bar' -> <foo bar>; scoring = create_scoring(); for identifiers.tokens -> token { score_add(scoring, 'bar', token) }; score_finalize(scoring) 22:05
pseudocode of what I mean
heh, I feel like I've skipped a step where 'bar' gets extracted from initial identifier 22:06
> identifiers = tokenize('foo_bar') -> <foo bar>; scoring = create_scoring(); for tokenize('bar-foo') -> input_token { for identifiers.tokens -> token { score_add(scoring, input_token, token) }}; score_finalize(scoring) 22:07
I feel like tokenizing should be rakudo-implemented and scoring should be a bunch of ops 22:09
so that scoring object really is useless in any other context outside of ops, but scorings of various tokens in the same instances can be compared 22:12
timo why wouldn't we just have a single syscall that takes a target string, a list of candidates, and something to put results into 22:17
Voldenet well, the tokenization would require pre-building a table larger than the string (since it'd need N \0-terminated strings), but it depends on the use case 22:19
the initial idea I've had would just take string, list of candidates and number of results requested 22:20
but then it can't be anyhow pre-optimized for distance-matching
so, input and targets would need to get tokenized on every attempt 22:22
maybe `my $ctx := matching_table_new('foo_bar'); matching_table_add($ctx, 'another_identifier'); my $iterator := matching_table_top($ctx); while $iter { my $item = matching_table_pop($iter); say $item }` 22:27
low-level, doesn't do any implications about any datatypes, iterator can hold any data it needs to get N top results 22:29
timo how often do we expect to re-do distance scoring in a given process's lifetime? 22:31
Voldenet typically typo-suggestion emits the best guess and everything exists, but practically, every scope would need its matching table 22:36
perhaps linked to another one 22:37
in case user catches the error, maybe I'm overthinking this design 22:39
I'm looking at the github.com/rakudo/rakudo/blob/74da...umod#L1405 22:40
and perhaps suggestions should also have priority
timo ideally if the user catches the error we don't generate any suggestions at all unless they are explicitly accessed, for example by printing the exception out 22:44
Voldenet `my $ctx := matching_table_new(3); matching_table_add($ctx, 'foo_bar', 0); matching_table_add($ctx, 'another_identifier', 0); my $iterator := matching_table_top('some_typo', $ctx); while $iter { my $item = matching_table_pop($iter); say $item }` 22:49
in current case $ctx could be equivalent to `my @candidates := [[], [], []];`
I like the idea of having similarity matching inside the core, because it can be used outside of the core code 22:50
timo we do have StrDistance for use in the tr operator i think 23:12
which is something else, but at least tangentially related
23:18 librasteve_ left