Welcome to the main channel on the development of MoarVM, a virtual machine for NQP and Rakudo (moarvm.org). This channel is being logged for historical purposes. Set by lizmat on 24 May 2021. |
|||
01:43
MasterDuke joined
06:47
sena_kun joined
07:01
sena_kun left
07:07
sena_kun joined
07:30
sena_kun left
08:32
harrow left
08:55
harrow joined
|
|||
lizmat | timo: I think the difference between .match(/foo/) and /foo/ is that in the .match case, the $/ is looked up by the .match method, | 09:11 | |
whereas with /foo/ it is passed as a raw argument to Regex.clone | |||
timo | somewhere in there something must be trying to $/.Bool or something | 09:12 | |
lizmat | I wouldn't be surprised that the $/ in that case gets unhinged and writing to it will just randomly write into some memory location | ||
I will try this theory by adding another method to Regex, not pass $/ but have it looked up like with .match, and adapt codegenning that way | 09:13 | ||
timo | well, it shouldn't be just some memory location, but if the Scalar gets shared around, maybe the type between one getlex and another changes and we wrongfully assume in speshed code that the type is known and use some low-level accesses on the value?? | ||
we should have enough guards for that though | |||
lizmat | so it basically codegens: if / lizmat /.clone(:topic($_), :slash($/)); | 09:16 | |
and running it like that, also fails | |||
timo | hmm. all of that is strange | 09:22 | |
does the result change with spesh turned off? | |||
lizmat | MVM_SPESH_DISABLE=1 right? | 09:23 | |
in that case, no | 09:24 | ||
timo | that's the right var yeah | 09:27 | |
you got different errors from me right? actual segfaults and panics and such? because with your reproduction example i only got the dispatcher errors and wrong total counts | |||
lizmat | no segfaults or panics in my sample code, nno | 09:29 | |
an occasional hang | |||
ok, I just tried the different approach with $/ and $_ being looked up, and that also fails | 09:30 | ||
I just realized that another difference between / foo / and .match(/foo/) is, is that in the former case the match is being done because the Regex object is being sunk | 09:31 | ||
nvm... even with an explicit .Bool it crashes | 09:38 | ||
timo: I've golfed the code down to 30 lines in gist.github.com/lizmat/d0f1eb60e77...91ba670daa | 09:47 | ||
changing the use of ParaQueue to Channel, also produces the same errors | 09:53 | ||
so I'd say it's not related to ParaQueue | |||
managed to get the golf down to 14 lines | 10:22 | ||
what appears to be important, is the outer .cue | 10:23 | ||
also, just got this error: | |||
Invocant of method 'Bool' must be a type object of type 'Mu', not an | |||
object instance of type 'Match'. Did you forget a 'multi'? | |||
which would indicate indeed a dispatch issue | |||
what also appears to be important, is that the code *should* have some matches | 10:24 | ||
changing the pattern to something that isn't found, and all is well | 10:25 | ||
so a recap: | |||
timo | yeah i imagine the error "must be a ..." means the dispatch code first decided with the given arguments it must be one particular candidate, then tries to call the candidate, and dies because the arguments changed in between | ||
lizmat | the problem shows itself is: | 10:26 | |
- calling Bool on a Regex | |||
- inside cued code inside cued code | |||
- there must be an occasional match | 10:27 | ||
- adding a "my $/" in either the for scope, or the inner .cue scope, evades the issue | 10:28 | ||
further datapoint: this appears only on .Bool, not on .defined | 10:53 | ||
aha! Regex.Bool turns out to be interesting reading | 10:55 | ||
timo | it checks $/? | 10:58 | |
lizmat | ($!slash = topic.match(self)).Bool | ||
updated gist again as to not need the REA file | 11:00 | ||
timo | oh, whoops, it modifies an attribute of the Regex object? but we clone that first, right? | 11:01 | |
lizmat | yes | ||
/ foo / codegens as /foo/.clone($_, $/) | |||
timo | oh, does it take the $/ as a Scalar and binds it, and the $/ Scalar is what's shared? | 11:02 | |
so the clone isn't helping for this particular case | |||
lizmat | Regex.clone sets $!topic and $!slash in the clone of the Regex object | 11:03 | |
raw, so basically its containers | 11:04 | ||
but every scope should get its own, fresh $/ should it not? | 11:05 | ||
hmm. intriguingL | 11:07 | ||
m: my $/; { "foo" ~~ / foo / }; say $/ | |||
camelia | Potential difficulties: Redeclaration of symbol '$/'. at <tmp>:1 ------> my $/ā; { "foo" ~~ / foo / }; say $/ ļ½¢fooļ½£ |
||
lizmat | m: { my $/; { "foo" ~~ / foo / }; say $/ } | ||
camelia | ļ½¢fooļ½£ | ||
lizmat | so the inner / / is setting the $/ in the outer scope | ||
and apparently the compunit has a hard definition of $/ that can be "found" | 11:08 | ||
and it won't be found if the code is inside a .cue { } block | |||
m: { "foo" ~~ / foo / }; say $/ | |||
camelia | ļ½¢fooļ½£ | ||
lizmat | m: { my $/; { "foo" ~~ / foo / } }; say $/ | 11:09 | |
camelia | Nil | ||
timo | oh, i seem to recall something about taking the outer's $/, or was that $_? | 11:10 | |
lizmat | $_ and $/ both, actually | ||
timo | i guess it does that for every scope that doesn't get its own $/, plus some optimization that leaves it out when it's proven to not be needed? | 11:16 | |
lizmat | adding a "my $/" to ThreadPoolScheduler!run-one doesn't make the problem go away | 11:23 | |
timo | i think it's probably more about lexical scopes than dynamic scopes? | ||
so it would be a change near where the regex lives or is cloned or executed that determines it? | |||
lizmat | well, since it's only Regex.Bool that's causing the issues, and not Regex.defined, I'd say there's something in Regex.Bool doing it | 11:24 | |
aha! | 11:25 | ||
m: $_ = "foo"; with / foo / { .say } | |||
camelia | / foo / | ||
lizmat | Regex.Bool is special! | ||
m: $_ = "foo"; if / foo / { .say; say $/ } | 11:26 | ||
camelia | foo ļ½¢fooļ½£ |
||
lizmat | m: $_ = "foo"; with / foo / { .say; say $/ } | ||
camelia | / foo / Nil |
||
11:32
MasterDuke left
|
|||
timo | makes sense that .Bool does more work than .defined | 11:32 | |
Regex.Bool actually runs the regex against $_ or something, doesn't it? | |||
lizmat | ($!slash = topic.match(self)).Boo | 11:33 | |
l | |||
where: my \topic = $!topic | |||
ok, it turns out the outer .cue is not needed to cause the issue | 11:37 | ||
golf now 11 lines | 11:38 | ||
timo | so we're getting the $/ from the for loop that's the same scalar every go-around so everything cued on the scheduler tries to share one $/? | 11:39 | |
because the block that we're cue-ing doesn't generate its own $/ and therefore just binds the parent scope's $/? | 11:40 | ||
we should be able to put a $/.VAR.WHERE or $/.VAR.WHICH in a few places to see if we're looking at the same, or at different $/ scalars | 11:48 | ||
lizmat | will do that | 11:49 | |
meanwhile: | |||
looking at: ($!slash = $!topic.match(self)).Bool | |||
the last Bool is a Match.Bool or a Nil.Bool | |||
so the theory that: | 11:50 | ||
Cannot resolve caller Bool(Nil: ); none of these signatures matches: | |||
(Match:U $:: *%_ --> Bool::False) | |||
(Match:D $:: *%_) | |||
timo | .o( unless the topic in use is not a Str but some custom user class ) | ||
lizmat | is caused by $/ having been changed when the dispatcher thinks it's static, makes a lot of sense | ||
timo | yeah that's my working theory | ||
m: for ^1000 { my $lol = 99; await start { $lol = "hi"; $lol = [1, 2, 3]; $lol = Nil; $lol = %(:1a, :2b, :3c); }, start { for ^10 { my $result = so $lol } }; } | 11:52 | ||
camelia | ( no output ) | ||
timo | m: for ^1000 { my $lol = 99; await start { for ^50 { $lol = "hi"; $lol = [1, 2, 3]; $lol = Nil; $lol = %(:1a, :2b, :3c); } }, start { for ^50 { my $result = so $lol } }; } | ||
camelia | An operation first awaited: in block <unit> at <tmp> line 1 Died with the exception: Type check failed in binding to parameter '<anon>'; expected Str but got Hash ({:a(1), :b(2), :c(3)}) in block at <tmp> line 1 |
||
timo | ^- this is a big no-no do not do that you will get much pain | 11:53 | |
but the $lol here is explicitly shared and the responsibility of the user | |||
with the $/ problem that you're experiencing, it's not so clear | |||
lizmat | right, so $/ is breaking the async contract | 11:54 | |
timo | m: my @ex; for ^1000 { my $lol = 99; await start { for ^50 { $lol = "hi"; $lol = [1, 2, 3]; $lol = Nil; $lol = %(:1a, :2b, :3c); } }, start { for ^50 { my $result = so $lol } }; CATCH { default { @ex.push($_) } } }; say @ex.map({ .WHAT.^name }).Bag | ||
camelia | Bag(X::Multi::Ambiguous+{X::Await::Died}(30) X::Multi::NoMatch+{X::Await::Died}(25) X::Parameter::InvalidConcreteness+{X::Await::Died}(870) X::TypeCheck::Binding::Parameter+{X::Await::Died}(35)) | ||
timo | i wouldn't call it "async" necessarily | ||
m: my @ex; for ^1000 { my $lol = 99; await start { for ^50 { $lol = "hi"; $lol = [1, 2, 3]; $lol = Nil; $lol = %(:1a, :2b, :3c); } }, start { for ^50 { my $ldc = $lol<>; my $result = so $ldc } }; CATCH { default { @ex.push($_) } } }; say @ex.map({ .WHAT.^name }).Bag | 11:55 | ||
camelia | Bag() | ||
timo | ^- an early decont can help, since we then only read the scalar once at the start and not again later | ||
but we do want to operate with the Scalar accessible most of the time when dealing with $/ | 11:56 | ||
lizmat | well, $/ needs to be set with the result of the match | 11:58 | |
otherwise a *lot* of spectest will fail :-) | |||
timo | one thing we can do is hold on to the result of $!topic.match(self) in a local variable, assign it into $!slash to make it available, and check the local variable for its .Bool, instead of the return value of the assignment which will be the scalar container with "what we think we just assigned" in it | 11:59 | |
lizmat | ok so maybe a decont would work | 12:00 | |
timo | it will at least not blow up | ||
but | |||
we may get the .Bool of what some other thread stored into the same $/ in the mean time if we're unlucky | 12:01 | ||
lizmat | aaaahhh... that would explain the higher than possible values that we've seen | ||
timo | right | 12:02 | |
this is probably a good idea to do it like that just in general | |||
lizmat | ok, the decont makes the dispatch issue go away | ||
github.com/rakudo/rakudo/commit/29a032138c | 12:18 | ||
and: github.com/rakudo/rakudo/issues/5626 | 12:21 | ||
timo | right. the question i have now is, is it actually specified by the language definition that the { } block that you give to cue there is supposed to have a fresh $/ or not | 12:25 | |
lizmat | good question: the design of $/ predates async thoughts :-) | 12:26 | |
timo | when you manually cue a block on the scheduler, that's a bit more low-level than using start, which we would expect users to use when they want the scheduler to do something at some point, or the other APIs like Supply.interval or Promise.after or what it's called | ||
as in, is using "cue" without knowing absolutely for certain what you're doing an expected footgun or not | 12:29 | ||
of course raku shouldn't be full of footguns | |||
lizmat | hmmm indeed, the problem doesn't exist with start... | 12:32 | |
but it *does* with Promise.start... | |||
timo | start does a lot of work for its variables and also for dynamics | ||
lizmat | intriguing | ||
timo | but only the start { ...} syntax, not the Promise.start method | ||
lizmat | actually, Promise.start does the dynamics bits as well | 12:33 | |
timo | also a potential problem when someone innocently refactors start {...} into Promise.start({...}) | ||
lizmat | ok, looks like start { } does a Promise.start( nqp::p6capturelex(Block.clone) ) | 12:37 | |
timo | is capturelex + Block.clone also something we do in code-gen when we just naturally enter a block that has inner blocks? | 12:43 | |
because i think it might bse | |||
be* | |||
lizmat | ah, actually we do | 12:44 | |
timo | i'm fuzzy on how this works exactly. i imagine nine knows a lot more about scopes and blocks and such | ||
now i'm wondering if we have to make spesh a lot more conservative about removing type checks when deconting | 13:23 | ||
even when we know the decont type of an object, if another thread goes and modifies it concurrently, we can corrupt memory, and corrupting memory is Very Bad. but how much do we actually want to sacrifice so that code that unsafely multithreads doesn't crash and burn | 13:24 | ||
lizmat | good question... with all of the experience of the past day, I'm adapting ParaSeq and Needle::Compile, and see if that results in a stabler environment | 13:26 | |
I mean, generally it is understood that you shouldn't be setting / reading the same lexical from multiple threads | |||
it's just that $/ is one that goes under everybody's radar | 13:27 | ||
timo | the funny thing is that spesh is a lot more careful about lexicals | ||
but here we're getting the Scalar directly passed around in arguments and attributes | |||
nine | I'd argue that if you write code that unsafely multithreads it *should* crash and burn - explosively | ||
timo | so many more of spesh's "check once, then run more optimized code" stuff applies | 13:28 | |
ideally the explosions are also reproducible under "rr record" :D | |||
lizmat | nine: many people would not consider / foo / unsafe code | 13:29 | |
timo | yeah, the unsafe part is giving a single { } to cue multiple times where the outer defines the $/ that's shared | ||
is there some involvement of the "lazily create $/ Scalar" optimization perhaps? | 13:30 | ||
which i think we also have for $_ | |||
nine | lizmat: it does not matter what they consider. It *is* unsafe. And as long as it is, it's better that people are aware of it early and can educate themselves on safe ways to use it. Much better than have us wipe that problem under the rug and have it appear only once every couple of months in a giant code base. | 13:36 | |
lizmat | so you're saying I should revert github.com/rakudo/rakudo/commit/29a032138c ? | 13:38 | |
afk for a bit& | 13:39 | ||
nine | So we cannot easily get rid of the implicit $/. I do wonder however if we could give every block a specific $/ instead of only those with statement prefixes? | 14:52 | |
lizmat | this used to be the case, if I remember correctly, and that was causing a significant slowdown *then* | ||
but that was before we have newdisp | 14:53 | ||
nine | That sounds surprising. Why would having a potentially unused $/ variable declaration slow things down? | 14:58 | |
lizmat | memory churnn | 15:01 | |
is what I remember | 15:02 | ||
actually, we can NOT give each block its own $/ | |||
m: { "foo" ~~ / foo / }; say $/ | |||
camelia | ļ½¢fooļ½£ | ||
lizmat | m: { my $/; "foo" ~~ / foo / }; say $/ | 15:03 | |
camelia | Nil | ||
lizmat | and that is spec | ||
nine | Maybe it would even help. If we mandate that *every* block has a $/ variable, then we could use the absence of such a variable a the sign that the Match object is not needed. I.e. we create a $/ lexical on every block, but only if it is actually accessed. And if the block does not contain that variable, we do not have to set it. | ||
That is a spec that's worth re-visiting I'd say. I totally get "foo" ~~ /<foo>/; say $<foo>; But why the nested block thing? | 15:04 | ||
lizmat | m: if "foo" ~~ / foo / { say $/ } | 15:06 | |
camelia | ļ½¢fooļ½£ | ||
lizmat | m: if "foo" ~~ / foo / { my $/; say $/ } | 15:07 | |
camelia | (Any) | ||
nine | That is quite the killer argument | 15:10 | |
lizmat | yeah :-( | ||
15:53
japhb left
15:54
japhb joined
|
|||
timo | yeah making that a special case that makes the block take $/ instead of $_ when a regex is involved sounds like a very WAT | 16:38 | |
asking people to write -> $/ { ... } everywhere also kind of sucks | 16:39 | ||
even though that would immediately work | |||
Geth | MoarVM/coolroot: b4bb554f9f | (Timo Paulssen)++ | 99 files rename the new MVMROOT to MVM_ROOT, keep old MVMROOT |
17:57 | |
MoarVM/coolroot: 41a44241d8 | (Timo Paulssen)++ | 100 files An alternative to MVMROOT macros where we don't have to put the code block inside the macro's argument list, which means the whole block is no longer considered a single statement by tools like gdb, profilers, and so on. |
18:05 | ||
timo | squished down into a single commit, rebased on top of latest main branch state, i think it's good to merge. rakudo doesn't need the patch that changes the code using MVMROOT any more since MVMROOT is now as it was before and the new one is MVM_ROOT | 18:06 | |
Geth | MoarVM/coolroot: 233326cd1f | (Timo Paulssen)++ | 99 files An alternative to MVMROOT macros where we don't have to put the code block inside the macro's argument list, which means the whole block is no longer considered a single statement by tools like gdb, profilers, and so on. |
18:07 | |
timo | and now i tossed the change to azure-pipelines.yml out, which should not be merged to main | ||
i want to add a field to MVMInstance, but i'd like to put it where it fits better than at the end, but on the other hand it's kind of public API huh | 18:42 | ||
patrickb | The $/ misery keeps hitting hard. I remember a group whining session about this at the last RCS. The "-> $/ { ... } everywhere" idea also came up and was dismissed as "we can't do this, too late". | ||
I acknowledge that it kills a bit of the whipuptotude of raku, but the $/ thing is one of the most magical corners in Raku that could use some demagicalization. | 18:45 | ||
timo | oh what luck! there is literally a 1 byte hole right where i want to put my 1 bit | ||
nine | I could warm up to the idea of having to write -> $/ there. It's 6 characters more that solve a lot of problems. | 18:46 | |
patrickb | I tend to agree. I welcome ideas that make it work with less, but if we can't come up with a shorter solution, I'd welcome those 6 chars. From a learning-Raku perspective having $/ explicitly there is a good entry point to understand $0,... and Match objects. It's be a nice on ramp. | 18:48 | |
[Coke] | Seems a reasonable thing to add to a new language version. | 19:29 | |
timo | let's bring "use strict" back :P | 19:42 | |
[Coke] | that does not seem like a reasonable thing to add to a new language version. | 19:43 | |
timo | i would like to push the button on the coolroot pull request | 19:56 | |
[Coke] | IANA core developer, but isn't it going to be confusing having two similarly named macros? Is there a plan to migrate to the new one eventually? | 20:01 | |
timo | yeah i'd like to get rid of the old one at some point | 20:02 | |
[Coke] | perhaps dumb question: Should the "get rid of" happen in this same PR? | ||
timo | no, the reason this doesn't get rid of it immediately is so that any code that #include "moar.h" for some reason can continue compiling without intervention | 20:03 | |
otherwise i'd just have named it the same | |||
[Coke] | ok. | ||
I appreciate the caution, but: do we promise any stability in moarVM (vs. raku language level) ? | 20:04 | ||
timo | oh, now i can remove the comment again about the error message | ||
i'm not sure | |||
[Coke] | ok. abundance of caution is good. | 20:06 | |
nine | I don't think we have ever promised anything there. | ||
timo | nine: your Inline::something module is pretty much the only thing outside raku that we found | ||
how do you feel about just having MVMROOT change how it's used? | |||
nine | I'm not sure how much usage Inline::Perl6 has seen so far. | 20:07 | |
timo | are you testing it against newer rakudo versions regularly? | 20:13 | |
nine | nope | ||
I haven't looked at it in years | 20:14 | ||
timo | well, i'd like to just replace MVMROOT with coolroot immediately, and inside rakudo do a commit that bumps and immediately changes the tiny amount of code that uses the old MVMROOT syntax to the new one | 20:15 | |
nine | Would be fine with me. | 20:19 | |
Let's find out whether someone actually uses Inline::Perl6 and upgrades rakudo regularly :D | 20:20 | ||
timo | i've been changing debug log stuff in spesh around to put more stuff into a single zstd compression call and got the compression ratio from around 9 to about 15.3 | 20:37 | |
when compressing it with just the zstd commandline utility, which can use the entire file for context, it gets a lot better, but of course then i can't efficiently seek in the file | 20:41 | ||
ab5tract | I convinced a colleague to bundle Perl 6 (as it was back then) as an installable package based on the justification that it was needed to package Inline::Perl6 | ||
That module never made it into production, but for a good while I was able to yum install perl6 on any box and I considered that a significant win :) | 20:42 | ||
timo | funny enough level 0 gets better compression than level 1 at faster compression *and* decompression speed | 20:46 | |
Frames Skips Compressed Uncompressed Ratio Check Filename | 20:49 | ||
21397 1 104 MiB 1.56 GiB 15.342 None /tmp/speshlog.blorb.3303317.txt.zst | |||
this is how it comes out of spesh right now. the ratios i'm seeing while benchmarking range from 17x to 56x | 20:50 | ||
gist.github.com/timo/490f0e2394f9b...7531b76e39 | 20:52 | ||
ab5tract | timo: sick! | 21:00 | |
timo | spesh logs are highly, highly repetetive | 21:01 | |
21:25
sena_kun joined
|
|||
Geth | MoarVM/zstd_speshlog: a59905fd23 | (Timo Paulssen)++ | 4 files compress spesh log output with zstd |
21:46 | |
MoarVM/zstd_speshlog: 5e2cf9cdc1 | (Timo Paulssen)++ | 8 files check SPESH_LOG for .zst, store flag, fix truncated bits |
|||
MoarVM/zstd_speshlog: 1c145b0474 | (Timo Paulssen)++ | 6 files DumpStr->MVMDumpStr, keep more stuff in one ds to zstd at once This gives drastically better compression rate. We can use the spots where one frame ends and another starts to efficiently skip through the compressed file and identify semantically relevant pieces even when loading really big spesh logs (like the core setting compilation generating a ~1.5GiB file) |
|||
timo | oops, that says "rate" but i mean "ratio" | ||
Geth | MoarVM/zstd_speshlog: 487dd75cf9 | (Timo Paulssen)++ | src/spesh/optimize.c Add a append_null i forgot |
21:49 | |
timo | neat, my "skip over zstd frames" script takes just 0.38s (including rakudo startup, and module loading) to skip through (and of course record the start/end positions of the frames in) the 105MB big file (that's really representing 1.5 gigabytes of data) | 21:57 | |
correction, i might not actually be getting everything? | 22:03 | ||
actually takes 0.9s | 22:07 | ||
23:09
sena_kun left
|