Welcome to the main channel on the development of MoarVM, a virtual machine for NQP and Rakudo (moarvm.org). This channel is being logged for historical purposes.
Set by lizmat on 24 May 2021.
01:43 MasterDuke joined 06:47 sena_kun joined 07:01 sena_kun left 07:07 sena_kun joined 07:30 sena_kun left 08:32 harrow left 08:55 harrow joined
lizmat timo: I think the difference between .match(/foo/) and /foo/ is that in the .match case, the $/ is looked up by the .match method, 09:11
whereas with /foo/ it is passed as a raw argument to Regex.clone
timo somewhere in there something must be trying to $/.Bool or something 09:12
lizmat I wouldn't be surprised that the $/ in that case gets unhinged and writing to it will just randomly write into some memory location
I will try this theory by adding another method to Regex, not pass $/ but have it looked up like with .match, and adapt codegenning that way 09:13
timo well, it shouldn't be just some memory location, but if the Scalar gets shared around, maybe the type between one getlex and another changes and we wrongfully assume in speshed code that the type is known and use some low-level accesses on the value??
we should have enough guards for that though
lizmat so it basically codegens: if / lizmat /.clone(:topic($_), :slash($/)); 09:16
and running it like that, also fails
timo hmm. all of that is strange 09:22
does the result change with spesh turned off?
lizmat MVM_SPESH_DISABLE=1 right? 09:23
in that case, no 09:24
timo that's the right var yeah 09:27
you got different errors from me right? actual segfaults and panics and such? because with your reproduction example i only got the dispatcher errors and wrong total counts
lizmat no segfaults or panics in my sample code, nno 09:29
an occasional hang
ok, I just tried the different approach with $/ and $_ being looked up, and that also fails 09:30
I just realized that another difference between / foo / and .match(/foo/) is, is that in the former case the match is being done because the Regex object is being sunk 09:31
nvm... even with an explicit .Bool it crashes 09:38
timo: I've golfed the code down to 30 lines in gist.github.com/lizmat/d0f1eb60e77...91ba670daa 09:47
changing the use of ParaQueue to Channel, also produces the same errors 09:53
so I'd say it's not related to ParaQueue
managed to get the golf down to 14 lines 10:22
what appears to be important, is the outer .cue 10:23
also, just got this error:
Invocant of method 'Bool' must be a type object of type 'Mu', not an
object instance of type 'Match'. Did you forget a 'multi'?
which would indicate indeed a dispatch issue
what also appears to be important, is that the code *should* have some matches 10:24
changing the pattern to something that isn't found, and all is well 10:25
so a recap:
timo yeah i imagine the error "must be a ..." means the dispatch code first decided with the given arguments it must be one particular candidate, then tries to call the candidate, and dies because the arguments changed in between
lizmat the problem shows itself is: 10:26
- calling Bool on a Regex
- inside cued code inside cued code
- there must be an occasional match 10:27
- adding a "my $/" in either the for scope, or the inner .cue scope, evades the issue 10:28
further datapoint: this appears only on .Bool, not on .defined 10:53
aha! Regex.Bool turns out to be interesting reading 10:55
timo it checks $/? 10:58
lizmat ($!slash = topic.match(self)).Bool
updated gist again as to not need the REA file 11:00
timo oh, whoops, it modifies an attribute of the Regex object? but we clone that first, right? 11:01
lizmat yes
/ foo / codegens as /foo/.clone($_, $/)
timo oh, does it take the $/ as a Scalar and binds it, and the $/ Scalar is what's shared? 11:02
so the clone isn't helping for this particular case
lizmat Regex.clone sets $!topic and $!slash in the clone of the Regex object 11:03
raw, so basically its containers 11:04
but every scope should get its own, fresh $/ should it not? 11:05
hmm. intriguingL 11:07
m: my $/; { "foo" ~~ / foo / }; say $/
camelia Potential difficulties:
Redeclaration of symbol '$/'.
at <tmp>:1
------> my $/ā; { "foo" ~~ / foo / }; say $/
ļ½¢fooļ½£
lizmat m: { my $/; { "foo" ~~ / foo / }; say $/ }
camelia ļ½¢fooļ½£
lizmat so the inner / / is setting the $/ in the outer scope
and apparently the compunit has a hard definition of $/ that can be "found" 11:08
and it won't be found if the code is inside a .cue { } block
m: { "foo" ~~ / foo / }; say $/
camelia ļ½¢fooļ½£
lizmat m: { my $/; { "foo" ~~ / foo / } }; say $/ 11:09
camelia Nil
timo oh, i seem to recall something about taking the outer's $/, or was that $_? 11:10
lizmat $_ and $/ both, actually
timo i guess it does that for every scope that doesn't get its own $/, plus some optimization that leaves it out when it's proven to not be needed? 11:16
lizmat adding a "my $/" to ThreadPoolScheduler!run-one doesn't make the problem go away 11:23
timo i think it's probably more about lexical scopes than dynamic scopes?
so it would be a change near where the regex lives or is cloned or executed that determines it?
lizmat well, since it's only Regex.Bool that's causing the issues, and not Regex.defined, I'd say there's something in Regex.Bool doing it 11:24
aha! 11:25
m: $_ = "foo"; with / foo / { .say }
camelia / foo /
lizmat Regex.Bool is special!
m: $_ = "foo"; if / foo / { .say; say $/ } 11:26
camelia foo
ļ½¢fooļ½£
lizmat m: $_ = "foo"; with / foo / { .say; say $/ }
camelia / foo /
Nil
11:32 MasterDuke left
timo makes sense that .Bool does more work than .defined 11:32
Regex.Bool actually runs the regex against $_ or something, doesn't it?
lizmat ($!slash = topic.match(self)).Boo 11:33
l
where: my \topic = $!topic
ok, it turns out the outer .cue is not needed to cause the issue 11:37
golf now 11 lines 11:38
timo so we're getting the $/ from the for loop that's the same scalar every go-around so everything cued on the scheduler tries to share one $/? 11:39
because the block that we're cue-ing doesn't generate its own $/ and therefore just binds the parent scope's $/? 11:40
we should be able to put a $/.VAR.WHERE or $/.VAR.WHICH in a few places to see if we're looking at the same, or at different $/ scalars 11:48
lizmat will do that 11:49
meanwhile:
looking at: ($!slash = $!topic.match(self)).Bool
the last Bool is a Match.Bool or a Nil.Bool
so the theory that: 11:50
Cannot resolve caller Bool(Nil: ); none of these signatures matches:
(Match:U $:: *%_ --> Bool::False)
(Match:D $:: *%_)
timo .o( unless the topic in use is not a Str but some custom user class )
lizmat is caused by $/ having been changed when the dispatcher thinks it's static, makes a lot of sense
timo yeah that's my working theory
m: for ^1000 { my $lol = 99; await start { $lol = "hi"; $lol = [1, 2, 3]; $lol = Nil; $lol = %(:1a, :2b, :3c); }, start { for ^10 { my $result = so $lol } }; } 11:52
camelia ( no output )
timo m: for ^1000 { my $lol = 99; await start { for ^50 { $lol = "hi"; $lol = [1, 2, 3]; $lol = Nil; $lol = %(:1a, :2b, :3c); } }, start { for ^50 { my $result = so $lol } }; }
camelia An operation first awaited:
in block <unit> at <tmp> line 1

Died with the exception:
Type check failed in binding to parameter '<anon>'; expected Str but got Hash ({:a(1), :b(2), :c(3)})
in block at <tmp> line 1
timo ^- this is a big no-no do not do that you will get much pain 11:53
but the $lol here is explicitly shared and the responsibility of the user
with the $/ problem that you're experiencing, it's not so clear
lizmat right, so $/ is breaking the async contract 11:54
timo m: my @ex; for ^1000 { my $lol = 99; await start { for ^50 { $lol = "hi"; $lol = [1, 2, 3]; $lol = Nil; $lol = %(:1a, :2b, :3c); } }, start { for ^50 { my $result = so $lol } }; CATCH { default { @ex.push($_) } } }; say @ex.map({ .WHAT.^name }).Bag
camelia Bag(X::Multi::Ambiguous+{X::Await::Died}(30) X::Multi::NoMatch+{X::Await::Died}(25) X::Parameter::InvalidConcreteness+{X::Await::Died}(870) X::TypeCheck::Binding::Parameter+{X::Await::Died}(35))
timo i wouldn't call it "async" necessarily
m: my @ex; for ^1000 { my $lol = 99; await start { for ^50 { $lol = "hi"; $lol = [1, 2, 3]; $lol = Nil; $lol = %(:1a, :2b, :3c); } }, start { for ^50 { my $ldc = $lol<>; my $result = so $ldc } }; CATCH { default { @ex.push($_) } } }; say @ex.map({ .WHAT.^name }).Bag 11:55
camelia Bag()
timo ^- an early decont can help, since we then only read the scalar once at the start and not again later
but we do want to operate with the Scalar accessible most of the time when dealing with $/ 11:56
lizmat well, $/ needs to be set with the result of the match 11:58
otherwise a *lot* of spectest will fail :-)
timo one thing we can do is hold on to the result of $!topic.match(self) in a local variable, assign it into $!slash to make it available, and check the local variable for its .Bool, instead of the return value of the assignment which will be the scalar container with "what we think we just assigned" in it 11:59
lizmat ok so maybe a decont would work 12:00
timo it will at least not blow up
but
we may get the .Bool of what some other thread stored into the same $/ in the mean time if we're unlucky 12:01
lizmat aaaahhh... that would explain the higher than possible values that we've seen
timo right 12:02
this is probably a good idea to do it like that just in general
lizmat ok, the decont makes the dispatch issue go away
github.com/rakudo/rakudo/commit/29a032138c 12:18
and: github.com/rakudo/rakudo/issues/5626 12:21
timo right. the question i have now is, is it actually specified by the language definition that the { } block that you give to cue there is supposed to have a fresh $/ or not 12:25
lizmat good question: the design of $/ predates async thoughts :-) 12:26
timo when you manually cue a block on the scheduler, that's a bit more low-level than using start, which we would expect users to use when they want the scheduler to do something at some point, or the other APIs like Supply.interval or Promise.after or what it's called
as in, is using "cue" without knowing absolutely for certain what you're doing an expected footgun or not 12:29
of course raku shouldn't be full of footguns
lizmat hmmm indeed, the problem doesn't exist with start... 12:32
but it *does* with Promise.start...
timo start does a lot of work for its variables and also for dynamics
lizmat intriguing
timo but only the start { ...} syntax, not the Promise.start method
lizmat actually, Promise.start does the dynamics bits as well 12:33
timo also a potential problem when someone innocently refactors start {...} into Promise.start({...})
lizmat ok, looks like start { } does a Promise.start( nqp::p6capturelex(Block.clone) ) 12:37
timo is capturelex + Block.clone also something we do in code-gen when we just naturally enter a block that has inner blocks? 12:43
because i think it might bse
be*
lizmat ah, actually we do 12:44
timo i'm fuzzy on how this works exactly. i imagine nine knows a lot more about scopes and blocks and such
now i'm wondering if we have to make spesh a lot more conservative about removing type checks when deconting 13:23
even when we know the decont type of an object, if another thread goes and modifies it concurrently, we can corrupt memory, and corrupting memory is Very Bad. but how much do we actually want to sacrifice so that code that unsafely multithreads doesn't crash and burn 13:24
lizmat good question... with all of the experience of the past day, I'm adapting ParaSeq and Needle::Compile, and see if that results in a stabler environment 13:26
I mean, generally it is understood that you shouldn't be setting / reading the same lexical from multiple threads
it's just that $/ is one that goes under everybody's radar 13:27
timo the funny thing is that spesh is a lot more careful about lexicals
but here we're getting the Scalar directly passed around in arguments and attributes
nine I'd argue that if you write code that unsafely multithreads it *should* crash and burn - explosively
timo so many more of spesh's "check once, then run more optimized code" stuff applies 13:28
ideally the explosions are also reproducible under "rr record" :D
lizmat nine: many people would not consider / foo / unsafe code 13:29
timo yeah, the unsafe part is giving a single { } to cue multiple times where the outer defines the $/ that's shared
is there some involvement of the "lazily create $/ Scalar" optimization perhaps? 13:30
which i think we also have for $_
nine lizmat: it does not matter what they consider. It *is* unsafe. And as long as it is, it's better that people are aware of it early and can educate themselves on safe ways to use it. Much better than have us wipe that problem under the rug and have it appear only once every couple of months in a giant code base. 13:36
lizmat so you're saying I should revert github.com/rakudo/rakudo/commit/29a032138c ? 13:38
afk for a bit& 13:39
nine So we cannot easily get rid of the implicit $/. I do wonder however if we could give every block a specific $/ instead of only those with statement prefixes? 14:52
lizmat this used to be the case, if I remember correctly, and that was causing a significant slowdown *then*
but that was before we have newdisp 14:53
nine That sounds surprising. Why would having a potentially unused $/ variable declaration slow things down? 14:58
lizmat memory churnn 15:01
is what I remember 15:02
actually, we can NOT give each block its own $/
m: { "foo" ~~ / foo / }; say $/
camelia ļ½¢fooļ½£
lizmat m: { my $/; "foo" ~~ / foo / }; say $/ 15:03
camelia Nil
lizmat and that is spec
nine Maybe it would even help. If we mandate that *every* block has a $/ variable, then we could use the absence of such a variable a the sign that the Match object is not needed. I.e. we create a $/ lexical on every block, but only if it is actually accessed. And if the block does not contain that variable, we do not have to set it.
That is a spec that's worth re-visiting I'd say. I totally get "foo" ~~ /<foo>/; say $<foo>; But why the nested block thing? 15:04
lizmat m: if "foo" ~~ / foo / { say $/ } 15:06
camelia ļ½¢fooļ½£
lizmat m: if "foo" ~~ / foo / { my $/; say $/ } 15:07
camelia (Any)
nine That is quite the killer argument 15:10
lizmat yeah :-(
15:53 japhb left 15:54 japhb joined
timo yeah making that a special case that makes the block take $/ instead of $_ when a regex is involved sounds like a very WAT 16:38
asking people to write -> $/ { ... } everywhere also kind of sucks 16:39
even though that would immediately work
Geth MoarVM/coolroot: b4bb554f9f | (Timo Paulssen)++ | 99 files
rename the new MVMROOT to MVM_ROOT, keep old MVMROOT
17:57
MoarVM/coolroot: 41a44241d8 | (Timo Paulssen)++ | 100 files
An alternative to MVMROOT macros

where we don't have to put the code block inside the macro's argument list, which means the whole block is no longer considered a single statement by tools like gdb, profilers, and so on.
18:05
timo squished down into a single commit, rebased on top of latest main branch state, i think it's good to merge. rakudo doesn't need the patch that changes the code using MVMROOT any more since MVMROOT is now as it was before and the new one is MVM_ROOT 18:06
Geth MoarVM/coolroot: 233326cd1f | (Timo Paulssen)++ | 99 files
An alternative to MVMROOT macros

where we don't have to put the code block inside the macro's argument list, which means the whole block is no longer considered a single statement by tools like gdb, profilers, and so on.
18:07
timo and now i tossed the change to azure-pipelines.yml out, which should not be merged to main
i want to add a field to MVMInstance, but i'd like to put it where it fits better than at the end, but on the other hand it's kind of public API huh 18:42
patrickb The $/ misery keeps hitting hard. I remember a group whining session about this at the last RCS. The "-> $/ { ... } everywhere" idea also came up and was dismissed as "we can't do this, too late".
I acknowledge that it kills a bit of the whipuptotude of raku, but the $/ thing is one of the most magical corners in Raku that could use some demagicalization. 18:45
timo oh what luck! there is literally a 1 byte hole right where i want to put my 1 bit
nine I could warm up to the idea of having to write -> $/ there. It's 6 characters more that solve a lot of problems. 18:46
patrickb I tend to agree. I welcome ideas that make it work with less, but if we can't come up with a shorter solution, I'd welcome those 6 chars. From a learning-Raku perspective having $/ explicitly there is a good entry point to understand $0,... and Match objects. It's be a nice on ramp. 18:48
[Coke] Seems a reasonable thing to add to a new language version. 19:29
timo let's bring "use strict" back :P 19:42
[Coke] that does not seem like a reasonable thing to add to a new language version. 19:43
timo i would like to push the button on the coolroot pull request 19:56
[Coke] IANA core developer, but isn't it going to be confusing having two similarly named macros? Is there a plan to migrate to the new one eventually? 20:01
timo yeah i'd like to get rid of the old one at some point 20:02
[Coke] perhaps dumb question: Should the "get rid of" happen in this same PR?
timo no, the reason this doesn't get rid of it immediately is so that any code that #include "moar.h" for some reason can continue compiling without intervention 20:03
otherwise i'd just have named it the same
[Coke] ok.
I appreciate the caution, but: do we promise any stability in moarVM (vs. raku language level) ? 20:04
timo oh, now i can remove the comment again about the error message
i'm not sure
[Coke] ok. abundance of caution is good. 20:06
nine I don't think we have ever promised anything there.
timo nine: your Inline::something module is pretty much the only thing outside raku that we found
how do you feel about just having MVMROOT change how it's used?
nine I'm not sure how much usage Inline::Perl6 has seen so far. 20:07
timo are you testing it against newer rakudo versions regularly? 20:13
nine nope
I haven't looked at it in years 20:14
timo well, i'd like to just replace MVMROOT with coolroot immediately, and inside rakudo do a commit that bumps and immediately changes the tiny amount of code that uses the old MVMROOT syntax to the new one 20:15
nine Would be fine with me. 20:19
Let's find out whether someone actually uses Inline::Perl6 and upgrades rakudo regularly :D 20:20
timo i've been changing debug log stuff in spesh around to put more stuff into a single zstd compression call and got the compression ratio from around 9 to about 15.3 20:37
when compressing it with just the zstd commandline utility, which can use the entire file for context, it gets a lot better, but of course then i can't efficiently seek in the file 20:41
ab5tract I convinced a colleague to bundle Perl 6 (as it was back then) as an installable package based on the justification that it was needed to package Inline::Perl6
That module never made it into production, but for a good while I was able to yum install perl6 on any box and I considered that a significant win :) 20:42
timo funny enough level 0 gets better compression than level 1 at faster compression *and* decompression speed 20:46
Frames Skips Compressed Uncompressed Ratio Check Filename 20:49
21397 1 104 MiB 1.56 GiB 15.342 None /tmp/speshlog.blorb.3303317.txt.zst
this is how it comes out of spesh right now. the ratios i'm seeing while benchmarking range from 17x to 56x 20:50
gist.github.com/timo/490f0e2394f9b...7531b76e39 20:52
ab5tract timo: sick! 21:00
timo spesh logs are highly, highly repetetive 21:01
21:25 sena_kun joined
Geth MoarVM/zstd_speshlog: a59905fd23 | (Timo Paulssen)++ | 4 files
compress spesh log output with zstd
21:46
MoarVM/zstd_speshlog: 5e2cf9cdc1 | (Timo Paulssen)++ | 8 files
check SPESH_LOG for .zst, store flag, fix truncated bits
MoarVM/zstd_speshlog: 1c145b0474 | (Timo Paulssen)++ | 6 files
DumpStr->MVMDumpStr, keep more stuff in one ds to zstd at once

This gives drastically better compression rate. We can use the spots where one frame ends and another starts to efficiently skip through the compressed file and identify semantically relevant pieces even when loading really big spesh logs (like the core setting compilation generating a ~1.5GiB file)
timo oops, that says "rate" but i mean "ratio"
Geth MoarVM/zstd_speshlog: 487dd75cf9 | (Timo Paulssen)++ | src/spesh/optimize.c
Add a append_null i forgot
21:49
timo neat, my "skip over zstd frames" script takes just 0.38s (including rakudo startup, and module loading) to skip through (and of course record the start/end positions of the frames in) the 105MB big file (that's really representing 1.5 gigabytes of data) 21:57
correction, i might not actually be getting everything? 22:03
actually takes 0.9s 22:07
23:09 sena_kun left