Welcome to the main channel on the development of MoarVM, a virtual machine for NQP and Rakudo (moarvm.org). This channel is being logged for historical purposes.
Set by lizmat on 24 May 2021.
00:02 reportable6 left 00:05 reportable6 joined 01:05 linkable6 left, evalable6 left 01:06 linkable6 joined 02:06 linkable6 left 02:08 linkable6 joined 05:00 bartolin_ left, samcv left, bartolin joined 05:01 samcv joined 05:06 evalable6 joined 06:02 reportable6 left 07:05 reportable6 joined
Geth MoarVM/new-disp-nativecall: 9677c720b0 | (Stefan Seifert)++ | build/Makefile.in
Add missing dependency on labels.h to Makefile.in
MoarVM/new-disp-nativecall: 90efe839a4 | (Stefan Seifert)++ | 5 files
Support passing unboxed nums to native routines
MoarVM/new-disp-nativecall: e3def6e6bc | (Stefan Seifert)++ | 6 files
JIT compile sp_runnativecall_v and sp_runnativecall_i
08:20 harrow left 08:27 harrow joined
lizmat o/ #moarvm 08:32
I'm tracing a weird issue with IRC::Log::Colabti and IRC::Channel::Log
the "coordinates" of an IRC message (aka: hour, minute, ordinal within minute, position within daily log) 08:33
are compacted into a single 64 bit native int to reduce memory footprint
this works fine on a single log 08:34
and it works fine on many logs (inside an IRC::Channel::Log object) as long as the log files are parsed consecutively
but as soon as they are parsed in parallel (using .race), the 64 bit value contains garbage 08:35
well, garbage, I got the strong impression that values from different threads are being or-ed
method TWEAK(int :$hour, int :$minute, int :$ordinal, int :$pos) { 08:37
$!hmop = ($hour +< 56) + ($minute +< 48) + ($ordinal +< 32) + $pos;
is the actual logic being used to pack 08:38
so I don't see how this could be leaking between threads? 08:39
lizmat goes back to 2021.09 to see whether it also happened then 08:40
nine How do you know that there's garbage in those ints? 08:41
Btw. since you expect those values to have certain ranges, you could make that more clear by using appropriately sized ints for them. I.e. int8 :$hour, int8 :$minute, int16 :$ordinal, int32 :$pos 08:43
lizmat nine: indeed I could, but I've already verified that the values are in range 08:45
nine It's more to make it clear to the reader
lizmat I understand... wouldn't that involve a runtime penalty? 08:46
nine Yes. But it may not even be measurable. We'd emit some more trunc_i/extend_i ops, but those are among the cheapest there are, at least when JITed 08:47
lizmat well, actually, I'd want them to be uint8 etc. 08:51
nine Oh, true 08:52
lizmat actually: garbage is incorrect, only the lower 32bits seem to be affected 08:55
which is why I never noticed it before, as the upper 32 bits are used to create the target strings, and that's what I used only so far 08:56
ok, the plot thickens: 09:19
if I replace the lexical keeping the position in IRC::Log::Colabti.parse by an Int (rather than an int)
there is no problem when running sequentially 09:20
but in parallel I get:
Died at:
Parameter '$a' of routine 'prefix:<++>' must be a type object of type
'Mu', not an object instance of type 'Int'. Did you forget a 'multi'?
note that $pos is initialized with: 09:21
my Int $pos = 0;
nine Have you checked on 2021.09? 09:22
lizmat tried to build 2021.09, but that failed :-(
NQP claiming it needed 2021.10 even though I told it to use 2021.09 09:23
perl Configure.pl --force-rebuild --gen-moar=2021.09 --gen-nqp=2021.09 --make-install;
aaah... wait... I know what I did wrong :-) 09:24
(with trying to compile 2021.09) 09:25
ok, this issue also exists on 2021.09, so is not new-disp related 09:29
nine What about spesh? 09:30
lizmat what about it? 09:36
nine Have you tried with spesh disabled? 09:37
lizmat MVM_SPESH_DISABLE=1 right ?
nine yes 09:38
lizmat no change
actually.... 09:39
with spesh disabled, and using an Int
it does complete (with wrong numbers)
with spesh enabled, it dies with various types of errors 09:41
all caused by the value becoming a type object 09:42
nine Looks like a spesh bug, quacks like a spesh bug... 09:44
lizmat so, should I file an issue on MoarVM or on rakudo ? 09:45
nine Since it walks like a spesh bug, I'd say MoarVM 09:46
lizmat oki] 09:49
nine Oh boy...looks like I'll have to change all the methods on Inline::Perl5::Interpreter to subs to be able to gain from natively typed results and for now even from JITing 09:53
lizmat meh, why doesn't that work on methods then (eventually) ? 09:54
jnthnwrthngtn Because method calls are late-bound so we can't at the callsite emit a dispatch_i or similar 09:55
nine Because they are late bound. The static optimizer cannot know the code object that will eventually get called on a method call. Therefore it cannot check it's returns and cannot copy that to the QAST::Op node. And this means that we cannot pick a natively typed dispatch op, but have to emit dispatch_o
jnthnwrthngtn lizmat: Does MVM_JIT_DISABLE=1 make the bug go away?
I'm wondering if it could be a machine code mis-gen (uses a 32-bit reg somewehre it should use a 64-bit one, thus losing the upper bits...) 09:56
lizmat MVM_JIT_DISABLE=1 does not make a difference 09:57
jnthnwrthngtn OK, was worth a try
lizmat running log parsing in parallel does
so it only happens when run in parallel
and it only affects the lower 32 bits
even with a degree => 2 it happens 09:59
Geth MoarVM/cheaper-frames: 2ae17407d2 | (Jonathan Worthington)++ | 5 files
Allocate ->work registers on the call stack

Thus saving us a pair of FSA calls to allocate and free the registers for every frame invocation. While microbenchmarks where we manage to inline everything will just get a bit more done while interpreting, a more realistic program will benefit from this. For example, the Rakudo build process completes in around 96.7% of the time it used to before ... (12 more lines)
MoarVM/cheaper-frames: 68ee0cb355 | (Jonathan Worthington)++ | 5 files
Allocate some frame environments on the callstack

Only do this for frames that live on the callstack rather than on the heap. This is the case when we have a lexical environment but it is never captured or the frame doesn't escape in other ways. When frames do escape, we have to also move the environment out of the callstack and onto the heap. We already do go to some effort to allocate on the heap ... (5 more lines)
MoarVM: jnthn++ created pull request #1578:
Allocate frame work and, when possible, environment registers on the callstack
10:33 linkable6 left 10:34 linkable6 joined
nine I think, I can actually have spesh translate dispatch_o to sp_dispatch_i followed by a box_i where appropriate. 10:39
jnthnwrthngtn Ah, and then let box/unbox elim rip it out? That's probably quite workable, yes 10:43
nine Exactly :)
jnthnwrthngtn I'm really curious why the difference is so big between machines for that frame cheapening. 10:44
nine I'm aiming at having the JIT of sp_runnativecall do the absolute minimum. It already supports Pointers via unboxing in the dispatch program.
jnthnwrthngtn Ah, cool 10:45
That sounds like a nice way to go
nine So, if you have something like takeapointer(giveapointer()); we'll end up not creating a Pointer object at all
jnthnwrthngtn Well...kinda...so long as we don't end up with pointers magically starting to unbox as a language feature
m: use NativeCall; my $p = Pointer.new(0); my int $i = $p 10:46
camelia ( no output )
jnthnwrthngtn oh :)
m: use NativeCall; my $p = Pointer.new(0); my int $i = $p; say $i
camelia 0
nine We're long past that point :D
jnthnwrthngtn Give to C what is Cs...
nine It's possible that we will even handle strings this way 10:47
via CStr
10:47 Altai-man joined
lizmat github.com/MoarVM/MoarVM/issues/1579 11:04
timo is the variable pos declared outside the race? 11:20
jnthnwrthngtn timo: I'm looking at conf progs a bit; for heap snapshots, I don't think it's currently powerful enough to do things like "take a snapshot only on major collections" or "take a snapshot at most once every 10 minutes"? 11:21
timo mh, thats possible, i think i wanted to add variables that can persist between runs for things like when was the last snapshot taken 11:25
only on major will just want a reserved variable name that gets that as like a parameter i guess 11:26
jnthnwrthngtn Yeah, I'm seeing there's getattr usage to expose information like that? 11:30
That could be used to expose "is it a full collection" but also "time since last snapshot" 11:31
Which is probably simpler than having persistent things :)
timo ah, indeed 11:38
worst case theres $*VM or something in rakudos libs/ to get a snapshot from user code iirc
lizmat timo: no, it is a lexical in IRC::Log::Colabti.parse 11:40
and IRC::Log::Colabti.parse is effectively what's being raced 11:42
11:52 evalable6 left, linkable6 left, linkable6 joined 11:53 evalable6 joined
timo this is not about log entries being added in the wrong order because of race? 12:00
lizmat no, the entries are all in the correct order
just the .pos value (aka the lower 32 bits of $!hmop) is garbage
the race is between dates, *not* within a date 12:01
timo how does the pos value look when you output it from tweak? 12:02
12:02 reportable6 left 12:03 reportable6 joined
lizmat he, from what I observed, they look correct 12:03
there's a *lot* of them 12:04
nine lizmat: how _do_ you know that the lower 32 bits of hmop are incorrect? 12:05
lizmat in two ways: by calling .pos on an entry, and in a debugging build, checking the $!hmop value 12:06
nine checking it how?
lizmat visually, by looking at all entries of a day and having it output the $!hmop value in hex 12:07
but let me build in a sanity check after parsing a log 12:08
nine Huh...another dispatch oddity: I can't find any dispatch_v generated for Raku code. All instances I find seem to be from NQP code or hand crafted ops 12:24
lizmat _v as in version ? 12:28
nine void
jnthnwrthngtn nine: Not that surprising, given we have to enforce sink context
nine I could go the same route and fix it in post^Wspesh 12:30
Btw. having spesh generate the box_i works wonderfully :)
Geth MoarVM/new-disp-nativecall: 18 commits pushed by (Stefan Seifert)++
review: github.com/MoarVM/MoarVM/compare/e...bfa4e5330f
12:45 frost left
lizmat stops looking at the pos issue to work on the RWN 12:49
nine 12.786s! 12:56
Now I hope that this is actually semantically correct.
jnthnwrthngtn nine: That is with or without JITting the native call? 12:57
Ah, looking at the diff, seems like "JITting some of them" 12:58
nine jnthnwrthngtn: so far I've implemented JITing of native calls with integer and double arguments and integer and void results.
jnthnwrthngtn ah, so potentially quite a few more things aren't yet JITted?
Geth MoarVM/new-disp-nativecall: a30b74d1b5 | (Stefan Seifert)++ | src/spesh/disp.c
Fixup "Translate dispatch to native functions"
MoarVM/new-disp-nativecall: 9ee6b1bfec | (Stefan Seifert)++ | src/spesh/disp.c
Simplify dispatch_o to sp_runnativecall_v if there's no result

We cannot generate dispatch_v instructions for Raku code because of sinking semantics. But for native functions with void result we can replace dispatch_o with sp_runnativecall_v as we know that they don't return anything.
nine This ^^^ is worth almost half a second
jnthnwrthngtn wow :) 13:02
nine jnthnwrthngtn: with boxing/unboxing in HLL code or via spesh, we can also fully JIT calls involving CPointers. For csv-ip5xs.pl the remaining unJITed calls are due to rw args. 13:03
jnthnwrthngtn Ah, OK. 13:04
I guess for this there's a lot of overhead that is not just the native calling
Will be interesting to see how database drivers fare; I'd hope for a good sppeedup on those where it's a C call for every value retrieved 13:05
nine Well the native calls with rw args are unfortunately the work horses, i.e. the actual calls to perl functions in this case. 13:10
jnthnwrthngtn Ah. I hadn't realized there was a lot of rw things there. OK, then there's still hope for maybe some more speedup then :) 13:13
timo making a readwrite integer be allocated on the stack via scalar replacement and giving the actual address to the native call 13:14
that seems very cheap
maybe not great to give the address of our actual stack region to some random c function
nine There's a lot of information I have to communicate back from those calls, i.e. return value, number of return values, type of return value and if there was an exception. 13:15
timo ok, so theres many rw arguments
nine I wonder if I can just apply the same trick here: have spesh rewrite the call, so the native function can write back into the natively typed input register and add the assign_i after the call. 13:16
Btw. just for comparison, test-t takes 57.241s on the same input file 13:18
jnthnwrthngtn You mean decont_i before, assign_i after, and pass a pointer to the MoarVM register or similar?
That could work and would keep complexity down (and let spesh deal with optimizing those reads/writes) 13:19
nine yes
timo thats the one the thirteen seconds from earlier are compared against?
nine timo: yes
timo well, test-t vs test-perl5-xs?
nine test-t vs csv-ip5xs
timo er, yes 13:20
how dare perl5 continues to beat us haha
jnthnwrthngtn With new-disp + #1578 it seems we finally are doing frame creation (as in, when we can't inline) faster than perl5 (or at least, that's what winning a recursive fibonaci benchmark would suggest) 13:22
timo i love hearing that!
jnthnwrthngtn Still losing on that one to Python and Ruby, alas.
nine So, we're faster by default _and_ can inline 13:23
What do they do to be faster still?
timo python does not inline at all tho
jnthnwrthngtn nine: I imagine their frame setup and/or calling convs are simply much more lightweight.
timo thats my intuition as well
they have neither rw args nor unwind specials of any kind, right 13:24
jnthnwrthngtn We're still quite costly there. The work/env allocation was one part of it, but we've more to do yet.
timo python i mean
is what i imagine
jnthnwrthngtn Yeah, I'm currently looking at making special unwind be a callstack record kind
So we get rid of the check there
timo i remember
13:33 ggoebel joined
ggoebel jnthnwrthngtn: is the code for the microbenchmark comparisions for raku vs perl5 vs python vs ruby in a repo somewhere? 13:34
jnthnwrthngtn ggoebel: Not a public one
ggoebel I think vrurg has mentioned the computer language benchmarks game before (benchmarksgame-team.pages.debian.n...ruby.html) but that is not accepting new languages. And their benchmarks are not too similar 13:36
timo and i still havent put anything up for the language dragrace actually 13:49
14:12 ggoebel left
timo oh, do we jit nurcfurc yet? 14:43
14:47 ggoebel joined
nine we do 14:49
[Coke] nurcfurc? 14:53
14:56 jdv left, jdv joined
nine runcfunc is just not funky enough 14:59
Oh darn... replacing args in spesh is not as easy as it looked because the callsite will still claim that an arg is an object arg 15:13
I guess I have to create a modified copy of the callsite, intern that and use the pointer to it in the ins 15:19
MVM_callsite_drop_positional and MVM_callsite_insert_positional look quite handy for that 15:21
And it works! 15:40
Now I just have to JIT those rw args
11.448s! 15:53
jnthnwrthngtn :D 15:55
japhb nine: What's the time for 2021.09 and 2021.10? (In other words, before new-disp, and after new-disp but before your changes) 16:18
17:45 Altai-man left
lizmat and yet another Rakudo Weekly News hits the Net: rakudoweekly.blog/2021/10/25/2021-43-thank-you/ 17:49
dogbert17 lizmat++ 17:58
nine japh: before new-disp between 14 and 15s. On 2021.10 at around 30s or so 18:00
18:02 reportable6 left
japhb Oh, interesting. So new-disp made it 2x slower, but you've managed to win that all back and then some. Very nice work, nine++! 18:04
dogbert17 I turned the flapping test, i.e. t/spec/S12-methods/lastcall.t, into github.com/MoarVM/MoarVM/issues/1580
japhb (Although I suppose it wasn't new-disp that made it slower, but that you had to rewrite some bits that no longer matched the rest of MoarVM, yes?) 18:05
nine It's even weirder than that. The major slow down comes from the trick with replacing the routine's $!do no longer working. Thing is, that I don't know why it worked before new-disp. I don't even really see a connection. 18:46
Anyway with dispatchers, the whole thing is on much more solid ground. Had a replacement for the trickery up within an hour or so as it's just another dispatcher. And JIT implementation is completely regular so far. 18:49
If you get the impression, that I love this architecture, you're spot on :)
lizmat ok, I think I have been able to golf my problem of earlier today down to about 30 lines without any external modules 19:00
19:02 linkable6 left, evalable6 left 19:04 linkable6 joined 19:05 reportable6 joined
japhb nine: nice! :-) 19:05
lizmat github.com/MoarVM/MoarVM/issues/15...-951223064 # golfed version of corruption issue 19:08
MasterDuke m: say -2**63; say abs -2**63; my int $a = -2**63; say $a; my int $b = abs $a; say $b # bug? or just DIHWIDT? 19:09
camelia -9223372036854775808
lizmat sanity check: we're allowed to have methods defined inside methods, no? 19:11
japhb MasterDuke: We handle native ints and uints incorrectly in some cases. That is one of them. My evaluation perusing the code is that we've been playing fast and loose with numbers near the size boundaries for a while, but in a LOT of places. It will take concerted cleanup work to get rid of all of those. :-( 19:15
MasterDuke true 19:16
japhb MasterDuke: It's the reason for stuff like this: github.com/japhb/CBOR-Simple/blob/...#L206-L214 19:17
MasterDuke but my understanding is that the example above is expected behavior for C (haven't tried/don't know about other languages) and i don't know if this is a case of incorrect behavior for raku
i've thought about resurrecting my attempt to make *_u version of ops, but it's another giant project and i should probably finish some of the other ones i have that are somewhat close to being done 19:19
japhb Boy, do I understand that thought.
19:19 ggoebel left
japhb has another one of those "What did I get myself into?" thoughts staring at vt100.net/emu/dec_ansi_parser 19:19
MasterDuke lizmat: fyi, your link for the "Thanks to VMIL2021" tweet just goes to the #rakulang search 19:26
lizmat argh 19:27
Fixed # MasterDuke++ 19:29
MasterDuke i can never remember, is there a way to tell if something is an int vs an Int in raku (or nqp)? 19:36
m: use nqp; say -2**63; my int $a = -2**63; say nqp::isbig_I(nqp::decont($a)); say nqp::isbig_I($a); # this doesn't seem right 19:41
camelia -9223372036854775808
This representation (NativeRef) cannot unbox to other types (for type IntLexRef)
in block <unit> at <tmp> line 1

MasterDuke m: use nqp; say -2**63; my int $a = -2**63; say nqp::isbig_I(nqp::decont($a)); # this doesn't seem right 19:42
camelia -9223372036854775808
MasterDuke m: use nqp; say -2**61; say abs -2**61; my int $a = -2**61; say nqp::isbig_I(nqp::decont($a)) # this doesn't seem right either 19:55
camelia -2305843009213693952
MasterDuke but i guess because it's expecting to take an object and just check if the value is a certain size, it's not really about int vs Int 19:57
19:57 ggoebel joined
MasterDuke m: use nqp; my Int $a = 2**30; say nqp::isint(nqp::decont($a)); say nqp::isint($a); 20:01
camelia 0
MasterDuke m: use nqp; my int $a = 2**30; say nqp::isint(nqp::decont($a)); say nqp::isint($a);
camelia 0
20:03 evalable6 joined
MasterDuke nqp: my int $a := 2**30; say(nqp::isint($a)); say(nqp::isint(~$a)) # oh, guess it's really only supposed to be used in nqp 20:04
camelia 1
nine japhb: I'd love to see our uint handling cleaned up 20:21
11.164s! thanks to 4 more lines getting us JIT support for passing VMArrays. That's actually the first time I'm reusing a bit of the old NativeCall JIT implementation 20:38
With this 100 % of the hot native calls in csv-ip5xs.pl are JIT compiled 20:40
MasterDuke nine: at github.com/MoarVM/MoarVM/commit/91...ea7dbR3922 are *_[sn] going to be added to the `case` later? 20:41
nine Yes they are still todo 20:42
I don't yet know how to deal with strings. Ironically csv-ip5xs doesn't actually pass strings, but encode them on the Raku side to a Buf and pass that (hence the support for VMArray I just implemented)
21:00 linkable6 left 21:01 linkable6 joined 21:02 linkable6 left, linkable6 joined 21:30 ggoebel left