Welcome to the main channel on the development of MoarVM, a virtual machine for NQP and Rakudo (moarvm.org). This channel is being logged for historical purposes.
Set by lizmat on 24 May 2021.
00:02 reportable6 left 00:04 reportable6 joined
timo should probably decide if arg spesh should consider NativeRef valid for param_rp_i and friends 00:08
00:26 squashable6 left 00:38 squashable6 joined 00:46 squashable6 left 00:49 squashable6 joined 01:49 greppable6 left, evalable6 left, quotable6 left, tellable6 left, squashable6 left, bloatable6_ left, releasable6 left, coverable6 left, notable6 left, reportable6 left, linkable6 left, nativecallable6 left, bisectable6_ left, statisfiable6 left, shareable6_ left, unicodable6_ left, committable6_ left, benchable6 left, sourceable6_ left, evalable6 joined, shareable6 joined 01:50 quotable6 joined, nativecallable6 joined 01:51 reportable6 joined, statisfiable6 joined 01:52 committable6 joined 02:49 greppable6 joined, bloatable6 joined, linkable6 joined 02:50 unicodable6 joined, tellable6 joined, bisectable6 joined 02:51 sourceable6 joined, benchable6 joined 02:52 notable6 joined, ggoebel__ left, squashable6 joined 03:04 ggoebel__ joined 03:49 releasable6 joined 03:51 coverable6 joined 04:17 squashable6 left 04:19 squashable6 joined 04:20 squashable6 left 04:22 squashable6 joined 04:55 squashable6 left 05:57 squashable6 joined 06:02 reportable6 left
Nicholas good *, #moarvm 07:41
07:44 moon-child left, moon-child joined 07:58 ggoebel__ left
lizmat PSA: the Rakudo Weekly News will be published on Tuesday this week due to circumstances 08:57
nine good circumstances, #moarvm 09:02
09:03 reportable6 joined
lizmat ah, yes, good circumstances :-) 09:11
09:24 brrt joined
Nicholas good *, brrt 09:25
brrt good * Nicholas 09:52
Nicholas nine: were all the big endian systems that new-disp failed to build on for you 64 bit? 10:19
10:37 sena_kun joined
nine Nicholas: not sure about the armv7l: build.opensuse.org/project/show/ho...rakudo-git 10:44
But then that seems to have a different failure mode: it seems to get stuck in an endless loop
Nicholas that will be 32 little endian 10:45
nine So definitely something different then :)
Good news btw.: I've successfully used a dispatcher to get NativeCall's performance back. Still using those generated function bodies, but it's a step in the right direction. 10:46
Nicholas OK, I asume that s390x is. ppc64 certainly is 10:47
nine It even uncovered a bug that has been there for years, but wasn't visible because previously the first call to a native sub would always use the generic code and only use the generated function for the following calls
s390x is, I checked that when I first reported it here 10:48
Nicholas sparc64 fails in the same was as ppc64. So that didn't reveal any (earlier) alignment failures that ppc64 fakes around
ppc32 fails differently, but I can't build with ASAN or run with valgrind there
I also hacked all the spesh unions on x86_64 to add padding to put struct members in places like they would be on big endian, and nothing breaks 10:49
so, it's not an error of "using the wrong size inconsistently"
also, I dsabled the CGOTO runloop and we don't get any out of range ops
*that* is strange
we are failing on an opcode stream that only has in-range ops
but has an out of range value for a register passed to one op 10:50
and then deprecated ops
nine What does the dump of the failing bytecode look like?
Nicholas I don't know enough to know how to do that
nine Breakpoint on the NYI op, then: call MVM_dump_bytecode(tc) 10:51
Nicholas I assume also I need to build on master (which also failed, but differently) and then do this
I don't need a breakpoint. I just hack the C source to do that!
nine or that :)
Nicholas need master because you or timo fixed something about the dumper, IIRC
nine I added support to sp_resumption a week ago 10:52
Nicholas cool. the commit I'm testing doest' have that 10:53
10:56 brrt left
Nicholas OK, here's a structural question that I'm not confident that I know the answer to 10:58
where does the endian swapping happen for bytecode? In the bytecode valdiator?
IIRC the specialised bytecode is written to RAM and then read back from RAM, but the writer does the big-to-little swap. And I forget where the little-to-big swap is 11:00
nine Isn't that in memcpy_endian in src/core/bytecode.c? 11:01
Nicholas bytecode dump: paste.scsys.co.uk/595948 11:03
and I don't get why I don't see const_n32 in that dump 11:04
nine You're getting "const_n32 NYI"? 11:05
Nicholas yes, that's the error message somewhere near the end of that paste
hacked like this: 11:06
OP(const_n32):
+ MVM_dump_bytecode(tc);
MVM_exception_throw_adhoc(tc, "const_n32 NYI");
nine The "unknown type 0" messages are a bit worrysome. That usually means either broken bytecode or a bug in the bytecode dumper.
Nicholas I'm assuming the former until we rule it out
and memcpy_endian() etc in bytecode.c only seem to deal with the frame reads. There is no read_int64 in that file to swap 64 bit things in the bytecode stream 11:07
oh pants
I'm wrong
no, I might not be. that's extops that are using memcpy, (and not endian-swapping. odd?) 11:09
nine Might explain why it only happens in rakudo, not nqo 11:10
nqp
Nicholas ooh interesting yes
nine But then, why is it only an issue now? 11:11
And why is the commit introducing sp_assertparamcheck involved then?
IIRC you even said that ok is always 1 anyway 11:12
Nicholas I think I need to go and eat lunch before any more thinking is possible. 11:14
11:58 ggoebel joined 12:03 reportable6 left, brrt joined 12:06 reportable6 joined
Nicholas OK, I can't work out whether commit 51d08b5fb4b09bd75008759f60a8c9fcb5433a4b ends up missing some endian swapping 12:11
nine: different strangeness: paste.scsys.co.uk/595949 12:23
the bytecode dump is triggered by the NYI op (I think) but the NYI op doesn't feature in any code backtrace. So I'm confused. But sp_assertparamcheck does feature.
nine: given that this is the "answer", please could you tell me what the question is :-) paste.scsys.co.uk/595950 12:40
disabling case MVM_OP_assertparamcheck: in optimize_bb_switch() appears to be the minimal solution 12:41
so, the bug would seem to be in either the setup for sp_assertparamcheck or the runtime (or a failed endian swap back inbetween the two) 12:42
and I don't know how to spot which of these it is
but the attack surface is small now.
and ./rakudo-m -Ilib t/02-rakudo/03-cmp-ok.t fails as soon as I restore the case MVM_OP_assertparamcheck 12:56
13:03 brrt left 13:17 brrt joined 13:43 ggoebel left, ggoebel joined
nine Nicholas: is ok still always 1? 13:50
Nicholas er, I don't know. but wait a 'mo
paste.scsys.co.uk/595951
the slot argument is not geting endian swapped (enough)
"this fixes it" is true for t/02-rakudo/03-cmp-ok.t for whatever revision I was testing 13:51
there is at least one more bug by master
nine If this fixes it, then it's needed in a few other places as well. 13:52
Nicholas this is my assumption "few other places" 13:53
but I think that this isn't the right fix
nine And then everything would make sense again. Because I used those other places as argument for why it should be correct here. But if it was wrong everywhere...
Nicholas I infer that something else (that swaps args around on writing) isn't doing for this case
or something else that swaps around on reading.
the reading swap-around is done implicitly by ensure_bytes(...) 13:54
so if there isn't ensure_bytes(...) for some categories of arguments, no swap will happen
but I don't know enough about the code paths to spot them
nine validate_literal_operand only seems to consider signed operand types. But then it whould throw an exception for unknown ones if an unsigned one appears 13:58
Nicholas yes, this is where I'm stuck too
nine Do we validate speshed bytecode? And if yes, do we validate operands of speshed instructions? 14:01
Nicholas "not know"
(as my son no longer puts it)
we do not reach that fail(...) because if I put an abort() just before it no abort happens 14:04
need to go AFK for a bit
lizmat also afk for a few hours& 14:11
nine too 14:13
Nicholas jnthnwrthngtn: I think that we need your help/insight on this^^ 14:44
15:24 brrt left 15:31 Kaipi left 15:51 brrt joined 16:04 Kaiepi joined 16:16 brrt left
[Coke] MasterDuke: ETA on unbreaking the build? 16:20
(at least for windows)
MasterDuke in my PR? 16:21
Nicholas nine: next strange thing - once I remove the bswap_32 and get back to breakage. I have a breakpoint on OP(sp_assertparamcheck): 17:03
*both* times MVMint64 ok = GET_REG(cur_op, 0).i64; is *true* (ie we hit the else, so just do cur_op += 6) 17:04
and immediately we hit bad bytecode
er, not immediately
we next arrive at my breakpoint in MVM_dump_bytecode 17:05
OK 17:11
nine: 6 != 8
your insight that the bytecode stream was off was correct. 17:12
17:17 jdv left 17:18 jdv joined
[Coke] Do we not get new tickets announced in here? github.com/rakudo/rakudo/issues/4549 17:18
MasterDuke that's odd, github shows that everything built ok on moarvm's last commit 17:20
[Coke] I will recheck. 17:21
Nicholas (also your question of "is ok true or false?" was correct)
17:26 sena_kun left
nine Nicholas: why 8? 17:36
Nicholas r(int64) is 2, sslot is 2, uint32 is 4 17:37
on a little endian machine, the opcode pointer ends up in the second half of the uin32 literal, which is (usually) 0
so it's executing a no-op 17:38
nine That does make a lot of sense of course. But why is it only broken on some architectures then?
Nicholas then resyncing with correctness
nine OMG
That's devious
Nicholas oh yes ;-)
assuming I have this correct, this is one for the weekly
nine I really didn't consider the possiblity that it's broken everywhere
Nicholas IIRC once upon a time Leo did a hack with parrot where he de-synced the instruction pointer
but I only remember that *after* figuring out that 6 isn't 8
OK, ppc32 on my triplet of commits is at stage optimize 17:39
ppc64 hasn't bombed yet, so might well be OK
forget what it's on
MasterDuke do we need to check all the ops?
Nicholas sparc64 on master/master/master isn't at CORE.c.setting yet
MasterDuke: I don't know. It might be just this one
hence why sparc64 is on master/master/master 17:40
it's *nothing* to do with incorrect endian fixups
MasterDuke the very next one, sp_bindcomplete, doesn't have the same problem? 17:41
Nicholas I will find out soon
I have rather too many checkouts on the go currently 17:42
anyway, we now know a new subtle failure mode
and a bit of wonders if I can conditionally define noop to be something other than 0
and have op 0 be "halt and catch fire (with a backtrace)" 17:43
and how many places would have to explicitly set memory to "not zero any more" to avoid false positives
ppc64 happy with whatever checkout it has 17:44
it's doing ASAN
MasterDuke i started watching halt and catch fire when i was back in the us recently, but annoyingly now i can't continue (without using a vpn) here in the uk 17:45
[Coke] windows build still dying on perl6_ops_moar.dll in rakudo with everything at master 17:48
Nicholas [Coke]: I realise that this is bad, but I have no idea how to help solve it 17:49
well, other than the commits 934ff8258ef8c51e2cb5b2e7372308078cc6457d / d2dae1396cd5d2b8a326394ff169e470ee5dc910 / 1355f036a1968d3eed62f0e16dc7390fe6f86e13 were good together 17:50
17:51 linkable6 left 17:53 linkable6 joined
Nicholas yes, at least one more bug exists 17:55
17:57 brrt joined
Nicholas MasterDuke: I think that you're right. 17:58
it's audit time!
18:01 vrurg_ joined, brrt` joined 18:02 reportable6 left, vrurg left 18:03 brrt left
vrurg_ .ask jnthnwrthngtn Is there any reason to have a unique per-compunit GLOBAL at compile time and only install the primary one for run-time? I think we have an error caused by this design. 18:09
.ask jnthnwrthngtn Is there any reason to have a unique per-compunit GLOBAL at compile time and only install the primary one for run-time? I think we have an error caused by this design. 18:10
tellable6 vrurg_, I'll pass your message to jnthnwrthngtn
18:42 vrurg_ is now known as vrurg 19:00 patrickb joined 19:12 discord-raku-bot left 19:14 discord-raku-bot joined 19:24 brrt` left 19:28 childlikempress joined, moon-child left
Geth MoarVM/new-disp-big-endian-fix: b955a95a7c | (Nicholas Clark)++ | src/core/interp.c
Correct the cur_op increment in sp_assertparamcheck and sp_bindcomplete

These two spesh ops had been leaving cur_op 2 bytes *before* the start of the next op. However, this wasn't noticed (yet) on little endian systems because the last part of each op was a uint32 literal which (for all code compile during the build and spectest) was less than 65536. Hence on
  *little* endian systems the last two bytes were always zero, which the
... (8 more lines)
19:47
MoarVM: nwc10++ created pull request #1557:
Correct the cur_op increment in sp_assertparamcheck and sp_bindcomplete
19:48
Nicholas I believe that the fix is good. There may well be typos etc in the commit message 19:49
lizmat: I think the "what went wrong" in github.com/MoarVM/MoarVM/pull/1557 might be amusing for the weekly
20:05 reportable6 joined 20:37 patrickb left 20:51 childlikempress is now known as moon-child
Geth MoarVM: b955a95a7c | (Nicholas Clark)++ | src/core/interp.c
Correct the cur_op increment in sp_assertparamcheck and sp_bindcomplete

These two spesh ops had been leaving cur_op 2 bytes *before* the start of the next op. However, this wasn't noticed (yet) on little endian systems because the last part of each op was a uint32 literal which (for all code compile during the build and spectest) was less than 65536. Hence on
  *little* endian systems the last two bytes were always zero, which the
... (8 more lines)
21:15
MoarVM: 4859be3546 | niner++ (committed using GitHub Web editor) | src/core/interp.c
Merge pull request #1557 from MoarVM/new-disp-big-endian-fix

Correct the cur_op increment in sp_assertparamcheck and sp_bindcomplete
nine Nicholas++ # excellent find
[Coke] indeed 21:16
vrurg nine: if you have time to have a look at github.com/rakudo/rakudo/pull/4538, or I could just merge it and see later if blin complains. :) 21:21
21:53 evalable6 left, linkable6 left 21:55 linkable6 joined 21:56 evalable6 joined 23:35 evalable6 left, linkable6 left