#moarvm on 21 May 2021 - Raku Programming Language Log

github.com/moarvm/moarvm \| IRC logs at colabti.org/irclogger/irclogger_logs/moarvm Set by AlexDaniel on 12 June 2018.
00:02 reportable6 left 00:04 reportable6 joined 01:04 lucasb left 01:06 frost-lab joined 01:33 ggoebel left 04:05 shareable6 left, reportable6 left, bloatable6 left, squashable6 left, releasable6 left, benchable6 left, linkable6 left, evalable6 left, committable6 left, nativecallable6 left, coverable6 left, tellable6 left, notable6 left, unicodable6 left, sourceable6 left, quotable6 left, greppable6 left, statisfiable6 left, bisectable6 left, nativecallable6 joined, sourceable6 joined, coverable6 joined 04:06 committable6 joined, notable6 joined, greppable6 joined, releasable6 joined, tellable6 joined, evalable6 joined, unicodable6 joined 04:07 bloatable6 joined, reportable6 joined, quotable6 joined, benchable6 joined, linkable6 joined 04:08 statisfiable6 joined, bisectable6 joined, shareable6 joined, squashable6 joined 05:37 statisfiable6 left, notable6 left, coverable6 left, greppable6 left, nativecallable6 left, bloatable6 left, shareable6 left, benchable6 left, releasable6 left, committable6 left, linkable6 left, unicodable6 left, sourceable6 left, bisectable6 left, quotable6 left, tellable6 left, evalable6 left, reportable6 left, squashable6 left 05:38 shareable6 joined, committable6 joined, evalable6 joined 05:39 unicodable6 joined, statisfiable6 joined, notable6 joined, nativecallable6 joined, releasable6 joined, linkable6 joined, greppable6 joined, quotable6 joined, tellable6 joined 05:40 bisectable6 joined, reportable6 joined, squashable6 joined, sourceable6 joined, bloatable6 joined, benchable6 joined, coverable6 joined 05:49 domidumont joined 06:02 reportable6 left 06:04 reportable6 joined 06:41 squashable6 left 06:44 squashable6 joined
nine	dogbert11: oh, that's interesting	06:45	Copy link Message link Add to gist Remove
nwc10	good ,	06:47	Copy link Message link Add to gist Remove
nine	That a segfault is connected to GC may (yet again) explain the seeming randomness of segfaults we see on CI		Copy link Message link Add to gist Remove
07:38 brrt joined
Nicholas	good *, brrt	07:40	Copy link Message link Add to gist Remove
brrt	good * Nicholas	07:54	Copy link Message link Add to gist Remove
08:07 sena_kun left 08:10 [Coke]_ joined 08:11 sena_kun joined 08:13 [Coke] left 08:21 hankache joined 08:37 zakharyas joined 08:57 Voldenet_ joined 08:58 Voldenet left
dogbert11	nine: now I 'only' have to catch it in the debugger :)	09:07	Copy link Message link Add to gist Remove
09:10 Voldenet_ is now known as Voldenet, Voldenet left, Voldenet joined
sena_kun	hi, folks	09:18	Copy link Message link Add to gist Remove
	how is the state of the revert revert revert commit? I remember it exposed some issues we had to address before the release, are they patched already or we should do a re-re-re-revert as the release is tomorrow?		Copy link Message link Add to gist Remove
	also, if there are any new blockers, please share.	09:19	Copy link Message link Add to gist Remove
09:26 brrt left 10:15 ggoebel joined 10:23 ggoebel left 10:27 brrt joined 10:33 ggoebel joined
nine	sena_kun: AFAIK the commit is still in there. Have any issues come up?	10:34	Copy link Message link Add to gist Remove
sena_kun	nine, not yet, though I have no means to do a Blin run now as usual, so I was wondering if something have show up on your (plural) side.	10:36	Copy link Message link Add to gist Remove
10:40 hankache left, hankache joined
nine	Blin would be mighty helpful...	10:40	Copy link Message link Add to gist Remove
sena_kun	:/		Copy link Message link Add to gist Remove
10:42 hankache left, hankache joined 10:59 hankache left
dogbert11	now I'm running with optimizations on, an 8k nursery and the gc debug flag set to two. It has now stopped, in gdb, with 'non-AsyncTask fetched from eventloop active work list'	11:00	Copy link Message link Add to gist Remove
	gist.github.com/dogbert17/e4a3993b...853d014649	11:02	Copy link Message link Add to gist Remove
	nine: is it possible to make something out of this or do we need to catch things earlier?	11:03	Copy link Message link Add to gist Remove
nine	dogbert11: the immediate question is: what _did_ it catch?	11:09	Copy link Message link Add to gist Remove
	So, good *s work here the same as on freenode? Checked		Copy link Message link Add to gist Remove
11:10 zakharyas left 11:35 brrt left
nine	Apparently a VMNull because the array slot work_idx is NULL	11:39	Copy link Message link Add to gist Remove
11:40 avar left 11:41 avar joined, avar left, avar joined, avar left 11:42 avar joined, avar left, avar joined, avar left 11:43 avar joined, avar left, avar joined
dogbert11	nine: (gdb) p REPR(task_obj)->ID	11:48	Copy link Message link Add to gist Remove
	value has been optimized out		Copy link Message link Add to gist Remove
	:(		Copy link Message link Add to gist Remove
nine	Yeah, you have to get it from the source: call MVM_repr_at_pos_o(tc, tc->instance->event_loop_active, work_idx)		Copy link Message link Add to gist Remove
	Or: p ((MVMArray*)(tc->instance->event_loop_active))->body.slots.o[1]	11:49	Copy link Message link Add to gist Remove
dogbert11	(gdb) p ((MVMArray*)(tc->instance->event_loop_active))->body.slots.o[1]		Copy link Message link Add to gist Remove
	$1 = (MVMObject *) 0x0		Copy link Message link Add to gist Remove
tbrowder	hi, working issue #1469 has lead to needing a CFLAGS change for libuv that may conflict with other libs. a casual look at the build situation, and confirmed by MasterDuke17, shows all objects being built with same CFLAGS. seems to me we should compile 3rdparty lin	11:53	Copy link Message link Add to gist Remove
	libs with the same CFLAGS they use.		Copy link Message link Add to gist Remove
	would require an overhaul of build but it would be more robust for future 3rdparty libs	11:54	Copy link Message link Add to gist Remove
dogbert11	nine: in case you want to try teasing the error out, here's the 'golf': gist.github.com/dogbert17/8eded7bd...02c1781405		Copy link Message link Add to gist Remove
	I have also updated the Panic gist a bit, i.e. with some 'l' commands, your 'p' command and 'info threads'	11:57	Copy link Message link Add to gist Remove
nine	oh a golf. That's useful!		Copy link Message link Add to gist Remove
11:57 hankache joined
dogbert11	more like a bogey :)	11:57	Copy link Message link Add to gist Remove
	I'm running with 8k nursery and GC_DEBUG=1		Copy link Message link Add to gist Remove
nine	of course it refuses to break in rr	11:59	Copy link Message link Add to gist Remove
12:02 reportable6 left
nine	OTOH use Test can be removed from the golf	12:03	Copy link Message link Add to gist Remove
12:04 reportable6 joined 12:25 hankache left
nine	The segfault happens because when run-one is called args[1] is NULL	12:39	Copy link Message link Add to gist Remove
	The most curious thing about this is: since args[1] is a register it must not ever be NULL	12:43	Copy link Message link Add to gist Remove
dogbert11	so how can that happen?	12:45	Copy link Message link Add to gist Remove
	it sounds like you've managed to repro :)		Copy link Message link Add to gist Remove
12:47 brrt joined
nine	at SETTING::src/core.c/ThreadPoolScheduler.pm6:297 (/home/nine/rakudo/blib/CORE.c.setting.moarvm:)	12:58	Copy link Message link Add to gist Remove
	That's where the call happens	12:59	Copy link Message link Add to gist Remove
	And the NULL we get from nqp::shift($queue)		Copy link Message link Add to gist Remove
	Added an assert in ConcBlockingQueue's shift and it triggers		Copy link Message link Add to gist Remove
dogbert11	cool	13:00	Copy link Message link Add to gist Remove
13:14 brrt left, brrt joined 13:19 brrt left
lizmat	so it's shifting from the queue when it shouldn't? or another thread beat it to it ?	13:21	Copy link Message link Add to gist Remove
nine	No, the whole point of ConcBlockingQueue is that it's safe to use from different threads. It's just that somehow a NULL ends up in that queue. But in both unshift and push we explicitly guard against that	13:22	Copy link Message link Add to gist Remove
lizmat	so the number of elems is > 0 when the shift produces a NULL, so it really sits in the queue, is what you're saying ?	13:39	Copy link Message link Add to gist Remove
nine	yes	13:48	Copy link Message link Add to gist Remove
lizmat	is it clear if the value got produced by a push or an unshift ?	13:49	Copy link Message link Add to gist Remove
	also: you said: "it's safe to use from different threads"	13:50	Copy link Message link Add to gist Remove
	are we 200% sure of that ?		Copy link Message link Add to gist Remove
	because if the guard in unshift / push is correct, the only other way I see is that another thread snatched it and thus you're looking at element #1 really, and if there is none left, that'd be a NULL ?	13:51	Copy link Message link Add to gist Remove
nine	Well it's meant to be thread safe. Of course the implementation may have bugs	13:52	Copy link Message link Add to gist Remove
lizmat	well, if it walks like a duck and talks like a duck (aka , push and unshift have guarded against NULL entry)	13:53	Copy link Message link Add to gist Remove
jnthn	The bugs there in the past have always been about GC handling around the lock acquisitions		Copy link Message link Add to gist Remove
lizmat	it can only be a duck (aka, a race on the queue.shift)	13:54	Copy link Message link Add to gist Remove
jnthn	At least, those I can remember have :)		Copy link Message link Add to gist Remove
nine	Well this bug seems to require a small nursery to reproduce, so maybe there's yet another GC handling issue there	13:56	Copy link Message link Add to gist Remove
	Well the node got into the queue via push and it definitely had a value back then	14:02	Copy link Message link Add to gist Remove
dogbert11	(gdb) bt		Copy link Message link Add to gist Remove
	#0 MVM_panic (exitCode=0, messageFormat=0x0) at src/core/exceptions.c:853		Copy link Message link Add to gist Remove
	#1 0x00007ffff78d85d2 in gc_mark (tc=0x7fffe00d42e0, st=0x5555555b5178, data=0x5555576392e8, worklist=0x7fffdc1cbec0) at src/6model/reprs/MVMCode.c:48	14:03	Copy link Message link Add to gist Remove
	#2 0x00007ffff7896c99 in MVM_gc_mark_collectable (tc=0x7fffe00d42e0, worklist=0x7fffdc1cbec0, new_addr=0x5555576392d0) at src/gc/collect.c:439		Copy link Message link Add to gist Remove
	#3 0x00007ffff7890a40 in MVM_gc_root_add_gen2s_to_worklist (tc=0x7fffe00d42e0, worklist=0x7fffdc1cbec0) at src/gc/roots.c:349		Copy link Message link Add to gist Remove
	#4 0x00007ffff7893870 in MVM_gc_collect (tc=0x7fffe00d42e0, what_to_do=1 '\001', gen=0 '\000') at src/gc/collect.c:155		Copy link Message link Add to gist Remove
	#5 0x00007ffff788766f in run_gc (tc=0x7fffe00d42e0, what_to_do=1 '\001') at src/gc/orchestrate.c:443		Copy link Message link Add to gist Remove
	#6 0x00007ffff78882e4 in MVM_gc_enter_from_interrupt (tc=0x7fffe00d42e0) at src/gc/orchestrate.c:728		Copy link Message link Add to gist Remove
	Adding pointer %p to past fromspace to GC worklist	14:05	Copy link Message link Add to gist Remove
	nine: should I do a MVM_dump_backtrace(tc) or something else	14:07	Copy link Message link Add to gist Remove
nine	Can you have a look at what that collectable actually is?	14:08	Copy link Message link Add to gist Remove
dogbert11	48 MVM_gc_worklist_add(tc, worklist, &body->outer); is it body->outer we want?		Copy link Message link Add to gist Remove
14:11 gugod joined
nine	Or even body itself since that's the one containing the outdated pointer. What code object is it?	14:11	Copy link Message link Add to gist Remove
dogbert11	(gdb) p *body	14:12	Copy link Message link Add to gist Remove
	$3 = {sf = 0x55555741f070, outer = 0x7fffdc22cbb8, code_object = 0x0, name = 0x555556d1c110, state_vars = 0x0, is_static = 1, is_compiler_stub = 0}		Copy link Message link Add to gist Remove
nine	name and sf->body.name are of interest	14:13	Copy link Message link Add to gist Remove
dogbert11	so how do I get an MVMString to something readable?	14:15	Copy link Message link Add to gist Remove
nine	MVM_dump_string(tc, string)		Copy link Message link Add to gist Remove
dogbert11	thx		Copy link Message link Add to gist Remove
nine	Or if it's not a debug build MVM_string_utf8_maybe_encode_C_string(tc, string)	14:16	Copy link Message link Add to gist Remove
dogbert11	I'll try that as well	14:17	Copy link Message link Add to gist Remove
	(gdb) p MVM_string_utf8_maybe_encode_C_string(tc, body.name)		Copy link Message link Add to gist Remove
	$8 = 0x7fffdc5b49b0 ""		Copy link Message link Add to gist Remove
	(gdb) p MVM_string_utf8_maybe_encode_C_string(tc, body->name)		Copy link Message link Add to gist Remove
	$9 = 0x7fffdc151dd0 ""		Copy link Message link Add to gist Remove
	(gdb)		Copy link Message link Add to gist Remove
	I'm probably doing something wrong but it seems to be the empty string	14:23	Copy link Message link Add to gist Remove
nine	How on earth? It looks like we're pushing the same MVMConcBlockingQueueNode onto two different queues! A poll on the one queue sets the node's value to NULL (when it becomes the new dummy head node) and a shift on the other queue then finds the broken node	14:27	Copy link Message link Add to gist Remove
14:27 domidumont left
dogbert11	oops	14:29	Copy link Message link Add to gist Remove
14:29 domidumont joined 14:41 zakharyas joined 14:46 frost-lab left 14:48 lucasb joined
nine	It gets weirder: even after replacing the FSA with plain calloc, not freeing the nodes at all anymore and commenting out the NULL assignment, I still get NULLs in node values	14:49	Copy link Message link Add to gist Remove
dogbert11	the plot thickens, will this be a one line fix	14:56	Copy link Message link Add to gist Remove
nine	I fear it will be a fix at all only when I manage to reproduce in rr. Because I'm running out of ideas. There's just no code left that would overwrite a queue node's value with NULL	14:59	Copy link Message link Add to gist Remove
dogbert11	and rr is not cooperating	15:09	Copy link Message link Add to gist Remove
tbrowder	seems embarassing to use python in our tool chain	15:39	Copy link Message link Add to gist Remove
nine	feel free to change that :)		Copy link Message link Add to gist Remove
15:42 nevore joined
nine	This just doesn't make sense. It's always the ConcBlockingQueueNode's value that suddenly turns into NULL, while it's next pointer stays intact. So it's a very precise change.	16:18	Copy link Message link Add to gist Remove
	It's probably not a random memory overwrite as nothing else seems to get hit and when I replace usage of the FSA with malloc that would surely change the behavior as we're talking about different memory areas. But it stays the same	16:19	Copy link Message link Add to gist Remove
	But ConcBlockingQueueNodes are only used and modified in src/6model/reprs/ConcBlockingQueue.c and I already removed all setting to NULL	16:20	Copy link Message link Add to gist Remove
	So what's left?		Copy link Message link Add to gist Remove
16:37 domidumont left 16:46 nevore left
Geth	MoarVM: tbrowder++ created pull request #1497: Define _GNU_SOURCE for GNU builds	16:52	Copy link Message link Add to gist Remove
17:08 cog left 17:09 cog joined 17:11 [Coke] joined 17:19 ggoebel left 17:20 [Coke]_ left 18:02 reportable6 left, reportable6 joined 18:16 Altreus left 18:51 MasterDuke joined 18:52 zakharyas left 19:14 linkable6 left, linkable6 joined 19:15 linkable6 left, tellable6 left, evalable6 left, shareable6 left 19:16 tellable6 joined, evalable6 joined 19:17 linkable6 joined 19:18 shareable6 joined, linkable6 left 19:21 linkable6 joined 19:27 shareable6 left, nativecallable6 left, evalable6 left, linkable6 left, greppable6 left, bisectable6 left, unicodable6 left, reportable6 left, squashable6 left, benchable6 left, statisfiable6 left, committable6 left, sourceable6 left, bloatable6 left, releasable6 left, coverable6 left, quotable6 left, tellable6 left, notable6 left 19:30 [Coke] is now known as {Coke}, {Coke} is now known as [Coke] 19:45 zakharyas joined 19:47 nativecallable6 joined 19:48 bisectable6 joined, notable6 joined, sourceable6 joined, releasable6 joined 19:49 squashable6 joined, coverable6 joined, evalable6 joined, tellable6 joined, greppable6 joined, committable6 joined, shareable6 joined, quotable6 joined 19:50 reportable6 joined, benchable6 joined, bloatable6 joined, unicodable6 joined, linkable6 joined, statisfiable6 joined
Geth	MoarVM: tbrowder++ created pull request #1498: Quell compiler warnings on Linux with gcc	19:59	Copy link Message link Add to gist Remove
tbrowder	nine: see last PR, two uninitiated values giving warnings about vfork and jumps	20:01	Copy link Message link Add to gist Remove
20:13 MasterDuke left, MasterDuke joined 20:19 zakharyas left
MasterDuke	just got a segfault in t/spec/S17-lowlevel/cas.t with only change being an 8k nursery	20:19	Copy link Message link Add to gist Remove
	haven't been able to catch it in rr though	20:25	Copy link Message link Add to gist Remove
	ran it under rr ~250 times, but never an error of any kind	20:33	Copy link Message link Add to gist Remove
dogbert11	MasterDuke: I got it as well	20:38	Copy link Message link Add to gist Remove
	0x00007ffff79b9e3b in evaluate_guards (gs=0x555558c0cac8, gs=0x555558c0cac8, callsite=0x555558c0cac8, guard_offset=0x7fffeea5ab66, tc=0x7fffe00d6ea0) at src/spesh/plugin.c:85		Copy link Message link Add to gist Remove
	85 outcome = STABLE(test) == gs->guards[pos].u.type;		Copy link Message link Add to gist Remove
MasterDuke	interesting	20:39	Copy link Message link Add to gist Remove
20:52 [Coke] left 21:53 lucasb left 21:54 kawaii left 21:59 kawaii joined, lucasb joined 22:02 ggoebel joined 23:34 evalable6 left, squashable6 left 23:36 evalable6 joined 23:37 squashable6 joined

Please report any issues / comments / feature requests as an issue on App::Raku::Log.

Thank you!