#moarvm on 26 January 2021 - Raku Programming Language Log

github.com/moarvm/moarvm \| IRC logs at colabti.org/irclogger/irclogger_logs/moarvm Set by AlexDaniel on 12 June 2018.
00:40 Kaiepi joined 00:59 lucasb left 03:50 rba left, rba joined 05:23 [Coke] left 05:26 [Coke] joined, [Coke] left, [Coke] joined 06:32 evalable6 left, linkable6 left, linkable6 joined 06:33 evalable6 joined 08:01 domidumont joined 08:42 MasterDuke joined 08:44 zakharyas joined
nwc10	good *, #moarvm	09:53	Copy link Message link Add to gist Remove
jnthn	morning o/	10:31	Copy link Message link Add to gist Remove
nwc10	\o		Copy link Message link Add to gist Remove
jnthn	Fresh snow! Fresh coffee! Life could be worse.	10:32	Copy link Message link Add to gist Remove
nwc10	We had a very small amount of snow overnight.		Copy link Message link Add to gist Remove
	But our here in the flatlands there was always less snow than up in the hills where we first lived		Copy link Message link Add to gist Remove
jnthn	I had to check it, but it turns out Vienna is some couple of hundred meters higher than Prague	10:38	Copy link Message link Add to gist Remove
moritz	oh, Austria had the high ground. That explains a lot, in hindsight :-)	10:41	Copy link Message link Add to gist Remove
MasterDuke	we actually got about an inch (enough and wet enough to make a snowman) two days ago and it's been just barely cold enough that it's sort of stuck around	10:48	Copy link Message link Add to gist Remove
nine	Almost all the snow here melted over the weekend, but yesterday night it started snowing again and everything's white now :) Though that will last only a couple more days. Will get much warmer	10:59	Copy link Message link Add to gist Remove
nwc10	had a very small amount of snow. "Strange bright thing in the sky" has seen to it		Copy link Message link Add to gist Remove
	I though it likely that you were going to report stuff that makes us all a bit jealous :-)	11:00	Copy link Message link Add to gist Remove
11:07 MasterDuke left 11:09 MasterDuke joined 12:22 zakharyas left 12:52 Voldenet left 12:54 Voldenet joined, Voldenet left, Voldenet joined
[Coke]	We've had a dip in temperatures, but no snow expected until later today.	13:29	Copy link Message link Add to gist Remove
	(below freezing most of the week)		Copy link Message link Add to gist Remove
MasterDuke	nine: i set a watchpoint on the address in frame->work[i].o, but nothing jumped out at me. a bunch of str and vmarray functions in the backtraces, but still nothing in any of the code i touched	13:31	Copy link Message link Add to gist Remove
13:34 zakharyas joined
MasterDuke	last time jnthn spotted some missing MVM_ASSIGN_REFs and some that needed changing. maybe he can see something in the new diff gist.github.com/MasterDuke17/fd236...07aad7722f (this is implementing the removal via deopt/log/stats/plan/worker instead of directly via deopt)	13:35	Copy link Message link Add to gist Remove
jnthn	Maybe when I escape from a somewhat involved rebase...	13:45	Copy link Message link Add to gist Remove
nine	MasterDuke: only one watchpoint?	13:48	Copy link Message link Add to gist Remove
MasterDuke	jnthn: yeah, no worries		Copy link Message link Add to gist Remove
	nine: yep, should there be more?	13:49	Copy link Message link Add to gist Remove
nine	Yes. I meant you need to trace that value (the pointer to the outdated object) back to the place where it gets outdated. That means that you will have to set watchpoints on every place where the pointer gets stored		Copy link Message link Add to gist Remove
MasterDuke	oh, ah...	13:50	Copy link Message link Add to gist Remove
	right		Copy link Message link Add to gist Remove
nine	I.e. first on frame->work[i].o. There you see who writes to the register. Clear that watchpoint and set one on where the source is (may be a different register or a local variable)		Copy link Message link Add to gist Remove
13:53 lucasb joined 14:12 brrt joined
MasterDuke	fyi, not terribly surprised, but no problems building nqp and rakudo if i comment out the call to MVM_spesh_candidate_discard_one in worker.c	14:14	Copy link Message link Add to gist Remove
jnthn	Hmm	14:39	Copy link Message link Add to gist Remove
	Just been rebasing new-disp, and see a failure in the dispatch tests: MoarVM oops: MVM_fixed_size_alloc request for 0 bytes	14:40	Copy link Message link Add to gist Remove
	Anyone know if something changed there?		Copy link Message link Add to gist Remove
nwc10	yes		Copy link Message link Add to gist Remove
	yes, I know, yes we did, but forget exactly what		Copy link Message link Add to gist Remove
	but there was a reason to add a sanity check	14:41	Copy link Message link Add to gist Remove
	commit 60070970c4a74627c60ac11a3d3fd111d8f7cad2		Copy link Message link Add to gist Remove
14:41 linkable6 left
nwc10	oh, because "we corrupt the malloc heap"	14:42	Copy link Message link Add to gist Remove
14:43 linkable6 joined
jnthn	Oh?	14:45	Copy link Message link Add to gist Remove
	malloc also doesn't care for zero size allocations though, I thought..		Copy link Message link Add to gist Remove
nwc10	IIRC it's implementation defined what you get - a pointer (to nothign you can use) or NULL	14:46	Copy link Message link Add to gist Remove
	however, I suspect there was somethign else going on here		Copy link Message link Add to gist Remove
	but I forget why that commit got found		Copy link Message link Add to gist Remove
jnthn	I'm not even sure I'd agree the caller is confused, it just forces more conditionals on the caller side		Copy link Message link Add to gist Remove
	Anyway, will track it down and adapt	14:47	Copy link Message link Add to gist Remove
nwc10	nine might remember		Copy link Message link Add to gist Remove
nine	POSIX says: If the size of the space requested is 0, the behavior is implementation-defined: either a null pointer shall be returned, or the behavior shall be as if the size were some non-zero value, except that the behavior is undefined if the returned pointer is used to access an object.		Copy link Message link Add to gist Remove
nwc10	OK, that's what I remembered and paraphrased		Copy link Message link Add to gist Remove
nine	I guess the FSA could just return NULL and properly ignore requests to free an area of 0 bytes	14:48	Copy link Message link Add to gist Remove
nwc10	I really can't remember. I remember that buggy code could end up calling MVM_fixed_size_alloc with 0	14:50	Copy link Message link Add to gist Remove
	"be happy" that it got a non-NULL pointer	14:51	Copy link Message link Add to gist Remove
	and then did something daft that must have corrupted the heap		Copy link Message link Add to gist Remove
	MVM_fixed_size_alloc_zeroed() called with a size of 0 would have ended up calling memset(..., 0, 0) which likely is undefined behaviour if ... is NULL		Copy link Message link Add to gist Remove
	I really can't remember, other than it felt like the better choice at the time, to deal with the observed problems	14:52	Copy link Message link Add to gist Remove
jnthn	It was easy to adapt to it	14:53	Copy link Message link Add to gist Remove
	So far as I understand the malloc rule, it's implementation defined exactly what it does, but a malloc(0) and later free of the thing you got back should be consistent	14:54	Copy link Message link Add to gist Remove
nwc10	that was my understanding too		Copy link Message link Add to gist Remove
jnthn	OK, NQP tests in new-disp are happy again	15:03	Copy link Message link Add to gist Remove
	Something looks less good in the Rakudo tests on a rebased new-disp...		Copy link Message link Add to gist Remove
	CALL-ME handling NYI in new dispatcher; got NQPRoutine		Copy link Message link Add to gist Remove
	Hmm		Copy link Message link Add to gist Remove
	Probably a Rakudo change rather than a mis-rebase	15:05	Copy link Message link Add to gist Remove
Geth	MoarVM/new-disp: 168 commits pushed by (Jonathan Worthington)++, (Timo Paulssen)++, (Daniel Green)++ review: github.com/MoarVM/MoarVM/compare/0...03ba30e786		Copy link Message link Add to gist Remove
jnthn	Now let's see if I can get this thing done without too much more moving under me :)	15:09	Copy link Message link Add to gist Remove
nwc10	maybe you can do something to amuse ASAN, because right now it doesn't find the rebased new-disp at all exciting. (a good thing)	15:48	Copy link Message link Add to gist Remove
jnthn	Dispatch resumption has enough moving parts that there's every chance!	15:49	Copy link Message link Add to gist Remove
nwc10	fun, excitement, technicolor bactraces		Copy link Message link Add to gist Remove
MasterDuke	nine: just to be explicit, this is what you meant? gist.github.com/MasterDuke17/462f7...9152947165	15:54	Copy link Message link Add to gist Remove
nine	MasterDuke: in principle yes. But it looks like you took a wrong turn. Initially you see 0x55963e330140, that's the address of the broken object. But after the second reverse-cont you get a change from 0x55963e2f4840 to 0x0. That's a different address, so you watched the wrong place	15:58	Copy link Message link Add to gist Remove
	I think you wanted watch -l ((MVMString )((char )GET_REG(cur_op, 2).o + GET_UI16(cur_op, 4)))		Copy link Message link Add to gist Remove
	in line 49		Copy link Message link Add to gist Remove
MasterDuke	ok, thanks	15:59	Copy link Message link Add to gist Remove
nine	As this shows, one needs to be extra careful as its super easy to get a wrong result. I usually double-check every step and also print the addresses of the memory locations and the contents of index variables like the i in frame->work[i].o	16:00	Copy link Message link Add to gist Remove
	That way its also easy to re-instate a watch, like watch -l 0x55963e330140	16:01	Copy link Message link Add to gist Remove
MasterDuke	made that change and the immediate next break is __memmove_avx_unaligned, so i assume i actually want whatever is in frame 1 (process_worklist)?	16:02	Copy link Message link Add to gist Remove
nine	Also printing the result of e.g. GET_UI16(cur_op, 4) makes it easier to correlate with the output of MVM_dump_bytecode(tc)		Copy link Message link Add to gist Remove
	what's the backtrace at that point?		Copy link Message link Add to gist Remove
MasterDuke	gist updated	16:03	Copy link Message link Add to gist Remove
nine	Ah, yes, that's memcpy(new_addr, item, item->size) in the GC, i.e. the garbage collector moved the object that contains the faulty pointer.	16:05	Copy link Message link Add to gist Remove
16:06 MasterDuke left, MasterDuke joined
MasterDuke	so watch item?	16:07	Copy link Message link Add to gist Remove
nine	Again, in principle yes. But the thing you're interested in is not item itself, but an attribute in item. What kind of object is item in the first place?	16:11	Copy link Message link Add to gist Remove
	The first hit is in sp_get_s, i.e. get a string (from an attribute) at a fixed offset		Copy link Message link Add to gist Remove
MasterDuke	isn't it an iter?	16:13	Copy link Message link Add to gist Remove
	MVMIter		Copy link Message link Add to gist Remove
nine	does that contain a string?	16:14	Copy link Message link Add to gist Remove
MasterDuke	not exactly. `struct { MVMStrHashIterator curr; MVMStrHashIterator next; } hash_state;`	16:15	Copy link Message link Add to gist Remove
nine	In that case it may not be the usual GC issue at all and instead is a mismatch between the code and the data		Copy link Message link Add to gist Remove
	Which considering you're working with spesh candidates is not that far fetched. Could it be that you're running the wrong spesh candidate with respect to the frame's registers?	16:16	Copy link Message link Add to gist Remove
MasterDuke	given that this only happens after removing a spesh candidate, seems possible	16:17	Copy link Message link Add to gist Remove
	gist.github.com/MasterDuke17/fd236...h-L98-L164 is what causes this		Copy link Message link Add to gist Remove
nine	Have you checked if it's running a removed spesh candidate?	16:19	Copy link Message link Add to gist Remove
MasterDuke	hm. that would have made sense to do. not sure exactly how to do that	16:22	Copy link Message link Add to gist Remove
nine	I guess the most straight forward way is to overwrite the memory before freeing it. Same as I did for GC_DEBUG 3	16:25	Copy link Message link Add to gist Remove
	You will notice pretty fast if it's trying to treat 0xfefefefefefefefefefefe as bytecode :)	16:26	Copy link Message link Add to gist Remove
MasterDuke	good idea	16:28	Copy link Message link Add to gist Remove
17:02 sxmx left 17:07 sxmx joined
MasterDuke	segfault instead of MVM_panic	17:17	Copy link Message link Add to gist Remove
nine	that's good!		Copy link Message link Add to gist Remove
	Why does it segfault exactly? Just to be sure		Copy link Message link Add to gist Remove
MasterDuke	running it under rr now to find out	17:21	Copy link Message link Add to gist Remove
	i re-ran the entire nqp build, looks like the segfault was in a different, earlier step	17:22	Copy link Message link Add to gist Remove
	this is not fast	17:23	Copy link Message link Add to gist Remove
17:23 sxmx left 18:00 brrt left 18:03 MasterDuke left 18:06 domidumont left
nine	If anyone is in the mood for a bit of a detective story demonstrating tools like rr, I'm quite proud to have just solved a mystery at work :) gist.github.com/niner/7f7cb0a79db9...47d44c2029	18:26	Copy link Message link Add to gist Remove
[Coke]	nine++	18:34	Copy link Message link Add to gist Remove
18:52 zakharyas left 19:15 sxmx joined 20:04 MasterDuke joined
MasterDuke	ah, no ->spesh_cand in this frame	20:21	Copy link Message link Add to gist Remove
	so segv here github.com/MoarVM/MoarVM/blob/mast...opt.c#L247	20:24	Copy link Message link Add to gist Remove
	and what do you know, some 0xefefefefefefefef show up gist.github.com/MasterDuke17/b33ca...9bd5154a57	20:26	Copy link Message link Add to gist Remove
20:41 MasterDuke left 20:42 MasterDuke joined
nine	An important step forward :)	20:42	Copy link Message link Add to gist Remove
20:48 ilogger2 joined
MasterDuke	hm. sticking an `if *->spesh_cand` everywhere won't be fun. there must be something better	20:51	Copy link Message link Add to gist Remove
20:55 ilogger2_ joined
jnthn	MasterDuke: The idea was that the candidate would stay around long enough that everything referencing it would just continue to work fine	21:35	Copy link Message link Add to gist Remove
	Thus why it because possible to GC		Copy link Message link Add to gist Remove
	Nothing should NULL it out on a frame that is still running and using it		Copy link Message link Add to gist Remove
	And the removal from the log guards mean that no new calls will end up using it		Copy link Message link Add to gist Remove
	So it goes away once everything that was in it goes out of dynamic scope	21:36	Copy link Message link Add to gist Remove
MasterDuke	hm		Copy link Message link Add to gist Remove
jnthn	The free_at_safepoint I thought was for the guard tree and array of candidates		Copy link Message link Add to gist Remove
	not for the candidate itself		Copy link Message link Add to gist Remove
	The memory associated with the candidate (slots, specialized bytecode, JITted machine code) gets cleared up in the gc_free of MVMSpeshCandidate	21:37	Copy link Message link Add to gist Remove
MasterDuke	i don't think i ever remove/overwrite the candidate from anywhere		Copy link Message link Add to gist Remove
jnthn	Having a NULL spesh cand means that the thing isn't specialized; if we're somehow in specialized code but spesh_cand isn't set, something is very odd.	21:38	Copy link Message link Add to gist Remove
MasterDuke	i.e., not *->spesh_cand. i just don't memcpy it when copying the array		Copy link Message link Add to gist Remove
jnthn	I thought it was an array of pointers to spesh candidates nowadays, which are the GC-able objects?	21:39	Copy link Message link Add to gist Remove
	f->spesh_cand->body.inlines is consistent with that too		Copy link Message link Add to gist Remove
MasterDuke	gist.github.com/MasterDuke17/fd236...-L134-L143		Copy link Message link Add to gist Remove
jnthn	If the memory on the other end of f->spesh_cand is junk, then that feels like a GC issue	21:40	Copy link Message link Add to gist Remove
	Yeah, that code matches my expectations		Copy link Message link Add to gist Remove
	Or at least, it does to the degree it's a pointer array	21:41	Copy link Message link Add to gist Remove
MasterDuke	i added `memset(cands_and_arg_guards->spesh_candidates, 0xfe, (new_num + 1) * sizeof(MVMSpeshCandidate *));` between the `MVM_spesh_arg_guard_destroy` and `MVM_fixed_size_free_at_safepoint(..., cands_and_arg_guards->spesh_candidates);`		Copy link Message link Add to gist Remove
jnthn	oh		Copy link Message link Add to gist Remove
MasterDuke	and now gist.github.com/MasterDuke17/b33ca...og-L54-L55		Copy link Message link Add to gist Remove
jnthn	oh, I think there's an odering issue here	21:42	Copy link Message link Add to gist Remove
	+ spesh->body.spesh_cands_and_arg_guards = new_cands_and_arg_guards;		Copy link Message link Add to gist Remove
	+		Copy link Message link Add to gist Remove
	+ MVM_spesh_arg_guard_regenerate(tc, &(new_cands_and_arg_guards->spesh_arg_guard),		Copy link Message link Add to gist Remove
	+ new_cands_and_arg_guards->spesh_candidates, new_num);		Copy link Message link Add to gist Remove
	new_cands_and_arg_guards must be fully set up before we assign it		Copy link Message link Add to gist Remove
	Otherwise something can read it before we have regenerated the guard tree, and see the old index		Copy link Message link Add to gist Remove
	Swapping those two statements around will likely help something, and I can imagine it fixing this	21:43	Copy link Message link Add to gist Remove
	*ordering		Copy link Message link Add to gist Remove
	.oO( code smell )		Copy link Message link Add to gist Remove
MasterDuke	i definitely wasn't 100% sure of the correct ordering. i pretty much tried to do the reverse of what happens when adding a candidate, but some of it did/does need to be in the same order	21:46	Copy link Message link Add to gist Remove
jnthn	Yes, the general principle is get everything in order, and then publish it when it's ready	21:47	Copy link Message link Add to gist Remove
	Heh, "in order" in the sense of "organized, completed" :)		Copy link Message link Add to gist Remove
MasterDuke	dropping GC_DEBUG down to 1 for speed got past where that segv was happening, but a different one popped up, so i bumped it back up to 3 and restarted the build. it's gonna take a little while to get back to where it was happening	21:50	Copy link Message link Add to gist Remove
jnthn	OK. afk for a bit for a game		Copy link Message link Add to gist Remove
MasterDuke	looks like the same segv as before	22:27	Copy link Message link Add to gist Remove
22:47 leont joined 23:05 harrow joined
	jnthn back for a bit	23:06	Copy link Message link Add to gist Remove
MasterDuke	huh. now f->static_info is 0	23:16	Copy link Message link Add to gist Remove
	gist.github.com/MasterDuke17/57b58...0701b2b2eb	23:17	Copy link Message link Add to gist Remove
jnthn	Hm, that sounds like a rather different kind of problem :s	23:24	Copy link Message link Add to gist Remove
	At a wild guess, it's created an uninlined frame from broken inline state, but...I'm a bit surprised it got this far	23:25	Copy link Message link Add to gist Remove
	Is the inline table getting marked properly?		Copy link Message link Add to gist Remove
MasterDuke	i changed `if (f->spesh_cand->body.inlines) {` to `if (f->spesh_cand && f->spesh_cand->body.inlines) {`	23:26	Copy link Message link Add to gist Remove
jnthn	If you have rr then I guess you have that to hand		Copy link Message link Add to gist Remove
	If we end up in this place and f->spesh_cand is NULL something was really rotten already		Copy link Message link Add to gist Remove
MasterDuke	what is the inline table?	23:27	Copy link Message link Add to gist Remove
jnthn	f->spesh_cand->body.inlines is an array of inlines; when we uninline the static frame comes from there	23:28	Copy link Message link Add to gist Remove
MasterDuke	f is not in a good state	23:29	Copy link Message link Add to gist Remove
jnthn	Ah, then all bets are off :)		Copy link Message link Add to gist Remove
	Though ooc, you could try running with MVM_SPESH_INLINE_DISABLE=1 just to see if it is related to that at all	23:30	Copy link Message link Add to gist Remove
MasterDuke	gist updated	23:31	Copy link Message link Add to gist Remove
jnthn	wow, zeroes all the way	23:35	Copy link Message link Add to gist Remove
	There's a #define at the top of deopt.c that tells something about the deopt it's doing; could you try turning that on?		Copy link Message link Add to gist Remove
	It'll maybe give us a clue about what it things its doing	23:36	Copy link Message link Add to gist Remove
	*thinks		Copy link Message link Add to gist Remove
MasterDuke	i'll give that a run after this MVM_SPESH_INLINE_DISABLE=1 finishes		Copy link Message link Add to gist Remove
	ugh, i need to stop setting GC_DEBUG to 3, it's just so slow	23:38	Copy link Message link Add to gist Remove
jnthn	3 is a mixed bag in terms of what it finds/hides also	23:41	Copy link Message link Add to gist Remove
	Compare to 1, 2 is a fairly straight "checks more things", but 3 changes how often GC happens	23:42	Copy link Message link Add to gist Remove
MasterDuke	a while ago i was only getting an error to appear under 3, but i think that was fixed and i could go back, i've just kept it there for no particularly good reason	23:43	Copy link Message link Add to gist Remove

Please report any issues / comments / feature requests as an issue on App::Raku::Log.

Thank you!