#moarvm on 1 December 2021 - Raku Programming Language Log

Welcome to the main channel on the development of MoarVM, a virtual machine for NQP and Rakudo (moarvm.org). This channel is being logged for historical purposes. Set by lizmat on 24 May 2021.
00:00 bartolin left, discord-raku-bot left, colemanx left, discord-raku-bot joined, Colt left 00:02 Colt joined, reportable6 left 00:04 [Coke] left 00:05 bartolin joined, reportable6 joined, [Coke] joined 00:17 squashable6 left 00:18 squashable6 joined 01:00 colemanx joined
ugexe	I'm trying to debug why macOS Monterey usually locks up when running e.g. shell("ls"), but I don't know if this lldb output actually has anything useful -- gist.github.com/ugexe/8c34b26c5edb...cf377ae334	01:44	Copy link Message link Add to gist Remove
02:42 frost joined 06:03 reportable6 left
nine	Won't work, but why not give it a try anyway: export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES	08:00	Copy link Message link Add to gist Remove
08:03 reportable6 joined 08:14 linkable6 left 08:15 linkable6 joined
Nicholas	good *, #moarvm	08:50	Copy link Message link Add to gist Remove
jnthnwrthngtn	Are threads 4 and 5 making progress, or is it just coincidence that they are both deep in OS things and trying to acquire a lock?	08:58	Copy link Message link Add to gist Remove
	meeting, bbiab	08:59	Copy link Message link Add to gist Remove
timo	ugexe: do we have info about the process that was spawned by thread 5?	10:07	Copy link Message link Add to gist Remove
10:11 patrickb joined
jnthnwrthngtn	I'd guess thread 5 is the shell to run `ls`, the thing that surprises me more is that uv_cpu_info leads to a fork	11:19	Copy link Message link Add to gist Remove
11:41 Altai-man joined 12:03 reportable6 left 12:07 frost left 12:49 linkable6 left
ugexe	if i repeat the example in the gist it gives the same relevant lldb output each time, so i assume its not making progress (same if i resume + interrupt)	13:14	Copy link Message link Add to gist Remove
jnthnwrthngtn	OK, so it really is stuck there.	13:17	Copy link Message link Add to gist Remove
lizmat	github.com/rakudo/rakudo/pull/4659	13:18	Copy link Message link Add to gist Remove
	Make sure that nqp::cpucores is only called once ever		Copy link Message link Add to gist Remove
	if this fixes it, should probably be done at the MoarVM level	13:19	Copy link Message link Add to gist Remove
jnthnwrthngtn	If this fixes it, it should probably be reported to libuv and maybe Apple.	13:23	Copy link Message link Add to gist Remove
ugexe	that does actually work around it	13:30	Copy link Message link Add to gist Remove
	maybe someone with a non-Monterey mac could run lldb on `shell("ls")` to see what it used to do	13:32	Copy link Message link Add to gist Remove
	i can do more debugging later if anyone has any suggestions, but i'm off to work for now	13:34	Copy link Message link Add to gist Remove
13:51 linkable6 joined 14:03 reportable6 joined 14:16 patrickb left
lizmat	$ r 'use nqp; nqp::cpucores for ^10000'	14:26	Copy link Message link Add to gist Remove
	real0m0.466s		Copy link Message link Add to gist Remove
	$ r 'use nqp; Kernel.cpu-cores for ^10000'		Copy link Message link Add to gist Remove
	real0m0.173s		Copy link Message link Add to gist Remove
	that's with #4659 applied	14:27	Copy link Message link Add to gist Remove
ugexe	Kernel.cpu-cores is public api though, so if the number of cpu cores visible can change at runtime (i have no idea) then we probably shouldnt cache that value for that method	14:29	Copy link Message link Add to gist Remove
[Coke]	I have a (checks) Big Sur mac if you want me to check something.	14:31	Copy link Message link Add to gist Remove
timo	you can turn off cores via /sys somewhere	14:33	Copy link Message link Add to gist Remove
jnthnwrthngtn	I think hot-changing number of CPU cores is a thing, for example a virtual machine may be able to have more cores added while it's running.	14:35	Copy link Message link Add to gist Remove
ugexe	could someone tell [Coke] the lldb or gdb commands to run a script (containing just `shell("ls")`) so they can get the `bt all` output?		Copy link Message link Add to gist Remove
	i don't think the runner works, and I only otherwise know how to attach to a running process (that script will finish running before he can attach to it)		Copy link Message link Add to gist Remove
jnthnwrthngtn	However, I'm not sure it's terribly common, and since our thread pool already fixates its maximum at creation time anyway... ;)	14:36	Copy link Message link Add to gist Remove
timo	if a user is calling that method regularly, they will have a reason, probably?		Copy link Message link Add to gist Remove
jnthnwrthngtn	Well, TPS was doing nqp::cpucores regularly, but honestly I don't remember thinking "oh, we should do this for a fresh value", it was more "make things work" and "don't expect it to be too costly"	14:39	Copy link Message link Add to gist Remove
[Coke]	ugexe: what about 'shell("ls");sleep 30' or somesuch?		Copy link Message link Add to gist Remove
	someone forwarded me an article on hardening release builds that we might want to adopt in part.	14:41	Copy link Message link Add to gist Remove
	cheatsheetseries.owasp.org/cheatsh...eoi7qt4p2Q	14:42	Copy link Message link Add to gist Remove
	cheatsheetseries.owasp.org/cheatsh...Sheet.html		Copy link Message link Add to gist Remove
14:43 Guest12 joined
ugexe	TPS caching the result from Kernel.cpu-cores wouldnt surprise me, but Kernel.cpu-cores itself caching might	14:43	Copy link Message link Add to gist Remove
nine	[Coke]: just gdb rakudo -e '...' is what I do	14:44	Copy link Message link Add to gist Remove
[Coke]	and once I'm in gdb?	14:45	Copy link Message link Add to gist Remove
	(Surprised to find I have a gdb lying around on my mac.)	14:46	Copy link Message link Add to gist Remove
ugexe	`bt all` or `bt full` or some such		Copy link Message link Add to gist Remove
[Coke]	both say "no stack"	14:48	Copy link Message link Add to gist Remove
nine	First `run`	14:49	Copy link Message link Add to gist Remove
	Then ctrl+c (or whatever on the mac) to interrrupt at the point where you want that backtrace		Copy link Message link Add to gist Remove
[Coke]	sorry, yes just figured that out. run says:		Copy link Message link Add to gist Remove
	Unable to find Mach task port for process-id 15321: (os/kern) failure (0x5).		Copy link Message link Add to gist Remove
	(please check gdb is codesigned - see taskgated(8))		Copy link Message link Add to gist Remove
	so I'm guessing the system is also suprised there's a gdb on my old mac.	14:50	Copy link Message link Add to gist Remove
	give me a bit.		Copy link Message link Add to gist Remove
ugexe	do you have lldb?		Copy link Message link Add to gist Remove
[Coke]	aye	14:52	Copy link Message link Add to gist Remove
ugexe	that might work better on a mac		Copy link Message link Add to gist Remove
[Coke]	stub code executed, process exited with status = 1		Copy link Message link Add to gist Remove
	(I have a moar-2021.10 release installed)	14:59	Copy link Message link Add to gist Remove
vrurg	Doing `RAKUDO_OPTIMIZER_DEBUG=1 make install` in rakudo ends up with 'Cannot find method 'scope' on object of type QAST::Var+{QAST::SpecialArg}'. I'm stilling trying to figure out what's going on, but perhaps somebody has good ideas? So far I see that mixin in SpecialArg role sometimes wipes out parents array on HOW.	15:04	Copy link Message link Add to gist Remove
	Could be a moar issue.	15:05	Copy link Message link Add to gist Remove
jnthnwrthngtn	Maybe, but moar doesn't really know about the content of meta-objects; so far as it's concerned they're just, well, objects.	15:09	Copy link Message link Add to gist Remove
	How do you conclude it wipes out the parents array?	15:10	Copy link Message link Add to gist Remove
15:13 [Coke] left 15:16 [Coke] joined
[Coke]	.'	15:24	Copy link Message link Add to gist Remove
vrurg	jnthnwrthngtn: was dumping HOW fields.	15:36	Copy link Message link Add to gist Remove
	Weird thing, when it happens to routine parameters (__lowered_param_*) they're ok right after mixin, but may start failing a few lines down, within the same lower_signature sub. Since there're no races could be involved, and initially they all have >0 number of parents, I conclude that the array gets emptied accidentally. Loss of data by the VM is not excluded yet.	15:41	Copy link Message link Add to gist Remove
	Ok, probably it's pure NQP issue, after all. Tried with JVM backend and got 'Unhandled exception: Sub+{is-implementation-detail} object coerced to string'. Won't be able to work on this until later today, so if somebody wants to look into it – I don't mind. ;)	16:07	Copy link Message link Add to gist Remove
	afk for a couple of hour.		Copy link Message link Add to gist Remove
ugexe	gist.github.com/ugexe/8c34b26c5edb...cf377ae334 -- updated the gist to also include the lldb output from big sur (which does not fork)	16:08	Copy link Message link Add to gist Remove
	the big sur example is on 2021.08 whereas monterey is on master fwiw	16:09	Copy link Message link Add to gist Remove
	hmm in the big sur example it doesnt call uv__get_cpu_speed among other things	16:58	Copy link Message link Add to gist Remove
17:41 patrickb joined
[Coke]	ugexe++	17:46	Copy link Message link Add to gist Remove
18:00 Altai-man left 18:02 reportable6 left 18:24 Guest12 left
ugexe	rakudo 2020.08.2 works, 2020.10 does not. these correspond to github.com/libuv/libuv/commit/87f0...e65e1b81c7	18:28	Copy link Message link Add to gist Remove
18:54 squashable6 left 18:56 squashable6 joined 19:23 patrickb left 19:44 MasterDuke joined
ugexe	__si_module_static_mdns_block_invoke -- i cant find anything on si_module_static_mdns but it kind of feels strange it is being queried as part of dlopen	19:54	Copy link Message link Add to gist Remove
	strange in that it looks like it might be mDNS	19:55	Copy link Message link Add to gist Remove
20:04 reportable6 joined 20:16 [Coke] left
vrurg	BTW, do we have an article/paper where the exact technical meaning of wanted/unwanted is explained? I was googling awhile ago, with no apparent success.	20:17	Copy link Message link Add to gist Remove
timo	that's what powers "useless use of" messages		Copy link Message link Add to gist Remove
20:22 [Coke] joined
vrurg	timo: that's where my understanding basically ends. :) It looks like there are nuances I'm not aware of. For example, does it affect lowering of vars?	20:32	Copy link Message link Add to gist Remove
timo	hm, maybe that's only influenced by whether a lexical is used by an inner scope or not	20:33	Copy link Message link Add to gist Remove
20:35 [Coke] left
vrurg	It should. But somehow I probably fail to indicate this... Whatever, that's what I was debugging when encountered the NQP bug with mixins. So, the bug goes first.	20:35	Copy link Message link Add to gist Remove
timo	how do you mean "indicate"?	20:36	Copy link Message link Add to gist Remove
vrurg	First I thought that 'whanted' attribute of a node means no variables involved are to be lowered. Second, I see that lowering code is using some annotations to skip variables. But this part of the optimizer is rather big and I didn't get deeper into it yet.	20:39	Copy link Message link Add to gist Remove
20:43 [Coke] joined
timo	i'm a little surprised to hear wanted was involved in variable lowering	20:49	Copy link Message link Add to gist Remove
vrurg	I don't say it, I _thought_ it is. :) That's why I'd like to find out more about it. Otherwise I either monkey-copying from other code. Or trying to guess, sometimes in a hard way.	20:52	Copy link Message link Add to gist Remove
timo	i'm headachy today, i'm not sure if i can be of too much help	20:54	Copy link Message link Add to gist Remove
20:54 [Coke] left
vrurg	timo: don't bother and get well soon! I'll figure out eventually. :)	20:57	Copy link Message link Add to gist Remove
timo	i can recommend asking many questions in here anyway	21:00	Copy link Message link Add to gist Remove
Geth	MoarVM: 8a684b3304 \| (Stefan Seifert)++ \| 2 files Fix out of bounds read of PHI facts in spesh During spesh optimization, we remove reads of registers with dead writers from PHI nodes. It could happen that the PHI node ended up with no registers to read at all. However the following analysis code assumed that we'd always have at least 1 register to read from, resulting in an array read out of bounds error and a variety of failure modes. ... (7 more lines)	21:01	Copy link Message link Add to gist Remove
	MoarVM: 7d58542da1 \| (Jonathan Worthington)++ (committed using GitHub Web editor) \| 2 files Merge pull request #1610 from MoarVM/fix_phi_out_of_bounds_read Fix out of bounds read of PHI facts in spesh		Copy link Message link Add to gist Remove
MasterDuke	is there any difference between `has Int $.a is default(0)` and `has Int $a = 0` if $!a is never set to Nil?	21:02	Copy link Message link Add to gist Remove
lizmat	not sure anymore, jnthnwrthngtn reworked that part with new-disp I believe	21:03	Copy link Message link Add to gist Remove
21:26 [Coke] joined
jnthnwrthngtn	MasterDuke: I suspect the first is cheaper	21:29	Copy link Message link Add to gist Remove
	MasterDuke: Although it's probably relatively fine margins, and likely vanishes once PEA is far enough along	21:30	Copy link Message link Add to gist Remove
	nine: One merged, one with minor comment but approved, the other one I need more time on, but posted an initial concern.	21:31	Copy link Message link Add to gist Remove
MasterDuke	performance-wise, i tried a sort of micro-benchmark and it seemed like maybe `= 0` was very slightly faster (with MVM_SPESH_BLOCKING=1), but it was small enough i'd need to re-run a lot more times to be sure. anyway they're close enough i'm not going to change existing code	21:32	Copy link Message link Add to gist Remove
	but in this case i was actually more interested in semantics	21:33	Copy link Message link Add to gist Remove
jnthnwrthngtn	They should result in the same outcome (assuming a literal value)	21:34	Copy link Message link Add to gist Remove
MasterDuke	thanks, good to know		Copy link Message link Add to gist Remove
jnthnwrthngtn	Actually if you never use $a beyond that point PEA may already be eating the difference.		Copy link Message link Add to gist Remove
timo	spesh log could perhaps give a little bit of insight, but i couldn't analyze one right now		Copy link Message link Add to gist Remove
jnthnwrthngtn	Also --profile and see if a load of Scalar allocations are getting optimized away :)	21:35	Copy link Message link Add to gist Remove
	But if they aren't then a) the allocation dominates, b) the literal 0 means easy pickings for tossing out the type guard, and that leaves an attribute bind, which is a couple of machine instructions	21:36	Copy link Message link Add to gist Remove
	(And if PEA does work to its full potential and your loop literally is no more than a variable decl with default or an assignment, then both become an empty loop.)	21:37	Copy link Message link Add to gist Remove
MasterDuke	m: use nqp; class I { has Int $.n is default(0); method from-posix-nanos(I:U: Int:D $nanos --> I:D) { nqp::p6bindattrinvres(nqp::create(I),I,q\|$!n\|,$nanos) } }; my $a; my $b = nqp::time; $a = I.from-posix-nanos($b) for ^10_000_000; say now - INIT now; say $a	21:38	Copy link Message link Add to gist Remove Run code
camelia	0.266082961 I.new(n => 1638394682931903452)		Copy link Message link Add to gist Remove
MasterDuke	m: use nqp; class I { has Int $.n = 0; method from-posix-nanos(I:U: Int:D $nanos --> I:D) { nqp::p6bindattrinvres(nqp::create(I),I,q\|$!n\|,$nanos) } }; my $a; my $b = nqp::time; $a = I.from-posix-nanos($b) for ^10_000_000; say now - INIT now; say $a		Copy link Message link Add to gist Remove Run code
camelia	0.265698586 I.new(n => 1638394692289436231)		Copy link Message link Add to gist Remove
MasterDuke	^^^ was my benchmark		Copy link Message link Add to gist Remove
timo	huh, does that even make a difference when you're using bindattr like that?	21:40	Copy link Message link Add to gist Remove
jnthnwrthngtn	Oh	21:41	Copy link Message link Add to gist Remove
	Yeah, then it's irrelevant :)		Copy link Message link Add to gist Remove
	Because create doesn't run BUILDALL and friends anyway		Copy link Message link Add to gist Remove
	So there's no way the default closure will apply	21:42	Copy link Message link Add to gist Remove
	And the Scalar is being discarded		Copy link Message link Add to gist Remove
	m: for ^10_000_000 { my $x is default(0) }; say now - INIT now		Copy link Message link Add to gist Remove Run code
camelia	1.430858666		Copy link Message link Add to gist Remove
jnthnwrthngtn	m: for ^10_000_000 { my $x = 0 }; say now - INIT now	21:43	Copy link Message link Add to gist Remove Run code
camelia	0.207846194		Copy link Message link Add to gist Remove
jnthnwrthngtn	Huh? :)		Copy link Message link Add to gist Remove
	m: for ^10_000_000 { my $x }; say now - INIT now		Copy link Message link Add to gist Remove Run code
camelia	0.16197103		Copy link Message link Add to gist Remove
jnthnwrthngtn	OK, that surprises me a lot		Copy link Message link Add to gist Remove
	m: for ^10_000_000 { }; say now - INIT now		Copy link Message link Add to gist Remove Run code
camelia	0.059409284		Copy link Message link Add to gist Remove
jnthnwrthngtn	I wonder if `is default` somehow frustrates lexical -> local lowering and thus PEA?	21:44	Copy link Message link Add to gist Remove
	Worth a dig. Perhaps a LHF		Copy link Message link Add to gist Remove
	jnthnwrthngtn afk for a bit		Copy link Message link Add to gist Remove
MasterDuke	10000009 Scalar allocations	21:45	Copy link Message link Add to gist Remove
	same with the `= 0` version	21:46	Copy link Message link Add to gist Remove
	is default(0): In total, 10004354 call frames were entered and exited by the profiled code. Inlining eliminated the need to create 9 call frames (that's 0%).	21:47	Copy link Message link Add to gist Remove
	= 0: In total, 9476 call frames were entered and exited by the profiled code. Inlining eliminated the need to create 19994888 call frames (that's 99.95%).		Copy link Message link Add to gist Remove
21:48 linkable6 left, linkable6 joined
MasterDuke	`sp_runbytecode_v r13(2), liti64(140269206293408), liti16(0), r6(2), r5(2) # [015] could not inline 'set' (21) candidate 0: bytecode is too large to inline` maybe?	21:49	Copy link Message link Add to gist Remove
	well, there were a couple 'too large' could not inlines, and one `inline-preventing instruction: takeclosure` + `could not inline '' (3) candidate 0: target has a :noinline instruction`	21:52	Copy link Message link Add to gist Remove
timo	huh, 99.95% inlined should make a big performance difference	22:02	Copy link Message link Add to gist Remove
MasterDuke	is default(0) also did 687 GCs vs 114 for = 0	22:04	Copy link Message link Add to gist Remove
timo	could the profiler have made an unfortunate impact somehow?	22:10	Copy link Message link Add to gist Remove
MasterDuke	well, we saw the 7x perf difference here on camelia	22:11	Copy link Message link Add to gist Remove
timo	what were you doing differently when you got a difference so small it wasn't easily measurable?	22:12	Copy link Message link Add to gist Remove
MasterDuke	i wasn't actually assigning, i was using bindattr	22:13	Copy link Message link Add to gist Remove
	also i wasn't creating a new variable in the loop body		Copy link Message link Add to gist Remove
timo	ok	22:14	Copy link Message link Add to gist Remove
	that makes sense now		Copy link Message link Add to gist Remove
MasterDuke	gist.github.com/MasterDuke17/be5bc...9435d0743c perf reports for the two cases	22:15	Copy link Message link Add to gist Remove
timo	i would really have expeted default(0) to be faster than = 0 since in theory the thing on the right could be code that needs to be run, but we do already recognize it's a static value, so why aren't the two the exact same		Copy link Message link Add to gist Remove
MasterDuke	seems to me to be something about not being able to inline 'unit' because of the takeclosure	22:16	Copy link Message link Add to gist Remove
	which doesn't happen in the spesh log of the assigning version	22:17	Copy link Message link Add to gist Remove
timo	try isolating the for loop in its own sub?	22:21	Copy link Message link Add to gist Remove
MasterDuke	m: sub foo() { for ^10_000_000 { my $x is default(0) } }; foo; say now - INIT now	22:22	Copy link Message link Add to gist Remove Run code
camelia	1.725214816		Copy link Message link Add to gist Remove
MasterDuke	m: sub foo() { for ^10_000_000 { my $x = 0 } }; foo; say now - INIT now		Copy link Message link Add to gist Remove Run code
camelia	0.20743934		Copy link Message link Add to gist Remove
timo	the default(0) is even slower now	22:27	Copy link Message link Add to gist Remove
	+/- camelia being rather noisy i imagine	22:28	Copy link Message link Add to gist Remove
MasterDuke	`sub foo { for ^10_000_000 { my $x is default(0) }; return 3 }; sub bar { say (5_000_000_000..5_000_001_000).grep(*.is-prime).tail; say foo; say (^1_000_000).pick gcd (^1_000_000).pick }; bar; say now - INIT now` shows the same behavior	22:39	Copy link Message link Add to gist Remove
	in the after of 'foo'. `inline-preventing instruction: takeclosure` + `could not inline '' (3) candidate 0: target has a :noinline instruction`	22:40	Copy link Message link Add to gist Remove
timo	maybe the output from --target=optimize gives a hint	22:45	Copy link Message link Add to gist Remove
MasterDuke	added to gist	22:47	Copy link Message link Add to gist Remove
	gist updated with both versions	22:49	Copy link Message link Add to gist Remove
	oh. `- QAST::Var(lexical $x :decl(static))` for assign vs `- QAST::Var(lexical $x :decl(contvar)) :lexical_used_implicitly<?>` for is default	22:52	Copy link Message link Add to gist Remove
	gist.github.com/MasterDuke17/be5bc...ze-L71-L96 compared to gist.github.com/MasterDuke17/be5bc...ze-L72-L88	22:54	Copy link Message link Add to gist Remove
	ha, with rakudo option `--optimize=0`, assign takes 2.8s, but is default takes the same 1.3	23:02	Copy link Message link Add to gist Remove
japhb	MasterDuke: Another big difference between those two gists is the presence of an extra block in the second (versus a lot more attribute binding in the first)	23:14	Copy link Message link Add to gist Remove
MasterDuke	adding `$v.implicit-lexical-usage = False;` here github.com/rakudo/rakudo/blob/mast...le.pm6#L78 drops 0.05s off the is default version, but does have a single failing spectest	23:17	Copy link Message link Add to gist Remove
	not ok 182 - Failure.new as a default value on an unconstrained Scalar works	23:18	Copy link Message link Add to gist Remove
	# Failed test 'Failure.new as a default value on an unconstrained Scalar works'		Copy link Message link Add to gist Remove
	# at t/spec/S02-names/is_default.rakudo.moar line 395		Copy link Message link Add to gist Remove
	# Error: Object of type Failure in QAST::WVal, but not in SC		Copy link Message link Add to gist Remove
23:51 childlikempress joined, moon-child left 23:52 childlikempress is now known as moon-child

Please report any issues / comments / feature requests as an issue on App::Raku::Log.

Thank you!