#moarvm on 2 March 2021 - Raku Programming Language Log

github.com/moarvm/moarvm \| IRC logs at colabti.org/irclogger/irclogger_logs/moarvm Set by AlexDaniel on 12 June 2018.
00:05 vrurg joined 00:09 vrurg left 00:10 MasterDuke left 00:46 leont left 01:03 vrurg joined 03:01 klapperl left, klapperl joined 03:04 avar left 03:16 avar joined, avar left, avar joined 04:03 bartolin left 04:17 bartolin joined 06:33 frost-lab joined 07:58 domidumont joined 08:26 patrickb joined 08:31 MasterDuke joined
patrickb	o/	08:35	Copy link Message link Add to gist Remove
MasterDuke	\o	08:39	Copy link Message link Add to gist Remove
08:55 zakharyas joined 08:57 sortiz left 08:59 sena_kun left 09:02 sena_kun joined 09:07 leont joined 09:12 Geth joined
MasterDuke	i'm looking at a perf report of a tiny script that defines some custom operators (essentially taken from www.khanate.co.uk/blog/2021/02/23/...-part-2/), but takes 9s to run	09:47	Copy link Message link Add to gist Remove
lizmat	could you see if it is as slow when the code is precomped?	09:48	Copy link Message link Add to gist Remove
MasterDuke	there are 1.35m calls to MVM_coerce_smart_intify, hitting this case github.com/MoarVM/MoarVM/blob/mast...#L417-L418 and i logged the types: `1349339 intifying an object of type P6int (BOOTInt)`		Copy link Message link Add to gist Remove
	is that faster `REPR(obj)->box_funcs.get_int` faster than the above block github.com/MoarVM/MoarVM/blob/mast...#L393-L409 which i would have though would have been hit?	09:50	Copy link Message link Add to gist Remove
	i assume the get_int is just github.com/MoarVM/MoarVM/blob/mast....c#L89-L97 which should be pretty fast, but MVM_coerce_smart_intify was higher in the perf report than i'd expect	09:54	Copy link Message link Add to gist Remove
	lizmat: the runtime isn't very much (so i'm sure precompiling would make the entire script much faster), but it's the compiling of it i'm trying to speed up	09:56	Copy link Message link Add to gist Remove
lizmat	ack ++MasterDuke		Copy link Message link Add to gist Remove
MasterDuke	run time is 41 ms, compile time is 9s	09:59	Copy link Message link Add to gist Remove
	6s of which are spent in github.com/Raku/nqp/blob/master/sr...L813-L1007 but i don't know if there are any more micro-optimizations to be done there, i suspect something algorithmic is needed	10:01	Copy link Message link Add to gist Remove
10:01 frost-lab left
lizmat	Looks to me line 966 is dead code ?	10:05	Copy link Message link Add to gist Remove
nine	indeed	10:06	Copy link Message link Add to gist Remove
MasterDuke	yeah, same with 873. but i don't think those are going to significantly impact the time	10:07	Copy link Message link Add to gist Remove
lizmat	lemme remove them and runs the tests		Copy link Message link Add to gist Remove
nine	I'm pretty sure spesh would optimize away 873 and likely 966, too		Copy link Message link Add to gist Remove
	OTOH removed code is debugged code	10:08	Copy link Message link Add to gist Remove
lizmat	873 appears to be needed for debugging		Copy link Message link Add to gist Remove
nine	both are	10:09	Copy link Message link Add to gist Remove
MasterDuke	same with 966 (used in 969)		Copy link Message link Add to gist Remove
lizmat	aaahh ok		Copy link Message link Add to gist Remove
nine	but they may be commented out		Copy link Message link Add to gist Remove
MasterDuke	but they should be commented out with the debugging code	10:10	Copy link Message link Add to gist Remove
lizmat	doing that now		Copy link Message link Add to gist Remove
MasterDuke	fwiw, process_worklist, MVM_VMArray_at_pos, VMArray_gc_mark are the top moarvm functions according to perf	10:11	Copy link Message link Add to gist Remove
nine	I'm a bit surprised at MVM_VMArray_at_pos, since that ought to get devirtualized in the JIT	10:12	Copy link Message link Add to gist Remove
	Or, no		Copy link Message link Add to gist Remove
10:13 frost-lab joined
nine	Well it would get devirtualized, but that only means that we get rid of MVM_repr_at_pos and replace it with MVM_VMArray_at_pos. And the latter is probably a bit too complicated to implement in the JIT directly	10:13	Copy link Message link Add to gist Remove
jnthn	The array element type is, however, hung off the REPR data, and so if we know the type we know the element type, so any branching on the array kind goes away	10:14	Copy link Message link Add to gist Remove
	I'd suggest we get the re-org of VMArray done first	10:15	Copy link Message link Add to gist Remove
MasterDuke	re-org? the FSA work i've yet to finish?		Copy link Message link Add to gist Remove
nine	Yeah, looking at it, the code is very repetitive and the only real difference is the multiplier for the index to get the memory address		Copy link Message link Add to gist Remove
jnthn	MasterDuke: Yes, and I think moving the length information into the chunk allocated with the FSA, to give us safety	10:16	Copy link Message link Add to gist Remove
	After that I guess it JITs into an offset calculation and a bounds check		Copy link Message link Add to gist Remove
	Then the deref		Copy link Message link Add to gist Remove
MasterDuke	yeah, that's the part i haven't finished yet		Copy link Message link Add to gist Remove
	that is, moving the length information into the chunk allocated with the FSA		Copy link Message link Add to gist Remove
jnthn	OK. The JIT output would change with that, so probably it's worth doing first.	10:17	Copy link Message link Add to gist Remove
	I'll try and backlog here a bit later. Still suffering limited keyboard time due to wrist.		Copy link Message link Add to gist Remove
nine	oh...still not better?	10:18	Copy link Message link Add to gist Remove
MasterDuke	i think the remove candidates PR is pretty close to done, hope to finish up the VMArray FSA stuff next		Copy link Message link Add to gist Remove
jnthn	nine: Well, in the absolute sense no, in the relative sense, improvement since I started using some gel and realizing that less keyboard time is useless if I don't also do less smartphone time :)	10:19	Copy link Message link Add to gist Remove
MasterDuke	(just distracted from either this morning by the slow custom operator compiling)		Copy link Message link Add to gist Remove
jnthn	Custom op compilation is slow more 'cus of the NFA design than anything, I suspect	10:20	Copy link Message link Add to gist Remove
MasterDuke	heh, yeah, i just keep hoping to find something lower hanging than the very top of the tree...		Copy link Message link Add to gist Remove
nine	jnthn: I can tell from experience that playing piano is also not the smartest idea in that case :)	10:21	Copy link Message link Add to gist Remove
MasterDuke	or throwing/catching an (american) football		Copy link Message link Add to gist Remove
nine	MasterDuke: too late! Considering the path you're on, you'll keep jumping from tree top to tree top		Copy link Message link Add to gist Remove
MasterDuke	ha	10:23	Copy link Message link Add to gist Remove
10:35 frost-lab left
nine	Why does CStruct's storage spec claim that it needs only space the size of a pointer for inlining: github.com/MoarVM/MoarVM/blob/mast...uct.c#L793 when its body actually contains 2 pointers? github.com/MoarVM/MoarVM/blob/mast...ruct.h#L21	10:38	Copy link Message link Add to gist Remove
	Ah, because it cannot be inlined according to the same storage spec	10:42	Copy link Message link Add to gist Remove
10:46 patrickb left
MasterDuke	btw, is there anything that can be done for all those MVM_coerce_smart_intify calls where the type is P6int (BOOTInt)? don't know if it matters, but they're coming from that NFA optimize method and NQPMu's BUILDALL	11:12	Copy link Message link Add to gist Remove
12:08 zakharyas left
MasterDuke	oh, github.com/MoarVM/MoarVM/blob/mast...ze.c#L1006 is supposed to take care of it. but then why is there still a smrt_intify in the 'after' of NQPMu's BUILDALL...?	12:33	Copy link Message link Add to gist Remove
	smrt_intify r13(3), r8(9)	12:37	Copy link Message link Add to gist Remove
	...		Copy link Message link Add to gist Remove
	r8(9): usages=1, deopt=9, flags=0		Copy link Message link Add to gist Remove
	...		Copy link Message link Add to gist Remove
	r13(3): usages=6, deopt=53,51,50,49,48,47,46,45,44,43,42,41,40,39,38,37,36,35,34,32,31,30,29,28,27,25,26,22,21,20,19,18,17,16,15,14,13,12,11,10, flags=0		Copy link Message link Add to gist Remove
	i don't know how to read facts		Copy link Message link Add to gist Remove
	lizmat resists the urge to mention something with "alternative" in it	12:39	Copy link Message link Add to gist Remove
MasterDuke	heh		Copy link Message link Add to gist Remove
	fwiw, it's this line github.com/Raku/nqp/blob/master/sr...Mu.nqp#L23	12:42	Copy link Message link Add to gist Remove
lizmat	you could try removing the "int" part ?	12:46	Copy link Message link Add to gist Remove
	but I guess that will cascade into a lit of calls in the nqp::iseq_i 's ?	12:47	Copy link Message link Add to gist Remove
	*lot		Copy link Message link Add to gist Remove
MasterDuke	that's what i assume		Copy link Message link Add to gist Remove
12:53 patrickb joined
MasterDuke	i think it doesn't get into the body of the optimize_coerce because it fails `if (facts->flags & (MVM_SPESH_FACT_KNOWN_TYPE \| MVM_SPESH_FACT_CONCRETE) && facts->type) {`	13:08	Copy link Message link Add to gist Remove
	`facts->flags & (MVM_SPESH_FACT_KNOWN_TYPE \| MVM_SPESH_FACT_CONCRETE)` is 0 and `facts->type` is 0x0		Copy link Message link Add to gist Remove
lizmat	so is the location of the code the reason?	13:10	Copy link Message link Add to gist Remove
MasterDuke	i assume it's because spesh doesn't know for sure what the type is of the thing it's pulling out with the nqp::atpos. compared to atpos_i which would always be an int	13:12	Copy link Message link Add to gist Remove
lizmat	yup, I'd say	13:15	Copy link Message link Add to gist Remove
	sadly, that array cannot be turned into a list_i, as it can also contain code objects for BUILD and TWEAK, if I recall correctly		Copy link Message link Add to gist Remove
	and in any case, in Rakudo this is all codegenned for each class, so it actually won't run BUILDALL in most cases	13:16	Copy link Message link Add to gist Remove
	well, that BUILDALL		Copy link Message link Add to gist Remove
MasterDuke	luckily this isn't the most expensive thing going on, but i thought it might be more easily optimized	13:18	Copy link Message link Add to gist Remove
lizmat	but you raise a good point	13:20	Copy link Message link Add to gist Remove
	I wonder though if it would be worthwhile to port the "create a custom BUILDALL method for a class" approach would be worthwhile in NQP, or even possible?	13:21	Copy link Message link Add to gist Remove
MasterDuke	i'm afraid i can only be a rubber duck here, i know absolutely nothing about the BUILDALL stuff	13:24	Copy link Message link Add to gist Remove
lizmat	how big a part is BUILDALL execution of what you're benchmarking ?	13:25	Copy link Message link Add to gist Remove
MasterDuke	it's the 6th most expensive function, but in absolute time values almost nothing in comparison	13:26	Copy link Message link Add to gist Remove
lizmat	ok, then let's focus on the top 5 :-)	13:27	Copy link Message link Add to gist Remove
MasterDuke	6.2s, 4.4s, 2s, 440ms, 240ms, then BUILDALL at 190ms	13:28	Copy link Message link Add to gist Remove
lizmat	and the 6.2s one is?	13:30	Copy link Message link Add to gist Remove
MasterDuke	those top three are optimize gen/moar/stage2/QRegex.nqp:817, mergesubstates gen/moar/stage2/QRegex.nqp:665, mergesubrule gen/moar/stage2/QRegex.nqp:560		Copy link Message link Add to gist Remove
	i've looked at all three a bunch and i think the micro-optimizations are pretty much all found. like jnthn++ said, a re-design is needed	13:31	Copy link Message link Add to gist Remove
lizmat	line 384 maybe better written as "elsif $to && @edges[0] == $EDGE_FATE and lose the inner if ?	13:33	Copy link Message link Add to gist Remove
	that would first check $to (which is probably cheaper), and only then index into @edges	13:34	Copy link Message link Add to gist Remove
	also: it's checking twice for @edges[0], maybe store that in a temp ?		Copy link Message link Add to gist Remove
MasterDuke	isn't the second one @sedges (note the initial 's')	13:36	Copy link Message link Add to gist Remove
	oh, you mean 831 and 834?		Copy link Message link Add to gist Remove
lizmat	830 yeah	13:37	Copy link Message link Add to gist Remove
	sorry, what was I typing		Copy link Message link Add to gist Remove
13:42 zakharyas joined
MasterDuke	no noticeable change in time with those changes	13:49	Copy link Message link Add to gist Remove
lizmat	yeah, it was a long shot: those were in the initial setup, not in the actual optimization	13:52	Copy link Message link Add to gist Remove
	afk for a few hours&		Copy link Message link Add to gist Remove
MasterDuke	jnthn: i'm not sure about github.com/MoarVM/MoarVM/pull/1426...2c17b3e282 but that was needed to actually trigger the optimization being removed in that example you gave	14:12	Copy link Message link Add to gist Remove
nine	I'm pretty sure all the regex code is from the era of "make it work somehow" rather than the later "make it work efficiently" stage	14:34	Copy link Message link Add to gist Remove
MasterDuke	i know there's that "passing fates somewhere" optimization i've asked about before, i've been thinking about trying to give that a go once i finish up the current stuff	14:37	Copy link Message link Add to gist Remove
	regexes/nfas/etc are not really my area of expertise, but then again, none of the rakudo/moarvm stuff i've done really are either...	14:39	Copy link Message link Add to gist Remove
nine	I think regexes/nfsas/etc are the last large white area on my map, too :)	14:44	Copy link Message link Add to gist Remove
MasterDuke	i really like using regexes (mastering regular expressions was only the second programming reference book i read cover-to-cover besides programming perl), but have never really spent any time with their implementation	14:47	Copy link Message link Add to gist Remove
nine	At university (I studied on-the-job with already 15 years of experience) as a beginner example for programming C we had to implement a grep-like tool and just for fun I implemented a very basic backtracking regex engine :) It's really the best way to understand how these things work	15:16	Copy link Message link Add to gist Remove
15:22 linkable6 left, evalable6 left, linkable6 joined 15:24 evalable6 joined
MasterDuke	my algorithms course had some stuff about NFAs/DFAs, but i don't remember any particularly practical exercises	15:32	Copy link Message link Add to gist Remove
15:35 patrickb left 15:37 patrickb joined 16:09 sortiz joined 16:54 patrickb left 17:05 cog left 17:06 cog joined 18:59 zakharyas left 19:33 domidumont left 20:21 sxmx left 20:38 sxmx joined 20:55 zakharyas joined 21:48 zakharyas left
lizmat	eprint.iacr.org/2021/232 # RIP RSA ?	22:28	Copy link Message link Add to gist Remove
moritz	the linked PDF says "work in progress 31.10.2019"	22:33	Copy link Message link Add to gist Remove
	if it's as revolutionary as the abstract claims, why hasn't it destroyed RSA yet in the last year?		Copy link Message link Add to gist Remove
	I don't understand number theory, so cannot judge the paper on its contents	22:34	Copy link Message link Add to gist Remove
leont	Given the name on it, I would take it serious	22:35	Copy link Message link Add to gist Remove
	Any idea I may have had of understanding a little number theory was quickly put to rest by that paper…	22:37	Copy link Message link Add to gist Remove
moritz	aye, it's pretty dense :D		Copy link Message link Add to gist Remove
MasterDuke	dense is the word i was just going to use		Copy link Message link Add to gist Remove
moritz	and according to the Wikipedia, it does look like he's got some good credentials in the field	22:38	Copy link Message link Add to gist Remove
leont	Then again, my understanding of number theory is based on having read Applied Cryptography like 15 years ago :-p		Copy link Message link Add to gist Remove
	(and that's probably also why I had heard of Schnorr signatures)	22:47	Copy link Message link Add to gist Remove
23:10 Kaiepi left 23:11 Kaiepi joined

Please report any issues / comments / feature requests as an issue on App::Raku::Log.

Thank you!