#moarvm on 2 August 2014 - Raku Programming Language Log

japhb	2 MB seems like a nice savings to me ...	00:03	Copy link Message link Add to gist Remove
	Or am I misreading?		Copy link Message link Add to gist Remove
jnthn	No, you're not, though we're talking about on CORE.setting :)	00:05	Copy link Message link Add to gist Remove
	We'll save some CPU too though, I imagine		Copy link Message link Add to gist Remove
timotimo	aye, the time fetching the strings indirectly and then comparing them ... that's gotta be costly		Copy link Message link Add to gist Remove
	mostly the fetching	00:06	Copy link Message link Add to gist Remove
TimToady	compiling src/mast/compiler.o		Copy link Message link Add to gist Remove
	src/mast/compiler.c: In function ‘form_bytecode_output’:		Copy link Message link Add to gist Remove
	src/mast/compiler.c:1255:53: error: ‘MVMThreadContext’ has no member named ‘str_consts’		Copy link Message link Add to gist Remove
	that's at HEAD		Copy link Message link Add to gist Remove
timotimo	oh, why was that thing called "vm"?		Copy link Message link Add to gist Remove
	hold on.	00:07	Copy link Message link Add to gist Remove
jnthn	Didn't you, like, at least try to compile your change? :P		Copy link Message link Add to gist Remove
timotimo	some of these macros take a "vm" argument, but use "tc" instead		Copy link Message link Add to gist Remove
dalek	arVM: 9154ac3 \| (Timo Paulssen)++ \| src/mast/nodes_moar.h: this argument was called "vm" misleadingly ...	00:08	Copy link Message link Add to gist Remove
00:50 cognome joined 01:46 FROGGS_ joined 03:18 jimmyz joined
jimmyz	Stage parse : 34.374, before ~36s Stage mast : 11.899, before ~13s	03:19	Copy link Message link Add to gist Remove
	since yesterday		Copy link Message link Add to gist Remove
xiaomiao	death by a thousand papercuts :)	03:43	Copy link Message link Add to gist Remove
04:21 ventica joined 07:00 ventica joined
nwc10	m: say 6.243e+01/6.636e+01; say 6.243e+01/6.597e+01	07:17	Copy link Message link Add to gist Remove Run code
camelia	rakudo-moar a1a236: OUTPUT«0.940777576853526␤0.946339245111414␤»		Copy link Message link Add to gist Remove
nwc10	jnthn: about 6% less time than mid yesterday's slight slowdown, and 5.5% less than yesterday morning. I think	07:18	Copy link Message link Add to gist Remove
	jnthn: but I do need to be careful, as building parrot tanks the machine performance, because it coredumps twice, filling the disk sufficient to make it slow		Copy link Message link Add to gist Remove
	setting with -flto 6.257e+01	08:44	Copy link Message link Add to gist Remove
	setting without -flto 6.243e+01		Copy link Message link Add to gist Remove
	so, it makes things fractionally slower, but within the noise	08:45	Copy link Message link Add to gist Remove
	so, not a free win		Copy link Message link Add to gist Remove
	at least, not for setting building		Copy link Message link Add to gist Remove
timotimo	thank you for measuring!	08:53	Copy link Message link Add to gist Remove
jnthn	afternoon, #moarvm	11:17	Copy link Message link Add to gist Remove
FROGGS_	hi all	11:23	Copy link Message link Add to gist Remove
dalek	arVM: c350fe0 \| jnthn++ \| src/6model/serialization.c: Toss dead macro.	11:42	Copy link Message link Add to gist Remove
timotimo	oh hey jnthn	12:26	Copy link Message link Add to gist Remove
	did you see the performance of "for" benchmarks is still less than it was at our last release?		Copy link Message link Add to gist Remove
jnthn	Hmm	12:38	Copy link Message link Add to gist Remove
	Oh, because they stupidly put parens around their ranges.		Copy link Message link Add to gist Remove
timotimo	%)		Copy link Message link Add to gist Remove
	again? :D		Copy link Message link Add to gist Remove
	time to fix that exact thing a second time		Copy link Message link Add to gist Remove
jnthn	And the code-path that stripped away such things is applied after the opt		Copy link Message link Add to gist Remove
timotimo	yeah, phase ordering problem yada yada	12:39	Copy link Message link Add to gist Remove
jnthn	Heh. "Don't write superstitious parens; it'll make yoru code slower" isn't such a bad thing :P		Copy link Message link Add to gist Remove
timotimo	%)	12:40	Copy link Message link Add to gist Remove
	if i want to specialize smart_numify and smart_strify in the case where they cannot directly unbox, i'd have to do a method call on them; would i need to somehow find a correct callsite to give to prepargs?	12:41	Copy link Message link Add to gist Remove
	or is there a prepargs-less method for invocation?		Copy link Message link Add to gist Remove
jnthn	No, always need that.		Copy link Message link Add to gist Remove
timotimo	in that case i'll do the "can unbox, yay" optimizations first and ignore the method call ones for now	12:42	Copy link Message link Add to gist Remove
	hum. smart_strify only tries to unbox a str if the object is concrete ... so i have to test for that fact, too	12:49	Copy link Message link Add to gist Remove
	timotimo wonders if brrt can work on moar-jit today	12:52	Copy link Message link Add to gist Remove
	wow, the smrt_strify -> unbox opt seems to trigger very often	12:54	Copy link Message link Add to gist Remove
	(only tested nqp so far)	12:55	Copy link Message link Add to gist Remove
	ah, damn, rakudo compilation seems to stumble over it		Copy link Message link Add to gist Remove
	LOL	13:01	Copy link Message link Add to gist Remove
	damn you, debug output		Copy link Message link Add to gist Remove
	since gen-cat has been dogfooded, it put a whole bunch of "optimized a call! yay!" lines into the resulting source code %)	13:02	Copy link Message link Add to gist Remove
jnthn	Yes, fprintf(stderr,...) advised :P	13:04	Copy link Message link Add to gist Remove
timotimo	that is very true	13:10	Copy link Message link Add to gist Remove
	hm. i don't suppose we have a spesh opcode (or general opcode, really) to just take a pointer to MVMObject, reinterpret it as a pointer to MVMString and put it into a register's .s?	13:16	Copy link Message link Add to gist Remove
	like sp_get_s without the first pointer dereferencing and with an offset of 0		Copy link Message link Add to gist Remove
jnthn	Um, no, and that sounds dangerous	13:20	Copy link Message link Add to gist Remove
	What do you want it for?		Copy link Message link Add to gist Remove
timotimo	smart_strify checks the reprid of the object and if it is MVMString, it just casts	13:22	Copy link Message link Add to gist Remove
jnthn	That...uh...should never actually occur		Copy link Message link Add to gist Remove
timotimo	in that case, i'll just remove that case from smrt_strify itself :)		Copy link Message link Add to gist Remove
jnthn	Can you try removing that case?	13:23	Copy link Message link Add to gist Remove
	I really hope we don't rely on it.		Copy link Message link Add to gist Remove
timotimo	sure		Copy link Message link Add to gist Remove
	maybe i should have put a printf in there instead of removing it	13:28	Copy link Message link Add to gist Remove
	oh	13:29	Copy link Message link Add to gist Remove
	it would throw a "cannot stringify this" exception		Copy link Message link Add to gist Remove
	that's fine, then		Copy link Message link Add to gist Remove
	doesn't occur anywhere in rakudo's build		Copy link Message link Add to gist Remove
	does that seem good enough for me to commit the patch?		Copy link Message link Add to gist Remove
jnthn	Hm, well, spectest is nice but maybe do that after your other improvements	13:31	Copy link Message link Add to gist Remove
timotimo	will do		Copy link Message link Add to gist Remove
13:44 cognome joined, cognominal joined 13:46 cognominal joined
timotimo	29.947 :D	13:47	Copy link Message link Add to gist Remove
	sadly, the "advanced" smrt_strify cases don't seem to get triggered by either the build nor "make test"		Copy link Message link Add to gist Remove
	will spectest now.		Copy link Message link Add to gist Remove
	did i understand correctly that we can't currently put new method calls into spesh'd code?	13:50	Copy link Message link Add to gist Remove
jnthn	I think perhaps we can, it's just tricky to deal with callsite stuff	13:51	Copy link Message link Add to gist Remove
timotimo	if the interface was more evilness-friendly, i could directly try to inline the target method :P		Copy link Message link Add to gist Remove
jnthn	But we need to look at that a bit anyway		Copy link Message link Add to gist Remove
	Well, no, we should just emit the method call and then let the normal logic look over it to decide if it's inlinable.		Copy link Message link Add to gist Remove
timotimo	but then i'd also have to find a proper spesh candidate and all that		Copy link Message link Add to gist Remove
	aye		Copy link Message link Add to gist Remove
jnthn	Composition always beats hacks.	13:52	Copy link Message link Add to gist Remove
timotimo	i wasn't very serious about that :)		Copy link Message link Add to gist Remove
	oh, lock.rakudo.moar crashes?		Copy link Message link Add to gist Remove
jnthn	Hmm		Copy link Message link Add to gist Remove
	Try it again just in case it's a one-off?		Copy link Message link Add to gist Remove
timotimo	was about to		Copy link Message link Add to gist Remove
	jnthn needs to do more stress testing on that sort of stuff		Copy link Message link Add to gist Remove
timotimo	waiting for it to finish first	13:53	Copy link Message link Add to gist Remove
	the failure from combinations.t was already reported in #perl6 by lizmat		Copy link Message link Add to gist Remove
	although it seems like i have a crash and they had a "not ok"	13:54	Copy link Message link Add to gist Remove
	spectests are fine	13:57	Copy link Message link Add to gist Remove
	i (or the test harness) misinterpreted an exit(1) as a crash		Copy link Message link Add to gist Remove
	oh, d'oh		Copy link Message link Add to gist Remove
	i didn't save the changes in coerce.c ...		Copy link Message link Add to gist Remove
jnthn	fail	13:58	Copy link Message link Add to gist Remove
dalek	arVM: 35687e1 \| (Timo Paulssen)++ \| src/core/coerce.c: we should never depend on this working.	14:14	Copy link Message link Add to gist Remove
	arVM: cfafc8d \| (Timo Paulssen)++ \| src/spesh/optimize.c: some smrt_strify can be optimized into simpler ops like unboxing a string or unboxing num/int and coercing.		Copy link Message link Add to gist Remove
timotimo	i guess smrt_numify may be worth even more, as it's probably commonly used instead of elems on arrays and hashes	14:31	Copy link Message link Add to gist Remove
jnthn	yeah	14:32	Copy link Message link Add to gist Remove
nwc10	timotimo: did you mean to commit fprintf(stderr, "spesh'd a smrt_strify to unbox and coerce a %d\n", register_type);	14:35	Copy link Message link Add to gist Remove
	and the other sprintf?	14:36	Copy link Message link Add to gist Remove
timotimo	er, no	14:38	Copy link Message link Add to gist Remove
	:)		Copy link Message link Add to gist Remove
dalek	arVM: 255b466 \| (Timo Paulssen)++ \| src/spesh/optimize.c: didn't mean to keep the debug output around		Copy link Message link Add to gist Remove
jnthn	Seems you forget to relesae the temp reg at the end?	14:43	Copy link Message link Add to gist Remove
timotimo	ah, yes. will fix that in a bit	14:49	Copy link Message link Add to gist Remove
	actually, why not right now	14:52	Copy link Message link Add to gist Remove
	(i blame the heat)		Copy link Message link Add to gist Remove
dalek	arVM: 8a9fd7f \| jnthn++ \| src/mast/compiler.c: Toss unused field.	14:54	Copy link Message link Add to gist Remove
	arVM: 681ec90 \| jnthn++ \| src/mast/compiler.c: Extract label handling code into functions. Tidies the code, and will make the upcoming refactor a little easier.		Copy link Message link Add to gist Remove
	arVM: a56d606 \| jnthn++ \| src/mast/compiler.c: Switch over to using label identity for matching. Means we can elminate a couple of hashes, but also that labels will no longer need to have a unique name generated.		Copy link Message link Add to gist Remove
	arVM: 82ca33d \| jnthn++ \| src/mast/nodes_moar.h: Remove name from MAST_Label; now unused.	15:08	Copy link Message link Add to gist Remove
timotimo	hm. my numify → elems + coerce_in opt doesn't seem to be correct :\	15:11	Copy link Message link Add to gist Remove
dalek	arVM: 4656e18 \| jnthn++ \| lib/MAST/Nodes.nqp: Remove name from MAST::Label and its constructor. Breaking API change; requires NQP and Rakudo updates.	15:14	Copy link Message link Add to gist Remove
timotimo	oh, it could be that the call gets tossed by the unused optimization?		Copy link Message link Add to gist Remove
jnthn	timotimo: Is that one you've committed?		Copy link Message link Add to gist Remove
timotimo	not yet		Copy link Message link Add to gist Remove
jnthn	OK, good...I need to bump		Copy link Message link Add to gist Remove
timotimo	this time i'm testing properly before i commit! :P		Copy link Message link Add to gist Remove
jnthn	Yes, you probably should be setting usages up on things you add.	15:15	Copy link Message link Add to gist Remove
timotimo	on temp registers, too?		Copy link Message link Add to gist Remove
jnthn	Yes		Copy link Message link Add to gist Remove
timotimo	that may explain it :)		Copy link Message link Add to gist Remove
jnthn	Dead code elimination will happily kill instructions involving temp registers too :)		Copy link Message link Add to gist Remove
timotimo	that fixed it, yay	15:17	Copy link Message link Add to gist Remove
	this optimization runs often		Copy link Message link Add to gist Remove
nwc10	jnthn: "works" on "my" machine - 2 spectests currently aren't clear	15:46	Copy link Message link Add to gist Remove
	or are flapping		Copy link Message link Add to gist Remove
15:51 zakharyas joined
dalek	arVM: 84b5348 \| (Timo Paulssen)++ \| src/spesh/optimize.c: release the temp register at the end.	15:53	Copy link Message link Add to gist Remove
	arVM: acaa897 \| (Timo Paulssen)++ \| src/spesh/optimize.c: spesh smrt_numify, bump usage counter of temp reg. this triggers especially often in combination with a MVMArray or MVMHash repr'd object and gives us a (usually optimized) elems call + a coerce_in		Copy link Message link Add to gist Remove
timotimo	after a spectest and a shower i feel confident pushing this		Copy link Message link Add to gist Remove
	nwc10: yeah, they are for me, too. but when i run them manually, they succeed :(		Copy link Message link Add to gist Remove
	a moar-jit with current master merged is at 30.5 seconds stage parse for me	16:03	Copy link Message link Add to gist Remove
	that's hardly any worse than master alone.		Copy link Message link Add to gist Remove
jnthn	Yeah. Thing is, when I tried the JIT on various hot loop stuff - even with bojecty code - I was seeing a 50% or so win.	16:08	Copy link Message link Add to gist Remove
timotimo	did you try counting how often we deopt in the core setting compilation?	16:09	Copy link Message link Add to gist Remove
jnthn	Yeah. Quite a lot.	16:10	Copy link Message link Add to gist Remove
timotimo	or should i try that while you do more awesome optimization stuff? :3		Copy link Message link Add to gist Remove
jnthn	And then I managed to reduce it a good bit		Copy link Message link Add to gist Remove
	But it's not that costly so far as I can tell		Copy link Message link Add to gist Remove
timotimo	how often compared to jumping into jitted code?		Copy link Message link Add to gist Remove
jnthn	In fact, deopt from JIT is cheaper than from interpreter in terms of the cost of the deopt itself.		Copy link Message link Add to gist Remove
timotimo	ah, that count was before the recent work		Copy link Message link Add to gist Remove
jnthn	Didn't count how often we run JITted cdoe		Copy link Message link Add to gist Remove
timotimo	fair enough, but if we deopt all the damn time, we'll end up interping all our code instead of running the jitted code :)	16:11	Copy link Message link Add to gist Remove
jnthn	Yeah.		Copy link Message link Add to gist Remove
timotimo	we could probably generate code in the jit output that counts how many opcodes we executed before we bailed due to deopt	16:12	Copy link Message link Add to gist Remove
	or we could postpone that to a bit later	16:13	Copy link Message link Add to gist Remove
	did brrt say he'd be AFK		Copy link Message link Add to gist Remove
	all weekend?		Copy link Message link Add to gist Remove
nwc10	timotimo: sometimes that means that they are badly written. IIRC one failing was to assume the current directory	16:14	Copy link Message link Add to gist Remove
jnthn	Think he said he was busy this weekend, yeah		Copy link Message link Add to gist Remove
timotimo	ah, ok		Copy link Message link Add to gist Remove
	then i don't need to wonder what's up		Copy link Message link Add to gist Remove
	turns out, that smart numify/stringify were already implemented in the jit anyway		Copy link Message link Add to gist Remove
	but i bet the spesh'd solution ends up cheaper in good cases		Copy link Message link Add to gist Remove
	does it make any sense to spesh away a "not" instruction after an instruction where we know how to negate the result by choosing another instruction? like an isnull + not_i could be just isnonnull	16:16	Copy link Message link Add to gist Remove
nwc10	that sounds like bad codegen	16:17	Copy link Message link Add to gist Remove
timotimo	.o( because the jit doesn't not_i yet )		Copy link Message link Add to gist Remove
nwc10	however, I guess that those sorts of sequences can appear as the result of inlining	16:19	Copy link Message link Add to gist Remove
timotimo	not only that	16:20	Copy link Message link Add to gist Remove
nwc10	so, "how often?" and "how costly?" "how much benefit?"		Copy link Message link Add to gist Remove
timotimo	every time the isnull is the result of one operation and the not_i is the result of another ...		Copy link Message link Add to gist Remove
	the jit bails out of 36 frames in the core setting because it sees not_i		Copy link Message link Add to gist Remove
	oh, many more of those are actually isnull_s	16:21	Copy link Message link Add to gist Remove
	more than isnull itself		Copy link Message link Add to gist Remove
jnthn	isnull_s and isnull should compile into the same assembly, surely.		Copy link Message link Add to gist Remove
	oh, no, wait		Copy link Message link Add to gist Remove
	They won't because of the VMNull thing.	16:22	Copy link Message link Add to gist Remove
timotimo	at least they are already both implemented :)	16:23	Copy link Message link Add to gist Remove
	you don't happen to know of some somewhat low-hanging optimization i could look at next? :)	16:42	Copy link Message link Add to gist Remove
	i suppose if i am to implement some feature or ecosystem-related thing instead it'd end up being "gui frontend for the debugger", which will yak-shave-reduce to "improve GTK::Simple"	16:44	Copy link Message link Add to gist Remove
jnthn	How's GTK::Simple doing these days?		Copy link Message link Add to gist Remove
timotimo	it displays windows, buttons and labels :P		Copy link Message link Add to gist Remove
	it's kinda hard to tell what's still in scope for GTK::Simple and what isn't	16:45	Copy link Message link Add to gist Remove
	and how to move things into separate modules while still maintaining compatibility between the things		Copy link Message link Add to gist Remove
	though since we got NativeCast now, ther's no need to have the same class repr the OpaquePointer were playing with		Copy link Message link Add to gist Remove
jnthn	It's not LHF, but I have pondered that CAPHASH may want to cease to exist, and we build Match objects more directly out of $!cstack	16:47	Copy link Message link Add to gist Remove
	We'd have to implement building Rakudo's ones too		Copy link Message link Add to gist Remove
	So we get less code re-use...but Match object construction is so hot path that building an intermediate data structure every time is kinda costly.	16:48	Copy link Message link Add to gist Remove
	Especially given the intermediate data structure is a hash	16:49	Copy link Message link Add to gist Remove
	And hash lookups are one of the things we spend most time doing in CORE.setting compilation.		Copy link Message link Add to gist Remove
	I'm not sure it's LHF, but it is at least "just" NQP and Perl 6 code to write :)		Copy link Message link Add to gist Remove
timotimo	oof	16:52	Copy link Message link Add to gist Remove
	commute &	16:58	Copy link Message link Add to gist Remove
	i'll have a look later :)		Copy link Message link Add to gist Remove
17:21 ventica joined 17:53 cognome joined
timotimo	now i've finished the commute and also some grocerisation	17:54	Copy link Message link Add to gist Remove
FROGGS	jnthn: that CAPHASH removal sounds like awesome	17:59	Copy link Message link Add to gist Remove
japhb	jnthn: still backlogging, so this may be resolved, but: The "superstitious parens in for loops" in the benchmarks were for three reasons: 1. Because it helps align with perl5, so I can visually see if I've typoed, 2. Because Perl 5 converts will accidentally do this all the time, and 3. Because it really shouldn't matter for performance, so if it does, I call that a bug worth catching. :-)	18:05	Copy link Message link Add to gist Remove
nwc10	m: say 6.1774e+01/6.243e+01; say 6.1774e+01/6.597e+01	18:57	Copy link Message link Add to gist Remove Run code
camelia	rakudo-moar e036e2: OUTPUT«0.989492231299055␤0.936395331211156␤»		Copy link Message link Add to gist Remove
nwc10	so 1% less than last time, and 6.4% less than yesterday morning	18:58	Copy link Message link Add to gist Remove
timotimo	what happened since last time		Copy link Message link Add to gist Remove
	?		Copy link Message link Add to gist Remove
nwc10	I don't know.		Copy link Message link Add to gist Remove
	once upon a time it was "this week". Right now, it seems to be "this hour"	18:59	Copy link Message link Add to gist Remove
	I guess, really, it's "this morning"		Copy link Message link Add to gist Remove
jnthn	Coulda been the labels improvemnets	19:01	Copy link Message link Add to gist Remove
	Also timotimo++'s patches		Copy link Message link Add to gist Remove
nwc10	does perl6bench like it?	19:02	Copy link Message link Add to gist Remove
timotimo	oh :3		Copy link Message link Add to gist Remove
jnthn	perl6bench doesn't measure compilation time really		Copy link Message link Add to gist Remove
nwc10	these are mostly compilation time fixies?	19:03	Copy link Message link Add to gist Remove
	er, fixes		Copy link Message link Add to gist Remove
	they don't help more general code paths?		Copy link Message link Add to gist Remove
jnthn	My labels thing was	19:04	Copy link Message link Add to gist Remove
	timotimo's are more genearl.		Copy link Message link Add to gist Remove
japhb	jnthn: perl6-bench does measure compile time for each test, it just subtracts it from the run time of the test ... or did you mean, the compile time for the compiler itself?	19:14	Copy link Message link Add to gist Remove
timotimo	the latter, i believe		Copy link Message link Add to gist Remove
jnthn	No, I meant for the test...OK, I guess what I shoulda said is "doesn't appear in the graphs" - which is the right thing in many senses. :)	19:15	Copy link Message link Add to gist Remove
	Though it could be itneresting to know about compile itme improvements over time :)		Copy link Message link Add to gist Remove
japhb	jnthn: Just turn off the compile time ignoring	19:16	Copy link Message link Add to gist Remove
	--/ignore-compile and/or --/ignore-setup	19:17	Copy link Message link Add to gist Remove
	(Because bench defaults both to on.)		Copy link Message link Add to gist Remove
	Mind you, you'll then see the combination of compile and run time, so hmmm.		Copy link Message link Add to gist Remove
	Maybe I need a plot mode where it just shows the compile time for each test.	19:18	Copy link Message link Add to gist Remove
	(Since the compile time is in the timings file, it's just normally subtracted out at analysis time)	19:19	Copy link Message link Add to gist Remove
19:23 cognome joined 19:53 ventica joined 20:35 ventica joined 20:59 ilbot3 joined
jnthn	sigh That took some doing...	22:36	Copy link Message link Add to gist Remove
dalek	arVM: 3e8e534 \| jnthn++ \| src/6model/s (3 files): Prepare for lazy deserialization.	22:37	Copy link Message link Add to gist Remove
	arVM: 9539fcd \| jnthn++ \| src/6model/ (4 files): Start storing serialization reader in the SCRef. We'll need to keep it around for deserialization. Move cleanup to the SCRef GC.		Copy link Message link Add to gist Remove
	arVM: 0c30c2b \| jnthn++ \| src/ (3 files): Make "allocate in gen2" tracking reentrant.		Copy link Message link Add to gist Remove
	arVM: 7a722dc \| jnthn++ \| src/ (4 files): Switch deserialization to take place lazily. Now things are only deserialized on "first touch". Unfortunately, we are very touchy, as little is set up to take advantage of this. Even before looking into using it better, however, it takes another 2.5MB off the base memory of Rakudo with CORE.setting loaded.		Copy link Message link Add to gist Remove
timotimo	nice :)	22:59	Copy link Message link Add to gist Remove
dalek	arVM: 58fdbb2 \| jnthn++ \| src/ (4 files): A little STable cleanup. Kill two fields we don't, and won't, use. Also re-order a bit to try and get better cache access patterns.	23:10	Copy link Message link Add to gist Remove

Please report any issues / comments / feature requests as an issue on App::Raku::Log.

Thank you!