#moarvm on 1 August 2014 - Raku Programming Language Log

00:02 raiph joined
dalek	arVM: 4538f61 \| jnthn++ \| src/ (3 files): Cache dynlex lookups. As suggested by TimToady++, we stash them within frames, so we get lifetime management for free (including if continuations happen). We poke it a few frames down the stack at various intervals, to try and maximize the benefit. Can likely tune this a bit more yet.	00:12	Copy link Message link Add to gist Remove
timotimo	nice :)	00:14	Copy link Message link Add to gist Remove
jnthn	Need to lose another 1.48s before I can say I can build Rakudo in 70s. :)	00:15	Copy link Message link Add to gist Remove
timotimo	how much is that worth?	00:16	Copy link Message link Add to gist Remove
	er.		Copy link Message link Add to gist Remove
	how much did your last commit improve build times?	00:17	Copy link Message link Add to gist Remove
00:17 avuserow joined
jnthn	Was about another second off the Rakudo build.	00:19	Copy link Message link Add to gist Remove
	So, at least a %		Copy link Message link Add to gist Remove
timotimo	sweet!		Copy link Message link Add to gist Remove
	you said you timed it at about 1.8% recently; so maybe you halved the time spent in dynvar lookups? :)	00:20	Copy link Message link Add to gist Remove
jnthn	Yeah; I'll need to do a C level profile again at some point.	00:21	Copy link Message link Add to gist Remove
	Wowza. Attempting to optimize junctions creates 103966 closures when compiling CORE.setting...	00:22	Copy link Message link Add to gist Remove
timotimo	attempting?		Copy link Message link Add to gist Remove
jnthn	Well, we may succeed		Copy link Message link Add to gist Remove
timotimo	how did i do that :(		Copy link Message link Add to gist Remove
	too many non-inlined blocks?		Copy link Message link Add to gist Remove
jnthn	non-inlinable	00:24	Copy link Message link Add to gist Remove
	3 nested subs	00:25	Copy link Message link Add to gist Remove
	The optimizer (and I'm guilty too) has some very large methods in it.		Copy link Message link Add to gist Remove
timotimo	ah, those nested subs could be un-nested and just take more arguments		Copy link Message link Add to gist Remove
	that would help, right?		Copy link Message link Add to gist Remove
jnthn	Which aren't too maintainer friendly, but aren't exactly spesh-friendly or optimizer friendly either		Copy link Message link Add to gist Remove
	Well, trying an easier refactor that's probably as effective.		Copy link Message link Add to gist Remove
timotimo	OK	00:26	Copy link Message link Add to gist Remove
jnthn	Basically, pull the transform into a separate method from the analysis.	00:27	Copy link Message link Add to gist Remove
	Yes, that helps a lot.	00:29	Copy link Message link Add to gist Remove
	Though it's not the biggest source of issues, just the most stand-out one		Copy link Message link Add to gist Remove
timotimo	how do you measure what part of the process generates how many closures?	00:33	Copy link Message link Add to gist Remove
jnthn	Patch to takeclosure in frame.c that just prints out the outer frame name.	00:35	Copy link Message link Add to gist Remove
timotimo	ah, OK	00:37	Copy link Message link Add to gist Remove
	and a \| sort \| uniq -c \| sort -n		Copy link Message link Add to gist Remove
00:50 colomon joined 01:48 FROGGS_ joined 01:56 cognominal joined 02:01 FROGGS_ joined 03:00 jimmyz joined
jimmyz	Stage parse : 36.933, 1.4s lower since yesterday :)	03:00	Copy link Message link Add to gist Remove
03:01 tadzik joined, ventica joined, cognominal joined 03:02 avuserow joined
xiaomiao	I wonder what the standard deviation of those benchmarks is ;)	03:31	Copy link Message link Add to gist Remove
	37sec +-1 sec, that's about 3% ... that could be "noise"		Copy link Message link Add to gist Remove
03:52 ilbot3 joined 03:56 ventica joined 05:18 avuserow joined 05:35 bcode joined 05:57 avuserow joined
sergot	o/	06:18	Copy link Message link Add to gist Remove
07:19 ventica joined 07:24 cognome joined 07:30 cognome_ joined 07:50 ventica joined 08:01 ventica joined 08:13 zakharyas joined 08:18 Ven joined
masak	\o	08:39	Copy link Message link Add to gist Remove
nwc10	o/		Copy link Message link Add to gist Remove
08:47 FROGGS[mobile] joined 08:53 brrt joined 09:06 brrt joined, brrt left
jnthn	o/	09:21	Copy link Message link Add to gist Remove
nwc10	OK, so one key part of testing is "don't fill the disk"	09:31	Copy link Message link Add to gist Remove
masak	heh.	09:32	Copy link Message link Add to gist Remove
09:33 colomon joined 09:52 jose__ joined
nwc10	m: say 6.7812e+01/6.8598e+01	10:31	Copy link Message link Add to gist Remove Run code
camelia	rakudo-moar 89c8e4: OUTPUT«0.988541939998251␤»		Copy link Message link Add to gist Remove
nwc10	jnthn: that's the setting build speedup, once the disk is only 90% used		Copy link Message link Add to gist Remove
jnthn	nwc10: Speedup since when, exactly? :)	10:36	Copy link Message link Add to gist Remove
nwc10	er, last time I measure it. Which was probably yesterday morning.		Copy link Message link Add to gist Remove
	grammar/fingers gah	10:37	Copy link Message link Add to gist Remove
	I'm going to measure parrot performance again, to see if it gained more	10:38	Copy link Message link Add to gist Remove
11:05 carlin joined
dalek	arVM: b6a9cad \| jnthn++ \| src/core/frame.c: Fix an uninitialized variable bug.	12:20	Copy link Message link Add to gist Remove
12:22 klaas-janstol joined 12:42 oetiker joined
dalek	arVM: e92aa36 \| jnthn++ \| src/6model/ (13 files): De-virtualize most reader functions. No point to call the same thing every time through a function pointer.	12:53	Copy link Message link Add to gist Remove
	arVM: 9a3a96d \| jnthn++ \| src/6model/serialization.c: Bump minimum serialization format version. This in turn enables us to assume we have varints in the thing we are reading, which we have for quite a while now.		Copy link Message link Add to gist Remove
	arVM: 6da5b90 \| jnthn++ \| src/6model/ (9 files): De-virtualize read_var_int.		Copy link Message link Add to gist Remove
	arVM: f55e682 \| jnthn++ \| src/6model/ (14 files): De-virtualize serialization write functions. Again, the abstraction was unused and unrequired.	13:03	Copy link Message link Add to gist Remove
nwc10	m: say 6.597e+01/6.7812e+01	13:55	Copy link Message link Add to gist Remove Run code
camelia	rakudo-moar 4d347f: OUTPUT«0.972836666076801␤»		Copy link Message link Add to gist Remove
nwc10	er, so that's 2.5% speedup since this morning		Copy link Message link Add to gist Remove
FROGGS[mobile]	O.o	14:06	Copy link Message link Add to gist Remove
14:15 zakharyas joined
[Coke]	does the .msi have rakudo-moar in it?	14:36	Copy link Message link Add to gist Remove
	so, I have someone who is a cpan module author, who has written XS stuff, has hacked on perl core in the past... and he's too intimidated to use perl 6.	14:37	Copy link Message link Add to gist Remove
timotimo	to use it?		Copy link Message link Add to gist Remove
	interesting, should be a good "test subject" :)		Copy link Message link Add to gist Remove
[Coke]	even to get a copy setup to play with.	14:38	Copy link Message link Add to gist Remove
timotimo	he's on windows, yeah?		Copy link Message link Add to gist Remove
	froggs had a rakudo star moarvm msi release candidate at one point		Copy link Message link Add to gist Remove
[Coke]	well, step one, we need a better story on perl6.org.		Copy link Message link Add to gist Remove
timotimo	nobody tested it, so it disappeared again		Copy link Message link Add to gist Remove
btyler	perl6 is extremely intimidating at first, because the vast majority of the code you encounter 'casually' is straight from the core rakudo crowd	14:39	Copy link Message link Add to gist Remove
	and that code tends to be rather dense, in the interest of maximally demonstrating power in minimal space	14:40	Copy link Message link Add to gist Remove
[Coke]	I think we could borrow some ideas from the mojolico.us site.		Copy link Message link Add to gist Remove
btyler	most perl 5 code you might encounter randomly is more or less baby perl		Copy link Message link Add to gist Remove
[Coke]	btyler: he's not even at code. too many options before that.	14:41	Copy link Message link Add to gist Remove
btyler	ah, sorry, projected from my own experience too much :)		Copy link Message link Add to gist Remove
timotimo	that does happen, yeah :(	14:44	Copy link Message link Add to gist Remove
	i thought about giving perl6.org an "express lane"		Copy link Message link Add to gist Remove
	[Coke] creates a playground to test with...	14:47	Copy link Message link Add to gist Remove
timotimo	what is this "playground"? :)		Copy link Message link Add to gist Remove
[Coke]	a fork.		Copy link Message link Add to gist Remove
timotimo	ah, of course	14:48	Copy link Message link Add to gist Remove
	[Coke] trips over the prereqs. whoops.	14:56	Copy link Message link Add to gist Remove
hoelzro	[Coke]++	15:00	Copy link Message link Add to gist Remove
[Coke]	... I thought I was in #perl6 this whole time.	15:05	Copy link Message link Add to gist Remove
timotimo	ah		Copy link Message link Add to gist Remove
[Coke]	whoops		Copy link Message link Add to gist Remove
15:18 ventica joined
dalek	Heuristic branch merge: pushed 16 commits to MoarVM/moar-jit by jnthn	15:29	Copy link Message link Add to gist Remove
jnthn	brrt: Updated moar-jit to master, are confirming it works. :)		Copy link Message link Add to gist Remove
timotimo	"are confirming"?	15:35	Copy link Message link Add to gist Remove
jnthn	*after		Copy link Message link Add to gist Remove
timotimo	ah, excellent!		Copy link Message link Add to gist Remove
	and even jit-moar-ops is in there		Copy link Message link Add to gist Remove
	things are looking mighty fine :)	15:36	Copy link Message link Add to gist Remove
jnthn	Except that things are slower with the JIT enabled...	15:37	Copy link Message link Add to gist Remove
timotimo	yeah	15:38	Copy link Message link Add to gist Remove
	probably just spending too much time aborting frames, still?		Copy link Message link Add to gist Remove
jnthn	Not sure yet	15:39	Copy link Message link Add to gist Remove
	Seeing if I can discover anything.		Copy link Message link Add to gist Remove
timotimo	have you counted how often the jit-invocation opcode got hit?		Copy link Message link Add to gist Remove
dalek	arVM/moar-jit: 3f22397 \| jnthn++ \| Configure.pl: Make dynasm rule work on nmake.	15:46	Copy link Message link Add to gist Remove
	arVM/moar-jit: bafbc3b \| jnthn++ \| src/jit/emit_win32_x64.c: Win32 JIT output was behind.		Copy link Message link Add to gist Remove
jnthn	Oddly, my profiler claims that we spend 6% of the time in JITted code, but the time spent in the interpreter only goes down by 1%	16:00	Copy link Message link Add to gist Remove
timotimo	oh, huh?	16:02	Copy link Message link Add to gist Remove
	but the jitted code ought to be at least a bit faster, right?	16:03	Copy link Message link Add to gist Remove
	hm, except		Copy link Message link Add to gist Remove
	if gcc strongly optimizes the interpreter loop, maybe it handles moving stuff from register to register directly instead of going through our locals storage?		Copy link Message link Add to gist Remove
	i don't quite see how that would be doable without "unrolling" the interpreter loop, though	16:04	Copy link Message link Add to gist Remove
16:04 ventica joined
lizmat	btyler / [Coke] : TheDamian gave a nice example of how he ported a perl 5 utility of his to perl 6	16:07	Copy link Message link Add to gist Remove
	at OSCON, wonder where that code lives nowadays		Copy link Message link Add to gist Remove
japhb	lizmat: Is there a video of that?	16:25	Copy link Message link Add to gist Remove
timotimo	i want to know, too		Copy link Message link Add to gist Remove
lizmat	yes, check out OSCON videos :-)		Copy link Message link Add to gist Remove
japhb	2014?		Copy link Message link Add to gist Remove
timotimo	jnthn: does the jit dump the generated bytecode to files, perhaps?	16:29	Copy link Message link Add to gist Remove
	Got negative offset for dynamic label 6 - i wonder where that comes from?	16:30	Copy link Message link Add to gist Remove
jnthn	Not by default, afaict	16:31	Copy link Message link Add to gist Remove
timotimo	even with a jit log i get 32.407 for stage parse	16:32	Copy link Message link Add to gist Remove
	that's not too bad, is it?		Copy link Message link Add to gist Remove
jnthn	If you set MVM_JIT_DISABLE=1 here, it comes out slower than with JIT, though.	16:33	Copy link Message link Add to gist Remove
	uh, faster than with JIT		Copy link Message link Add to gist Remove
timotimo	hold on.	16:34	Copy link Message link Add to gist Remove
	only about 0.3 seconds	16:35	Copy link Message link Add to gist Remove
	hm. maybe 0.5	16:36	Copy link Message link Add to gist Remove
16:39 cognome joined
timotimo	788 frames compiled	16:39	Copy link Message link Add to gist Remove
japhb	I'm not sure we can expect the JIT to be loads faster than spesh until we move from "the easy way that works" to "optimizing all the cycles". A JIT is an expensive thing, and you have to win it back with seriously tuned output.		Copy link Message link Add to gist Remove
timotimo	882 bails		Copy link Message link Add to gist Remove
japhb	Especially while the execution flow has to bounce in and out of JIT land		Copy link Message link Add to gist Remove
timotimo	sp_findmeth is still the king		Copy link Message link Add to gist Remove
	with 271	16:40	Copy link Message link Add to gist Remove
	(probably because of much improved bytecode? maybe we have less frames all-in-all now?)		Copy link Message link Add to gist Remove
japhb	Getting it working with just neutral performance v. spesh is already a good thing, because it would mean the generated code is enough faster to make up for the cost of generating it.		Copy link Message link Add to gist Remove
timotimo	yes	16:41	Copy link Message link Add to gist Remove
japhb	oh, timotimo: did you look at the flame chart info I sent you in #perl6 earlier?	16:42	Copy link Message link Add to gist Remove
timotimo	yes, pretty!		Copy link Message link Add to gist Remove
japhb	Man, I want that for my Perl 6 code ....		Copy link Message link Add to gist Remove
timotimo	well, with the "perf" line from that one blog post you can already get that for the c-level stuff	16:43	Copy link Message link Add to gist Remove
16:45 cognominal joined, cognome joined
jnthn	Thing is that it's hard to explain it as "JIT takes time", when my profiler is telling me 0.1% of the time is spent doing that.	16:46	Copy link Message link Add to gist Remove
timotimo	hm. how does that measure time spent in c functions called from the jit?	16:47	Copy link Message link Add to gist Remove
	oh, that number is for "jitting frames"		Copy link Message link Add to gist Remove
jnthn	yES	16:48	Copy link Message link Add to gist Remove
	*yes		Copy link Message link Add to gist Remove
	I'm just wondering if it's because CORE.setting's deopt count is epic.		Copy link Message link Add to gist Remove
timotimo	how come we have "loadlib" ops in "name", "type", "box_target", "positional_delegate" and "associative_delegate"?		Copy link Message link Add to gist Remove
jnthn	And falling back out of the JIT when deopting is more expensive than a switch-code-in--interpreter deopt.	16:49	Copy link Message link Add to gist Remove
timotimo	and has_accessor?		Copy link Message link Add to gist Remove
jnthn	timotimo: um...not sure I follow?		Copy link Message link Add to gist Remove
timotimo	in the jit bail log i see a bunch of failures with the loadlib opcode		Copy link Message link Add to gist Remove
	i ... don't think i understand what it does		Copy link Message link Add to gist Remove
	ah, that op would expect to hit the cache a bunch of times	16:50	Copy link Message link Add to gist Remove
	i hope the lock contention isn't too bad on that when we get to multithreaded apps. but i don't even know under what circumstances loadlib opcodes are generated	16:51	Copy link Message link Add to gist Remove
jnthn	loadlib is hot?		Copy link Message link Add to gist Remove
timotimo	don't think it is		Copy link Message link Add to gist Remove
jnthn	That'd be...odd		Copy link Message link Add to gist Remove
timotimo	just 9 bails		Copy link Message link Add to gist Remove
	ah, loadlib is probably just used to get a handle to a library and then findsym would be used to get at whatever symbols it'd expose	16:52	Copy link Message link Add to gist Remove
	that sounds like something that could spesh well.		Copy link Message link Add to gist Remove
jnthn	What are you seeing loadlib in?	16:54	Copy link Message link Add to gist Remove
timotimo	jit bail log		Copy link Message link Add to gist Remove
jnthn	For?		Copy link Message link Add to gist Remove
timotimo	the core setting	16:55	Copy link Message link Add to gist Remove
	don't let me distract you, it's probably nothing		Copy link Message link Add to gist Remove
	oh, that could be the methods of the Perl6::Compiler	16:58	Copy link Message link Add to gist Remove
jnthn	timotimo: Did you do some work on reducing guards at some point?	17:03	Copy link Message link Add to gist Remove
	origin/split_get_use_facts <- was that pending review?	17:04	Copy link Message link Add to gist Remove
17:08 FROGGS joined
FROGGS	o/	17:08	Copy link Message link Add to gist Remove
jnthn	o/ FROGGS		Copy link Message link Add to gist Remove
TimToady	\o	17:14	Copy link Message link Add to gist Remove
carlin	∿	17:15	Copy link Message link Add to gist Remove
17:32 colomon joined
dalek	arVM: 9d377a3 \| (Timo Paulssen)++ \| src/ (3 files): split get_facts and use_facts from get_and_use_facts.	17:45	Copy link Message link Add to gist Remove
	arVM: be8cfdf \| (Timo Paulssen)++ \| src/spesh/optimize.h: fix teh build		Copy link Message link Add to gist Remove
	arVM: b57061e \| jnthn++ \| src/spesh/osr.c: Ensure OSR-triggered optimize is used next invoke.		Copy link Message link Add to gist Remove
	arVM: 8df127a \| jnthn++ \| src/ (3 files): Merge remote-tracking branch 'origin/split_get_use_facts'		Copy link Message link Add to gist Remove
	arVM: 49f19ca \| jnthn++ \| src/spesh/log.h: Tweak spesh log run count. Bump minimum bytecode version to 2.		Copy link Message link Add to gist Remove
jnthn	timotimo: merged the branch, thanks :)	17:46	Copy link Message link Add to gist Remove
timotimo	oh, that	18:53	Copy link Message link Add to gist Remove
	nice :)		Copy link Message link Add to gist Remove
nwc10	Result: PASS	19:01	Copy link Message link Add to gist Remove
jnthn	Nice. Time to break more stuff :P	19:14	Copy link Message link Add to gist Remove
nwc10	other people could just write more tests	19:15	Copy link Message link Add to gist Remove
timotimo	jnthn: about the loadlib thing i said earlier: there's a bunch of frames that look exactly like this: gist.github.com/timo/9e49a3806f02857a484f		Copy link Message link Add to gist Remove
jnthn	What on earth...	19:16	Copy link Message link Add to gist Remove
[Coke]	do we have a pic of some kind somewhere to show the flow of a program through rakudo when it's on Moar? (esp. with the new spesh/jit stuff?)	19:17	Copy link Message link Add to gist Remove
timotimo	my thoughts exactly.		Copy link Message link Add to gist Remove
jnthn	No. If you're lucky I might draw one for my YAPC::EU talk though :)	19:18	Copy link Message link Add to gist Remove
[Coke]	jnthn: perfect, that'd be fine!	19:20	Copy link Message link Add to gist Remove
	DAMMIT, it's in Sofia!?		Copy link Message link Add to gist Remove
	I have free beer waiting for me in Sofia!	19:21	Copy link Message link Add to gist Remove
	... I cannot remember the name of the guy who owes me the beer. sadface. it's been too long.		Copy link Message link Add to gist Remove
timotimo	jnthn: what's keeping us from closing the loop on the "put argument names into callsites" optimization?	19:22	Copy link Message link Add to gist Remove
jnthn	timotimo: No much; it's just fiddly and annoying to do and will have a fairly low ROI	19:23	Copy link Message link Add to gist Remove
timotimo	OK then	19:24	Copy link Message link Add to gist Remove
	timotimo pushes it further to the back :P	19:25	Copy link Message link Add to gist Remove
	jnthn: would you be interested to sketch out ideas for how to turn spesh into a profiling thingie in the future?	19:27	Copy link Message link Add to gist Remove
19:29 ventica joined
carlin	[Coke]: ahh, so that's why rakudo 2014.07 is codenamed Sofia	19:29	Copy link Message link Add to gist Remove
19:32 FROGGS joined
nwc10	m: say 6.636e+01/6.597e+01	19:35	Copy link Message link Add to gist Remove Run code
camelia	rakudo-moar 085ab9: OUTPUT«1.00591177808095␤»		Copy link Message link Add to gist Remove
nwc10	slight negative speedup since lunchtime.	19:36	Copy link Message link Add to gist Remove
jnthn	Hmm		Copy link Message link Add to gist Remove
	Wonder what's to thank for that...		Copy link Message link Add to gist Remove
nwc10	but, given I've had repeatable speed diferences depending on the order that object files are linked		Copy link Message link Add to gist Remove
	there is some level of insanity in performance metrics		Copy link Message link Add to gist Remove
	arVM: 0043778 \| jnthn++ \| src/ (3 files): Split out part of frame deserialization. The split out part will be able to happen lazily, the first time we need it. (At present that won't be much of a win as we touch many of the frames at startup to install static lexical information; the plan is to move this information into the bytecode file also).		Copy link Message link Add to gist Remove
timotimo	nwc10: maybe we should start putting -flto into our gcc commandlines?	20:02	Copy link Message link Add to gist Remove
jnthn	timotimo: How much difference does it make?	20:03	Copy link Message link Add to gist Remove
nwc10	I have no good idea about that		Copy link Message link Add to gist Remove
timotimo	haven't measured yet		Copy link Message link Add to gist Remove
	jnthn: that commit above combined with the plan you mention in it ... would that make a difference for memory usage?	20:05	Copy link Message link Add to gist Remove
	like, not using 99% of the frames in core setting would free up a bit of memory?		Copy link Message link Add to gist Remove
jnthn	timotimo: That's the hope, yes	20:07	Copy link Message link Add to gist Remove
	timotimo: And maybe a bit of a startup saving too		Copy link Message link Add to gist Remove
timotimo	i'd like that a whole lot		Copy link Message link Add to gist Remove
dalek	arVM: 0098c0c \| jnthn++ \| src/ (5 files): Preparations for lazy frame deserialization.	20:53	Copy link Message link Add to gist Remove
	arVM: cdda218 \| jnthn++ \| src/core/bytecode.c: Switch on lazy frame deserialization. Or at least, the parts we can easily get away with putting off until later. While it needs further work to take further advantage, NQP shows a 2.2% and Rakudo shows a 1.4% memory reduction for the empty loop program.		Copy link Message link Add to gist Remove
timotimo	1.4% would be about 2 megabytes?	20:54	Copy link Message link Add to gist Remove
jnthn	Yeah, just short of	20:55	Copy link Message link Add to gist Remove
21:07 zakharyas joined 21:23 btyler joined
dalek	arVM: c65b2a6 \| jnthn++ \| docs/bytecode.markdown: Spec static lexical values table in bytecode.	21:34	Copy link Message link Add to gist Remove
	arVM: 9ba5d15 \| jnthn++ \| src/mast/compiler.c: No longer need to support Parrot cross-compiler. It's almost certainly broken beyond repair to cross-compile from Parrot to Moar anyway, so no need to keep these last bits around.	22:03	Copy link Message link Add to gist Remove
	arVM: ac33547 \| jnthn++ \| lib/MAST/Nodes.nqp: Update MAST::Frame to hold static lex values.		Copy link Message link Add to gist Remove
	arVM: c0984eb \| jnthn++ \| src/ (4 files): Write static lex values; read but don't apply them	23:25	Copy link Message link Add to gist Remove
	arVM: e64c5eb \| jnthn++ \| src/core/bytecode.c: Read in static lexicals.		Copy link Message link Add to gist Remove
	arVM: f25affb \| jnthn++ \| src/mast/nodes_moar.h: MAST nodes can be identified by exact type.	23:27	Copy link Message link Add to gist Remove
timotimo	oh, that ought to help a lot	23:30	Copy link Message link Add to gist Remove
	we do istype on mast nodes all the time	23:31	Copy link Message link Add to gist Remove
	oh, that's only for inside the mastcompiler		Copy link Message link Add to gist Remove
	but it should still help		Copy link Message link Add to gist Remove
jnthn	It's a small improvement...the cache-only istype is quite cheap anyway		Copy link Message link Add to gist Remove
timotimo	#define EMPTY_STRING(vm) (MVM_string_ascii_decode_nt(tc, tc->instance->VMString, ""))	23:32	Copy link Message link Add to gist Remove
	we have a per-tc (or per vm?) empty string nowadays		Copy link Message link Add to gist Remove
jnthn	per vm		Copy link Message link Add to gist Remove
	where on earth do we use that macro..		Copy link Message link Add to gist Remove
	oh, once per compilation		Copy link Message link Add to gist Remove
	no big saving		Copy link Message link Add to gist Remove
timotimo	./src/mast/compiler.c: hll_str_idx = get_string_heap_index(vm, ws, EMPTY_STRING(vm));		Copy link Message link Add to gist Remove
jnthn	but yeah, feel free to tweak it		Copy link Message link Add to gist Remove
timotimo	oke	23:33	Copy link Message link Add to gist Remove
dalek	arVM: ff15814 \| (Timo Paulssen)++ \| src/mast/nodes_moar.h: we can use the vm's empty string constant here.	23:37	Copy link Message link Add to gist Remove
timotimo	should i perhaps teach the ascii encoding about strlen(0) strings re-routing them to the global empty string constant if it exists?	23:39	Copy link Message link Add to gist Remove
jnthn	I think they are widely interned...	23:41	Copy link Message link Add to gist Remove
	And utf8 would be a better one to teach it		Copy link Message link Add to gist Remove
timotimo	mhm	23:42	Copy link Message link Add to gist Remove
jnthn	Fun fact: somewhere in Grammar.pm is a frame with 612 labels	23:43	Copy link Message link Add to gist Remove
timotimo	oh, cute		Copy link Message link Add to gist Remove
	is that after inlining?		Copy link Message link Add to gist Remove
jnthn	No!		Copy link Message link Add to gist Remove
timotimo	oh wow!		Copy link Message link Add to gist Remove
jnthn	Well, aside from NQP's block flattening of course.	23:44	Copy link Message link Add to gist Remove
timotimo	seems pretty jumpy		Copy link Message link Add to gist Remove
jnthn	Yeah		Copy link Message link Add to gist Remove
	Well, I'm pondering some MAST::Label changes.		Copy link Message link Add to gist Remove
	Today, we always make a string name for a MAST::Label, passing it to its constructor		Copy link Message link Add to gist Remove
timotimo	could be integers, too, right?	23:45	Copy link Message link Add to gist Remove
jnthn	However, we never - afaik - in the compiler make two MAST::Labels with the same identifier		Copy link Message link Add to gist Remove
	Well, they could be integers, yes.		Copy link Message link Add to gist Remove
	The alternative is that they just work by object identity		Copy link Message link Add to gist Remove
	Which I believe would work with the current codebase.		Copy link Message link Add to gist Remove
	Saving 8 bytes per MAST::Label		Copy link Message link Add to gist Remove
timotimo	hey, with jit enabled and latest master i get 30.5 seconds stage parse on my laptop :3		Copy link Message link Add to gist Remove
	oh, even better	23:46	Copy link Message link Add to gist Remove
jnthn	But I was then thinking "hm, I have no hash key"		Copy link Message link Add to gist Remove
	And wondering what happens if I make a linear scan of the labels.		Copy link Message link Add to gist Remove
	It'll be a C array so not too bad.		Copy link Message link Add to gist Remove
timotimo	even if you have a frame with 612 labels?		Copy link Message link Add to gist Remove
jnthn	A hash may be O(1) but the constant overhead isn't automatically cheap.	23:47	Copy link Message link Add to gist Remove
	Well, that's an extreme/rare case.		Copy link Message link Add to gist Remove
timotimo	that's right		Copy link Message link Add to gist Remove
jnthn	Most frames are tiny.		Copy link Message link Add to gist Remove
	We might lose out on the odd extreme one.		Copy link Message link Add to gist Remove
timotimo	so at least the linear search is going to be limited to each frame individually		Copy link Message link Add to gist Remove
jnthn	Right.		Copy link Message link Add to gist Remove
timotimo	that does sound sensible; do you have a histogram of frame sizes or something?		Copy link Message link Add to gist Remove
jnthn	No		Copy link Message link Add to gist Remove
	I just looked for maximum ones		Copy link Message link Add to gist Remove
	But I'm quite used to reading spesh logs :)		Copy link Message link Add to gist Remove
	And labels <=> basic blocks are clsoe	23:48	Copy link Message link Add to gist Remove
	*close		Copy link Message link Add to gist Remove
timotimo	ah, yes		Copy link Message link Add to gist Remove
	i've not seen any with 4 digit BBs in core setting :)		Copy link Message link Add to gist Remove
jnthn	I wonder how many labels we create in compilation...		Copy link Message link Add to gist Remove
	m: say 21160 * 8	23:52	Copy link Message link Add to gist Remove Run code
camelia	rakudo-moar fb0521: OUTPUT«169280␤»		Copy link Message link Add to gist Remove
jnthn	That's how much we'd save on MAST::Label directly		Copy link Message link Add to gist Remove
	But we save all the strings too		Copy link Message link Add to gist Remove
	m: say 21160 * (6 * 8 #`(string size) + 10 #`(conservative label length estimate) * 4 #`(per grapheme))	23:54	Copy link Message link Add to gist Remove Run code
camelia	rakudo-moar fb0521: OUTPUT«1862080␤»		Copy link Message link Add to gist Remove
jnthn	Not so much I guess.	23:55	Copy link Message link Add to gist Remove
	Though there's at least 1 intermediate string too, which is the numification of the number stuck onto it.		Copy link Message link Add to gist Remove
	Well, may give it a go tomorrow to see how it helps	23:56	Copy link Message link Add to gist Remove

Please report any issues / comments / feature requests as an issue on App::Raku::Log.

Thank you!