#moarvm on 11 October 2018 - Raku Programming Language Log

github.com/moarvm/moarvm \| IRC logs at colabti.org/irclogger/irclogger_logs/moarvm Set by AlexDaniel on 12 June 2018.
00:01 lizmat joined, p6bannerbot sets mode: +v lizmat 00:06 lizmat left 00:09 dogbert11 joined 00:10 p6bannerbot sets mode: +v dogbert11 00:13 dogbert17 left 01:14 fake_space_whale joined, p6bannerbot sets mode: +v fake_space_whale 01:21 MasterDuke left 01:40 ZzZombo left 02:35 ZzZombo joined, p6bannerbot sets mode: +v ZzZombo, ZzZombo left 06:04 domidumont joined 06:05 p6bannerbot sets mode: +v domidumont 06:14 lizmat joined, p6bannerbot sets mode: +v lizmat 06:16 patrickb joined, p6bannerbot sets mode: +v patrickb 06:20 fake_space_whale left 06:49 lizmat left 07:01 lizmat joined, p6bannerbot sets mode: +v lizmat 07:06 lizmat left 07:21 robertle joined 07:22 p6bannerbot sets mode: +v robertle 07:45 domidumont left 07:48 domidumont joined, p6bannerbot sets mode: +v domidumont 07:56 lizmat joined, p6bannerbot sets mode: +v lizmat 09:37 ZofBot left, huggable left, p6bannerbot left, buggable left 10:40 ZzZombo joined 10:42 ZzZombo_ joined, ZzZombo_ left, ZzZombo_ joined 10:46 ZzZombo left, ZzZombo_ is now known as ZzZombo 10:54 robertle left 11:22 p6bannerbot joined, ChanServ sets mode: +o p6bannerbot, ZofBot joined, p6bannerbot sets mode: +v ZofBot, huggable joined, buggable joined 11:23 p6bannerbot sets mode: +v huggable, p6bannerbot sets mode: +v buggable 11:43 Kaiepi left 11:44 Kaiepi joined 11:45 p6bannerbot sets mode: +v Kaiepi 12:49 scovit left 12:55 scovit joined, p6bannerbot sets mode: +v scovit 12:57 brrt joined 12:58 p6bannerbot sets mode: +v brrt
brrt	\o	12:58	Copy link Message link Add to gist Remove
jnthn	o/ brrt	12:59	Copy link Message link Add to gist Remove
brrt	ohai jnthn		Copy link Message link Add to gist Remove
	I find that I'm not sure how pass-by-reference works in nativecall	13:00	Copy link Message link Add to gist Remove
	i would have expected that we'd pass a pointer to the MVMRegister in args	13:01	Copy link Message link Add to gist Remove
	but that doesn't appear to be how it works		Copy link Message link Add to gist Remove
jnthn	I don't know, alas	13:24	Copy link Message link Add to gist Remove
	nine++ probably does		Copy link Message link Add to gist Remove
13:25 AlexDaniel left 13:26 AlexDaniel joined, p6bannerbot sets mode: +v AlexDaniel 13:35 scovit left
Geth	MoarVM/vectorization: 8 commits pushed by (Timo Paulssen)++ - factor out profile call node creation - store time of "first ever entry" of call nodes - expose first entry time in profiler data - expose a thread's start time - call node's first entry time should be relative - introducing vectorapply for native arrays - lego-jit vectorapply - no need for unsigned int args; jit don't like them yet	14:08	Copy link Message link Add to gist Remove
timotimo	lizmat: ^- here's the op i was talking about	14:09	Copy link Message link Add to gist Remove
lizmat	timotimo: does it come with documentation in ops.markdown ?	14:10	Copy link Message link Add to gist Remove
timotimo	not yet		Copy link Message link Add to gist Remove
	my num @a = 1e0..500_000e0; my num @b = 500_000e0...1e0; my num $c = 5e0; my num @out; my $time = now; for ^500_000 { @out[$_] = @a[$_] + @b[$_] * $c; }; say now - $time; say @out[99]	14:11	Copy link Message link Add to gist Remove
evalable6	0.34655233 2499605		Copy link Message link Add to gist Remove
timotimo	use nqp; my num @a = 1e0..500_000e0; my num @b = 500_000e0...1e0; my num @c = 5e0; my num @out; my $time = now; nqp::vectorapply(@b, @c, @b, 95, 1, 64); nqp::vectorapply(@a, @b, @out, 93, 0, 64); say now - $time; say @out[99]		Copy link Message link Add to gist Remove
	those are roughly equivalent	14:12	Copy link Message link Add to gist Remove
	because 95 is mul_n and 93 is add_n		Copy link Message link Add to gist Remove
	one of them is a cross operator, the one with a 1 in between, the other is a zip operator, the one with a 0 in between		Copy link Message link Add to gist Remove
lizmat	that looks pretty cool	14:13	Copy link Message link Add to gist Remove
timotimo	what i'd like you to have a look at is:		Copy link Message link Add to gist Remove
	make @out = @a Z+ @b X* $c turn into vectorapply calls		Copy link Message link Add to gist Remove
	they currently only work for 64bit wide arrays of int and num, and if it's a cross operator the smaller one has to be a native array, too, of the right kind and size, with only one element	14:14	Copy link Message link Add to gist Remove
lizmat	intriguing! :-) looks very cool		Copy link Message link Add to gist Remove
timotimo	\o/	14:15	Copy link Message link Add to gist Remove
lizmat	fwiw, I was first going to take a stab at documenting the new MAIN interface and write tests for it		Copy link Message link Add to gist Remove
timotimo	sure!		Copy link Message link Add to gist Remove
	no hurry :)		Copy link Message link Add to gist Remove
lizmat	and then I was planning to have a look at R#2360, attempting to fix nqp::p6store		Copy link Message link Add to gist Remove
synopsebot	R#2360 [open]: github.com/rakudo/rakudo/issues/2360 my %*FOO is Set = <a b c> dies		Copy link Message link Add to gist Remove
timotimo	the vectorapply version of that code can run 300 times and still finish a tiny bit faster than the for ^500_000 version	14:17	Copy link Message link Add to gist Remove
lizmat	and before all of that, first some sun / wind / cycling&		Copy link Message link Add to gist Remove
	timotimo: so you're saying that's potentially 300x as fast ?		Copy link Message link Add to gist Remove
timotimo	maybe i'll figure out soon-ish why it's even faster to have $c replaced with a 500_000 element @c array and using @c[$_] as well	14:18	Copy link Message link Add to gist Remove
	yeah, and potentially about 1.5kx faster than using Z+ and X*	14:19	Copy link Message link Add to gist Remove
	mhhh, my num @a = 1e0..500_000e0; takes about no time at all, but my num @a = 500_000e0...1e0; takes about 10 seconds; we recently optimized special cases of ... for for loops, surely we can put that into the push_all for the ... iterator, too :)	14:24	Copy link Message link Add to gist Remove
14:54 fake_space_whale joined 14:55 p6bannerbot sets mode: +v fake_space_whale 15:21 domidumont left 15:22 tadzik left, tadzik joined 15:23 p6bannerbot sets mode: +v tadzik 15:27 brrt left 15:36 lizmat left 16:02 lizmat joined, p6bannerbot sets mode: +v lizmat 16:06 lizmat left 16:26 patrickb left 16:33 shareable6 left, reportable6 left, committable6 left, quotable6 left, squashable6 left, reportable6 joined, shareable6 joined, committable6 joined, quotable6 joined, squashable6 joined, evalable6 left, bisectable6 left, evalable6 joined, bisectable6 joined 16:34 p6bannerbot sets mode: +v reportable6, p6bannerbot sets mode: +v shareable6, p6bannerbot sets mode: +v committable6, p6bannerbot sets mode: +v quotable6, p6bannerbot sets mode: +v squashable6, p6bannerbot sets mode: +v evalable6, p6bannerbot sets mode: +v bisectable6 16:36 lizmat joined, p6bannerbot sets mode: +v lizmat 16:39 releasable6 left, notable6 left, greppable6 left, releasable6 joined, notable6 joined, greppable6 joined 16:40 p6bannerbot sets mode: +v releasable6, p6bannerbot sets mode: +v notable6, p6bannerbot sets mode: +v greppable6 16:42 unicodable6 left, unicodable6 joined 16:43 p6bannerbot sets mode: +v unicodable6 16:49 ankitkk left, ankitkk joined 16:50 p6bannerbot sets mode: +v ankitkk 16:51 brrt joined 16:52 p6bannerbot sets mode: +v brrt 17:00 robertle joined 17:01 p6bannerbot sets mode: +v robertle 17:03 domidumont joined, p6bannerbot sets mode: +v domidumont
brrt	timotimo++ pretty cool work	17:08	Copy link Message link Add to gist Remove
17:10 fake_space_whale left, domidumont left
lizmat	timotimo: afaik, ... is still a gather / take combo	17:12	Copy link Message link Add to gist Remove
17:23 fake_space_whale joined 17:24 p6bannerbot sets mode: +v fake_space_whale
nine	brrt: but....that should be exactly how it works?	17:32	Copy link Message link Add to gist Remove
	brrt: that's also why I added a getarg op for reading the value back from the args buffer		Copy link Message link Add to gist Remove
brrt	oh, really	17:37	Copy link Message link Add to gist Remove
	..... so, I don't have to add a 'copy-back-to-frame' for rw arguments	17:38	Copy link Message link Add to gist Remove
	that's good news		Copy link Message link Add to gist Remove
	that simplifies things tremendously		Copy link Message link Add to gist Remove
	nine++	17:39	Copy link Message link Add to gist Remove
nine	My initial implementation just read the value from the local with lots of assumptions about which local that might be. But that was a tiny bit too fragile ;)	17:44	Copy link Message link Add to gist Remove
timotimo	lizmat: OK!	17:46	Copy link Message link Add to gist Remove
brrt	yeah, i can imagine :-)	17:47	Copy link Message link Add to gist Remove
timotimo	so i'm using nine's example profile data again, and the "paths" data for one function that appears in 522 call sites was a proud ~12 megabytes, which my program took about one and a half minutes to put together into a json blob		Copy link Message link Add to gist Remove
	with a whole lot of memory usage	17:48	Copy link Message link Add to gist Remove
	i.e. when i tried it earlier, it tried to dump core because it reached the maximum my ram had to offer		Copy link Message link Add to gist Remove
17:48 evalable6 left
timotimo	that's not quite acceptable %)	17:48	Copy link Message link Add to gist Remove
17:48 evalable6 joined 17:49 shareable6 left
timotimo	also, it'll be interesting to build the flame graph data when there's theoretically hundreds of megabytes of data in there	17:49	Copy link Message link Add to gist Remove
17:49 p6bannerbot sets mode: +v evalable6
timotimo	brrt: you think the vectorization branch is an acceptable way forward? it's surely not optimal, but it's certainly faster than what our zip/cross ops currently can do	17:51	Copy link Message link Add to gist Remove
brrt	I have totally not reviewed it	17:52	Copy link Message link Add to gist Remove
timotimo	it's probably more efficient to try to do all operations on each little bunch of data?		Copy link Message link Add to gist Remove
	rather than going through all data with one operation, then through all data with another		Copy link Message link Add to gist Remove
	and it's surely wasteful to require intermediate arrays to be made	17:53	Copy link Message link Add to gist Remove
brrt	hmmmm	17:54	Copy link Message link Add to gist Remove
timotimo	though if every operation only goes from two arrays to one, i'd assume most of the time you can have at most one temporary array?		Copy link Message link Add to gist Remove
brrt	in honesty you may have exceeded my expertise :-)		Copy link Message link Add to gist Remove
timotimo	haha		Copy link Message link Add to gist Remove
	i have no expertise either, that's why i just let the C compiler do 100% of the work		Copy link Message link Add to gist Remove
brrt	scarily, I'm getting good at writing adhoc jit templates	17:57	Copy link Message link Add to gist Remove
	not the most portable of skills..		Copy link Message link Add to gist Remove
dogbert11	brrt: do you have any theories as to why some spectest files fails when run with MVM_JIT_EXPR_DISABLE=1 ?		Copy link Message link Add to gist Remove
brrt	dogbert11: nope, can you point me to the right ones?	17:58	Copy link Message link Add to gist Remove
dogbert11	brrt; try running - MVM_JIT_EXPR_DISABLE=1 ./perl6 t/spec/S05-mass/properties-block.t		Copy link Message link Add to gist Remove
brrt	huh, that's funny	17:59	Copy link Message link Add to gist Remove
dogbert11	I thought so too. quite strange		Copy link Message link Add to gist Remove
brrt	goes away with MVM_JIT_DISABLE=1	18:00	Copy link Message link Add to gist Remove
	okay, I can probably figure that out		Copy link Message link Add to gist Remove
	I'll put it somewhere on my todo list		Copy link Message link Add to gist Remove
dogbert11	++brrt		Copy link Message link Add to gist Remove
brrt	.oO( we need an inverse jit bisect )	18:01	Copy link Message link Add to gist Remove
	I need to fixup jit bisect anyway ...		Copy link Message link Add to gist Remove
	anyway, I'll have to do all that later, afk for now :-)	18:02	Copy link Message link Add to gist Remove
18:03 brrt left
timotimo	oh, the cro process is still at like 3.9 gigs RSS	18:05	Copy link Message link Add to gist Remove
japhb	yikes	18:27	Copy link Message link Add to gist Remove
timotimo	oh lord, this can't be right	18:29	Copy link Message link Add to gist Remove
	the json was being created with :pretty	18:30	Copy link Message link Add to gist Remove
	that's pretty bad for a deeeeeeeply nested structure		Copy link Message link Add to gist Remove
	routine-paths in 2.7811155	18:31	Copy link Message link Add to gist Remove
	routine-paths json in 2.95350603: 263873 characters		Copy link Message link Add to gist Remove
	^- with :!pretty		Copy link Message link Add to gist Remove
	routine-paths in 2.910559		Copy link Message link Add to gist Remove
	routine-paths json in 120.8043488: 13517443 characters		Copy link Message link Add to gist Remove
	^- with :pretty		Copy link Message link Add to gist Remove
japhb	timotimo: When you're doing really serious vector/matrix/tensor operations, beyond a certain point runtime will be utterly dominated by memory hierarchy effects. Chunking large arrays so that all operations on a given set of data fit in fast caches makes a huge difference (consider e.g. multiplying a pair of 8k x 8k matrices).		Copy link Message link Add to gist Remove
timotimo	japhb: sadly, that means much more work :)	18:32	Copy link Message link Add to gist Remove
japhb	timotimo: Actually ... maybe not. It may be that if you want to do that sort of thing, we instead automate using one of the fast linear algebra libraries.		Copy link Message link Add to gist Remove
timotimo	true	18:33	Copy link Message link Add to gist Remove
japhb	Don't get me wrong, I think your current research is very useful. I was just answering your question earlier about vectorization of large volumes of data.		Copy link Message link Add to gist Remove
timotimo	alternatively, maybe the liboil compiler would actually be nice to put into moar	18:34	Copy link Message link Add to gist Remove
	yeah, i think i got you right :)		Copy link Message link Add to gist Remove
	TBF with the stuff i've implemented so far, i don't think matrix multiplication is particularly possible to implement	18:35	Copy link Message link Add to gist Remove
japhb	timotimo: Have you looked at PDL from the Perl 5 world?	18:36	Copy link Message link Add to gist Remove
timotimo	i have not		Copy link Message link Add to gist Remove
japhb	It's interesting just from the point of view of the things it makes easy, and the magic it does behind the scenes to make that fast-ish.	18:37	Copy link Message link Add to gist Remove
timotimo	i've looked a little into numpy	18:38	Copy link Message link Add to gist Remove
japhb	But it was not trying to do true CPU vectorization, rather just able to pump large multidim arrays into optimized C routines		Copy link Message link Add to gist Remove
timotimo	scipy has a thing that lets you write C++ code using some c++ library that does multidim arrays that you can slice every which way		Copy link Message link Add to gist Remove
japhb	It could not, for example, hold a candle to the real C/C++ fast linear algebra stuff. Still, it beat the blazes off doing things element-wise.		Copy link Message link Add to gist Remove
timotimo	last time i looked it was barely documented, barely hackable if you want very specific behaviour of the compiler, and apparently hadn't been touched in a couple of years	18:39	Copy link Message link Add to gist Remove
	got a tree with 162338 nodes	18:42	Copy link Message link Add to gist Remove
	routine-paths in 272.2697089		Copy link Message link Add to gist Remove
	oh jeez here we go		Copy link Message link Add to gist Remove
	in comparison, the stuff i pasted above had "got a tree with 5755 nodes"	18:43	Copy link Message link Add to gist Remove
	routine-paths json in 90.608918: 7433700 characters	18:44	Copy link Message link Add to gist Remove
japhb	m: say 162338 / 5755, 272.2697089 / 2.7811155		Copy link Message link Add to gist Remove Run code
camelia	28.20816797.899461169		Copy link Message link Add to gist Remove
japhb	m: say 162338 / 5755, ' ', 272.2697089 / 2.7811155		Copy link Message link Add to gist Remove Run code
camelia	28.208167 97.899461169		Copy link Message link Add to gist Remove
timotimo	now chrome is chugging along on the json and the react component tree		Copy link Message link Add to gist Remove
japhb	Hmmm, some nonlinear effects there, but at least not O(n**2)		Copy link Message link Add to gist Remove
18:45 shareable6 joined
timotimo	aye, you must imagine the call graph and we've got a set of leaf nodes	18:45	Copy link Message link Add to gist Remove
japhb	Are you sorting the keys? Looks like there might be an N log N effect		Copy link Message link Add to gist Remove
	(Just staring at the ratios)	18:46	Copy link Message link Add to gist Remove
timotimo	and the code goes via the parent ids towards the known roots		Copy link Message link Add to gist Remove
japhb	Ah, yeah, that would do it		Copy link Message link Add to gist Remove
18:46 p6bannerbot sets mode: +v shareable6
timotimo	i should be able to construct an sql query that picks every "current node"'s parent rather than going node-by-node	18:46	Copy link Message link Add to gist Remove
19:07 Kaiepi left, Kaiepi joined 19:08 p6bannerbot sets mode: +v Kaiepi
diakopter	heh portable	19:58	Copy link Message link Add to gist Remove
20:26 Kaiepi left 20:36 Kaiepi joined, p6bannerbot sets mode: +v Kaiepi 20:42 Kaiepi left, Kaiepi joined 20:43 p6bannerbot sets mode: +v Kaiepi 21:49 squashable6 left 21:50 squashable6 joined, p6bannerbot sets mode: +v squashable6 21:53 squashable6 left, squashable6 joined 21:54 p6bannerbot sets mode: +v squashable6
timotimo	i'm not sure where to stop adding "vectorized" stuff. like, i think coercing an array of int to an array of num and vice versa seems very useful to have	23:29	Copy link Message link Add to gist Remove
	but coercing int or num to str ... useful for sure, but not appropriate for the vectorapply op, i don't think	23:30	Copy link Message link Add to gist Remove
23:54 greppable6 left, greppable6 joined 23:55 p6bannerbot sets mode: +v greppable6

Please report any issues / comments / feature requests as an issue on App::Raku::Log.

Thank you!