#moarvm on 19 May 2018 - Raku Programming Language Log

github.com/moarvm/moarvm \| IRC logs at irclog.perlgeek.de/moarvm/today Set by moderator on 18 May 2018.
00:51 shareable6 joined 01:56 ilbot3 joined
moderator	github.com/moarvm/moarvm \| IRC logs at irclog.perlgeek.de/moarvm/today		Copy link Message link Add to gist Remove
03:22 greppable6 joined, reportable6 joined, notable6 joined, quotable6 joined, committable6 joined, coverable6 joined, evalable6 joined, bloatable6 joined 03:23 bisectable6 joined, releasable6 joined, nativecallable6 joined, unicodable6 joined, benchable6 joined, statisfiable6 joined, squashable6 joined, undersightable6 joined, shareable6 joined 05:11 robertle joined 05:52 robertle joined 06:12 shareable6 joined 08:53 domidumont joined 09:00 domidumont joined 09:28 FROGGS joined 09:47 shareable6 joined 11:35 shareable6 joined 12:07 zakharyas joined 12:57 shareable6 joined 13:45 Geth joined 14:45 ilmari[m] joined 15:36 committable6 joined, reportable6 joined, coverable6 joined, bloatable6 joined, releasable6 joined, unicodable6 joined, statisfiable6 joined, squashable6 joined 16:49 Kaiepi joined 17:36 shareable6 joined 18:10 shareable6 joined 18:51 Ven`` joined 19:51 shareable6 joined 21:00 Summertime joined 21:19 Summertime left
samcv	MasterDuke: mill arch is pretty interesting	21:21	Copy link Message link Add to gist Remove
timotimo	jnthn: do you think adding a bitmap for "which slots in the sc have been deserialized so far?" to make the loop over the whole array to find a given object faster?	21:44	Copy link Message link Add to gist Remove
	s/"?"$/ is a good idea?/		Copy link Message link Add to gist Remove
jnthn	Hm, is that expensive?	21:48	Copy link Message link Add to gist Remove
	I figured it'd only happen on the first lazy deserialization of something, and that we tend to then deserialize entire subtrees of things		Copy link Message link Add to gist Remove
timotimo	well, you know how we sometimes have a cached sc idx inside an object?	21:50	Copy link Message link Add to gist Remove
	in install_core_dists we hit that 20% of the time		Copy link Message link Add to gist Remove
	we do find_obj_idx or what it's called 88k times inside that script	21:51	Copy link Message link Add to gist Remove
	i'm printing out any number of >= 8 consecutive nulls and i get numbers ranging up to ~450 sometimes	21:52	Copy link Message link Add to gist Remove
	if we can sometimes skip this many items outright, we'd surely have much less cache evictions	21:53	Copy link Message link Add to gist Remove
jnthn	Hmm, but why are we hitting the linear search if the object already exists? :S	21:54	Copy link Message link Add to gist Remove
	I thought if something was deserialized then we'd always have index cached	21:55	Copy link Message link Add to gist Remove
timotimo	i didn't see through that code		Copy link Message link Add to gist Remove
jnthn	I found the odd place where we didn't before		Copy link Message link Add to gist Remove
	But I guess we must be missing another one		Copy link Message link Add to gist Remove
timotimo	i had no idea what the requirement was for an object to have its scidx cached :)		Copy link Message link Add to gist Remove
	but "an object should always have its scidx cached" is a reasonable explanation and i could go digging	21:56	Copy link Message link Add to gist Remove
jnthn	afaik there's not a reason that it can't, I'm guessing we must just somewhere "forget"		Copy link Message link Add to gist Remove
timotimo	rr shall rescue me	21:59	Copy link Message link Add to gist Remove
	oh, does anything speak against caching the index right then and there while we're doing the linear search?	22:00	Copy link Message link Add to gist Remove
jnthn	Hmm...if it can be done without thread headaches	22:07	Copy link Message link Add to gist Remove
timotimo	racing to install the same value from two threads could be fine	22:09	Copy link Message link Add to gist Remove
	hm, but if another thread is reading half the value, that could be not so great		Copy link Message link Add to gist Remove
	OK, it looks like the change makes no difference	22:10	Copy link Message link Add to gist Remove
	timing wise, that is	22:11	Copy link Message link Add to gist Remove
samcv	while taking a nap today i thought of an optimization for string eq. we can compare the cached hash codes (if they exist) and quickly reject non matching strings	22:18	Copy link Message link Add to gist Remove
timotimo	oh, we don't do that yet?	22:22	Copy link Message link Add to gist Remove
	sounds like a good idea in any case		Copy link Message link Add to gist Remove
jnthn	Just make sure to exclude the 0 no-hash-yet sentinel :-)	22:25	Copy link Message link Add to gist Remove
Geth	MoarVM: 4152021ff8 \| (Samantha McVey)++ \| tools/update-changelog.p6 [Tools] Add update-changelog.p6 tool	22:31	Copy link Message link Add to gist Remove
	MoarVM: d634d24cf3 \| (Samantha McVey)++ \| src/strings/ops.c Instantly return 0 with string eq if cached hash code doesn't match If the cached hash codes exist, we are able to quickly return 0 without having to manually compare the two strings. For some work loads I could see this having a fair impact.		Copy link Message link Add to gist Remove
samcv	spectest pass. and commited		Copy link Message link Add to gist Remove
timotimo	it could be beneficial to just calculate some hash codes "for fun" now	22:32	Copy link Message link Add to gist Remove
	to increase the occurence of hash codes not being 0		Copy link Message link Add to gist Remove
samcv	"for fun" lol	22:34	Copy link Message link Add to gist Remove
timotimo	like, when we do GC and some threads are done with their work already, but some other thread is still GCing away ... maybe grab some random strings and calculate some hash codes?		Copy link Message link Add to gist Remove
	do we do anything smart with hashes calculated from strands btw?		Copy link Message link Add to gist Remove
samcv	hah		Copy link Message link Add to gist Remove
	like what?		Copy link Message link Add to gist Remove
	also no we don't	22:35	Copy link Message link Add to gist Remove
timotimo	if we have two long strings in a strand, can we re-use the first part's hash code (if the whole string is a part of it)	22:36	Copy link Message link Add to gist Remove
	hm, the hash code potentially includes the length of the string, eh?		Copy link Message link Add to gist Remove
	that would make it useless for that purpose i suppose		Copy link Message link Add to gist Remove
	we probably don't want to have a hash function that you can just combine stuff together with, maybe	22:37	Copy link Message link Add to gist Remove
samcv	timotimo: yeah that basically makes it easily attackable	22:40	Copy link Message link Add to gist Remove
timotimo	right		Copy link Message link Add to gist Remove
	if we can cheat, so can the attacker		Copy link Message link Add to gist Remove
samcv	not sure if we want to rekey a hash if we have too many hash conflicts	22:41	Copy link Message link Add to gist Remove
	i mean we probably should ideally		Copy link Message link Add to gist Remove
timotimo	i don't yet know how rekeying can work if we want to keep cached hash codes		Copy link Message link Add to gist Remove
samcv	and then you can also worry about timing attacks		Copy link Message link Add to gist Remove
	timotimo: well you just ignore them	22:42	Copy link Message link Add to gist Remove
timotimo	is the attack we expect that the attacker gets the full hash code right?		Copy link Message link Add to gist Remove
	or just the part we use for buckets?		Copy link Message link Add to gist Remove
samcv	just the bucket part i suppose		Copy link Message link Add to gist Remove
timotimo	in that case we can perhaps just change what part of the hash code we use to decide on the bucket?	22:43	Copy link Message link Add to gist Remove
samcv	hm	22:44	Copy link Message link Add to gist Remove
	i guess we could like reverse it?		Copy link Message link Add to gist Remove
timotimo	start at the end instead of the beginning?		Copy link Message link Add to gist Remove
samcv	hm		Copy link Message link Add to gist Remove
	i mean we'll have a 64 bit value so that has a lot of surface area. we could rotate it maybe	22:47	Copy link Message link Add to gist Remove
timotimo	rotate sounds good		Copy link Message link Add to gist Remove
samcv	surface area as in, we don't really need a full 64 bits for bucket determination		Copy link Message link Add to gist Remove
	but it makes it slower to bruteforce and should be trivial to do a rotation on it and be able to rekey in case there is an attack	22:48	Copy link Message link Add to gist Remove
22:48 ZofBot joined
samcv	and allow us to not have to recompute all the strings hashes again	22:48	Copy link Message link Add to gist Remove
timotimo	hm	22:59	Copy link Message link Add to gist Remove
	we're currently telling people the order of items in a hash will change between runs of the same program		Copy link Message link Add to gist Remove
	are they expecting the order will not change on its own thereafter?		Copy link Message link Add to gist Remove
	naah, that'd already happen when hashes increase in size		Copy link Message link Add to gist Remove
samcv	well it won't change unless they add things to it		Copy link Message link Add to gist Remove
	and they already reorder on bucket resizing		Copy link Message link Add to gist Remove
timotimo	which it already did before anyway		Copy link Message link Add to gist Remove
	right		Copy link Message link Add to gist Remove
samcv	though they technically shouldn't rely on that either	23:00	Copy link Message link Add to gist Remove
timotimo	there isn't as much thinking before my talking today as there sometimes is		Copy link Message link Add to gist Remove
samcv	also could be interesting if each hash had its own rotation		Copy link Message link Add to gist Remove
timotimo	every MVMHash gets its own Quaternion	23:01	Copy link Message link Add to gist Remove
samcv	quaternion?		Copy link Message link Add to gist Remove
	what		Copy link Message link Add to gist Remove
timotimo	in 3d programming, they're used to make things rotate in a way you'd expect		Copy link Message link Add to gist Remove
	wow, the wikipedia has a bunch of illustration and none of them seem enlightening	23:02	Copy link Message link Add to gist Remove
samcv	on 3d programming?	23:03	Copy link Message link Add to gist Remove
timotimo	damn, a gamasutra article entitled "rotating objects using quaternions". it starts "Last year may go down in history as The Year of the Hardware Acceleration". it is from 1998		Copy link Message link Add to gist Remove
	btw, i don't know much about 3d graphics or 3d programming or whatever, i've just picked this snippet up somewhere	23:05	Copy link Message link Add to gist Remove
samcv	also not sure if we need to hide the order of objects in a hash table or not		Copy link Message link Add to gist Remove
MasterDuke	sounds about right, i think i got my first 3d video card around 1996 or 1997		Copy link Message link Add to gist Remove
samcv	i.e. by randomizing which buckets we iterate through first		Copy link Message link Add to gist Remove
timotimo	my first 3d-ish card was a matrox mystique, but no clue if the original or the 220 version	23:06	Copy link Message link Add to gist Remove
	the latter was released 1997 apparently		Copy link Message link Add to gist Remove
MasterDuke	mine was a canopus pure 3d, a 3dfx voodoo 1 (but with 6m ram, 2m more texture memory than the reference version)		Copy link Message link Add to gist Remove
	i could finally run jedi knight at 640x480!	23:07	Copy link Message link Add to gist Remove
TimToady	my first graphics processor was the blitter on an Amiga 1000 :)	23:08	Copy link Message link Add to gist Remove
MasterDuke	hm, i'm not sure i've ever used an amiga. certainly heard/read much about them though	23:09	Copy link Message link Add to gist Remove
timotimo	i never had any amiga, or even commodore or atari or what have you. the first computer i remember using was either a 386 or a 486, possibly the latter	23:10	Copy link Message link Add to gist Remove
jnthn	Today my wife was trying to install some smartphone app that wanted over 400MB of space, which seemed huge given what it was supposed to do. I pointed out this was 4 times more space than the entire disk space of my family's first home computer (a 486) that I programmed on. The BBC micro that was the first machine I programmed on didn't even have a hard disk. :-)	23:12	Copy link Message link Add to gist Remove
TimToady	neither did the Amiga 1000		Copy link Message link Add to gist Remove
	unless you count floppies...	23:13	Copy link Message link Add to gist Remove
jnthn	I figure floppies are by definition not hard. :P		Copy link Message link Add to gist Remove
TimToady	depends on how floppy your definition is, I suppose...		Copy link Message link Add to gist Remove
timotimo	in order to try to appreciate machines of the pre-timotimo-era i'm watching The 8bit Guy (formerly The iBook Guy, and additionally 8bit keys) on You the Tube	23:14	Copy link Message link Add to gist Remove
jnthn	Hm, actually, I'd always thought "hard disk" was just the opposite of "floppy disk", and never considered if that was the real reason for the naming :)		Copy link Message link Add to gist Remove
TimToady	it's a bit of a retrynym, I suspect		Copy link Message link Add to gist Remove
	*retro		Copy link Message link Add to gist Remove
timotimo	hm, is a solid state disk just the opposite of a fluid community cube?	23:15	Copy link Message link Add to gist Remove
	what would you call the opposite of a disk		Copy link Message link Add to gist Remove
MasterDuke	no, of a companion cube		Copy link Message link Add to gist Remove
TimToady	join the Flat Disk Society today!		Copy link Message link Add to gist Remove
timotimo	actually, disc and disk aren't the same thing		Copy link Message link Add to gist Remove
TimToady	discs have grooves :)	23:16	Copy link Message link Add to gist Remove
timotimo	amusingly, in german it's called Festplatte, which you could wrong-translate as a thing you put lots of food on to serve at some kind of fest/party/event		Copy link Message link Add to gist Remove
	wrangslate?		Copy link Message link Add to gist Remove
samcv	jnthn: when i was with lizmat i spitballed my ideas on implementing MVMString that has a feature to not normalize	23:17	Copy link Message link Add to gist Remove
	and it seems pretty doable with mostly minor modifications to our functions		Copy link Message link Add to gist Remove
jnthn	samcv: Hmm, too bad I wasn't there. :-) It had occurred to me that MVMString might want to be the thing behind Uni though		Copy link Message link Add to gist Remove
TimToady	though Uni is just differently normalized...	23:18	Copy link Message link Add to gist Remove
samcv	i actually did a proof of concept sorta thing		Copy link Message link Add to gist Remove
	i added a nqp op that converted a normal mvmstring into a non-normalized type		Copy link Message link Add to gist Remove
jnthn	Though my idea was to have multiple types based on the MVMString REPR so that we can use type specialization to strip out the switching over "what kind of string is this"		Copy link Message link Add to gist Remove
samcv	and added a setting to one of the mvmstring struct		Copy link Message link Add to gist Remove
jnthn	I think I'd rather shuffle that setting type-wards for the reason just mentioned :)		Copy link Message link Add to gist Remove
	The thing that worries me is the binary operations	23:19	Copy link Message link Add to gist Remove
samcv	binary operations?		Copy link Message link Add to gist Remove
timotimo	so MVMString gets a REPR_data?		Copy link Message link Add to gist Remove
jnthn	timotimo: Yes		Copy link Message link Add to gist Remove
samcv	also that would mean having to write new functions for every single current function?		Copy link Message link Add to gist Remove
jnthn	samcv: As in, those that have multiple strings as the input		Copy link Message link Add to gist Remove
samcv	ah		Copy link Message link Add to gist Remove
	well it works as long as both are of the same string type		Copy link Message link Add to gist Remove
	which i demonstrated in my Proof of concept i wrote	23:20	Copy link Message link Add to gist Remove
	string eq etc, i just had the second string convert its type to the first string's type		Copy link Message link Add to gist Remove
timotimo	you can always concatenate into a "dirty" type, i.e. mixed normalization modes		Copy link Message link Add to gist Remove
jnthn	It doesn't mean having to write new functions if they do the same thing		Copy link Message link Add to gist Remove
samcv	timotimo: well no. i didn't allow that		Copy link Message link Add to gist Remove
jnthn	It's possible that at MoarVM level it just blows up if there's a type mismatch		Copy link Message link Add to gist Remove
timotimo	then you have to decide if some kinds of normalizations are infectious compared to others		Copy link Message link Add to gist Remove
samcv	it would convert the second item in the concatenation to non-normalized type or normalized depending on the first one	23:21	Copy link Message link Add to gist Remove
jnthn	And we handle that up at Rakudo level		Copy link Message link Add to gist Remove
	otoh maybe that's inefficient		Copy link Message link Add to gist Remove
	Since MVMString is also immutable		Copy link Message link Add to gist Remove
samcv	which?		Copy link Message link Add to gist Remove
TimToady	privileging the first argument is a bit non-p6-y		Copy link Message link Add to gist Remove
samcv	well lizmat though that NFG is infectious		Copy link Message link Add to gist Remove
jnthn	Indeed		Copy link Message link Add to gist Remove
samcv	so maybe that's what i implemented actually	23:22	Copy link Message link Add to gist Remove
jnthn	Infections NFG could work		Copy link Message link Add to gist Remove
timotimo	that's (?) why we give all Int ops a type to box stuff into		Copy link Message link Add to gist Remove
jnthn	There's all kinds of tricky though		Copy link Message link Add to gist Remove
samcv	in any case, i have much more confidence of this being doable		Copy link Message link Add to gist Remove
timotimo	also it makes me just a little uncomfortable that the slice reprop just takes self's type		Copy link Message link Add to gist Remove
jnthn	Like, if we do $str.split($uni), are the results Str or Uni?		Copy link Message link Add to gist Remove
	If we do $uni.substr(1, 2) are those units graphemes or codepoints?	23:23	Copy link Message link Add to gist Remove
timotimo	and also, when do we consider a part to match if the split needle is explicitly Uni rather than Str		Copy link Message link Add to gist Remove
jnthn	And what does it return?		Copy link Message link Add to gist Remove
samcv	jnthn: it's the $str's type		Copy link Message link Add to gist Remove
jnthn	Do we have a .subuni(1, 2) for the other thing?		Copy link Message link Add to gist Remove
timotimo	not only "match inside a grapheme", but also how to handle different normalization forms of the same thing		Copy link Message link Add to gist Remove
samcv	no		Copy link Message link Add to gist Remove
timotimo	etc etc		Copy link Message link Add to gist Remove
samcv	jnthn: we have substr but it uses uni semantics		Copy link Message link Add to gist Remove
	well the data type is not normalized. so it just does substring identically	23:24	Copy link Message link Add to gist Remove
jnthn	samcv: That doesn't go so well with the "operations have consistent semantics" design rule, though		Copy link Message link Add to gist Remove
TimToady	I think if people want to split graphemes they'd better explicitly force Uni first		Copy link Message link Add to gist Remove
samcv	nothing changes. it just makes a strand or a new string from point a to b		Copy link Message link Add to gist Remove
	jnthn: well at least on moarvm it's that simple		Copy link Message link Add to gist Remove
jnthn	We probably need to figure out how we want it to look at the Perl 6 level before deciding the MoarVM level.	23:25	Copy link Message link Add to gist Remove
samcv	i was thinking more of how moarvm is concerned though than how it'd actually be implemented in rakudo		Copy link Message link Add to gist Remove
	yeah		Copy link Message link Add to gist Remove
	TimToady looks around for a language designer...		Copy link Message link Add to gist Remove
jnthn	I'm a bit uncomfortable with the units specified to .substr(...) meaning something different on Uni, I guess.		Copy link Message link Add to gist Remove
	In that we generally try to make it so that when you perform an operation, you don't have to know the exact type it's operating on to know the semantics	23:26	Copy link Message link Add to gist Remove
TimToady	that's why we used to have opaque string offsets in the design :)		Copy link Message link Add to gist Remove
jnthn	Thus why we have == vs eq		Copy link Message link Add to gist Remove
samcv	i'm gonna go grab some food. brb		Copy link Message link Add to gist Remove
jnthn	Enjoy :)		Copy link Message link Add to gist Remove
	TimToady: Hm, that was when Str was envisioned as a multi-layer construct rather than the NFG thing with Uni a separate thing, though?	23:27	Copy link Message link Add to gist Remove
	Or are those two ideas seperable?		Copy link Message link Add to gist Remove
TimToady	ayup		Copy link Message link Add to gist Remove
	I suppose it wouldn't hurt to have a .subuni, and be a little consistent with the subbuf vs substr distinction	23:28	Copy link Message link Add to gist Remove
	though it's not like we don't overload other method names on different types	23:29	Copy link Message link Add to gist Remove
jnthn	My feeling is that a type distinction between Str and Uni is probably right; the Perl 5 I've been writing recently has made me miss the Buf/Str distinction, and the Str/Uni case feels pretty distinct to		Copy link Message link Add to gist Remove
	TimToady: We do, though I like to think we mostly do it when the semantics will not be a surprise. :)		Copy link Message link Add to gist Remove
	Not knowing what the units I feed in will be interpreted as feels a bit...akward.	23:30	Copy link Message link Add to gist Remove
	*awkward		Copy link Message link Add to gist Remove
	I guess .index is simlarly problematic		Copy link Message link Add to gist Remove
TimToady	it does make it a bit harder to write generic code that doesn't care whether you feed it Uni or Str, but we could already say that about Buf	23:31	Copy link Message link Add to gist Remove
	maybe we also need a .submumble method :)		Copy link Message link Add to gist Remove
jnthn	:)		Copy link Message link Add to gist Remove
	I struggle to think of many cases where I don't care what level of abstraction I'm working at.	23:33	Copy link Message link Add to gist Remove
TimToady	what happens when we subbub a Blob currently?		Copy link Message link Add to gist Remove
	subbuf, er		Copy link Message link Add to gist Remove
	answer, it returns a subblob	23:34	Copy link Message link Add to gist Remove
jnthn	Yeah, I was thinking about making that able to use a "view" also	23:36	Copy link Message link Add to gist Remove
	So that it doesn't have to copy		Copy link Message link Add to gist Remove
samcv	back	23:38	Copy link Message link Add to gist Remove
TimToady	well, views work better on immutables than mutables, though editors often know how to maintain pointers into mutable buffers...	23:43	Copy link Message link Add to gist Remove
	so depends on whether we want to rewrite vim in Perl 6... :P		Copy link Message link Add to gist Remove
jnthn	Indeed, Blob is immutable. Wasn't planning it for Buf, or at least not without some explicit way of asking for it	23:44	Copy link Message link Add to gist Remove
timotimo	taking multiple subbuf-rw into the same buf and assigning length-changing things makes bufs very weird :)	23:47	Copy link Message link Add to gist Remove
	not surprising, though		Copy link Message link Add to gist Remove
samcv	timotimo: so i think what i'll do is have each hash have its own rotation of the hash keys	23:51	Copy link Message link Add to gist Remove
	and we could also change the rotation on table expansion as well		Copy link Message link Add to gist Remove

Please report any issues / comments / feature requests as an issue on App::Raku::Log.

Thank you!