|
github.com/moarvm/moarvm | IRC logs at irclog.perlgeek.de/moarvm/today Set by moderator on 18 May 2018. |
|||
|
00:51
shareable6 joined
01:56
ilbot3 joined
|
|||
| moderator | github.com/moarvm/moarvm | IRC logs at irclog.perlgeek.de/moarvm/today | ||
|
03:22
greppable6 joined,
reportable6 joined,
notable6 joined,
quotable6 joined,
committable6 joined,
coverable6 joined,
evalable6 joined,
bloatable6 joined
03:23
bisectable6 joined,
releasable6 joined,
nativecallable6 joined,
unicodable6 joined,
benchable6 joined,
statisfiable6 joined,
squashable6 joined,
undersightable6 joined,
shareable6 joined
05:11
robertle joined
05:52
robertle joined
06:12
shareable6 joined
08:53
domidumont joined
09:00
domidumont joined
09:28
FROGGS joined
09:47
shareable6 joined
11:35
shareable6 joined
12:07
zakharyas joined
12:57
shareable6 joined
13:45
Geth joined
14:45
ilmari[m] joined
15:36
committable6 joined,
reportable6 joined,
coverable6 joined,
bloatable6 joined,
releasable6 joined,
unicodable6 joined,
statisfiable6 joined,
squashable6 joined
16:49
Kaiepi joined
17:36
shareable6 joined
18:10
shareable6 joined
18:51
Ven`` joined
19:51
shareable6 joined
21:00
Summertime joined
21:19
Summertime left
|
|||
| samcv | MasterDuke: mill arch is pretty interesting | 21:21 | |
| timotimo | jnthn: do you think adding a bitmap for "which slots in the sc have been deserialized so far?" to make the loop over the whole array to find a given object faster? | 21:44 | |
| s/"?"$/ is a good idea?/ | |||
| jnthn | Hm, is that expensive? | 21:48 | |
| I figured it'd only happen on the first lazy deserialization of something, and that we tend to then deserialize entire subtrees of things | |||
| timotimo | well, you know how we sometimes have a cached sc idx inside an object? | 21:50 | |
| in install_core_dists we hit that 20% of the time | |||
| we do find_obj_idx or what it's called 88k times inside that script | 21:51 | ||
| i'm printing out any number of >= 8 consecutive nulls and i get numbers ranging up to ~450 sometimes | 21:52 | ||
| if we can sometimes skip this many items outright, we'd surely have much less cache evictions | 21:53 | ||
| jnthn | Hmm, but why are we hitting the linear search if the object already exists? :S | 21:54 | |
| I thought if something was deserialized then we'd *always* have index cached | 21:55 | ||
| timotimo | i didn't see through that code | ||
| jnthn | I found the odd place where we didn't before | ||
| But I guess we must be missing another one | |||
| timotimo | i had no idea what the requirement was for an object to have its scidx cached :) | ||
| but "an object should always have its scidx cached" is a reasonable explanation and i could go digging | 21:56 | ||
| jnthn | afaik there's not a reason that it can't, I'm guessing we must just somewhere "forget" | ||
| timotimo | rr shall rescue me | 21:59 | |
| oh, does anything speak against caching the index right then and there while we're doing the linear search? | 22:00 | ||
| jnthn | Hmm...if it can be done without thread headaches | 22:07 | |
| timotimo | racing to install the same value from two threads could be fine | 22:09 | |
| hm, but if another thread is reading half the value, that could be not so great | |||
| OK, it looks like the change makes no difference | 22:10 | ||
| timing wise, that is | 22:11 | ||
| samcv | while taking a nap today i thought of an optimization for string eq. we can compare the cached hash codes (if they exist) and quickly reject non matching strings | 22:18 | |
| timotimo | oh, we don't do that yet? | 22:22 | |
| sounds like a good idea in any case | |||
| jnthn | Just make sure to exclude the 0 no-hash-yet sentinel :-) | 22:25 | |
| Geth | MoarVM: 4152021ff8 | (Samantha McVey)++ | tools/update-changelog.p6 [Tools] Add update-changelog.p6 tool |
22:31 | |
| MoarVM: d634d24cf3 | (Samantha McVey)++ | src/strings/ops.c Instantly return 0 with string eq if cached hash code doesn't match If the cached hash codes exist, we are able to quickly return 0 without having to manually compare the two strings. For some work loads I could see this having a fair impact. |
|||
| samcv | spectest pass. and commited | ||
| timotimo | it could be beneficial to just calculate some hash codes "for fun" now | 22:32 | |
| to increase the occurence of hash codes not being 0 | |||
| samcv | "for fun" lol | 22:34 | |
| timotimo | like, when we do GC and some threads are done with their work already, but some other thread is still GCing away ... maybe grab some random strings and calculate some hash codes? | ||
| do we do anything smart with hashes calculated from strands btw? | |||
| samcv | hah | ||
| like what? | |||
| also no we don't | 22:35 | ||
| timotimo | if we have two long strings in a strand, can we re-use the first part's hash code (if the whole string is a part of it) | 22:36 | |
| hm, the hash code potentially includes the length of the string, eh? | |||
| that would make it useless for that purpose i suppose | |||
| we probably don't want to have a hash function that you can just combine stuff together with, maybe | 22:37 | ||
| samcv | timotimo: yeah that basically makes it easily attackable | 22:40 | |
| timotimo | right | ||
| if we can cheat, so can the attacker | |||
| samcv | not sure if we want to rekey a hash if we have too many hash conflicts | 22:41 | |
| i mean we probably should ideally | |||
| timotimo | i don't yet know how rekeying can work if we want to keep cached hash codes | ||
| samcv | and then you can also worry about timing attacks | ||
| timotimo: well you just ignore them | 22:42 | ||
| timotimo | is the attack we expect that the attacker gets the full hash code right? | ||
| or just the part we use for buckets? | |||
| samcv | just the bucket part i suppose | ||
| timotimo | in that case we can perhaps just change what part of the hash code we use to decide on the bucket? | 22:43 | |
| samcv | hm | 22:44 | |
| i guess we could like reverse it? | |||
| timotimo | start at the end instead of the beginning? | ||
| samcv | hm | ||
| i mean we'll have a 64 bit value so that has a lot of surface area. we could rotate it maybe | 22:47 | ||
| timotimo | rotate sounds good | ||
| samcv | surface area as in, we don't really need a full 64 bits for bucket determination | ||
| but it makes it slower to bruteforce and should be trivial to do a rotation on it and be able to rekey in case there is an attack | 22:48 | ||
|
22:48
ZofBot joined
|
|||
| samcv | and allow us to not have to recompute all the strings hashes again | 22:48 | |
| timotimo | hm | 22:59 | |
| we're currently telling people the order of items in a hash will change between runs of the same program | |||
| are they expecting the order will not change on its own thereafter? | |||
| naah, that'd already happen when hashes increase in size | |||
| samcv | well it won't change unless they add things to it | ||
| and they already reorder on bucket resizing | |||
| timotimo | which it already did before anyway | ||
| right | |||
| samcv | though they technically shouldn't rely on that either | 23:00 | |
| timotimo | there isn't as much thinking before my talking today as there sometimes is | ||
| samcv | also could be interesting if each hash had its own rotation | ||
| timotimo | every MVMHash gets its own Quaternion | 23:01 | |
| samcv | quaternion? | ||
| what | |||
| timotimo | in 3d programming, they're used to make things rotate in a way you'd expect | ||
| wow, the wikipedia has a bunch of illustration and none of them seem enlightening | 23:02 | ||
| samcv | on 3d programming? | 23:03 | |
| timotimo | damn, a gamasutra article entitled "rotating objects using quaternions". it starts "Last year may go down in history as The Year of the Hardware Acceleration". it is from 1998 | ||
| btw, i don't know much about 3d graphics or 3d programming or whatever, i've just picked this snippet up somewhere | 23:05 | ||
| samcv | also not sure if we need to hide the order of objects in a hash table or not | ||
| MasterDuke | sounds about right, i think i got my first 3d video card around 1996 or 1997 | ||
| samcv | i.e. by randomizing which buckets we iterate through first | ||
| timotimo | my first 3d-ish card was a matrox mystique, but no clue if the original or the 220 version | 23:06 | |
| the latter was released 1997 apparently | |||
| MasterDuke | mine was a canopus pure 3d, a 3dfx voodoo 1 (but with 6m ram, 2m more texture memory than the reference version) | ||
| i could finally run jedi knight at 640x480! | 23:07 | ||
| TimToady | my first graphics processor was the blitter on an Amiga 1000 :) | 23:08 | |
| MasterDuke | hm, i'm not sure i've ever used an amiga. certainly heard/read much about them though | 23:09 | |
| timotimo | i never had any amiga, or even commodore or atari or what have you. the first computer i remember using was either a 386 or a 486, possibly the latter | 23:10 | |
| jnthn | Today my wife was trying to install some smartphone app that wanted over 400MB of space, which seemed huge given what it was supposed to do. I pointed out this was 4 times more space than the entire disk space of my family's first home computer (a 486) that I programmed on. The BBC micro that was the first machine I programmed on didn't even have a hard disk. :-) | 23:12 | |
| TimToady | neither did the Amiga 1000 | ||
| unless you count floppies... | 23:13 | ||
| jnthn | I figure floppies are by definition not hard. :P | ||
| TimToady | depends on how floppy your definition is, I suppose... | ||
| timotimo | in order to try to appreciate machines of the pre-timotimo-era i'm watching The 8bit Guy (formerly The iBook Guy, and additionally 8bit keys) on You the Tube | 23:14 | |
| jnthn | Hm, actually, I'd always thought "hard disk" was just the opposite of "floppy disk", and never considered if that was the real reason for the naming :) | ||
| TimToady | it's a bit of a retrynym, I suspect | ||
| *retro | |||
| timotimo | hm, is a solid state disk just the opposite of a fluid community cube? | 23:15 | |
| what would you call the opposite of a disk | |||
| MasterDuke | no, of a companion cube | ||
| TimToady | join the Flat Disk Society today! | ||
| timotimo | actually, disc and disk aren't the same thing | ||
| TimToady | discs have grooves :) | 23:16 | |
| timotimo | amusingly, in german it's called Festplatte, which you could wrong-translate as a thing you put lots of food on to serve at some kind of fest/party/event | ||
| wrangslate? | |||
| samcv | jnthn: when i was with lizmat i spitballed my ideas on implementing MVMString that has a feature to not normalize | 23:17 | |
| and it seems pretty doable with mostly minor modifications to our functions | |||
| jnthn | samcv: Hmm, too bad I wasn't there. :-) It had occurred to me that MVMString might want to be the thing behind Uni though | ||
| TimToady | though Uni is just differently normalized... | 23:18 | |
| samcv | i actually did a proof of concept sorta thing | ||
| i added a nqp op that converted a normal mvmstring into a non-normalized type | |||
| jnthn | Though my idea was to have multiple types based on the MVMString REPR so that we can use type specialization to strip out the switching over "what kind of string is this" | ||
| samcv | and added a setting to one of the mvmstring struct | ||
| jnthn | I think I'd rather shuffle that setting type-wards for the reason just mentioned :) | ||
| The thing that worries me is the binary operations | 23:19 | ||
| samcv | binary operations? | ||
| timotimo | so MVMString gets a REPR_data? | ||
| jnthn | timotimo: Yes | ||
| samcv | also that would mean having to write new functions for every single current function? | ||
| jnthn | samcv: As in, those that have multiple strings as the input | ||
| samcv | ah | ||
| well it works as long as both are of the same string type | |||
| which i demonstrated in my Proof of concept i wrote | 23:20 | ||
| string eq etc, i just had the second string convert its type to the first string's type | |||
| timotimo | you can always concatenate into a "dirty" type, i.e. mixed normalization modes | ||
| jnthn | It doesn't mean having to write new functions if they do the same thing | ||
| samcv | timotimo: well no. i didn't allow that | ||
| jnthn | It's possible that at MoarVM level it just blows up if there's a type mismatch | ||
| timotimo | then you have to decide if some kinds of normalizations are infectious compared to others | ||
| samcv | it would convert the second item in the concatenation to non-normalized type or normalized depending on the first one | 23:21 | |
| jnthn | And we handle that up at Rakudo level | ||
| otoh maybe that's inefficient | |||
| Since MVMString is also immutable | |||
| samcv | which? | ||
| TimToady | privileging the first argument is a bit non-p6-y | ||
| samcv | well lizmat though that NFG is infectious | ||
| jnthn | Indeed | ||
| samcv | so maybe that's what i implemented actually | 23:22 | |
| jnthn | Infections NFG could work | ||
| timotimo | that's (?) why we give all Int ops a type to box stuff into | ||
| jnthn | There's all kinds of tricky though | ||
| samcv | in any case, i have much more confidence of this being doable | ||
| timotimo | also it makes me just a little uncomfortable that the slice reprop just takes self's type | ||
| jnthn | Like, if we do $str.split($uni), are the results Str or Uni? | ||
| If we do $uni.substr(1, 2) are those units graphemes or codepoints? | 23:23 | ||
| timotimo | and also, when do we consider a part to match if the split needle is explicitly Uni rather than Str | ||
| jnthn | And what does it return? | ||
| samcv | jnthn: it's the $str's type | ||
| jnthn | Do we have a .subuni(1, 2) for the other thing? | ||
| timotimo | not only "match inside a grapheme", but also how to handle different normalization forms of the same thing | ||
| samcv | no | ||
| timotimo | etc etc | ||
| samcv | jnthn: we have substr but it uses uni semantics | ||
| well the data type is not normalized. so it just does substring identically | 23:24 | ||
| jnthn | samcv: That doesn't go so well with the "operations have consistent semantics" design rule, though | ||
| TimToady | I think if people want to split graphemes they'd better explicitly force Uni first | ||
| samcv | nothing changes. it just makes a strand or a new string from point a to b | ||
| jnthn: well at least on moarvm it's that simple | |||
| jnthn | We probably need to figure out how we want it to look at the Perl 6 level before deciding the MoarVM level. | 23:25 | |
| samcv | i was thinking more of how moarvm is concerned though than how it'd actually be implemented in rakudo | ||
| yeah | |||
| TimToady looks around for a language designer... | |||
| jnthn | I'm a bit uncomfortable with the units specified to .substr(...) meaning something different on Uni, I guess. | ||
| In that we generally try to make it so that when you perform an operation, you don't have to know the exact type it's operating on to know the semantics | 23:26 | ||
| TimToady | that's why we used to have opaque string offsets in the design :) | ||
| jnthn | Thus why we have == vs eq | ||
| samcv | i'm gonna go grab some food. brb | ||
| jnthn | Enjoy :) | ||
| TimToady: Hm, that was when Str was envisioned as a multi-layer construct rather than the NFG thing with Uni a separate thing, though? | 23:27 | ||
| Or are those two ideas seperable? | |||
| TimToady | ayup | ||
| I suppose it wouldn't hurt to have a .subuni, and be a little consistent with the subbuf vs substr distinction | 23:28 | ||
| though it's not like we don't overload other method names on different types | 23:29 | ||
| jnthn | My feeling is that a type distinction between Str and Uni is probably right; the Perl 5 I've been writing recently has made me miss the Buf/Str distinction, and the Str/Uni case feels pretty distinct to | ||
| TimToady: We do, though I like to think we mostly do it when the semantics will not be a surprise. :) | |||
| Not knowing what the units I feed in will be interpreted as feels a bit...akward. | 23:30 | ||
| *awkward | |||
| I guess .index is simlarly problematic | |||
| TimToady | it does make it a bit harder to write generic code that doesn't care whether you feed it Uni or Str, but we could already say that about Buf | 23:31 | |
| maybe we also need a .submumble method :) | |||
| jnthn | :) | ||
| I struggle to think of many cases where I don't care what level of abstraction I'm working at. | 23:33 | ||
| TimToady | what happens when we subbub a Blob currently? | ||
| subbuf, er | |||
| answer, it returns a subblob | 23:34 | ||
| jnthn | Yeah, I was thinking about making that able to use a "view" also | 23:36 | |
| So that it doesn't have to copy | |||
| samcv | back | 23:38 | |
| TimToady | well, views work better on immutables than mutables, though editors often know how to maintain pointers into mutable buffers... | 23:43 | |
| so depends on whether we want to rewrite vim in Perl 6... :P | |||
| jnthn | Indeed, Blob is immutable. Wasn't planning it for Buf, or at least not without some explicit way of asking for it | 23:44 | |
| timotimo | taking multiple subbuf-rw into the same buf and assigning length-changing things makes bufs very weird :) | 23:47 | |
| not surprising, though | |||
| samcv | timotimo: so i think what i'll do is have each hash have its own rotation of the hash keys | 23:51 | |
| and we could also change the rotation on table expansion as well | |||