| Zoffix | \o | 00:43 | |
| japhb | samcv: Are you still looking for a job? If so, have you applied at Google yet? (I'm a manager there.) | 00:56 | |
|
01:51
ilbot3 joined
02:24
Ven`` joined
|
|||
| samcv | i think you remember saying that before. i will add it to my list, thanks :) | 03:08 | |
| i should probably commit something today. i think i'll do adding the unicode names for the hangul syllables | 03:16 | ||
| u: hangul | 03:21 | ||
| unicodable6 | samcv, U+1100 HANGUL CHOSEONG KIYEOK [Lo] (?) | ||
| samcv, U+1101 HANGUL CHOSEONG SSANGKIYEOK [Lo] (?) | |||
| samcv, 563 characters in total: gist.github.com/86a72f089958516afe...a6e453dcbf | |||
| samcv | hmm did i already do it? | 03:22 | |
| m: 0xD4DB.uniname.say | 03:24 | ||
| camelia | <Hangul Syllable>? | ||
| samcv | i was going to finish getting ucd2c.pl working properly i remember now. so i don't deal with the hassel of it regenerating wrong after adding the names | 03:48 | |
| m: use nqp; my $p = nqp::unipropcode('General_Category'); my $pv = nqp::unipvalcode($p, 'L'); nqp::hasuniprop('a', 0, $p, $pv).say | 05:36 | ||
| camelia | 0? | ||
| samcv | m: use nqp; my $p = nqp::unipropcode('General_Category'); my $pv = nqp::unipvalcode($p, 'N'); nqp::hasuniprop('3', 0, $p, $pv).say | ||
| camelia | 0? | ||
|
06:03
brrt joined
|
|||
| brrt | good * #moarvm | 06:33 | |
| samcv | good * | 06:34 | |
| brrt | samcv, can you help me figure something out | 06:39 | |
| samcv | i hope :-) | ||
| brrt | i'm having a designā¦. challenge, or tradeoff, or whatever in the optimizer, that i kind of know how to resolve, but would prefer another mind to look at :-) | ||
| samcv | ok :) | 06:40 | |
| brrt | context: I really, really, really, like bounded-sized-sets | ||
| or more properly | |||
| i like knowing in advance how much of a thing to allocate | |||
| so that i only ever have to allocate that single thing | |||
| samcv | yeah | ||
| brrt | so i use that pretty heavily in even-moar-jit | 06:41 | |
| now, in the optimizer, i need to be able to swap out one node for another | |||
| and also, i want to be able to compute number of referents for a node, for instance to replace a heavily-referenced LOAD with a COPY of the same, so that the tiler thinks it's opaque and will actually load it into a register rather than issue two separate LOADs | 06:42 | ||
| so what i'd like is to associate each node with its set of parents, or referents, or inboud edges, or whatever you want to call them | 06:43 | ||
| i.e. its users | |||
| now there is a very simple identity going on. it's impossible for there to be more referents than nodes | |||
| it would be possible to have almost as many references as there are nodes, for instance if i have a single DO list that refers the same node over and over and over again | 06:44 | ||
| so, that makes me happy, because now i can allocate the reference-mapping things as a single array, arrange them in a linked list per node, and use pointer bumping to allocate | 06:45 | ||
| however. | |||
| when i'm optimizing, i'm modifying the tree at the same time | |||
| so when e.g. i swap out node A for node B, all references to A must be replaced by B | 06:46 | ||
| great, i can use my reference list, and find out all nodes that refer to A, and have them refer to B | |||
| this never adds any new references, so my initial constraint still holds | 06:47 | ||
| but B can be the root of a tree, as can A | |||
| all references to the children of A are now become useless (A has been spliced out). all children of B have no references allocated to them | |||
| the second part is a problem because i would like to iterate over B (haivng not visitied it before) and see if there is anything there i can optimize | 06:49 | ||
| and if i want to assign references to the nodes in B, i can't because i'll easily end up with more references than i'd have allocated | 06:52 | ||
| so, i can now: try to garbage collect all references from A | |||
| which will break my pointer-bumping scheme, but okay | |||
| will have to use a free list scheme instead | 06:53 | ||
| or B, allocate all references from a region (MVM_spesh_alloc), and have them cleaned up later | |||
| i mean, i know fairly well that B is the solution here | |||
| i also have C, split optimization in multiple passes | |||
| nine | brrt: how performance sensitive is that code anyway? Spesh already runs in a separate thread and thus the JIT does, too. | 06:54 | |
| brrt | it's partly about performance, but it's also partly about manageablility | 06:55 | |
| brb | 06:56 | ||
|
07:12
brrt joined
|
|||
| brrt | on the other hands, two things come to mind | 07:14 | |
| the sooner we finish compilation, the sooner that code now actually runs | |||
| so fast compilation does have an effect | |||
| more importantly, the size of the data that the expr compiler works on is potentially huge | 07:15 | ||
| we have already very large frames | |||
| ultimately, we want to compile the entire frame to an expression | |||
| so cheap and cheerful is i think the way to go | |||
| (what if all you need is a hammer, and good aim?) | 07:16 | ||
| nine | brrt: MVM_spesh_alloc is pretty much just pointer bumping and you don't have to care about freeing the memory. Sounds like the perfect compromise between performance and manageability | ||
| brrt | it's pretty good, yes, and to be fair, the only good reaosn i have not to use it in the expr tree, is that it doesn't handle dynamically growing arrays very well | 07:17 | |
| and a dynamically growing array is such a super useful data structure | |||
| hash table? dynamic array | |||
| free list? dynamic array | |||
| samcv | brrt, how would splitting opt into multiple passes be | ||
| brrt | heap, dynamic array | 07:18 | |
| union-find, you get the point | |||
| not well defined in my mind, but you'd have one pass to get all references, another pass to do the replacements | |||
| topological graph, well, in my case, static array | 07:19 | ||
| that splitting would in fact not solve the problem of my references | 07:22 | ||
| so, yeah, spesh allocation is the way to got | |||
| *go | |||
| just wanted to confirm that, is all :-) | |||
| nine | brrt: thanks for asking btw. Made me investigate MVM_spesh_alloc :) | 07:23 | |
|
07:24
geekosaur joined
|
|||
| samcv | sounds like a plan brrt | 07:26 | |
| brrt | alright | ||
| cool | |||
|
08:29
robertle_ joined
08:40
zakharyas joined
12:01
zakharyas joined
12:11
evalable6 joined
12:22
nwc10 joined
|
|||
| timotimo | my readSizedInt64 is full of fastinvoke + hllize + decont_i :\ | 12:33 | |
| that might make it a bit slow | |||
| the hllize comes from calling .shift on the Buf[uint8], which is done as an invoke_o kind of thing | 12:36 | ||
| looks like in Blob it already just calls nqp::shift_i on itself | 12:37 | ||
| but it can return a Failure | |||
| that's probably what frustrates the whole effort | |||
| with nqp::shift_i the generated code is much nicer | 12:44 | ||
| it's a single BB with only shift_i, const_i64_16, blshift_i, and add_i | |||
|
12:48
MasterDuke joined
|
|||
| timotimo | oh yeah! | 12:49 | |
| i've reached 8 seconds | |||
| MasterDuke | timotimo: for the heap analyzer? | ||
| timotimo | 317% cpu usage, not bad | ||
| yep! | 12:50 | ||
| MasterDuke | nice. haven't gotten to try it out yet, but looking forward to the speedup | ||
| timotimo | AFKBBL | 12:51 | |
| MasterDuke | btw, anybody have any comments/suggests re my INTERPOLATE questions from yesterday? | ||
| timotimo | but maybe i'll push something before i go | 12:52 | |
|
12:54
coverable6 joined,
committable6 joined,
bloatable6 joined,
unicodable6 joined,
nativecallable6 joined,
releasable6 joined,
greppable6 joined,
quotable6 joined,
bisectable6 joined,
benchable6 joined,
evalable6 joined,
statisfiable6 joined
|
|||
| timotimo | i just pushed to my branch | 12:58 | |
| have fun! | |||
| MasterDuke | cool, checking it out now | 12:59 | |
| timotimo | yay | 13:00 | |
| don't forget it has to be the file moar now spits out into /tmp | 13:01 | ||
| otherwise you'll have no speed benefits | |||
| also, it currently leaks file descriptors | |||
| MasterDuke | right. i don't need to pass any new options to ---profile (other than =heap)? | 13:02 | |
| timotimo | correct | ||
| i just hacked the output in | |||
| MasterDuke | k | ||
| timotimo | i also haven't tested it with anything that has more than one snapshot :D | ||
| MasterDuke | hm, got a segv. rebuilding rakudo to make sure it isn't that | 13:04 | |
| (when doing the profile) | |||
| timotimo: gist.github.com/MasterDuke17/c9b2a...0f097d4e40 | 13:08 | ||
| valgrind and gdb output | 13:10 | ||
| timotimo | huh | 13:28 | |
| MasterDuke | it gets through one call of `MVM_spesh_sim_stack_gc_mark`, but the second seems to have an invalid/missing/something `sims->frames` | 13:35 | |
| timotimo | i don't think i meant to commit that | 13:36 | |
| MasterDuke | i.e., it segvs on the first iteration of the loop | ||
| timotimo | gimme a sec | ||
| MasterDuke | commit what? | ||
| timotimo | oh, did i forget to commit the independent "make heap snapshot no longer segv" stuff | ||
| MasterDuke | ha. btw, i'm on your heapsnapshot_binary_format branch | 13:37 | |
| timotimo | yeah | ||
| MasterDuke | and i rebased master onto it | 13:38 | |
| timotimo | ah! | 13:39 | |
| i do have changes! | |||
| Geth | MoarVM/heapsnapshot_binary_format: 3b2fefcaf4 | (Timo Paulssen)++ | 7 files WIP gc_describe functions for new spesh datastructures |
13:40 | |
| timotimo | the important part is that it no longer tries to gc_mark if it's actually in heapsnapshot mode (which means there is no "worklist") | ||
| MasterDuke | ah, no segv | 13:44 | |
| timotimo | i had already "git add"ed it and i always just "git diff" to see what's what | 13:45 | |
| but git gui thankfully showed me | |||
| MasterDuke | and now there's a heapsnapshot_new_format in /tmp | ||
| timotimo | yup! | ||
| MasterDuke | does seem a bit faster. longer than 8s though | 13:47 | |
| timotimo | compare it to the regular format on an unpatched heapanalyzer | ||
| also, how big are the files and such? :) | 13:48 | ||
| MasterDuke | 48m for regular, 100m for binary | 13:49 | |
| 2 snapshots | |||
| didn't time it, but `summary` in the regular one was a bunch slower | 13:50 | ||
|
13:53
MasterDuke_ joined
|
|||
| MasterDuke_ | hm, binary version does seem to switch snapshots much faster | 13:54 | |
| oh, but maybe `top objects by size` gives different results | 13:57 | ||
| timotimo | well, switching snapshots does no work | 13:59 | |
| i'd time something like echo "snapshot 0\nsummary" on both implementations | |||
| "top objects by size" will additionally go through the whole snapshot once again | 14:00 | ||
| MasterDuke_ | timotimo: gist.github.com/MasterDuke17/91bce...9fde812822 | ||
| looks like the numbers are the same, but different names | |||
| timotimo | huh | ||
| that's not right | |||
| oh, perhaps the empty string in the string heap tripped me up | 14:01 | ||
| can you check into the file to see if these strings are neighbours except off by one? | |||
| MasterDuke_ | in which file? | ||
| timotimo | the old version is easier to look at | 14:02 | |
| MasterDuke_ | "Spesh slot entry","<SC>","P6opaque","Scalar","Perl6::Metamodel::ContainerDescriptor","NQPMu","Method cache","Type cache entry","Boolification method","WHO","WHAT","HOW" | 14:03 | |
| selection from old file around P6opaque | 14:04 | ||
| "RoleToClassApplier","149","&has_method","&has_attribute","NQPArray","NQPMatch","int","str","gen/moar/stage2/QRegex.nqp","$?CLASS","$?PACKAGE","$NO_CAPS", | |||
| selection from old file around NQPArray | |||
| useful? | 14:05 | ||
| timotimo | oh | ||
| maybe snapshots 0 and 1 are swapped | |||
| no that makes no sense either | 14:06 | ||
| MasterDuke_ | oh, i just noticed p6opaque shows up a couple times in the new version's output | ||
| 3 times | |||
| timotimo | it shouldn't do that :D | ||
| well, it's easy to be much faster, but also wrong | 14:07 | ||
| for every bit of wrong you can potentially be 10x as fast :D | |||
| MasterDuke_ | yup | ||
| timotimo | oh, i gotta get ready to head out | 14:08 | |
| seems like i'll have to fix this later | |||
| MasterDuke_ | k | 14:12 | |
| timotimo | unless you can figure out what mistake i made :D | 14:15 | |
| like you could print out the @!strings for both versions and compare | |||
| MasterDuke_ | you think the problem is in the app? not the moarvm branch? | 14:29 | |
| timotimo | hm, right, could also be the functions that write out things. not sure how, though | 14:37 | |
| MasterDuke_ | i added `dd @!strings[^20]` to the BUILDs | 14:40 | |
| the output for just running the app is the same | |||
|
15:07
brrt joined
|
|||
| brrt | rebase from hell, but we'll get there... | 15:16 | |
| Zoffix | .oO( ? I'm on a reeeeeebaaaasee to hell... ? ) |
15:17 | |
|
15:36
AlexDaniel joined
15:39
benchable6 joined
|
|||
| japhb | .oO( The rebase to hell is paved with well-intentioned patches. ) |
15:52 | |
|
16:29
dogbert2 joined
17:03
zakharyas joined
17:17
leont joined
|
|||
| nine | That's an interesting comment: github.com/MoarVM/MoarVM/blob/mast...erp.c#L108 | 18:21 | |
| The lists don't actually match. I wonder how much that costs? | |||
| Now finally the whole thing makes sense! "Points to the current opcode." is just plain wrong. cur_op does _not_ point at the current opcode, but at the place following that, which is the number of the first register. | 18:25 | ||
| The NEXT_OP macro tellst that story | 18:26 | ||
| so *tc->interp_cur_op should actually point at GETREG(cur_op, 0) which is exactly what I need, because it's the register I should return values into | 18:27 | ||
| dogbert2 | github.com/libtom/libtommath/blob/...hanges.txt | 19:49 | |
| Zoffix | "-- Fixed mp_rand() to fill the correct number of bits" we had a bug due to that | 19:50 | |
| dogbert2 | and I believe we had another bug relating to 'Fixed mp_invmod()' | 19:52 | |
|
19:52
brrt joined
|
|||
| dogbert2 | I think timotimo patched that though | 19:52 | |
| brrt | good * | 19:53 | |
| i'm thinking of calling of the whole rebase plan | |||
| its super frustrating | |||
| it doesn't actuallly give a 'clean' set of patches | |||
| since many of the intermediate steps are just broken | |||
| it risks diverging from the current, working, code | 19:54 | ||
| *off | 19:55 | ||
| dogbert2 | Zoffix: RT #129829 | 19:56 | |
| synopsebot6 | Link: rt.perl.org/rt3/Public/Bug/Display...?id=129829 | ||
| brrt | and it distract from actual useful changes | ||
| .ask jnthn i notice a bunch of jgb_sc_wb calls to things where the operand might be an integer rather than an object | 20:12 | ||
| yoleaux | brrt: I'll pass your message to jnthn. | ||
| nine | .ask brrt why does the JIT clear RV in the epilogue? It could be used for returning a value to the caller otherwise | 20:32 | |
| yoleaux | nine: I'll pass your message to brrt. | ||
|
22:09
geekosaur joined
|
|||
| timotimo | ended up not doing anything much today | 22:19 | |
| lizmat knows the feeling | 22:22 | ||
| lizmat goes to bed | |||
| samcv | another day, gotta work on ucd2c.pl again. ugh. i will prevail! | 23:26 | |
| Zoffix | \o/ | 23:36 | |
| samcv | Zoffix, what is your opinion on the Unified ideographs. none of them have names. it's my opinion like the control codes we should call them <CJK Unified Ideograph-7FFE> or such | 23:37 | |
| because atm there's no way to distinguish them | |||
| m: "?".uninames.say | 23:38 | ||
| camelia | (<CJK Ideograph>)? | ||
| samcv | err just CJK IDeograph i guess plus the number | ||
|
23:43
geekosaur joined
|
|||
| Zoffix | samcv: yeah, sounds like a plan | 23:45 | |
|
23:48
geekosaur joined
|
|||
| samcv | and fixing this i made a function to catch the bad output that gets in there and i'm putting it every damn place with a croak | 23:53 | |
| eventually i'll figure out where it's being added in... | |||