Zoffix | \o | 00:43 | |
japhb | samcv: Are you still looking for a job? If so, have you applied at Google yet? (I'm a manager there.) | 00:56 | |
01:51
ilbot3 joined
02:24
Ven`` joined
|
|||
samcv | i think you remember saying that before. i will add it to my list, thanks :) | 03:08 | |
i should probably commit something today. i think i'll do adding the unicode names for the hangul syllables | 03:16 | ||
u: hangul | 03:21 | ||
unicodable6 | samcv, U+1100 HANGUL CHOSEONG KIYEOK [Lo] (?) | ||
samcv, U+1101 HANGUL CHOSEONG SSANGKIYEOK [Lo] (?) | |||
samcv, 563 characters in total: gist.github.com/86a72f089958516afe...a6e453dcbf | |||
samcv | hmm did i already do it? | 03:22 | |
m: 0xD4DB.uniname.say | 03:24 | ||
camelia | <Hangul Syllable>? | ||
samcv | i was going to finish getting ucd2c.pl working properly i remember now. so i don't deal with the hassel of it regenerating wrong after adding the names | 03:48 | |
m: use nqp; my $p = nqp::unipropcode('General_Category'); my $pv = nqp::unipvalcode($p, 'L'); nqp::hasuniprop('a', 0, $p, $pv).say | 05:36 | ||
camelia | 0? | ||
samcv | m: use nqp; my $p = nqp::unipropcode('General_Category'); my $pv = nqp::unipvalcode($p, 'N'); nqp::hasuniprop('3', 0, $p, $pv).say | ||
camelia | 0? | ||
06:03
brrt joined
|
|||
brrt | good * #moarvm | 06:33 | |
samcv | good * | 06:34 | |
brrt | samcv, can you help me figure something out | 06:39 | |
samcv | i hope :-) | ||
brrt | i'm having a designā¦. challenge, or tradeoff, or whatever in the optimizer, that i kind of know how to resolve, but would prefer another mind to look at :-) | ||
samcv | ok :) | 06:40 | |
brrt | context: I really, really, really, like bounded-sized-sets | ||
or more properly | |||
i like knowing in advance how much of a thing to allocate | |||
so that i only ever have to allocate that single thing | |||
samcv | yeah | ||
brrt | so i use that pretty heavily in even-moar-jit | 06:41 | |
now, in the optimizer, i need to be able to swap out one node for another | |||
and also, i want to be able to compute number of referents for a node, for instance to replace a heavily-referenced LOAD with a COPY of the same, so that the tiler thinks it's opaque and will actually load it into a register rather than issue two separate LOADs | 06:42 | ||
so what i'd like is to associate each node with its set of parents, or referents, or inboud edges, or whatever you want to call them | 06:43 | ||
i.e. its users | |||
now there is a very simple identity going on. it's impossible for there to be more referents than nodes | |||
it would be possible to have almost as many references as there are nodes, for instance if i have a single DO list that refers the same node over and over and over again | 06:44 | ||
so, that makes me happy, because now i can allocate the reference-mapping things as a single array, arrange them in a linked list per node, and use pointer bumping to allocate | 06:45 | ||
however. | |||
when i'm optimizing, i'm modifying the tree at the same time | |||
so when e.g. i swap out node A for node B, all references to A must be replaced by B | 06:46 | ||
great, i can use my reference list, and find out all nodes that refer to A, and have them refer to B | |||
this never adds any new references, so my initial constraint still holds | 06:47 | ||
but B can be the root of a tree, as can A | |||
all references to the children of A are now become useless (A has been spliced out). all children of B have no references allocated to them | |||
the second part is a problem because i would like to iterate over B (haivng not visitied it before) and see if there is anything there i can optimize | 06:49 | ||
and if i want to assign references to the nodes in B, i can't because i'll easily end up with more references than i'd have allocated | 06:52 | ||
so, i can now: try to garbage collect all references from A | |||
which will break my pointer-bumping scheme, but okay | |||
will have to use a free list scheme instead | 06:53 | ||
or B, allocate all references from a region (MVM_spesh_alloc), and have them cleaned up later | |||
i mean, i know fairly well that B is the solution here | |||
i also have C, split optimization in multiple passes | |||
nine | brrt: how performance sensitive is that code anyway? Spesh already runs in a separate thread and thus the JIT does, too. | 06:54 | |
brrt | it's partly about performance, but it's also partly about manageablility | 06:55 | |
brb | 06:56 | ||
07:12
brrt joined
|
|||
brrt | on the other hands, two things come to mind | 07:14 | |
the sooner we finish compilation, the sooner that code now actually runs | |||
so fast compilation does have an effect | |||
more importantly, the size of the data that the expr compiler works on is potentially huge | 07:15 | ||
we have already very large frames | |||
ultimately, we want to compile the entire frame to an expression | |||
so cheap and cheerful is i think the way to go | |||
(what if all you need is a hammer, and good aim?) | 07:16 | ||
nine | brrt: MVM_spesh_alloc is pretty much just pointer bumping and you don't have to care about freeing the memory. Sounds like the perfect compromise between performance and manageability | ||
brrt | it's pretty good, yes, and to be fair, the only good reaosn i have not to use it in the expr tree, is that it doesn't handle dynamically growing arrays very well | 07:17 | |
and a dynamically growing array is such a super useful data structure | |||
hash table? dynamic array | |||
free list? dynamic array | |||
samcv | brrt, how would splitting opt into multiple passes be | ||
brrt | heap, dynamic array | 07:18 | |
union-find, you get the point | |||
not well defined in my mind, but you'd have one pass to get all references, another pass to do the replacements | |||
topological graph, well, in my case, static array | 07:19 | ||
that splitting would in fact not solve the problem of my references | 07:22 | ||
so, yeah, spesh allocation is the way to got | |||
*go | |||
just wanted to confirm that, is all :-) | |||
nine | brrt: thanks for asking btw. Made me investigate MVM_spesh_alloc :) | 07:23 | |
07:24
geekosaur joined
|
|||
samcv | sounds like a plan brrt | 07:26 | |
brrt | alright | ||
cool | |||
08:29
robertle_ joined
08:40
zakharyas joined
12:01
zakharyas joined
12:11
evalable6 joined
12:22
nwc10 joined
|
|||
timotimo | my readSizedInt64 is full of fastinvoke + hllize + decont_i :\ | 12:33 | |
that might make it a bit slow | |||
the hllize comes from calling .shift on the Buf[uint8], which is done as an invoke_o kind of thing | 12:36 | ||
looks like in Blob it already just calls nqp::shift_i on itself | 12:37 | ||
but it can return a Failure | |||
that's probably what frustrates the whole effort | |||
with nqp::shift_i the generated code is much nicer | 12:44 | ||
it's a single BB with only shift_i, const_i64_16, blshift_i, and add_i | |||
12:48
MasterDuke joined
|
|||
timotimo | oh yeah! | 12:49 | |
i've reached 8 seconds | |||
MasterDuke | timotimo: for the heap analyzer? | ||
timotimo | 317% cpu usage, not bad | ||
yep! | 12:50 | ||
MasterDuke | nice. haven't gotten to try it out yet, but looking forward to the speedup | ||
timotimo | AFKBBL | 12:51 | |
MasterDuke | btw, anybody have any comments/suggests re my INTERPOLATE questions from yesterday? | ||
timotimo | but maybe i'll push something before i go | 12:52 | |
12:54
coverable6 joined,
committable6 joined,
bloatable6 joined,
unicodable6 joined,
nativecallable6 joined,
releasable6 joined,
greppable6 joined,
quotable6 joined,
bisectable6 joined,
benchable6 joined,
evalable6 joined,
statisfiable6 joined
|
|||
timotimo | i just pushed to my branch | 12:58 | |
have fun! | |||
MasterDuke | cool, checking it out now | 12:59 | |
timotimo | yay | 13:00 | |
don't forget it has to be the file moar now spits out into /tmp | 13:01 | ||
otherwise you'll have no speed benefits | |||
also, it currently leaks file descriptors | |||
MasterDuke | right. i don't need to pass any new options to ---profile (other than =heap)? | 13:02 | |
timotimo | correct | ||
i just hacked the output in | |||
MasterDuke | k | ||
timotimo | i also haven't tested it with anything that has more than one snapshot :D | ||
MasterDuke | hm, got a segv. rebuilding rakudo to make sure it isn't that | 13:04 | |
(when doing the profile) | |||
timotimo: gist.github.com/MasterDuke17/c9b2a...0f097d4e40 | 13:08 | ||
valgrind and gdb output | 13:10 | ||
timotimo | huh | 13:28 | |
MasterDuke | it gets through one call of `MVM_spesh_sim_stack_gc_mark`, but the second seems to have an invalid/missing/something `sims->frames` | 13:35 | |
timotimo | i don't think i meant to commit that | 13:36 | |
MasterDuke | i.e., it segvs on the first iteration of the loop | ||
timotimo | gimme a sec | ||
MasterDuke | commit what? | ||
timotimo | oh, did i forget to commit the independent "make heap snapshot no longer segv" stuff | ||
MasterDuke | ha. btw, i'm on your heapsnapshot_binary_format branch | 13:37 | |
timotimo | yeah | ||
MasterDuke | and i rebased master onto it | 13:38 | |
timotimo | ah! | 13:39 | |
i do have changes! | |||
Geth | MoarVM/heapsnapshot_binary_format: 3b2fefcaf4 | (Timo Paulssen)++ | 7 files WIP gc_describe functions for new spesh datastructures |
13:40 | |
timotimo | the important part is that it no longer tries to gc_mark if it's actually in heapsnapshot mode (which means there is no "worklist") | ||
MasterDuke | ah, no segv | 13:44 | |
timotimo | i had already "git add"ed it and i always just "git diff" to see what's what | 13:45 | |
but git gui thankfully showed me | |||
MasterDuke | and now there's a heapsnapshot_new_format in /tmp | ||
timotimo | yup! | ||
MasterDuke | does seem a bit faster. longer than 8s though | 13:47 | |
timotimo | compare it to the regular format on an unpatched heapanalyzer | ||
also, how big are the files and such? :) | 13:48 | ||
MasterDuke | 48m for regular, 100m for binary | 13:49 | |
2 snapshots | |||
didn't time it, but `summary` in the regular one was a bunch slower | 13:50 | ||
13:53
MasterDuke_ joined
|
|||
MasterDuke_ | hm, binary version does seem to switch snapshots much faster | 13:54 | |
oh, but maybe `top objects by size` gives different results | 13:57 | ||
timotimo | well, switching snapshots does no work | 13:59 | |
i'd time something like echo "snapshot 0\nsummary" on both implementations | |||
"top objects by size" will additionally go through the whole snapshot once again | 14:00 | ||
MasterDuke_ | timotimo: gist.github.com/MasterDuke17/91bce...9fde812822 | ||
looks like the numbers are the same, but different names | |||
timotimo | huh | ||
that's not right | |||
oh, perhaps the empty string in the string heap tripped me up | 14:01 | ||
can you check into the file to see if these strings are neighbours except off by one? | |||
MasterDuke_ | in which file? | ||
timotimo | the old version is easier to look at | 14:02 | |
MasterDuke_ | "Spesh slot entry","<SC>","P6opaque","Scalar","Perl6::Metamodel::ContainerDescriptor","NQPMu","Method cache","Type cache entry","Boolification method","WHO","WHAT","HOW" | 14:03 | |
selection from old file around P6opaque | 14:04 | ||
"RoleToClassApplier","149","&has_method","&has_attribute","NQPArray","NQPMatch","int","str","gen/moar/stage2/QRegex.nqp","$?CLASS","$?PACKAGE","$NO_CAPS", | |||
selection from old file around NQPArray | |||
useful? | 14:05 | ||
timotimo | oh | ||
maybe snapshots 0 and 1 are swapped | |||
no that makes no sense either | 14:06 | ||
MasterDuke_ | oh, i just noticed p6opaque shows up a couple times in the new version's output | ||
3 times | |||
timotimo | it shouldn't do that :D | ||
well, it's easy to be much faster, but also wrong | 14:07 | ||
for every bit of wrong you can potentially be 10x as fast :D | |||
MasterDuke_ | yup | ||
timotimo | oh, i gotta get ready to head out | 14:08 | |
seems like i'll have to fix this later | |||
MasterDuke_ | k | 14:12 | |
timotimo | unless you can figure out what mistake i made :D | 14:15 | |
like you could print out the @!strings for both versions and compare | |||
MasterDuke_ | you think the problem is in the app? not the moarvm branch? | 14:29 | |
timotimo | hm, right, could also be the functions that write out things. not sure how, though | 14:37 | |
MasterDuke_ | i added `dd @!strings[^20]` to the BUILDs | 14:40 | |
the output for just running the app is the same | |||
15:07
brrt joined
|
|||
brrt | rebase from hell, but we'll get there... | 15:16 | |
Zoffix | .oO( ? I'm on a reeeeeebaaaasee to hell... ? ) |
15:17 | |
15:36
AlexDaniel joined
15:39
benchable6 joined
|
|||
japhb | .oO( The rebase to hell is paved with well-intentioned patches. ) |
15:52 | |
16:29
dogbert2 joined
17:03
zakharyas joined
17:17
leont joined
|
|||
nine | That's an interesting comment: github.com/MoarVM/MoarVM/blob/mast...erp.c#L108 | 18:21 | |
The lists don't actually match. I wonder how much that costs? | |||
Now finally the whole thing makes sense! "Points to the current opcode." is just plain wrong. cur_op does _not_ point at the current opcode, but at the place following that, which is the number of the first register. | 18:25 | ||
The NEXT_OP macro tellst that story | 18:26 | ||
so *tc->interp_cur_op should actually point at GETREG(cur_op, 0) which is exactly what I need, because it's the register I should return values into | 18:27 | ||
dogbert2 | github.com/libtom/libtommath/blob/...hanges.txt | 19:49 | |
Zoffix | "-- Fixed mp_rand() to fill the correct number of bits" we had a bug due to that | 19:50 | |
dogbert2 | and I believe we had another bug relating to 'Fixed mp_invmod()' | 19:52 | |
19:52
brrt joined
|
|||
dogbert2 | I think timotimo patched that though | 19:52 | |
brrt | good * | 19:53 | |
i'm thinking of calling of the whole rebase plan | |||
its super frustrating | |||
it doesn't actuallly give a 'clean' set of patches | |||
since many of the intermediate steps are just broken | |||
it risks diverging from the current, working, code | 19:54 | ||
*off | 19:55 | ||
dogbert2 | Zoffix: RT #129829 | 19:56 | |
synopsebot6 | Link: rt.perl.org/rt3/Public/Bug/Display...?id=129829 | ||
brrt | and it distract from actual useful changes | ||
.ask jnthn i notice a bunch of jgb_sc_wb calls to things where the operand might be an integer rather than an object | 20:12 | ||
yoleaux | brrt: I'll pass your message to jnthn. | ||
nine | .ask brrt why does the JIT clear RV in the epilogue? It could be used for returning a value to the caller otherwise | 20:32 | |
yoleaux | nine: I'll pass your message to brrt. | ||
22:09
geekosaur joined
|
|||
timotimo | ended up not doing anything much today | 22:19 | |
lizmat knows the feeling | 22:22 | ||
lizmat goes to bed | |||
samcv | another day, gotta work on ucd2c.pl again. ugh. i will prevail! | 23:26 | |
Zoffix | \o/ | 23:36 | |
samcv | Zoffix, what is your opinion on the Unified ideographs. none of them have names. it's my opinion like the control codes we should call them <CJK Unified Ideograph-7FFE> or such | 23:37 | |
because atm there's no way to distinguish them | |||
m: "?".uninames.say | 23:38 | ||
camelia | (<CJK Ideograph>)? | ||
samcv | err just CJK IDeograph i guess plus the number | ||
23:43
geekosaur joined
|
|||
Zoffix | samcv: yeah, sounds like a plan | 23:45 | |
23:48
geekosaur joined
|
|||
samcv | and fixing this i made a function to catch the bad output that gets in there and i'm putting it every damn place with a croak | 23:53 | |
eventually i'll figure out where it's being added in... |