02:48 ilbot3 joined 06:58 domidumont joined
samcv timotimo, more progress. Running names will print this out gist.github.com/13247bbed6c2cca656...5958d906c3 07:35
and we programatically generate the control code's names 07:36
the best part is, UCD-gen.p6 generates the code for which points to generate that are <control-whatever> names 07:37
here is the full source gist.github.com/1989fe2c72f7aafa4c...b79c968926 (100 codepoints)
timotimo very cool 07:48
i expect this mechanism will also be used for variant selectors and for the cjk thingamabobs? 07:49
samcv yeah at least for now. not sure how we want to handle unicode names being indexed
so this should be pretty adaptable
and I can re-use the same code for things like ranges of cp's that have the same bitfield index 07:50
timotimo nice
but what do you mean "being indexed"?
you mean a request "give me the unicode name for codepoint i"?
samcv being able to go from a cp to a name without searching through all of them
yep
timotimo i suggest we build a table that'll give us indices into the uninames array (i.e. the packed 16bit ints) 07:51
samcv every some number of codepoints?
timotimo for example
if we go that route, we'll have to make sure we can address sub-int units, i.e. the base40 codemes
samcv just give it the index that has the 0 value and so the name begins
i don't think we need that. and storing pointers to all of them would be a bit wasteful 07:52
we could always skip a codeme though
timotimo always skip one?
samcv if we started out seeing the end of the previous string
err skip. uhm
each base 40 sig digit place 07:53
was that a codeme, or was the single 40-digit integer a codeme
timotimo well, i suppose we can guarantee that no string has a name of length two, and the strings that are encoded as just a 0 we can already handle through get_uninames
codemes is what a 16bit int has three of
samcv kk
i don't see why having a name that is only 2 letters long is an issue
timotimo oh, only one letter long would be a problem 07:54
samcv 0XX|0 < we have to read two 40-base int's
timotimo because then you'd have 0, letter, 0 in the int
samcv why can't we have that?
timotimo if we index based on 16-bit units, and we say "we give you the 16bit int that has the zero that the previous name ends with", you can't tell if the first zero was meant or the second one
samcv it reads between the first 0 and the next, and whatever is between those two (codeme wise) is the name
timotimo ok, so how do you address the next word? 07:55
samcv hm. ok
i see your point
timotimo as long as the array of codemes doesn't exceed what we can address with 16bit, it should be very doable to have a fast access table that doesn't waste too much memory 07:57
and if we reach the point where we no longer can address it with 16bit per entry, we can go from absolute indices to encoding the distance between what we index and turn the binary search into a linear search 07:58
samcv so there's like 0xE01ED names
m: 0xE01ED.base(2).chars 07:59
camelia ( no output )
samcv m: 0xE01ED.base(2).chars.say
camelia rakudo-moar 582260: OUTPUT«20␤»
samcv that doesn't fit into 16-bit
so I guess we could do it every other
timotimo you have it the wrong way around i think 08:00
my idea was this:
we have an array that'll tell us: codepoint name for U+100 starts at codeme 45, codepoint name for U+200 starts at codeme 99, codepoint name for U+300 starts at codeme 123
so what we have to store is the codeme's address, which is (if we don't have a trick) the length of our 16-bit array times 3 08:01
samcv yeah exactly
timotimo if we have the "we just point at a 16bit integer and whereever the 0 is that's where the name will start" we can cut that to a third
samcv so we just store the index of every other unicode name
timotimo i don't get it 08:02
samcv then we will fit into a 16-bit int
timotimo we already only store the index of every 100th unicode name
samcv index 0, cp 0/1, index 3 cp 2/3 etc
timotimo (that number is up for debate and tuning of course)
samcv so we know where every odd numbered cp starts
timotimo oh?
samcv if it's even it's the second name
timotimo that sounds smart 08:03
samcv and then it will be fine even if we have 2 character unicode 1 names
timotimo but if we already use 20 bits and we only address every other, we go to 19 bits, not to 10 bits :(
samcv true :( 08:05
timotimo we'd have to ... *counts on fingers* ;)
every 16th
samcv yeah 08:06
08:10 brrt joined
brrt ohai #moarvm 08:10
perhaps unsurprisingly, timeout(1) in perl is about 10 LoC 08:11
perhaps more surprisingly, our expr JIT failure was really difficult to pin down to a miscompile or a mis-register-allocate
the broken frame was in fact '!reserve_fates', so that was kind of suspicious in my book 08:12
given that i'd seen a bunch of commits by TimToady that had.. 'something' to do with that
timotimo huh, interesting 08:13
brrt now i've merged origin/master into even-moar-jit, and my infiinte loop has disappeared
timotimo i think reserve_fates is absolutely new?
brrt this is something i know nothing off
timotimo it has to do with having fate indices globally unique 08:14
brrt i'm perfectly happy to pass the compiled frames for further inspection
timotimo where globally is, of course, problematic when we precompile stuff that we later load in
brrt but i can't for the life of me tell what the error is
hmmm
Geth arVM/even-moar-jit: 23 commits pushed by bdw++
review: github.com/MoarVM/MoarVM/compare/3...840891ced4
brrt in fact, the expression JIT does a bunch of interesting thigns, but not that 08:15
timotimo "that"?
brrt not anything that's obviously broken
timotimo ah
brrt heh, i'll make a gist, see what you think
timotimo so i'll be looking at mvm bytecode on one side and x86_64 assembly on the other? 08:17
brrt gist.github.com/bdw/10b46860853822...7580f002a6
no, two different compiles of the same frame
08:19 domidumont joined
brrt aha 08:19
08:19 zakharyas joined
brrt i think i've spotted it 08:19
timotimo i should be cleaning and tidying :o
time's running out
brrt no matter 08:20
timotimo what's the difference? 08:22
brrt well, i'm adding to r10, while i should be adding to r9 08:23
somehow the register has changed from r9 to r10
weird!
08:24 domidumont joined
timotimo sounds weird, yeah 08:25
brrt oh, i think i know why it's happened
the output register is r10, but the first input register is r9 08:26
timotimo ah, and it derped which is out and which is in?
brrt yeah, this is another variant of three-op/two-op shenanigans
timotimo yeah, that is difficult
brrt oh well 08:28
there's probably a generic way to resolve this…. but i need to find it first 08:29
timotimo so is that fixed now or just circumvented? 08:30
tbqh i'm surprised we could compile such a big frame with the exprjit 08:36
i thought its selection of tiles was so very limited still 08:37
there's even some gotos in there
wait. that doesn't necessarily mean all of that is exprjit compiled 08:41
it's quite crazy to me that that code is mostly mov
and it's not all useless movs, too
brrt well, most of it can be erased 08:50
there are a bunch of getlex to the same point 08:51
these can safely be optimized away
well, for some value of 'safely
timotimo heh.
brrt argh, never change the conditions while debugging 09:27
(git reflog is a real lifesaver) 09:30
timotimo oh yes 09:31
Geth arVM/even-moar-jit: 6402e4e2b7 | (Bart Wiegmans)++ | src/jit/x64/tiles.dasc
Fix output register for indirect indexed tiles

Another case of three-operand/two-operand conversion pain. I feel like the solution should be generalized, but it's not entirely obvious how.
09:44
brrt yay, solved 09:47
each bug fixed brings us closer :-)
timotimo neato 10:02
brrt now, i'm kind of wondering what to attack next 10:35
best choice is probably C function call support 10:36
timotimo ah, yes
brrt because that's the blocking thing, mostly
timotimo that'll give us a whole bunch of ops in one fell swoop
(since we still have that script that parses graph.c)
brrt indeed
timotimo i don't really have an overview of what exactly works now 10:52
the exprjit can now fully compile if/then/else?
brrt yeah 10:55
samcv nqp::index goes to MVM_string_index right? 10:56
timotimo it should
you can check in interp.c
samcv hmm for some reason i wasn't getting any 2+ needle length calls to that, in cases where both strings were made up of 32 bit graphes 10:57
ah yeah that is what my issue was
timotimo what exactly? 10:59
samcv just some ascii text 11:00
timotimo oh btw, you can use the xx operator (or is it x?) to build a long string (as a rope) and then turn the rope into a flat array of graphemes by using nqp::indexingoptimized
oh, it optimized the text into 8bit graphemes?
samcv yeah
i suppose
ya
timotimo we used to have indexingoptimized as "flattenropes", which would modify the string in-place. that was a very fragile thing that lead to many explosions :D 11:01
nine .tell brrt I'd attack whatever is keeping others from contributing to the exprejit
yoleaux2 nine: I'll pass your message to brrt.
11:08 brrt joined
brrt nine, yeah, that's mostly function call support :-) 11:09
yoleaux2 11:01Z <nine> brrt: I'd attack whatever is keeping others from contributing to the exprejit
brrt so what works now: basic register allocation, spill/loads, marking of register preferences (but not conflict resolution) 11:10
the basic challenge of function calls is that they a): ensure that values in volatile registers are properly spilled on a function call; b): ensuring that values are in their proper places for ABI requirements 11:12
i have a strategy in place to tackle that
nine :) 11:22
12:10 domidumont joined 12:27 domidumont joined 13:09 ggoebel joined 13:12 domidumont joined 13:22 zakharyas joined 15:10 brrt joined
brrt so, one of the minor issues of complexity 15:11
we can have more arglist arguments than we can have registers 15:12
so; if we tried to load all these arguments in registers (as we would if they were 'normal' references), then we'll have infinite problems 15:13
as a result, we shouldn't really do that
instead…. we can load it directly from the spilled live range (if any) 15:15
15:20 zakharyas joined
brrt hmm, here's an idea 15:33
nope, scrap the idea
(the idea was: magic-case the handling of ARGLIST in value references. not a good idae)
but ARGLIST things in general need to be special cased.... 15:49
okay, we'll figure it out
16:29 TimToady joined 17:27 FROGGS joined 17:36 zakharyas joined 18:57 ggoebel joined 18:59 pyrimidine joined 19:33 Geth joined 20:05 pyrimidine joined 20:30 pyrimidine joined 22:02 pyrimidine joined
samcv here is a gist of what I get backtracing moar when running prove on the nqp regex test 01 gist.github.com/875b9bbfe6689b0bff...a1bb2f74a3 22:45
jnthn That's just invoking a frame 22:57
It suggests whatever the hang is, it's in high-level code in a loop, not in C code in a loop
MVM_dump_backtrace(tc) may help
Note that since you attached, stdout will be pointing to the original place it was going 22:59
So, maybe test harness
Can always open a file in the debugger and dup2 or so
If -v won't show the output
timotimo hopefully running the test file directly with nqp-m will also give you the problem 23:03
jnthn Aye, that makes things easier if so :) 23:04
23:09 pyrimidine joined 23:37 pyrimidine joined