00:19 diakopter left 01:31 colomon joined 02:14 JimmyZ joined 04:09 jnap joined
dalek arVM: 26e596f | larry++ | src/strings/ops.c:
only fetch char once for built-in cclasses

Many of the cases fell through and refetched the character at an offset into the string. (Worst case, \w would do 7 fetches and 6 table lookups on any CJK character!) Now we just fetch once into the cp var, and call the property lookup function directly. (Having cp handy also allows us to short-circuit the ASCII lookups and avoid the Unicode tables entirely in the common case.)
05:07
05:09 jnap joined
TimToady that last change actually makes the spectest run slightly faster, seems 05:15
dalek arVM: a4a76ed | larry++ | src/strings/ops.c:
check Lo before Ll, Lu, Lt, and Lm

  (We already handle ASCII separately, and there are a lot of CJK chars.)
05:45
TimToady we really oughta just be able to check L, but that doesn't work for some reason 05:46
JimmyZ TimToady++ 05:48
TimToady figured you might like that one :) 05:49
JimmyZ yeah
TimToady there are a lot of CJK characters, and a lot of Chinese characters like you :) 05:50
JimmyZ In php, I use github.com/moriyoshi/libmbfl 05:52
:P
s/I/we/ 05:53
dalek arVM: bc6cd39 | larry++ | src/strings/unicode_ops.c:
add routine to get non-boolean uniprops

  (No idea how to hook this in yet, but it'd be nice to be able to look
up the numeric value of a character, for instance.)
06:08
TimToady need a way to look up names too 06:09
prolly shoulda put that into a branch, oh well 06:22
dalek arVM: 8c10875 | larry++ | src/strings/unicode_ops.c:
service function to fetch character name

  (Nobody calls it yet, so no severe need for a branch. What, me lazy?)
Also, delete dup of previous function I should've put in a branch. :)
06:57
TimToady I guess there wasn't a dup, so never mind that line... 06:59
I don't suppose there's a howto on adding opcodes, or more importantly, a hownotto... 07:22
I guess the most important rule is "Don't make jnthn cry." :) 07:29
moritz the most important rule is "don't cause an op renumbering", 'cause that requires a rebootstrap 07:50
TimToady what about renaming an existing op without renumbering? hasuniprop should really be named getuniprop, because the binary properties automatically return 0 or 1, but generalizing it to getuniprop can return things like numeric value as well 07:53
actually, I take that back--obviously something in the spectests is assuming that will return only 0 or 1 :) 07:55
I'd need a new opcode for the char name anyway 07:58
07:59 FROGGS joined
FROGGS TimToady: renaming should be okay, because the compiled stage0 in nqp does not know about the op names 08:35
and in theory you can renumber all ops that were added after the last time somebody made a fresh stage0 08:36
and you can renumber ops that are not used by stage0 obviously
but I think there is no need to reorder anything
jnthn oplist has a rules list at the top :) 09:12
nwc10 presumably if the spectest is relying on something, it's always possible to move the goalposts, er, change the spec
jnthn But adding an op is really just editing that file, running perl6 tools/update_ops.p6 or so, and then adding it in the right place in interp.c. 09:15
(the first just updates the auto-gen'd, too-boring-to-do-by-hand, op metadata)
bbl & 09:19
timotimo good catch for the unicode improvements! 09:51
10:13 odc joined
JimmyZ Good evening 12:21
12:24 colomon joined
FROGGS hi JimmyZ 12:28
JimmyZ Hello FROGGS 12:41
masak 你好JimmyZ 12:44
JimmyZ 下午好 麦高 13:04
masak :) 13:06
JimmyZ: 其实,我的妻子刚刚叫我“卡尔”。
很容易。 13:07
13:39 TimToady joined 13:57 camelia joined 14:08 colomon joined
JimmyZ 卡尔,晚上好 14:34
oh there is masak-masak game 14:36
masak yep. 15:24
unless that's all blog spammers.
TimToady okay, so now I've gotta figure out how to call my spiffy new opcodes from nqp and/or perl6-m 17:41
any hints?
doubtless this has been explained here on this channel to other people before, but I confess I tuned that part out :) 17:42
TimToady goes to search the logs
timotimo so you added it to the interp.c? 17:44
TimToady yes, and it compiles
timotimo you'll have to find MASTOperations.nqp or something similar
nqp/src/vm/moar/QAST/QASTOperationsMAST.nqp 17:45
TimToady I've been looking at that file and wondering how much I can cargo cult
timotimo the mapping of "nqp::function" to "moarvm bytecode" is there
i've cargo culted most of it so far :P
but by now i'm able to explain things
TimToady do I have to register the nqp names someplace else?
timotimo i don't think so 17:46
TimToady okay, then I can probably figure it out :)
thanks
> say nqp::getuniname(0x1f4a9) 18:01
PILE OF POO
\o/
> say nqp::getuniname(0x4e05)
Segmentation fault
/o\ 18:02
is probably null, needs to return name at beginning of span, '<CJK UNIFIED IDEOGRAPH>' in this case
a pity that C doesn't seem to like an initializer that shares constant strings amoung different entries 18:04
actually, the real spans are fine, it's just the ones that are faked with NULL pointers 18:08
how to get those to share the constant strings... 18:09
it's not like initializers in C are executable code... 18:10
18:12 tgt joined
TimToady or is the C compiler smart enough to consolidate idential constant strings? 18:13
*tical 18:14
no, it is not--at least mine isn't 18:17
grr, src/strings/unicode.c:14520:14: error: initializer element is not constant 18:21
src/strings/unicode.c:14520:14: error: (near initialization for ‘codepoint_names[13401]’)
despite having declared it as const char *cjk = "<CJK IDEOGRAPH>"; 18:22
const char const *cjk doesn't help either 18:23
timotimo could have negative numbers instead of NULL :P
TimToady I'm sure C will just love that
timotimo surely will! :)
TimToady if it wasn't going to get stuffed into a readonly segment (potentially), I could doctor the array after the facct 18:24
timotimo and you can't figure out the absolute addresses properly without doing the whole linking yourself or something like that :| 18:25
TimToady I need a try in C that will catch a SIGSEGV
timotimo and you can't do the checking yourself? 18:26
TimToady if I want to cheat and change the NULL to the earlier valid pointer, it might fail 18:27
I suppose I could take the const off the table and see what happens :)
timotimo everything will be 100x slower
TimToady yeah
well, for now I'll just assume it's an uncommon operation to look for the name of a spanned character, and just look for the previous entry somehow 18:31
and I'll throw the name into the table occasionally to prevent backscanning hundreds of entries 18:45
18:45 FROGGS joined
FROGGS wow, what do my tired eyes read in the backlog? 18:48
TimToady++
19:19 jnap joined
dalek arVM: 54d4d4b | larry++ | / (10 files):
introduce getuniname and getuniprop opcodes

Also tweak support routines for robustness and efficiency. In support of fast(ish) name lookup in spanned areas will NULL names, we introduce the name into the table every 25 entries maximum to avoid long backscans, without using up much more memory.
21:26
[Coke] are these opcodes going to be exposed to nqp? 21:28
TimToady shortly 21:30
got stuck headless, recompiling
[Coke] gentle reminder to add them to docs/*
masak not "headless"; you always have a HEAD in Git. but it can be "detached", like Nearly Headless Nick. ;) 21:41
FROGGS Sir Nickolas if you please :o) 21:42
tadzik :D 21:43
masak :P
23:24 colomon joined 23:27 dalek joined 23:56 tgt joined