github.com/moarvm/moarvm | IRC logs at colabti.org/irclogger/irclogger_logs/moarvm
Set by AlexDaniel on 12 June 2018.
Geth MoarVM/kosher-unicode-names-only: a34e9e4bae | (Nicholas Clark)++ | src/strings/unicode_ops.c
Don't add placeholders such as "<control>" to the Unicode names lookup hash.

Including these make no sense - the hash is for a 1->1 mapping of name to codepoint, but for each of these names we have multiple values.
The way our hashes are currently implemented (with "bind"), these duplicate keys are not ignored - we add hundreds of duplicate entries to the same ... (5 more lines)
14:00
Geth MoarVM: nwc10++ created pull request #1321:
Don't add placeholders such as "<control>" to the Unicode names lookup hash.
14:05
timotimo that makes sense 14:12
but don't we call them <control-001> and such? 14:13
m: say uniname("\x03")
camelia <control-0003>
nwc10 there's (different) C code to implement that 14:17
the forwards map code point->name 14:18
timotimo ah
nwc10 is worried that there is a stupid bug in that pull request 14:46
I have. 14:57
The bug is real. The fix is in the wrong place. Meaning also that it is untested. 14:58
nwc10 jnthn: there's a stupid bug in that pull request 16:23
*And*
even fixing it in the place in the C code that I should have 16:24
is two places in the C code
is
...
what is it doing? WHat are all these span things, and FATE_*
and I think
jnthn Hm, I suspect trying to store the Unicode databsae compactly, if you mean the code I think you do :)
nwc10 actually that code is the remains of soemthing else, and if I have it right, the entire "extents" thing doesbn't matter any more - it's a linear scan of codepoint_names 16:25
jnthn: yes, it's storing the *properties* efficiently
but the names, they either "no longer need to be stored with this" or possibly "never needed to be stored with this"
so I think that the C code in generate_codepoints_by_name can be a lot simpler. 16:26
jnthn I seem to recall *something* about names changed along the way.
nwc10 :-)
jnthn This wouldn't surprise me.
nwc10 back in 2013?
nwc10 and then 16:26
I think my "revised fix" is wrong - it manages to get the code that ignores '<' into the correct place
in the C
but the correct place to ignore '<' is in ucd2c.pl 16:27
and never even put it into the C array
and I also really don't get why U-0081 is NULL in that array 16:28
Geth MoarVM/kosher-unicode-names-only: 09dfa207e1 | (Nicholas Clark)++ | 2 files
Don't add placeholders such as "<control>" to the Unicode names lookup hash.

Including these make no sense - the hash is for a 1->1 mapping of name to codepoint, but for each of these names we have multiple values.
The way our hashes are currently implemented (with "bind"), these duplicate keys are not ignored - we add hundreds of duplicate entries to the same ... (5 more lines)
16:33
timotimo m: say uniparse(uniname("\x02")).raku 18:00
camelia "\x[2]"
Geth MoarVM: 67c8413f5b | (Nicholas Clark)++ | tools/ucd2c.pl
ucd2c.pl shells out to rakudo, without checking that it ran correctly.

The error message gets lost in its garrulous output.
18:37
nwc10 well, that was part of my problem.
OK, with that, Unicode UCD 12.1, Emoji 12 (ie 12.0, not 12.1) I can *almost* rebuild files byte-for-byte perfectly 18:39
but src/strings/unicode_db.c differs for uni_seq_722 to uni_seq_733
was Emoji_Combining_Sequence
now Emoji_Keycap_Sequence 18:40
and this makes little sense, as those things seem to be the same all the back to emoji-5.0
jnthn: yes, it's that code related to $usually 18:56
timotimo dispatch chains haven't been in any of jn's talks, right? 19:09
dogbert17 there's some strange code in src/strings/unicode_db.c 19:22
what's happening here for example: 19:23
75055 if (block) {
75056 return block ? Block_enums[num+1] : Block_enums[0];
75057 }
Geth MoarVM: cfe6ed8f56 | (Nicholas Clark)++ | 3 files
Consistent whitespace for the initialiser for codepoint_bitfield_indexes

Previously the code would vary the amount of whitespace in the comment depending upon whether the code point happened to have a name. This means that unrelated changes would cause this initialiser to dance, creating collateral damage in the git diffs.
... (8 more lines)
20:52
timotimo jnthn: what do i have to get right to make the guards from the dispatch program deopt correctly? do i copy the "deopt one" annotation? i think that may be what i've had before; a synthetic deopt point only has to be made if there wasn't already a deopt point available for some reason? 21:39
timotimo has there been much exploration of how dispatch chains are like trace jitting, and what that means for systems including them? 22:21
jnthn Well, if we mean tracing the meta level, then yes, both imply something in the whole multi-level language or bind time analysis space 22:50
It's just that we don't really try to trace them, we just make explicit the operations that should end up in the dispatch program
Or seen another way, only the set of dispatch transform/guard related syscalls are traced 22:51
It's certainly no accident that I called that phase "record". :)
timotimo ah, yeah, meta-level tracing, not machine code instructions tracing; is that the correct difference? 22:52
jnthn Well, sorta...I mean, the meta-level tracing, so far as I understand how the terminology is used, is about knowing which things are the interpreter and which things are the program. 22:53
e.g. the program counter is not interesting to trace the increments of 22:54
If you don't have a meta level and are only tracing the executing bytecode then you don't care about this.
For dispatch programs, I figure all the language level and MOP knowledge is meta 22:55
timotimo OK 23:01
just need to get the deopt of the generated guards correct 23:02
jnthn About the deopt question, though: I think you'll probably need synthetic deopt points; I think spesh plugins had the same situation, so I'd see how it was solved there. 23:03
timotimo i stole the functions it used i think :D
insert_arg_type_guard 23:04
timotimo MVM_spesh_plugin_rewrite_resolve is probably the right place to steal from 23:05
timotimo that code can steal from the prepargs instruction that sits in front of the spesh plugin invoke 23:06
steal the deopt annotations, that is 23:07
jnthn I think dispatch probably needs to be both a pre and post deopt point 23:10
timotimo ah i can do that, it only needs an extra flag in the oplist? 23:11
jnthn Think so 23:14
I think it needs to be all pre/post/all
pre for guards ahead of it
post for return type guard 23:15
all for global deopt
timotimo not sure what "return type guard" means in this case 23:16
jnthn As with invoke: we log and can guard on the return type of a call, and dispatch is (often) a way of making a call 23:17
timotimo ah, ok, so there'll also want to be a guard for that that will come from the spesh log, not from the dispatch program or record 23:18
jnthn Yes 23:20
iirc that's set as part of the return instruction
Well, logged as part of...
timotimo via "is_caller_logging" right?
less via, more "involving"
jnthn yes 23:27