01:48 ilbot3 joined 04:09 nebuchadnezzar joined 04:27 zakharyas joined 05:49 domidumont joined 05:52 domidumont joined 07:13 brrt joined
jnthn D'oh. So, I wake up, shower pondering what on earth might be wrong with ucd2c.pl, grab breakfast, come to computer to read the news before the day's work...and discover it's a national holiday here today. :P 08:51
Anyway, figure I'll work the day and take some other day off next week when family visit. Otherwise I'll just spend the day sitting around wondering why a darn Unicode database import script is busted :P 08:53
nwc10 but given the good news that I belatedly read here news.perlfoundation.org/2016/09/jon...ent-g.html 08:54
wasn't today a non-work-work day anyway?
anyway, your "move the holiday" sounds like an excellent plan 08:55
jnthn somehow suspects that in terms of technical difficulty, the stuff he does for his TPF grant is harder work than this other work :)
More fun too, though :)
dalek arVM/unicode9: ed685ed | jnthn++ | tools/ucd2c.pl:
Tweak ucd2c.pl to produce deterministic output.

Before, the output produced depended on hash ordering, which meant two runs on the exact same Unicode database could produce wildly different output. This doesn't help when trying to debug.
09:18
arVM/unicode9: a94e443 | jnthn++ | tools/ucd2c.pl:
Tweak ucd2c.pl to produce deterministic output.

Before, the output produced depended on hash ordering, which meant two runs on the exact same Unicode database could produce wildly different output. This doesn't help when trying to debug.
09:22
arVM/unicode9: fb29f18 | jnthn++ | src/strings/unicode_ (2 files):
Update to the Unicode 9 character database.
lizmat jnthn++ but /me wonders why that isn't a perl 6 script :-)
jnthn lizmat: Because it wasn't feasible to make it one at the time it was written
nwc10 but 'patches soon welcome' (once the current problem is fixed) ?
lizmat jnthn: feels like LHF for someone else :-) 09:23
jnthn heh
If 1600+ lines of Unicode database parsing, bit-field computation, generating C code, and so forth is appealing to anyone, then yes :P 09:24
lizmat well, at least now it's deterministic :-) 09:26
so debugging it should be a piece of cake :-)
jnthn Right, so the port should spit out the same thing :P
Which will help, yes.
stmuk_ that *BSD crash appears to happen on more bleading edge linux distros (tumbleweed & arc according to reports) although I still fail to reproduce on linux :( 09:27
jnthn Oh no, please say this isn't going to be that libc lock ellision hardware bug? :/ 09:28
Well, maybe software bug
I forget the details
stmuk_ use of zsh seems common to reports (although many people use it anyway) and I played around with ulimits and zsh but nothing 09:29
or perhaps fs (ufs/brfs?) although I tried brfs and it seemed to work fine 09:32
jnthn bugs.debian.org/cgi-bin/bugreport....bug=800574 is the issue I was thinking of, fwiw 09:35
stmuk_ eww 09:36
jnthn Yes, if it's that then very :( indeed 09:41
But I've no idea if this is the libc used on BSDs? 09:42
stmuk_ the BSDs use their own libc not glibc (its also related to the android one) 09:44
jnthn Hmm 09:56
So, emit_unicode_property_keypairs loads both property and property value aliases
The problem we have is that some property aliases and property value aliases overlap 09:57
lizmat so, is that a unicode problem? or a perl scripting issue ? 09:59
jnthn Well, we seem to emit two tables 10:00
emit_unicode_property_keypairs does the one in question, and emit_unicode_property_value_keypairs does another
10:00 zakharyas joined
jnthn It's unclear why we're paying attention to property *value* aliases in the former 10:00
Though not doing so fails us an incredible number of tests 10:03
m: say ' ' ~~ /<:space>/ 10:26
camelia rakudo-moar c01fc3: OUTPUTĀ«ļ½¢ ļ½£ā¤Ā»
jnthn m: say uniprop(' ', 'Space')
camelia rakudo-moar c01fc3: OUTPUTĀ«SPā¤Ā»
jnthn m: say uniprop(' ', 'LineBreak')
camelia rakudo-moar c01fc3: OUTPUTĀ«SPā¤Ā»
jnthn m: say uniprop(' ', 'space') 10:28
camelia rakudo-moar c01fc3: OUTPUTĀ«SPā¤Ā»
jnthn m: say uniprop(' ', 'WSpace') 10:29
camelia rakudo-moar c01fc3: OUTPUTĀ«1ā¤Ā»
jnthn Gah, it gets worse 10:40
There are other collisions too
Heh, and at the top of PropertyValueAliases.txt it explains all of this 10:43
m: say uniprop("x", "AL") 10:48
camelia rakudo-moar c01fc3: OUTPUTĀ«Lā¤Ā»
jnthn m: say uniprop("x", "Arabic_Letter") 10:49
camelia rakudo-moar c01fc3: OUTPUTĀ«Lā¤Ā»
jnthn m: say uniprop("x", "Line_Break")
camelia rakudo-moar c01fc3: OUTPUTĀ«BKā¤Ā»
jnthn m: say uniprop("x", "Bidi_Class") 10:50
camelia rakudo-moar c01fc3: OUTPUTĀ«Lā¤Ā»
jnthn m: say uniprop("&", "Bidi_Class") 10:51
camelia rakudo-moar c01fc3: OUTPUTĀ«ONā¤Ā»
jnthn m: say uniprop("&", "Line_Break")
camelia rakudo-moar c01fc3: OUTPUTĀ«BKā¤Ā»
jnthn m: say uniprop($_, "Line_Break") for ^64 10:53
camelia rakudo-moar c01fc3: OUTPUTĀ«CMā¤CMā¤CMā¤CMā¤CMā¤CMā¤CMā¤CMā¤CMā¤BKā¤LFā¤BKā¤BKā¤CRā¤CMā¤CMā¤CMā¤CMā¤CMā¤CMā¤CMā¤CMā¤CMā¤CMā¤CMā¤CMā¤CMā¤CMā¤CMā¤CMā¤CMā¤CMā¤SPā¤BKā¤BKā¤BKā¤BKā¤BKā¤BKā¤BKā¤BKā¤BKā¤BKā¤BKā¤BKā¤BKā¤BKā¤BKā¤BKā¤BKā¤BKā¤Bā€¦Ā»
timotimo bok bok bok bok bok goes the hen 10:57
QU
Ambiguous Quotation
10:58
Quotation marks
act like they are both opening and closing
mhhh, both opening and closing
that's fun :)
jnthn m: for ^0xFFFF { if uniprop($_, 'Gc') eq 'Pe' { .say; last } } 10:59
camelia ( no output )
jnthn m: for ^0xFFFF { if uniprop($_) eq 'Pe' { .say; last } }
camelia rakudo-moar c01fc3: OUTPUTĀ«41ā¤Ā»
jnthn m: for ^0xFFFF { if uniprop($_, 'gc') eq 'Pe' { .say; last } }
camelia rakudo-moar c01fc3: OUTPUTĀ«41ā¤Ā»
jnthn m: for ^0xFFFF { if uniprop($_, 'jg') eq 'Pe' { .say; last } }
camelia ( no output )
jnthn m: for ^0xFFFF { if uniprop($_, 'Joining_Group') eq 'Pe' { .say; last } } 11:00
camelia ( no output )
jnthn m: say uniprop('x', 'Joining_Group') 11:01
camelia rakudo-moar c01fc3: OUTPUTĀ«No_Joining_Groupā¤Ā»
jnthn m: say ^0xFFFF .map({ uniprop($_, 'Joining_Group') }.uniq
camelia rakudo-moar c01fc3: OUTPUTĀ«===SORRY!=== Error while compiling <tmp>ā¤Unable to parse expression in argument list; couldn't find final ')' ā¤at <tmp>:1ā¤------> ap({ uniprop($_, 'Joining_Group') }.uniqā<EOL>ā¤Ā»
jnthn m: say ^0xFFFF .map({ uniprop($_, 'Joining_Group') }).uniq
camelia rakudo-moar c01fc3: OUTPUTĀ«No such method 'uniq' for invocant of type 'Seq'ā¤ in block <unit> at <tmp> line 1ā¤ā¤Ā»
jnthn m: say ^0xFFFF .map({ uniprop($_, 'Joining_Group') }).unique
camelia rakudo-moar c01fc3: OUTPUTĀ«(No_Joining_Group YEH ALEF WAW BEH TEH MARBUTA HAH DAL REH SEEN SAD TAH AIN GAF FARSI YEH FEH QAF KAF LAM MEEM NOON HEH SWASH KAF NYA KNOTTED HEH HEH GOAL TEH MARBUTA GOAL YEH WITH TAIL YEH BARREE ALAPH BETH GAMAL DALATH RISH HE SYRIAC WAW ZAIN HETH TETH Yā€¦Ā»
jnthn m: for ^0xFFFF { if uniprop($_, 'Joining_Group') eq 'PE' { .say; last } } 11:02
camelia rakudo-moar c01fc3: OUTPUTĀ«1830ā¤Ā»
nwc10 do any of these bugs summon Cthulu by accident?
jnthn m: say chr(41) ~~ /<:Pe>/
camelia rakudo-moar c01fc3: OUTPUTĀ«ļ½¢)ļ½£ā¤Ā»
jnthn m: say chr(41) ~~ /<:pe>/ 11:03
camelia rakudo-moar c01fc3: OUTPUTĀ«ļ½¢)ļ½£ā¤Ā»
jnthn m: say chr(41) ~~ /<:PE>/
camelia rakudo-moar c01fc3: OUTPUTĀ«Nilā¤Ā»
jnthn m: say chr(1830) ~~ /<:PE>/
camelia rakudo-moar c01fc3: OUTPUTĀ«Nilā¤Ā»
jnthn m: say chr(1830) ~~ /<:Pe>/
camelia rakudo-moar c01fc3: OUTPUTĀ«Nilā¤Ā»
jnthn The problem seems to boil down to "what should the <:Foo> form mean", because it appears to be ambiguous at the moment 11:06
At present, we take the property, and map it back to a Unicode property name 11:07
Ah, here's the bytecode: 11:08
00074 const_s loc_31_str, 'Foo'
00075 unipropcode loc_30_int, loc_31_str
00076 unipvalcode loc_29_int, loc_30_int, loc_31_str
00077 hasuniprop loc_28_int, loc_6_str, loc_7_int, loc_30_int, loc_29_int
So, we then use the property we mapped it to, to grab a Unicode property value code 11:09
We then ask if the string we're matching (6_str) and the given location (7_int) has that property/value 11:10
In the long form like :PropName<Value> there's no ambiguity 11:13
timotimo did they add more aliases so that we now have ambiguities?
and thus our test suite has become bogus?
jnthn It doesn't seem so 11:14
timotimo strange that it'd explode now, then :\ 11:15
hey, we can constant-fold unipropcode and probably also unipvalcode!
jnthn Indeed
Well, in spesh at least
timotimo that'll help all those scripts that spend 50% of their cpu time inside unicode_db! 11:16
yes, otherwise we'd bind precompiled scripts to the unicode version they were compiled with
jnthn Thing is, if I reverse the order of the loop in generate_property_codes_by_names_aliases then I get more things busted 11:17
Meaning we were relying on, so far as I can tell, a completely aribitrary mechanism for resolving the conflcits
*conflicts
timotimo ugh, that's a really bad "decision" :) 11:18
jnthn Wow, I just put the Unicode 8 DB back in with my consistent output fix and it also fails that space test 11:24
timotimo o_O 11:25
timotimo reconsiders bragging about moarvm's unicode support
jnthn Well, it boils down to a lang design issue too, I think 11:26
The what...even if I back out my changes to sort things and feed it the Unicode 8 DB it still fails the one test
timotimo not having the decision codified yet what a uniprop in a regex really means, yes? 11:27
jnthn Something like 11:28
jnthn tries running ucd2c from master on the Unicode 8 DB
huh, that passes 11:29
Aha, but applying my "produce deterministic output" patch busts it 11:34
lunch & 11:44
nwc10 jnthn++
jnthn Back 12:18
OK, so
On master, with the Unicode 8 database, I did 3 runs
of ucd2c.pl, make -j install, run the spectest
first 2 times pased, third failed 12:19
So we had the problem all along
It's just that hash ordering sometimes hides it 12:20
For all the cases we have spectests for
Do it 3 times with the Unicode 9 DB, and I got pass, fail, pass 12:23
nwc10 but will it be fixed before ilmari returns from lunch? :-) 12:24
ilmari nwc10: I'd have to go for lunch first... 12:25
jnthn :P
Now trying this with 24742b1024
pass, fail, fail 12:27
So it can pass with the Unicode 9 DB + NFG tweaks
nwc10 ilmari: I think that you're giving him an unfair advantage 12:28
jnthn
.oO( How many more re-gens until it passes again... :) )
12:29
bah, I shoulda stashed the passing one :P
jnthn would kinda like to get a clarification from TimToady on what we'd like <:Foo> to do 12:30
S05 doesn't really make things much clearer
Hurrah, I have a passing one. 12:32
So in the meantime, we can have our Unicode 9 DB bump, more by luck than judgement
The situation doesn't get any worse
But urgh. 12:33
The other bad news is that NFG will need a more extensive bunch of changes to handle...yes, *emoji*. 12:34
nwc10 does Unicode actually have suitable joiners yet to express jumping the shark? 12:35
jnthn Since the grapheme boundary algorithm seems to have become something that needs more than just looking at 2 chars and being able to decide if there is a grapheme boundary between them
Wouldn't surprise me :P 12:36
m: say "\c[SHARK]"
camelia rakudo-moar c01fc3: OUTPUTĀ«===SORRY!=== Error while compiling <tmp>ā¤Unrecognized character name SHARKā¤at <tmp>:1ā¤------> say "\c[SHARKā]"ā¤Ā»
jnthn jnthn@lviv:~/dev/rakudo$ ./perl6-m -e 'say "\c[SHARK]"'
šŸ¦ˆ
Unicode 9 does, however, add a shark :P
Seems there's no chars with JUMP in their name 12:39
dalek arVM: a19dc90 | jnthn++ | / (2 files):
Explicitly handle ZWJ in grapheme break.

This is in preparation for updating to the Unicode 9 UCD, which takes the Grapheme_Extend property away from Zero Width Joiner.
12:41
MoarVM: 9b39aa5 | jnthn++ | src/strings/unicode_ (2 files):
MoarVM: Update to the Unicode 9 database.
MoarVM:
MoarVM: Note that while this gives us support for the various new chars in
MoarVM: Unicode 9, we'll need updates to NFG to fully handle the emoji rules
MoarVM: in grapheme break handling.
arVM/deterministic-ucd2c: 224a261 | jnthn++ | tools/ucd2c.pl:
Tweak ucd2c.pl to produce deterministic output.

Before, the output produced depended on hash ordering, which meant two runs on the exact same Unicode database could produce wildly different output. This doesn't help when trying to debug.
12:58
jnthn So, I'll leave that cleanup in a branch for now, until the lang design issue gets a decision/resolution.
ilmari jnthn: but there doesn't seem to be a water skier emoji 12:59
the closest is surfer, I guess
SPEEDBOAT+ZJW+SURFER+ZJW+SHARK 13:00
jnthn Nice :) 13:01
ilmari is disappointed there's no WHOLE PIZZA, only SLICE OF PIZZA 13:03
13:03 domidumont joined
ilmari and on that note, lunchtime 13:03
nwc10 \o/
ilmari m: say "\c[SLICE OF PIZZA]" xx 8 13:04
camelia rakudo-moar c01fc3: OUTPUTĀ«(šŸ• šŸ• šŸ• šŸ• šŸ• šŸ• šŸ• šŸ•)ā¤Ā»
ilmari although the slices I intend to eat are rectangular
nwc10 ZWJ is the anti-pizza-wheel? 13:05
ilmari pizza glue
nwc10 jnthn: `git describe` in MoarVM gives 2016.09-3-g9b39aa5 13:07
your NQP bump wants 4
nwc10 is surprised that Travis hasn't been on a drive-by yet 13:08
jnthn It did in the other channel
(And fixed) 13:17
I'll leave the NFG changes for another day.
brrt jnthn++ 13:19
jnthn rt.perl.org/Ticket/Display.html?id=125978 is "intersting" 14:59
gist.github.com/jnthn/7d2cefda9d88...a62e6c7b4f is from valgrind
I'm not quite sure how this state can ever arise 15:00
Of course, it doesn't seem to show up anything like so easily with a debug build 15:34
Ah, got a related but different valgrind error out of it now 15:42
These are sorta suggestive that a thread is running while another is GCing, which would be bizzare. 15:56
But it's hard to see how:
if (ctx->arg_flags) {
/* Free the generated flags. */
MVM_free(ctx->arg_flags);
ctx->arg_flags = NULL;
Could lead to a double free otherwise :S
The traces clearly say the free in question ran 15:57
And the NULLing must happen right after it
16:04 Ven_ joined 16:10 Ven_ joined
jnthn suspects he won't fully track this down today... 16:15
Time for some rest now, anyway.
17:16 FROGGS joined 17:34 domidumont joined 18:01 Ven_ joined 18:21 Ven_ joined 18:41 Ven_ joined 19:01 Ven_ joined 19:08 wrl joined
wrl hey, i'm evaluating languages for embedding inside of a larger C app (ala blender). how's moar's embedding API these days? 19:09
timotimo there isn't a stable api for embedding 19:12
wrl gotcha. is that planned?
timotimo not sure 19:13
it's quite possible to just run a moar instance and communicate with the script running inside it via normal IPC things like sockets or pipes 19:14
wrl unfortunately the larger app relies on being able to share wrapped C data structure around
timotimo oh, i meant run the instance in your own program 19:15
then you can use NativeCall and friends to get at shared data structures
wrl ah right i see
that seems like it could get very obtuse, unfortunately 19:16
timotimo wouldn't know until i tried it, tbh 19:19
19:21 Ven_ joined 19:32 patrickz joined 19:41 Ven_ joined 20:00 Ven_ joined 20:21 Ven_ joined 20:40 Ven_ joined 20:56 Ven_ joined