00:25 tokuhirom joined 01:09 tokuhirom joined 01:20 TimToady joined 02:37 vendethiel joined 02:47 ilbot3 joined 04:08 TimToady joined 04:13 KDr2 joined 05:24 ingy joined 05:36 FROGGS joined 06:41 FROGGS_ joined 06:47 FROGGS__ joined 06:57 nwc10 joined 07:44 FROGGS joined 08:38 kjs_ joined 09:24 zakharyas joined 10:12 kjs_ joined
jnthn .tell Hotkeys I didn't actually change how \r behaves yet, just did the groundwork. Also, unless you're building a MoarVM at HEAD, rather than the version Rakudo will pick by default, you'll not have any of my changes yet anyway. 10:12
oh, no bot
Hotkeys: I didn't actually change how \r behaves yet, just did the groundwork. Also, unless you're building a MoarVM at HEAD, rather than the version Rakudo will pick by default, you'll not have any of my changes yet anyway.
psch: \r\n will become a synthetic, *not* identical to \n, just to be clear. 10:13
10:48 brrt joined
brrt \o 10:49
guess who had to fix... yet another compilation bug this morning? 10:50
jnthn brrt? :) 10:52
brrt jnthn++ :-P
apparantly, accessing the lower bytes of rsp-rdi (registers 4-7) dynamically in x64 means you *have* to add a REX byte 10:53
jnthn This REX byte seems to cause plenty of fun... 10:54
brrt because otherwise the address is understood as the second byte of the lower 4 (rax-rbx) registers
even if you don't actually address any of the regular extended registers
'fun'
why is this? GOD ONLY KNOWS 10:55
i'm also seeing that my approach to register-allocation-over-conditional-and-call-boundaries was oversimplisitc 10:58
as it happens, *online algorithms suck*
whereby 'suck' has a technical definition meaning 'make things much more complicated and difficult to analyse'
anyway... i was hoping i could get the compiler to run with just bugfixing the current setup. but the complexity of it feels like it's spiralling 11:03
so i'm wondering if i should move the register allocator to an offline phase, as well as having the tiles in linear memory order
one of the things that bothered me slightly is that if i linearize the tiles to an array, then i can't splice in spills and loads easily. 11:05
which means one of three things: a): i linearize to something else than an array, like a linked list, which is kind of not-so-fun with regards to allocation (or i should use the spesh allocator, but I don't like that much); 11:06
b): i add a second array which is walked next to the first array that holds spills and stores (i kind of like that solution, but it is complex) 11:07
c): i do spills and stores inline
but i'm not sure how c) interoperates with the idea of making the register allocation step an offline step 11:08
jnthn Hmmm
brrt (having register alloc as a offline step is basically compiler best practice, or so i've heard)\
jnthn For (a) what's wrong with using the spesh allocator?
brrt consistency. i use the spesh allocator for nothing, and then bam!, it reappears 11:09
although......
hmmm
i could actually use the spesh allocator to hold the info nodes
the info node array is quite redundant as all the tree 'pointers' and constants also get a info node 11:10
timotimo well, the spesh allocator is good for things that are unevenly sized and that you're going to throw away completely at the end
brrt aye
timotimo so it seems like a good fit
brrt it is a good fit
brrt wonders if i want random access to the tiles
ok, that is a good idea, i think 11:14
timotimo i've heard a thousand times that linked lists never outperform arrays even if you do inserts in the middle ... or something like that
because ... CACHES!
jnthn yeah but the spesh allocator sticks the nodes in order anyway :P
timotimo L1 cache can give you multiple gigs per second throughput! doesn't matter that it's still small and has to grab data from RAM at a much, much slower rate all the time 11:15
jnthn That's why I picked that kind of design
Thus a bunch of MVMSpeshIns will be continguous, unless you hit a point something was spliced in.
So it's relatively cache friendly
brrt i think knuth's old quote applies really well here 11:16
timotimo: just do multiple gigs of computation on 8k of memory or so :-P 11:17
brrt wonders if that is actually done anywhere
timotimo i'm still wondering if my idea for a Big Data product that handles "Big Data data sets as large as 100x your L1 cache effectively" will be seen as revolutionary and awesome 11:18
brrt lol 11:21
'big' data like ... a gigabyte :-o 11:22
fwiw, did anybody see the 'go is a poorly designed language' article anywhere?
it's funny because all the things the author considers as 'poor design' are quite logical imho
timotimo oh, does it say "its types are spelled totally differently from C"? 11:23
brrt no, actually, it doesn't 11:24
it says 'i want my negative array indexing to work like python and it doesn't :-('
while negative array indices are a huge cost in a potential fast path
jnthn heard Go had attracted a bunch of Python folks, but isn't sure how accurate that is 11:25
brrt perhaps not when they are constant (you can constant-fold it away), but definitely variable indices
yeah, well, if i were to go and blog about how python doesn't support sigils, i'd look ridiculous, no?
timotimo wouldn't that be fun? 11:26
brrt yes. yes it would
'splicing stuff out doesn't look easy' - that's because it *isn't* easy 11:27
and cheap
'declaring a variable in a new scope using shorthand notation shadows my outer scope variable' - i'm not even sure what anybody should expect 11:28
i'm going to write an article about how go doesn't have a whateverstar and how that makes it a sucky language 11:29
timotimo A/B-test it against an article about python not having a whateverstar 11:31
brrt that... 11:32
is an excellent idea
dalek arVM: 385e498 | jnthn++ | src/strings/normalize.c:
Make NFG algorithm use Unicode Grapheme Clusters.

As described in Annex #29. We do all of it except the CRLF case, as enabling that even breaks our ability to parse Perl 6 code (will need to figure out why). Aside from the CRLF case, though, we now pass all the Unicode grapheme boundary tests (that is, we get the .chars that are expected).
12:13
arVM: 82f93f7 | jnthn++ | src/strings/normalize.h:
Toss #define we ended up not needing.
12:14
arVM: 3519077 | jnthn++ | src/strings/normalize.h:
Slightly simplify a conditional.
jnthn Time to see what the spectest fallout of the NFG algorithm change will be... :) 12:15
nwc10 spectests weren't clean before 12:19
lizmat yeah, were 4 files failing for me yesterday eve
nwc10 they are in a state of sin a bit too often for my liking
jnthn With MOAR_REVISION or with master? 12:20
Seems 7 test files are vicitms of the change 12:23
Well, have tests that are vicitms 12:24
lunch & 12:27
nwc10 jnthn: "nom", I believe is the culprit 12:38
more spectests fail, but eyeballing the summary, doesn't look like anything surprising 13:01
jnthn back 13:06
brrt \o jnthn 13:09
jnthn Turns out 1 failing file was 'cus I needed to patch Str.perl (now done) 13:19
Next 2 were tests that don't make sense under NFG semantics.
And that we didn't spot last time around 'cus the NFG algo was insufficient.
m: say uniprop("\x1B3D", 'General_Category') 13:30
camelia rakudo-moar da8881: OUTPUTĀ«Mcā¤Ā»
jnthn m: say "\x1B3D".NFD 13:31
camelia rakudo-moar da8881: OUTPUTĀ«NFD:0x<1b3c 1b35>ā¤Ā»
jnthn m: say "\x1B3D".chars
camelia rakudo-moar da8881: OUTPUTĀ«1ā¤Ā»
jnthn m: say "\x1B3D".ord
camelia rakudo-moar da8881: OUTPUTĀ«6973ā¤Ā»
jnthn wtf
m: say 0x1B3D 13:32
camelia rakudo-moar da8881: OUTPUTĀ«6973ā¤Ā»
jnthn Locally
> say "\x1B3D".ord
6972
o.O
m: say Uni.new(0x1B3D).Str.ord.base(16) 13:34
camelia rakudo-moar da8881: OUTPUTĀ«1B3Dā¤Ā»
jnthn > say Uni.new(0x1B3D).NFC 13:35
NFC:0x<1b3d>
Not NFC that got busted
> say Uni.new(0x1B3D).Str.ord.base(16)
1B3C
I don't even... 13:36
m: say Uni.new(0x1B3D).NFD.NFC 13:45
camelia rakudo-moar da8881: OUTPUTĀ«NFC:0x<1b3c 1b35>ā¤Ā»
jnthn ah 13:46
brrt what, how
jnthn m: say uniname(0x1B3D)
camelia rakudo-moar da8881: OUTPUTĀ«BALINESE VOWEL SIGN LA LENGA TEDUNGā¤Ā»
jnthn m: say uniname(0x1B3C)
camelia rakudo-moar da8881: OUTPUTĀ«BALINESE VOWEL SIGN LA LENGAā¤Ā»
jnthn m: say uniname(0x1B35) 13:47
camelia rakudo-moar da8881: OUTPUTĀ«BALINESE VOWEL SIGN TEDUNGā¤Ā»
jnthn m: say uniprop(0x1B3C, 'General_Category')
camelia rakudo-moar da8881: OUTPUTĀ«Mnā¤Ā»
jnthn m: say uniprop(0x1B3D, 'General_Category')
camelia rakudo-moar da8881: OUTPUTĀ«Mcā¤Ā»
jnthn m: say uniprop(0x1B35, 'General_Category')
camelia rakudo-moar da8881: OUTPUTĀ«Mcā¤Ā»
jnthn Yowser. That's a bit of an interesting problem. 13:48
nwc10 It's not at all obvious to me why (or what's wrong) 13:49
I have not read this yet: morepypy.blogspot.co.at/2015/10/pyp...us+Blog%29
jnthn nwc10: Well, it came from a test regression
brrt should probably subscribe to their blog since it is interesting 13:50
jnthn m: say uniprop(0x1B3D, 'NFC_QC')
camelia rakudo-moar da8881: OUTPUTĀ«Yā¤Ā»
nwc10 I'm just looking at what's on planetpython.org/
brrt although i recently lost all my subscriptions when i forgot to copy a opml file
jnthn So, that character passes the NFC quick-check
Which implies "we're already in NFC" 13:51
brrt uhuh
jnthn But NFC should afaiu be stable
Such that if you compute NFD and then again compute NFC, you get the same thing back
That's not happenign here
*happening
nwc10 we're into "#11907 Looking for a compiler bug is the strategy of LAST resort. LAST resort." ? 13:52
jnthn Not yet
I need to go look at our NFD -> NFC 13:53
But NFC is *defined* in terms of NFD
13:57 rarara_ joined
brrt wonders what the current state of the art is in rubyland, since ruby may be even more comparable to perl6 in terms of indirections 13:58
jnthn Well, I can't find a way we're inconsistent with the actual Unicode data files 14:07
So yeah, it's very odd. I have a case where a string passes the NFC QuickCheck, but actually computing NFC on that string doesn't give identity 14:16
nwc10 use more 'coffee'; ? 14:17
jnthn And it's a one-char string so I don't think I could be getting the use of NFC_QC wrong.
14:37 tokuhiro_ joined
jnthn OK, seems we have something wrong in our canonical composition 14:40
Yeah, nailed it I think 14:56
brrt that was fast 14:57
jnthn The number of characters added by patch to time spent ratio is pretty awful :P 14:59
brrt aw, there goes your enterprise points 15:00
jnthn :P 15:01
brrt has seen a lot of articles lately about how COBOL was making a comeback 15:03
maybe we should have an Inline::COBOL
or just port moar to COBOL for enterprise points
jnthn If your goal is -Osalary, COBOL may well be one of the best languages to learn :) 15:05
dalek arVM: 5ff3001 | jnthn++ | src/strings/normalize.c:
Fix a canonical composition bug.

We didn't admit various starter/starter composition cases. This bug actually managed to survive despite us passing the complete Unicode normalization test suite, because we never hit this code path before thanks to the NFC_QC property. Now, thanks to NFG_QC, we can hit it in some more cases (this does perhaps point to a future optimization). Fixes a spectest regression.
15:09
brrt i don't even... 15:11
jnthn :) 15:13
brrt understand how or why that makes a difference :-) 15:14
jnthn brrt: Sometimes, two characters that are *not* combining chars are composed into a single char when canonicalizing. 15:15
brrt: And the bug I just fixed meant we didn't let that happen. 15:16
Now I'm down to one affected test file 15:17
Bizzarely, S32-io/IO-Socket-INET.t
m: say uniname(0xbeef) 15:21
camelia rakudo-moar 3cc195: OUTPUTĀ«<Hangul Syllable>ā¤Ā»
jnthn haha
m: say uniname(0xbabe)
camelia rakudo-moar 3cc195: OUTPUTĀ«<Hangul Syllable>ā¤Ā»
jnthn Pro tip: when writing tests and wanting some random "Unicode character", don't just spell cute words :) 15:22
Otherwise you might (or in this case, will!) end up with something that will, under NFG, end up combining with the previous grapheme.
brrt no, i stil don't get it 15:37
but i'll accept that for now 15:38
jnthn brrt: I only get it in so far as "I read the Unicode spec and know what the terms mean"
I don't know anything about the Balianese language and the specific thing that's going on with these chars. 15:39
brrt the NFG business seems similar in a way to the x86 instruction encoding business
you think you fixed it, but no, something funny happens 15:40
only in specific magical cases
jnthn Well, this wsan't even an NFG bug, just an NFC one :)
brrt fair enough :-)
jnthn It's not *that* bad, tbh. It's just that humans are darn creative about their writing systems.
Hangul is probably the biggest offender in terms of amount of code we have to write just for it. 15:41
nwc10 t/spec/S15-nfg/cgj.rakudo.moar 15:50
TODO passed: 5-8
jnthn Yup :)
Oh man. The \r\n => 1 grapheme thing may be a bit fraught 16:12
nwc10 why so?
jnthn Well, NQP can't even parse a simple test file with a \r\n in it any more 16:14
And then the error handling code it uses to try and report that fails too 16:15
And the REPL hangs 16:17
nwc10 step away from the keyboard, and make a curry? 16:18
the "error reporting" thing sounds like a bug that needs fixing whatever else happens next.
jnthn aye
Ah 16:21
One possible problem is that concatenation of a \r and \n doesn't produce the grapheme
dalek arVM: f1a216d | jnthn++ | src/strings/nfg.c:
Update concat code in prep for \r\n as grapheme.
16:28
jnthn Aha 16:35
say(?("\r\n" ~~ /\v/)) 16:36
[Coke] oho?
jnthn That doesn't match
So, that's almost certainly the source of the \r\n -> 1 grapheme bustage
I suspect fixing this is going to need an NQP bootstrap updage 16:37
*update
But worse, it probably also brings up the issues with NFG and the NFA engine
TimToady m: say "\x037e".ord.base(16) 16:45
camelia rakudo-moar 3cc195: OUTPUTĀ«3Bā¤Ā»
jnthn m: say uniname(0x037E) 16:46
camelia rakudo-moar 3cc195: OUTPUTĀ«GREEK QUESTION MARKā¤Ā»
TimToady is there an explanation for why we lose track of GREEK QUESTION MARK?
jnthn Almost certainly :)
m: say uniprop(0x037E, 'Decomp_Spec')
camelia rakudo-moar 3cc195: OUTPUTĀ«003Bā¤Ā»
jnthn ^^
'cus Unicode says we should as part of normalization
See singleton equivalence in unicode.org/reports/tr15/ for more info 16:47
TimToady k
jnthn (There's a handful of 'em)
[Coke] TimToady: I thought we already said that on #perl6. my bad. 16:49
jnthn: it makes us impervious to the "mess with your friend's code" meme that was going around recently.
jnthn [Coke]: Nah, there's still plenty of other ways to do that :) 16:51
TimToady jnthn: btw
m: say uniprop(0x1B3D) # don't need to type General_Category every time
camelia rakudo-moar 3cc195: OUTPUTĀ«Mcā¤Ā»
TimToady that's the default
jnthn Darn, wish I'd know that earlier :P
After today I probably don't need to look up gen cats again for another few months :P
TimToady: I assume you want us to end up with \r\n as a synthetic so we totally follow the Unicode grapheme cluster rules? :) 16:53
TimToady unless there's some showstopper reason we can't
jnthn Not that I see so far, it's just a bit of hunting down the badass umptions in the code... :) 16:54
TimToady at least it makes it easier to match \n in regex :)
jnthn Yeah
TimToady and certainly \v should match it too
jnthn For sure; just patched that locally 16:55
Though it's a hack :/
And will break LTM of \V until I more generally fix the NFG/NFA interaction
The NFA design we inherited is rather into doing nqp::ord
And so loses synthetics
Not a huge engineering problem to fix, I don't think. Just another thing to do. 16:56
TimToady though even if we have an nqp::gord or so, we're still in trouble if we go with per-string or per-domain tables 16:57
jnthn Indeed. I don't want to go that way.
Better to actually pass one-char strings.
At the "API"
(We can still keep it all integers on the inside at actual matching time) 16:58
TimToady let's keep doing it right, and then think about optimization
jnthn Well, we're not doing it right yet...but yeah.
TimToady *righter
jnthn Otherwise, I think the NFG algo tweaks have turned out OK. 16:59
TimToady any feel on input performance degradation?
17:00 tokuhiro_ joined
TimToady presumably shouldn't be much if most of the file is quick-reject ASCII 17:00
jnthn Yeah, it's a little slowdown for ASCII 'cus we have to care about \r now 17:01
TimToady well, except insofar as \r\n is ... yeah
jnthn And we're more careful over controls
When we have to do the full analysis it's more costly
But I computed us an NFG quickcheck property
So we should only be doing the hard work when we really need to 17:02
m: say uniprop('x', 'NFG_QC')
camelia rakudo-moar 3cc195: OUTPUTĀ«0ā¤Ā»
jnthn Ah, build is behind
But yeah, we leak that property to userspace as if it was a normal Unicode property, when it's in fact one we've made up
Dunno if that bothers you. 17:03
TimToady not much
jnthn k
Ended up with all the control chars being NFG terminators, btw. 17:04
TimToady though if the UC adopts the "NFG" term we could eventually get a name collision, but so far they seem to prefer Normalization_Form_Grapheme and such
jnthn Which was something you suggested before.
Well, they do call their quickcheck properties NFC_QC for example 17:05
TimToady well, hopefully then they won't adopt it and flip the sense :)
jnthn If they did, they'd make it inconsistent with how the other quickcheck properties work
And they don't seem that crazy. :)
Or at least, no more crazy than you have to be to try and bring some order to the world's writing systems... 17:06
TimToady in the backlog: "never hit this codepath" 17:08
do we have any plans for a code coverage tool?
so we can tell if there are glaring blind spots in roast? 17:09
jnthn Well, a user-level thing "not yet"
Do we have the tech to hack something up to tell us where our roast blind spots are? That's easier. 17:10
The cross-thread write logging and the profiler use bytecode instrumentation, and it's easy enough to write extra ones of those
TimToady well, thinking setting-level mostly there
jnthn My Big Plan is to turn the instrumentation stuff into a kind of meta-interpreter framework so you can write stuff like profilers and coverage tools and debuggers in NQP or Perl 6 code 17:11
But that's not going to happen this side of 6.c 17:12
Anyway, I can do a couple-of-hours hack solution to get an approximate answer to "what is roast not covering" 17:15
And maybe it'll be inspiration for somebody to go and make a good one :)
OK, I've got NQP patches that get all but one of the NQP tests passing 17:17
(With \r\n as a grapheme) 17:18
It'll need a rebootstrap, alas 17:19
Hm, and it doesn't make it all the way through the Perl 6 build either. 17:20
TimToady we have 5 comments in nqp that mention things to change after a rebootstrap :) 17:22
which I'm sure were put there at least one reboot ago...
jnthn :) 17:23
Time for me to go cook us something tasty :)
jnthn is happy to have this bit of the NFG work nearly done 17:26
Guess I need to worry about the threads/IO things next week
Well, various I/O things... 17:27
Anyway, away for a bit
18:06 kjs_ joined 18:17 tokuhiro_ joined 18:18 zakharyas joined 18:28 leont joined 18:40 FROGGS joined 18:42 vendethiel joined 19:24 kjs_ joined 19:54 tokuhiro_ joined 19:55 rarara_ joined 20:10 kjs_ joined 20:37 kjs_ joined 21:19 kjs_ joined 21:33 zakharyas joined 22:20 tokuhiro_ joined