timotimo good luck! 00:10
i've had a bit too much success with my last attempt at sleep :D
samcv sorry jnthn and lizmat 01:21
03:05 avarab joined 05:26 geekosaur joined 06:27 domidumont joined 06:33 domidumont joined 06:54 domidumont joined 07:12 domidumont joined 08:07 brrt joined
brrt good hi, #moarvm 08:07
samcv good hi!
nwc10 good *, #moarvm 08:08
brrt no worries, comments happen and stuff
even jit compilers ultimately happen if you wait long enough 08:09
nwc10 but in that case are they Just In Time? Or rather late? :-) 08:12
don't worry. I'm here all week, but tomorrow is a public holiday (at least Between Keyboard and Chair)
samcv it's always on time, because it's right in the name
just in time
you cannot dispute this
nwc10 aha
therefore clearly good UGT everyone 08:13
samcv no it's good 'hi'
nwc10 er, it's also "good morning"
I thought
I'm confused
samcv no that's what brrt said
<brrt> good hi, #moarvm
<samcv> good hi!
nwc10 yes, I realised that much 08:14
but clearly I failed to read it the first time
is it bedtime yet?
nine .tell jnthn Mandatory reviews are not only good for QA but also help distributing knowledge around the team and grooming new developers. They might help MoarVM becoming less dependent on you.
yoleaux2 nine: I'll pass your message to jnthn.
samcv it's a little after midnight here
brrt guesess samcv's timezone to be asiatic 08:15
not australian, that'd be even later
samcv no i live in california 08:16
brrt then it … ought to be considerable past midnight?
oh
no, i'm wrong
samcv but new Zealand is 4 hours off from me, but on a different day
brrt my timezone mental calculations are off
arnsholt US eastern seaboard is CET-6, western is CET-9 =) 08:17
samcv -9... what 08:18
it's -8
arnsholt CET, not GMT/UTC
brrt also known as 'the one true timezone'
:-P
arnsholt Obviously =)
(I *think* that's brrt's timezone as well, which is why I phrased it like that =) 08:19
nwc10 UGT±*
brrt suggests introducing stardates
by the way.
samcv how are stardates measured
brrt spilling is finicky
with clocks 08:20
:-P
samcv bad definition of time
08:21 zakharyas joined
brrt wonders if it's worthwhile to create a cons for a spill consisting of (live-range, spill-number) 08:23
nm, i can just stash that in the live range object, and stash the spilled live ranges in an array 08:27
not necessary to do that, but good for debugging
ohai renamed masak 08:32
with regards to macro's, i have written some truly fearful stuff now
arnsholt C preprocessor macros, or less insane macros? =) 08:40
brrt C preprocessor macro's of course 08:49
my greatest source of pride is the __COMMA__ macro
arnsholt I assume it expands to something other than a comma? =) 09:08
nwc10 hasn't even got halfway through news.ycombinator.com/item?id=13319904 yet 09:09
brrt it expands to a comma if defined as a comma, can also expand to something else 09:37
jnthn yawns 09:38
yoleaux2 08:14Z <nine> jnthn: Mandatory reviews are not only good for QA but also help distributing knowledge around the team and grooming new developers. They might help MoarVM becoming less dependent on you.
brrt i defined it as a
'|' to create a bitmap from enums at compile time
nwc10: the other side is that clearly the python community are doing things right, if they can convince so many people to put so much effort into their language; and they're doing something wrong, because none of that ever goes anywhere / becomes mainstream 09:39
how many alternative python implementations have we seen
nwc10 "viele" 09:40
how many have stuck around?
not "lots"
brrt pypy… barely
nwc10 I question why "barely" because I thought that it was doing OK, but I am an outsider looking in 09:41
brrt the project is doing fine, but uptake is still very low
nwc10 I do find grumpy interesting, because a comment of someone (I think I know who, but I don't want to attribute wrongly or leakily) was that Google stopped Unladen Swallow partly because their (other) investment in PyPy had delivered a lot of speed
(The other reason I suspect was because concurrency wasn't going to work out in Unladen Swallow) 09:42
anyway, yes, pypy still hanging in there
dropbox not yet using Pyston in production
(Clearly their goal is C API compatibility)
arnsholt nwc10: There's a link in that HN discussion to a YouTube video with Armin Ronacher talking about how CPython internals leak into Python user-space
Very interesting
nwc10 whilst IronPython and Jython appear to have stuck around long enough for firms to use them 09:43
and now cause problems beacuse they're on historical versions with now escape path
arnsholt: thanks, not spotted that
will find it and watch it
(later, when not at work)
brrt not spotted either, will watch as well
jnthn ran into Jython in the wild at $big-company, for whatever that data point is worth 09:44
nwc10 well, it means I can ask you
1) so they're code is stuck on 2.5 compatibliity?
2) what are they using now/next? 09:45
oh, sorry, I forgot. jython 2.7.0 did ship. About 18 months ago
brrt the other, other hand 09:47
nwc10 a.k.a "The Gripping Hand" - context en.wikipedia.org/wiki/The_Mote_in_God's_Eye 09:48
jnthn nwc10: This was a couple of years ago, so I don't know what happened now/next. 09:50
Probably "nothing" though.
iirc they were using jython to get at Java libs 09:52
So no particularly easy escape
*sigh* So I got another night of LTA sleep. 09:53
jnthn will attempt not to be too grumpy today... 09:54
nwc10 :-(
I had those the night before last and the night before that, but last night was OK
does this affect your coffee bootstrapping?
jnthn No, coffee was produced fine today 09:56
However
I normally drink a 5
At Christmas somebody sent me a collection on different coffees. And a few days ago I ran out of the one I normally drink so figured I'd crack on with the collection. 09:57
I didn't expect a 3 to seem so much weaker. I've no idea what I'll make of the 1s in the pack. :)
*a collection of 09:58
brrt can agree with taht
anyway: the wider issue is
nwc10 line them up like shots of spirits and just neck the entire row? :-)
brrt lots of people have the problem of 'python doesn't perform'
which has a bunch of aspects; one of which is the global interpreter lock 09:59
another aspect - which i think is even more relevant - is memory usage and startup times
unfortunately, pypy helps with neither 10:00
the thing is, it generalizes to ruby and perl as well
lots of memory usage, lots of disk IO on startup, lot's of magic and intermediate layers
jnthn I expect different people have problems with different ones of those 10:01
nwc10 sorry to seemingly sound a bit glib, but "me too" 10:02
and I am arm waving
and I'm probably biased
brrt in my experience it's really, really common
(PHP also has this problem, obviously)
jnthn (So the set of people for who the GIL is the blocker may be a different set from whom startup/memory is the blocker)
nwc10 but the thing I notice about language syntax or similar encancement requests is "I'd like X" and folks thinking "hence everyone would like X" 10:03
brrt i think both inhibit 'scale' in a way 10:04
the effect of having a GIL is that if you want to do *anything* asynchronously, you need to resort to 'a solution' or 'a framework' or IPC or whatever 10:05
you can't just start a new thread in your uWSGI handler, for instance, and think that things will be fine, because they won't
the fact that you use so much memory limits the amount of process handlers you can have 10:06
so now, we need autoscaling, and so we need a cloud, and so we need containers, and a distributed container management, and…
i'm oversimplifying, of course
arnsholt nwc10: On gem from that talk: "Until pickle dies, we will never have versioned submodules" =) 10:07
s/On/One
nwc10 does he explain why in the talk?
arnsholt Yeah
nwc10 good. then I won't ask here :-)
jnthn brrt: If you're scaling sufficiently you'll hit the need to go multi-node whatever the thing is written in, I suspect. 10:08
brrt hmmm 10:09
i guess my point is
google's move to go, at their scale, is the wisest thing they could've done, and patching up python to make it do what go can do has proven unreasonable in the past 10:10
nwc10 yes, that's a good summary (and I agree)
jnthn For sure, if you can run half the number of nodes then you save a lot at that scale.
nwc10 and also have spotted some comments somewhere in the discussions that I have read, that they don't use the dynamic features (that are expensive to implement) that they aren't putting in 10:11
brrt they do allow for introspection, though 10:21
nwc10 reaches " Although Grumpy is compiled, it is just as dynamic as Python, in that method dispatch involves dictionary lookups, etc." 11:02
which is interesting. It suggests that their use case is going to be "whatever we can do ahead of time"
samcv also jnthn it looks like all we need from the canonical combining class is to check whether it's 0 or whether one is greater than the other, and also whether they are equal. yey 11:06
can get the int value for the enum since they are all in order always
jnthn Yeah, the non-zero values are primarily used in canonical ordering 11:12
Perhaps even exclusively, thinking about it.
samcv yeah
jnthn Or at least, if I used them elsewhere I forget
A little bit of history by the way
samcv well. also used in unicode_normalizer_process_codepoint_full
which can probably be taken out (maybe)
jnthn But only for non-zeroness? 11:13
samcv hmm?
jnthn Originally we didn't use TR29's definition of grapheme-break at all
And just combined things with a non-zero CCC
(In a very early cut of the code)
samcv i believe the GCB property should be different for ones we care about
but as i said. may not be true
but i believe it is 11:14
jnthn So it's very possible that the uses of CCC outside of the canonical ordering can be replaced by soemthing else.
samcv so hopefully we can check the GCB proprety to decide what to do, and if it's not 0, er or whatever the default value is
yeah
jnthn The main thing is that we still need to make sure we are getting to NFC form *and* doing NFG
samcv yep 11:15
jnthn NFG is kinda an NFC++ in that sense.
And NFC is formed from NFD
The code at present does a quite literal 3-step process on that.
samcv m: say 0x304.uniprop('GCB')
camelia rakudo-moar 8568dd: OUTPUT«Extend␤»
jnthn (Guarded by the quick check to avoid it in many cases)
samcv m: say 0x304.uniprop('Canonical_Combining_Class')
camelia rakudo-moar 8568dd: OUTPUT«230␤»
jnthn There are things that are extends with a CCC of 0 11:16
samcv do we care about those though?
jnthn Well
We do for forming NFG correctly, yes
samcv i think we care more about extends in that case
than ccc being 0
jnthn That's why the NFG algo switched away from caring about CCC 11:17
samcv ah
yeah. exactly my point
jnthn All I'm saying really is: be careful that in simplifying it, we don't somehow break forming NFC along the way.
samcv yeah. exactly
we may even be able to remove NFG_QC and only do GCB 11:23
but we will see as i start trying to simplify some things. that is much further down the list 11:24
jnthn That...doesn't sound likely
samcv since it's not too expensive
jnthn Singleton decompositions are the first counter-example that come to mind.
samcv well for some parts
i mean we still need it. but for breaking up the graphemes i think we don't. for other things yes 11:25
i.e. can call it further down the process codepoint full function
jnthn Ah, OK.
Yes, I don't doubt there's better/faster ways to achieve the smae thing at all ;) 11:26
*same
I'd rather like there to be. :-)
samcv we use nfg_qc on things that decompose and such right? 11:27
jnthn Yes
nfg_qc is nfc_qc
And then we zero more things
So it fully covers all the things NFC does
You added some I missed (or maybe new in Unicode 9) recently to that set of "more things", I think. :) 11:28
samcv yep
jnthn, does anything that have extend or some weird GCB property have NFG_QC =No? 11:30
i don't think they do
well can't. it just breaks things in how we do things currently.
jnthn If they do then it'd be a bug, surely. 11:31
samcv yeah
so maybe just rely on nfg_qc initially. and if it passes, ok fine, if it doesn't then look up the gcb and check decompostion and stuff 11:32
and we can even look up the ccc just once and pass it to canonical_composition and canonical sort. so will not have to look it up again 11:33
jnthn *nod* 11:34
Though canonical sort waits until it has the things it needs to sort
But we can probably keep a parallel buffer of CCCs
samcv oh true
oh actually looks like ccc is _not_ in order right now. but i made it in order. so that is nice 11:39
jnthn :)
samcv we need to borrow a really fast utf-8 decoding function though 11:40
jnthn When I implemented this stuff I was in many places quite literally translating Unicode spec into code that looked quite a lot like the spec. So if you're uncovering easy performance wins I'll hardly be surprised. :-)
I tought the one we had got was meant to be decent (in that we did actually borrow it, not write it ourselves :-)) 11:41
*thought
But I'm +100 to replacing it with something faster (provided licensing stuff is OK)
nwc10 One insight from Perl 5 land (that others had and I thought "why didn't I think of it?")
samcv idk if the qt one could work
it's simd optimized for a bunch of processors and arm as well
nwc10 is that you can support all the non-standard stuff
samcv since qt runs on phones
and they have fast latin-1 and other format conversion as well 11:42
nwc10 just make sure that the fast one flags if it finds "bad" input
and redo it with slow code that handles the exceptional stuff you want to do
because the common case is (or should be) well formed input
samcv OK much better! now the nf* tests don't fail anymore now that the ccc is in order :) 11:43
well when replacing with int based function
jnthn Nice :) 11:44
samcv++
nwc10: Yeah, that's reasonable. One slight challenge is in streaming decoding where you might not have the data to hand any more to look back at
nwc10 are there any cases that require more than 1 previous character to be held? 11:46
jnthn When decoding UTF-8? All the 2-3 byte sequences :)
nwc10 specific one I can think of is one valid low surrogate comes in, and it's not followed by a high surrogate
yes. BUT
samcv define character
jnthn Oh, you're thinking ofthose
samcv codepoint?
nwc10 you can't let the start of the UTF-8 sequence go past
jnthn That too
nwc10 because you don't know that it's valid until you find enough continuation characters 11:47
jnthn Oh, I thought you were talking about tracking stuff to report line/column where the error in the input was.
nwc10 also, surrogates in UTF-8 aren't valid, IIRC, if you're being strict
no, just at the code point level
jnthn Ah, OK. Then my comment makes less sense :)
arnsholt samcv: For answering questions like "are there characters with property A and property B at the same time" it might make sense to shove the whole shebang into an sqlite database, or similar? 12:11
In fact, looks like someone's already put together tools to do that: github.com/Codepoints/unicodeinfo
samcv why sqlite database... that would be loads slower 12:16
or you mean for my purposes?
NFG_QC is an internal property not unicode spec tho 12:19
i can just survey all the codepoints using perl6 and check things
i'm going to bed. night night everybody 12:28
jnthn 'night, samcv++ 12:29
samcv arnsholt, there's also this unicode.org/cldr/utility/properties.html
which is *most* accurate
mostly
also haven't figured out how to add and subtract properties yet but i think there is a way
arnsholt Yeah, I meant for your purposes, not for MoarVM 12:30
Because using SQL to get answers to questions is awesome (IMO, anyways =)
15:26 FROGGS joined 16:33 brrt joined 17:01 ggoebel joined
timotimo i got another day of sleep %) 18:09
dogbert17 o/ timotimo, how are the hiccups today? 18:11
timotimo well, i haven't been up for long, but so far they haven't appeared 18:24
yesterday they were gone for most of my day-equivalent-timespan, but towards the time i wanted to go to sleep they came back
dogbert17 so, how much have you slept recently? 18:28
timotimo well, from yesterday to today and from the day before yesterday to yesterday it was ~10 hours each maybe? before that it was scattered attempts of few hours 18:31
dogbert17 10 hours, sounds like a record :-) 18:42
timotimo i'm not saying it's good :P
20:00 ggoebel joined
samcv .ask jnthn so I am trying to get certain emoji working, and so i made this change here github.com/samcv/MoarVM/commit/a29...bc21b3R579 23:45
yoleaux2 samcv: I'll pass your message to jnthn.
samcv .ask jnthn and oh oops i made a mistake. sorta nvm 23:47
yoleaux2 samcv: I'll pass your message to jnthn.
samcv nice. ok i fixed it 👍 23:48