00:45 jnap joined 01:09 raiph joined 01:54 jnap joined 02:04 colomon joined 02:23 jnap joined 02:47 jnap joined 02:51 benabik joined 03:03 jnap joined 05:30 jnap joined 05:47 jnap joined
cxreg is moar's GC a m&s model? 06:39
i had a long-standing question with parrot's GC that despite all the times it was rewritten, was never addressed. which is, making m&s gc cow-friendly 06:40
ruby added a bitmap look-aside for gc in 2.0, very similar to what i'd hoped to see in parrot 06:42
patshaughnessy.net/2012/3/23/why-yo...n-ruby-2-0
06:48 jnap joined
TimToady jnthn.net/papers/2013-bs-secret-life-of-gc.pdf 07:04
07:49 jnap joined 08:49 jnap joined
FROGGS about the uniode issues... 09:26
the problem is that we currently can't handle unions, like L, which matches when one of Lu|Ll|Lt|Lm|Lo is true
so I was thinking of making a bitmask out of Lu|Ll|Lt|Lm|Lo, but unfortunatly these are enums (1..5) whih makes that impossible
and we have only three bytes available for storing that information
my goal for today is to pack these unions (L = Ll|Lt|Lu, LC = Ll|Lm|Lo|Lt|Lu) as a list of ranges in these three bytes
diakopter FROGGS: u still there? 09:29
FROGGS diakopter: yes
diakopter why do you say we have only 3 bytes for that 09:31
FROGGS because it is a MVMint32, and the first byte is for the property code 09:32
diakopter eh
diakopter is confused 09:33
where do you see these 3/4 bytes 09:34
FROGGS here, for example: entry->codepoint = unicode_property_value_keypairs[index].value & 0xFFFFFF;
MVMint32 property_code = unicode_property_value_keypairs[index].value >> 24;
diakopter that's generated; it computes how many bits each property needs; most are binary so they need just one 09:35
FROGGS Ll is something like 14 << 24 + 1
and Zs is something like 14 << 24 + 36 09:36
14 == general category
diakopter my understanding of the ucd2c.pl is so far removed from the generated code that doesn't help me; I can only speak in generalities, I guess
FROGGS diakopter: it is not about how many bits something needs, it is the property code and property value code 09:37
diakopter what is the problem we're trying to solve? just that the "L" property doesn't work?
FROGGS diakopter: that is in unicode_ops.c
diakopter (or other aliases?)
FROGGS correct, any union like L does not work
diakopter it's just that in the unicode text file tables, it doesn't care whether it's a union; the value is "normalized" into the text file anyway 09:38
FROGGS hmmm, maybe I could make L its own property code, and emit a lookup to Lu | Ll | ...
diakopter so it should just be a regular old boolean property
FROGGS gc ; L ; Letter # Ll | Lm | Lo | Lt | Lu 09:39
gc ; LC ; Cased_Letter # Ll | Lt | Lu
gc ; Ll ; Lowercase_Letter
gc ; Lm ; Modifier_Letter
diakopter right; it's precomputed for you in the .txt
at least, it used to be
FROGGS if you ask a codepoint like "a" for its property value code for general category, you get a 1, which is Ll
so you can't check for L in the same category 09:40
diakopter hang on
0061;LATIN SMALL LETTER A;Ll;0;L;;;;;N;;;0041;;0041
FROGGS hmmm 09:41
but but
diakopter inthe
FROGGS hmmm
yeah
diakopter hang on; looking at the ucd2c.pl for a min
oh, did you revert your aliases thing yet 09:43
FROGGS no
I may know why it does not work
diakopter can you revert it 09:44
FROGGS if we are using UNIDATA/extracted/DerivedGeneralCategory.txt, there are mappings for Ll,Lu,etc to codepoint, but not for L
diakopter oh
FROGGS no, does not seem to be used... 09:45
diakopter hm, how new is that file
(was it in 6.0 or 6.1?)
FROGGS no idea
diakopter if so, oops :)
hm yeah, seems it was there 09:46
so that's not the problem
(they didn't move the data to there)
I'm still looking at th e.pl 09:47
FROGGS but sadly I must go now, otherwise we won't have lunch :o( 09:48
I am going to check then why we don't create an enum for L if that is the case
diakopter i'll keep looking at it
FROGGS thank you
:o)
09:49 FROGGS[mobile] joined
diakopter FROGGS[mobile]: which commits were you going to revert 09:50
09:50 jnap joined
FROGGS parts of github.com/MoarVM/MoarVM/commit/00...a9f#diff-2 09:51
I'd remove at least line 775 to 783 there
diakopter what were the other lines doing
FROGGS was about adding ASCII_Hex_Digit, ASCIIHexDigit and asciihexdigit 09:56
10:03 [Coke] joined, masak joined
dalek arVM: 815c123 | jnthn++ | src/core/args.c:
Align arity-related error generation with Rakudo.
10:23
10:25 FROGGS[mobile] joined
moritz on Thursday we will have the first monthly rakudo release that supports MoarVM 10:28
for that purpose, we should also cut a MoarVM release
jnthn moritz: Yes, we should.
moritz so, how? :-) 10:29
jnthn 1) Make a tarball. 2) Put it somewhere. :)
moritz maybe steal the release process from NQP, which also has submodules?
jnthn That's a good idea
moritz jnthn: tag the version in the MoarVM repo
write it into a file
ideally have MoarVM read that file, and know its own version 10:30
jnthn C:\consulting\MoarVM>moar --version 10:31
This is MoarVM version 2013.10-375-g3a57b01
It does at least have some idea...
moritz ok, so that needs to consider a VERSION file if not under git, or something
jnthn But i guess that comes from git
So yeah...need to change that. 10:32
moritz I'll try to look into it this weekend, but we have visitors, so tuits might be low-ish
jnthn Sure. If you don't get to it, I will :)
moritz ooh, Configure.pl already reads from a VERSION file if it exists 10:33
jnthn oh, cool
moritz FROGGS++
FROGGS[mobile]++
dalek arVM: 75e60a1 | jnthn++ | src/core/args.c:
Align missing named param error with Rakudo's.
10:34
10:51 jnap joined 10:56 odc joined
FROGGS diakopter: could you make some sense out of it? otherwise I would continue now 10:57
dalek arVM: ccce6a6 | moritz++ | / (2 files):
[Build] "release" Makefile target
10:59
moritz this is a "simplest thing that could possibly work". I'm pretty sure pmichaud++ had some issues with git-archive, but I don't know what they were, so I'm going to use it until I run into the same issue 11:00
jnthn was gonna go with git archive also 11:01
FROGGS ahh, we are indeed reading from DerivedGeneralCategory where L is not listed! 11:26
I can fix that
timotimo i had a probably silly idea when trying to sleep: 11:38
since we handle strings as collections of begin+length blobs anyway, would it make sense to have an optimization that looks for common substrings and compresses every string?
it seemed very costly to me when i thought about it, but there may be crazy-cool algorithms involving prefix-tables or something like that 11:39
jnthn is working on the binder elimination opt today 11:51
11:52 jnap joined
FROGGS jnthn: will this get rid of the "slow path" ? 11:52
jnthn Yeah 11:53
FROGGS ohh, that sounds interesting then :D
timotimo \o/ 11:55
jnthn I'm probably gonna (though maybe not in time for release) get this stuff in place on JVM too. 12:02
FROGGS ohh, I didn't even know that this applies to jvm also 12:12
jnthn Well, we notice it a bit less there 'cus the JVM is good at making hot slow paths less slow :)
timotimo but this would also hit non-hot slow paths, yes? 12:20
jnthn Yeah 12:22
timotimo i like that
jnthn And should mean the JVM can do a way better job on hot paths overall
timotimo i can imagine 12:23
jnthn Time for a little shopping/lunch, methinks...
12:52 jnap joined 13:53 jnap joined
FROGGS 0061;LATIN SMALL LETTER A;Ll;0;L;;;;;N;;;0041;;0041 13:58
I wonder how both Ll and L property is stored for "a" in the six bits that general category has... 13:59
when I do "die Dumper($point) if $point->{code} == 65;" in the while loop of sub emit_bitfield, I can see that every codepoint can only has one property value within a category 14:19
so "a" can't be Ll and L at the same time atm
14:44 benabik joined
benabik L = Ll + Lu + L? 14:44
FROGGS ? 14:45
benabik L is Letter, Ll is lowercase, Lu is uppercase (IIRC). So perhaps it's implicitly in L because it's in Ll? 14:46
FROGGS well, that is not wrong, the problem is that the implementation makes it impossible to check for L atm
14:54 jnap joined
moritz there are also non-cased letters 14:57
L = Lu + Ll + Lt + Lm + Lo
(t = titlecase, m = modifier, o = other) 14:58
benabik L = ^L.$ ? 14:59
FROGGS that is not how it works :o) 15:00
diakopter: that "L" in "0061;LATIN SMALL LETTER A;Ll;0;L;;;;;N;;;0041;;0041" is not gc:Letter but bc:Left_To_Right 15:08
diakopter: so I think we need to make L, LC, etc first class properties with one bit width 15:24
15:51 ingy joined 15:52 dagurval joined 15:55 jnap joined 15:58 ingy joined, dagurval joined
diakopter FROGGS: ah yes, ok 16:03
FROGGS: if you're not doing it, I can do it now 16:04
FROGGS: you working on it? 16:12
FROGGS diakopter: not atm 16:19
diakopter: I would be happy if you gave it a whirl
diakopter ok I will
now 16:20
FROGGS: where are the other union-ish categories defined 16:22
er, properties
FROGGS UNIDATA/PropertyValueAliases.txt:566:gc ; L ; Letter # Ll | Lm | Lo | Lt | Lu 16:23
you need to look for | in the last part 16:24
diakopter did you see I already had the section to parse that file
FROGGS I added: 16:26
my @unionof = split /\s*\|\s*/, $parts[-1];
if ($#unionof) {
to:
github.com/MoarVM/MoarVM/blob/mast...2c.pl#L872
ahh, and the first within that "if ($#unionof) {" is "pop @parts;"
diakopter ok, but what is the current line 808 doing 16:27
FROGGS it resolves extra aliases
you can get rid of it for now
because it probably mixes names of aliases and unions 16:28
diakopter ok 16:29
16:55 jnap joined 17:56 jnap joined
jnthn m: say 850 / 1662 18:28
camelia rakudo-moar 6aa2f1: OUTPUT«0.511432␤»
jnthn Got a few regressions to hunt after dinner, but so far got spectest time on Moar down to half of what it was.
Well, just over half 18:29
diakopter wat 18:44
that sounds like it means setting parse time is far more than 50% reduced in time
FROGGS wow! 18:49
dalek arVM: 8782fd3 | diakopter++ | / (3 files):
back out these changes temporarily or not
18:51
diakopter FROGGS: did I remove the right things?
I'm not certain
jnthn diakopter: No, setting parse time is the about the same. :) 18:52
diakopter o_O
jnthn diakopter: It's the code it generates that's better.
diakopter o
FROGGS: I'm not sure 18:56
18:57 jnap joined, lue joined
diakopter FROGGS: am I right that a char can have multiple general categories? 19:00
FROGGS diakopter: yes, looks like you removed the right things 19:11
diakopter: yes
a char can be at least in two categories of the general category 19:12
same goes for all unions
I mean, this applies only for unions
diakopter ok 19:13
FROGGS a codepoint can be in up two to categories of a "property"
jnthn: btw, I am running into this: 19:25
Object conflict detected during deserialization.
(Probable attempt to load two modules that cannot be loaded together).
I just commented it so I can compile v5... and it seems to work :o)
perl6-m t/spec/base/num.v5 19:33
Cannot find method 'encoding': no method cache and no .^find_method
looks like I need to tweak v5's module loader...
damn, I did not want to use fudge for v5's source 19:34
jnthn FROGGS: Oh, there's some NYI there I think... 19:37
[Coke] moar is up to 28068 passes. 19:40
FROGGS jnthn: ahh, about repossession or some such? 19:41
jnthn yeah
FROGGS [Coke]: and it will get a lot more passing tests or it will fail quite a lot for the next run... be prepared :o) 19:42
[Coke] oh, di dyou fix the unicode!? woot. 19:44
FROGGS not me, no 19:45
diakopter++ is working on it 19:46
19:58 jnap joined
FROGGS damn, I think I sent a patch for Perl5 wrong-ish :o( 20:05
jnthn FROGGS: You wrote it in Perl 6 instead? :) 20:07
FROGGS hehe, no 20:08
but I think that I have attached either another unrelated patch, or only the unrelated patch :/
jnthn Email, how does it work... 20:10
diakopter FROGGS: hard problem is hard
er, easy problem is hard
FROGGS diakopter: yeah
diakopter well, I got this far: 20:12
processing PropertyValueAliases.txt...found union: C is 'Cc | Cf | Cn | Co | Cs'
found union: L is 'Ll | Lm | Lo | Lt | Lu'
found union: LC is 'Ll | Lt | Lu'
found union: M is 'Mc | Me | Mn'
found union: P is 'Pc | Pd | Pe | Pf | Pi | Po | Ps'
found union: S is 'Sc | Sk | Sm | So'
found union: Z is 'Zl | Zp | Zs'
but the problem is it needs to do this much earlier
anyway, moving it 20:14
dalek arVM: c74390c | jnthn++ | / (10 files):
Add assertparamcheck, to help with arg type checks

This will be used to add a new feature to the Moar code-gen in NQP, which in turn will be used for binder fast-pathing in Rakudo.
20:15
arVM: 2bfad44 | jnthn++ | src/core/args.c:
Decont while auto-unboxing.

Works for the common case.
FROGGS diakopter: well, you could scan for that piece earlier... 20:16
diakopter yeah
FROGGS like before calling UniCode(
diakopter adding into UnicodeData()
timotimo jnthn: this is your speedup without regressions? :) 20:24
jnthn timotimo: Well, I might have lost 4-5 tests.
tests, not test files
But think we can live with that. 20:25
Rakudo patch coming it a bit.
timotimo that sounds pretty darn good! 20:26
jnthn Doing some further analysis/tweaks 20:27
diakopter FROGGS: I'm still not sure whether to add these as top-level categories or not 20:58
20:58 jnap joined
FROGGS is there another way? 21:00
I think that is the cheapest we can do
when we want to avoid to add another hash
diakopter FROGGS: oops I meant top-level properties 21:01
jnthn diakopter: What's the argument against doing so?
diakopter hm 21:02
I guess there isn't one
except unicode treats them as property values
instead of properties
meh 21:03
they're looked up the same way anyway
it's not like someone does <gc=Letter>
FROGGS or <:gc<Letter>> 21:07
I'd do it in another way too if I knew one 21:08
21:20 masak joined, eternaleye joined, harrow joined
diakopter FROGGS: wait, how are the general category things currently looked up 21:36
FROGGS diakopter: it is looking for, say, "Letter" in unicode_property_keypairs 21:40
MoarVM/src/strings/unicode.c:32957: {"compex",44},{"Digit",22},{"digit",22},{"CWU",38},{"cwu",38},{"Soft_Dotted",75},{"SoftDotted",75},{"softdotted",75},
diakopter no I mean the actual general categories, like Letter_Cased
FROGGS so Digit is property code 22
Letter_Cased should be in there too, but might not be atm 21:41
it should point to MVM_UNICODE_PROPERTY_GENERAL_CATEGORY, which is 14 for right now 21:42
dalek arVM: ac8af16 | tadzik++ | src/core/args.c:
Remove leftover debugging code
diakopter so someone would need to do <:gc<Letter_Cased>> ?
tadzik sweet karma
FROGGS diakopter: no
diakopter tadzik--
FROGGS Letter_Cased would point to 14, which is gc
jnthn tadzik++ # removing jnthn's debugging code :P
FROGGS and then it goes on and searches for the property value code using that property code 14 21:43
so Letter_Cased is an alias for gc when it comes to property codes 21:44
and this is a problem for L, because there are several L-s, and we prefer the one from gc
diakopter: that is why I did this: gist.github.com/FROGGS/7575e1d5b15abc22ba89 (note that the patch is invalid, because modified manually) 21:47
diakopter bleh
I'm not sure what I meant the %done thing to do :) 21:48
FROGGS install aliases only once I suppose
because you only get the first anyway 21:49
diakopter yeah but it never checked it
FROGGS yeah, I've seen that :o)
but still, diakopter++ for this unimagic
diakopter we'll see
jnthn Mmm...smoked porter 21:50
FROGGS and I will have a full scottish breakfast in about ten hours! yay! 21:53
that does not help about the missing beer, but well, what an I do?
can*
21:59 jnap joined
diakopter src/vm/moar/ops/container.c(154) : warning C4133: 'function' : incompatible types - from 'Rakudo_Scalar *' to 'MVMObject *' 22:04
jnthn FROGGS: Take more care to keep your beer fridge stocked, what's what :P 22:06
*that's
diakopter: Thanks, got it 22:07
FROGGS not another project... SDL, v5, kids, S11 and now beer? ó.ò 22:08
diakopter harness reports: Missing test file: t\spec\S04-operators\brainos.t
jnthn Yeah, that's not moar specific
haha...I have TEST_JOBS=6 and now Rakudo "make test" takes 3s... 22:09
diakopter someone remind me how to nmake spectest in parallel
jnthn set TEST_JOBS=n
(choose n)
diakopter thanks
jnthn uh, and less obviously: I tend to find virtual cores - 2 works out well.
FROGGS diakopter: I tested these four files regularly btw: gist.github.com/FROGGS/aaafdacb95ece1aba4d8 22:10
diakopter jnthn: can you nopaste your current spectest output so I can compare to mine
FROGGS diakopter: you have to beat the second block there :o)
diakopter ok 22:11
jnthn diakopter: Just for S05?
FROGGS jnthn: are you sure you are really invoking prove? O.o
3s is like quick
jnthn FROGGS: Yes, I see all the tests go by
diakopter jnthn: well, all, if you could
jnthn Comes out as 4s sometimes... 22:12
FROGGS ohh yes, it feels very fast...
jnthn Doing another spectest run with latest changes here 22:13
diakopter: Will paste you it in a mo
diakopter FROGGS: well, somehow I broke the block lookups 22:22
FROGGS hmmm, should not be too hard to fix, this was in a separate commit 22:23
jnthn diakopter: gist.github.com/jnthn/95d3d4c69d585bbcece5 22:28
diakopter thanks 22:30
timotimo 3s? that's absurdly quick 22:44
FROGGS m-spectest takes 8:44.44 using TEST_JOBS=4 22:50
dunno how much it was before though
jnthn Probably quite a lot more :)
timotimo how long does parrot take for that on your machine?
FROGGS yeah... 8:44 minutes is very nice
timotimo: need to rebuild it to test
jnthn Well, it's ok... :)
FROGGS will post later
timotimo thanks :) 22:51
i'm rebuilding everything right now, too
23:00 jnap joined
jnthn Rakudo on Moar seems to have the best startup time of the Rakudos, by this point. 23:15
m: say "I starts in {100*0.31/0.46)% of the time Rakudo on Parrot does" 23:16
camelia rakudo-moar 61860f: OUTPUT«===SORRY!=== Error while compiling /tmp/ihZuhgiFgu␤Unable to parse expression in block; couldn't find final '}' ␤at /tmp/ihZuhgiFgu:1␤------> say "I starts in {100*0.31/0.46⏏)% of the time Rakudo on Parrot does"…»
jnthn m: say "I starts in {100*0.31/0.46}% of the time Rakudo on Parrot does"
camelia rakudo-moar 61860f: OUTPUT«I starts in 67.391304% of the time Rakudo on Parrot does␤»
FROGGS a segfault I have to golf: gist.github.com/FROGGS/647a0fa6bcb06d42717e 23:17
interp.c:2060 is existskey 23:18
jnthn obj = 0x0
You're trying to do a lookup in a null hash.
FROGGS hmmm, might be related to this "Probable attempt to load two modules that cannot be loaded together" 23:20
yeah, require-ing my test.pl from a v5 block with loads a v6 script which has a v5 block in it crashes 23:31
and these v5 blocks get a Perl5::Terms module implicitly loaded 23:32
dalek arVM: 26a57a8 | diakopter++ | / (3 files):
try to add aliases and gc value aliases # who knows whether it will work
23:53
arVM: b8f2a2d | diakopter++ | tools/ucd2c.pl:
try again
arVM: 1a93bf0 | diakopter++ | / (3 files):
pass a couple more tests... need more though.
jnthn Hmm, seems that the MAST to bytecode thingy doesn't properly detect duplicate callsites 23:54
mebbe just NYI 23:55
Probably wroth doing since CORE.setting gets 7045 callsites :) 23:57
FROGGS uhh, something to pull 23:58
diakopter jnthn: yeah it broke at some point
jnthn diakopter: Well, I'm thinking of working named arg's names into callsites too
diakopter :) 23:59
jnthn diakopter: And maybe a "no need to re-type-check" flag
I realized that the first, but maybe also the second, will help when it comes to do doing the specializer