|
01:40
pyrimidi_ joined
02:48
ilbot3 joined
02:52
FROGGS_ joined
03:23
TimToady joined
04:31
geekosaur joined
05:10
pyrimidine joined
08:00
domidumont joined
08:05
domidumont joined
|
|||
| dalek | arVM: 8b31d97 | samcv++ | / (3 files): Fix RT #122471 and #122470 return <control-0000> for \0 and other controls RT: rt.perl.org/Ticket/Display.html?id=122471 rt.perl.org/Ticket/Display.html?id=122470 We now pass several tests we were not passing before in uniname.t |
11:47 | |
| synopsebot6 | Link: rt.perl.org/rt3//Public/Bug/Displa...?id=122471 | ||
| Link: rt.perl.org/rt3//Public/Bug/Displa...?id=122470 | |||
| arVM: 3dc5647 | jnthn++ | / (3 files): Merge pull request #469 from samcv/uniname_no1 Fix RT #122471 and #122470 return <control-0000> for \0 and other controls |
|||
| synopsebot6 | Link: rt.perl.org/rt3//Public/Bug/Displa...?id=122471 | ||
| Link: rt.perl.org/rt3//Public/Bug/Displa...?id=122470 | |||
| samcv | thanks jnthn | 11:48 | |
| jnthn | Thank you! :-) I'll try and find a moment to look at #468 soon also. | 11:49 | |
| samcv | oh one thing though | ||
| so there is this bug that has been in it for like | |||
| at least 1 year. where space is not space | |||
| m: say ' ' ~~ /<:space>/ | |||
| camelia | rakudo-moar 340bc9: OUTPUT«「 」» | ||
| samcv | this ONLY works because it matches the space property of Line_Break | 11:50 | |
| and it aliases SP to space and Space | |||
| if i change the alias in the unicode property _VALUE_ file, it totally breaks that | |||
| so i've gotten it fixed so White_Space == space. but | |||
| then it still is broken doing ' ' ~~ /<:space>/ | 11:51 | ||
| and i cannot figure out why :( | |||
| but i believe thet other PR passes all spectests, and still has that bug. | |||
| which also means the spectests pass :P | |||
| i also encountered the opposite case, where ' ' ~~ /<:White_Space>/ would be 'wrong' (aka it would seem to work for 0x20) but then ' ' ~~ /<:space>/ would be broken. though uniprop-bool would return the correct value for both | 11:52 | ||
| jnthn | I think I uncovered something along these lines a while back | 11:53 | |
| samcv | ALSO if I hack the script to change the White_Space property to the 'space' property (so that's its primary name), then change MVM_CC_WHITE_SPACE(something like that) to MVM_CC_SPACE and recompile | ||
| it all works flawlessly | |||
| and all tests pass | |||
| jnthn | Lemme try and find it... | ||
| samcv | though it is a workaround | ||
| jnthn | Yeah, iirc we're being a bit too lenient on what properties we accept without a name qualifier | 11:54 | |
| samcv | that's the only _consistant_ way i was able to fix the bug. | ||
| but it is quite bad that we think 'space' is somehow a property VALUE alias to SP | |||
| m: ' '.uniprop('Line_Break').say | 11:55 | ||
| camelia | rakudo-moar 340bc9: OUTPUT«SP» | ||
| samcv | which is the reason <:space> works. 'space' is one of the whitespace canonical names. while the SP property, it is aliased to Space with capital | ||
| jnthn | aha, this: github.com/perl6/specs/issues/118 | 11:56 | |
| I think the badness may boil down to the thing <:space> compiles into | |||
| Which iirc is a lookup asking "what property name has a value of this name" or some such, which is ambiguous. | 11:57 | ||
| samcv | though what is weird | ||
| jnthn | I ran into this when realized that *some* regenerations of of the Unicode DB failed spectests | ||
| samcv | ' '.uniprop-bool('space') works | ||
| but <:space> doesn't work. actually | |||
| <:space> IS OPPOSITE | |||
| jnthn | And then the next re-run on the same input data worked | 11:58 | |
| samcv | and will match non whitespace and not match whitespace | ||
| jnthn | Becuase hash randomization screwed up the order | ||
| samcv | yeah | ||
| jnthn | I think I discovered this while trying to get it to be resistant to hash randomization and always write out thigns in the same order | ||
| samcv | i notice that too | ||
| jnthn | That patch may be in a branch somewhere | ||
| samcv | what patch? | ||
| to fix? | |||
| jnthn | No | ||
| To get consistent order | |||
| samcv | ah | 11:59 | |
| jnthn | Which...uh...provided consistent breakage. | ||
| samcv | just add sort in front of all the 'keys'? | ||
| :) | |||
| yes | |||
| jnthn | I think it needed a couple of places | ||
| samcv | well i put sort everywhere keys was | ||
| and then it was always broken | |||
| and it was great :) | |||
| jnthn | github.com/MoarVM/MoarVM/commit/22...f17661a58c | ||
| samcv | but the space thing was the most pervasive problem… | 12:00 | |
| but ONLY in regex | |||
| which i think is a regex bug | |||
| related to that thing you linked | |||
| <:Ll> is not a unicode property, it's a value. | |||
| and other things etc | |||
| jnthn | *nod* | ||
| samcv | general categories are distinct though | ||
| jnthn | There's a useful answer on that issue I linked also, from nova patch | ||
| samcv | but who wants to match the SP property of Line_Break without specifying it | ||
| jnthn | "No regex engine allows for arbitrary property values for all properties without the associated names, due to the obvious conflicts." | 12:01 | |
| samcv | oh Also, supporting Script instead of Script_Extension would be a mistake since the latter is generally what people expect and should be encouraged over Script. I p | ||
| +1 for that | |||
| jnthn | This isn't true. Ours apparently does. :P :P | ||
| But I agree it really shouldn't. :) | |||
| samcv | matches extended script? | ||
| do we even parse that unicode file? | 12:02 | ||
| jnthn | No, I meant the "allows for arbitrary property values" thing I quoted. | ||
| samcv | ah | ||
| so i guess it looks up the SP property first | |||
| SP => Space... though.... | |||
| jnthn | Yeah, and the reason it gets that first is, iirc, because we get lucky with the ordering. :S | 12:03 | |
| samcv | if i edit the unicode file and change it from SP;Space to SP; fakeeeee | ||
| then it still breaks | |||
| ah | |||
| yes | |||
| so i don't know what it's looking up for that | |||
| also how to fix it so that it uh. isn't broken | |||
| i have been working hard trying to get things to work | 12:04 | ||
| and i sort of figured out where it breaks | |||
| but it's like "everything is ok here" #####MAGIC HERE##### | |||
| then on the other side it just is totally either screwed up totally or like ok-ishhhhhh | |||
| let me find the line | |||
| jnthn | I wonder if we can fix it by changing how <:Foo> (that is, just a value) is compiled | 12:05 | |
| So that it just considered general category, and script extension, and nothing else. | 12:06 | ||
| *considers | |||
| samcv | uh emit_unicode_property_value_keypairs | 12:07 | |
| what about boolean properties | |||
| like space? | |||
| jnthn | github.com/perl6/nqp/blob/master/s....nqp#L1360 # this is where the compilation happens, fwiw | ||
| samcv | property NAMES don't interfere with script or uhm | 12:08 | |
| YES | |||
| i looked at that | |||
| ahh | |||
| yep | |||
| was hard to understand :P | |||
| jnthn | op('unipropcode', $pcode, $pname), | 12:09 | |
| op('unipvalcode', $pvcode, $pcode, $pname), | |||
| samcv | but i think we should match binary properties | ||
| jnthn | That I think is where it's a bit dubious | ||
| samcv | script names | ||
| uh | |||
| and general category | |||
| and probably script extensions too | |||
| jnthn | Are binary properties reliably unambiguous with general category and script extenion? | ||
| *extension | 12:10 | ||
| samcv | yes | ||
| unles you count Sc and sc | |||
| that is the only exception | |||
| but you shouldn't changecase for names that are only 2 letters anyway | |||
| jnthn wonders if we'll end up wiht more exceptions in the future :) | |||
| samcv | uh | 12:11 | |
| that's not really an exception | |||
| you're only supposed to allow lowercase and no underscore for names that have an underscore | |||
| prettty sure | |||
| will have to see which TR said that | |||
| so like WSpace, space, White_Space are official names. so you can do also | 12:12 | ||
| whitespace, white_space | |||
| or WhiteSpace | |||
| jnthn | Ah, I see. | 12:13 | |
| samcv | actually i think the better rule is the stuff in the 2nd column of the unicode property aliases file | ||
| anything in the 2nd column, the long name, you can do that with | 12:14 | ||
| for sure | |||
| but the 1st column you can't | |||
| let me try and find it | |||
|
12:15
pyrimidine joined
|
|||
| samcv | Loose matching should be applied to all property names and property values, with | 12:15 | |
| # the exception of String Property values. | |||
| With loose matching of property names and | |||
| # values, the case distinctions, whitespace, and '_' are ignored. For Numeric Property | |||
| # values, numeric equivalencies are applied: thus "01.00" is equivalent to "1" | |||
| jnthn | Last I looked at the script, I think we cheated in case a bit also. | 12:16 | |
| samcv | yeah it lowercases them | 12:17 | |
| takes out _ etc | |||
| jnthn | Yup | ||
| Just covers the common ways you might write it | 12:18 | ||
| But not truly case insensitive | |||
| samcv | but the property value and property name aliases shouldn't be in the same stnructure maybe? idk | ||
| jnthn | That's probably the least of our troubles at the moment, however. :) | ||
| samcv | seemed weird | ||
| jnthn | Yeah | ||
| samcv | yeah :) | ||
| jnthn | I guess... | ||
| samcv | it creates a hash with the data there right? | 12:19 | |
| so there would be collisions or? | |||
| jnthn | At the point we compile the regex we can actually case-analyze <:Foo> for if it's a general category name, a script extension name, or a boolean property name | ||
| samcv | is it just a normal kind of hash, with keys and values? | ||
| jnthn | I think it's actually not a hash but instead does some kind of binary search | 12:20 | |
| But I may be misremembering | |||
| samcv | also i'm not sure how it looks up the property names | 12:21 | |
| for a given name | |||
| jnthn | udc2c.pl and the Unicode database stuff is one of the handful of bits of MoarVM that I didn't either write in the first place or significantly rewrite somewhere along the lines. :-) | ||
| samcv | and what the numbers in unicode_property_value_keypairs mean | ||
| heh | |||
| jnthn | And its workings have mystified me a few times too :P | ||
| Also I didn't touch it for a few months. I think that it boils down to something like: each char in the database has a bitfield, which stores bit-packed representations of property values | 12:22 | ||
| So looking up a property name I believe resolves to an index into a table that specifies the relevant bits to extract | 12:23 | ||
| samcv | dammit rt.perl.org isn't posting my emails | 12:24 | |
| jnthn | And then the integer those bits make up provides a way to do a lookup in a property values table | ||
| samcv | yeah that's what i sort of thought | ||
| though. does the order of the pairs in it matter | |||
| is what i want to know | |||
| (in the C file) | |||
| i mean i only got everything working fine when the numbers were the same | 12:25 | ||
| like when i manually changed the in the script (when it saw 'White_Space' it changed it to 'space') | |||
| then all the 'space' and "White_Space" whatever pairs were the same numbers for both the property and the property value datastructures in unicode_db.c | 12:26 | ||
| and i've noticed all the ones that i had to workaround in rakudo didn't match either. so | |||
| but i still don't know if the order matters at all, or maybe it shouldn't matter if all of them don't collide | |||
| because i've seen the same key, have a different value in the same structure. so higher up would be {"space", 21} then lower down {"space", 120} | 12:27 | ||
| jnthn | Aha, reading MVM_unicode_get_property_str in unicode_db.c is somewhat informative | 12:28 | |
| samcv | jnthn, github.com/MoarVM/MoarVM/blob/mast...ops.c#L139 | ||
| jnthn | (And int) | ||
| samcv | yeah it is | ||
| but i want to know how it gets the values aside from that. | |||
| jnthn | Oh yes, that loosk familiar | 12:29 | |
| samcv | tell me what it does! | ||
| well | |||
| jnthn | heh | ||
| It makes a hash | |||
| samcv | what happens if there are multiple same keys | ||
| jnthn | By going through a table | ||
| samcv | with different values in the table | ||
| jnthn | Latest entry wins | ||
| samcv | ok so last one | ||
| jnthn | Yup | ||
| But note it does through the table in reverse order too | |||
| samcv | so keep regenerating until all the roast tests passes? | ||
| :P | |||
| jnthn | (For no particular reason that I can tell) | ||
| samcv | ah k | 12:30 | |
| jnthn | Well yes, that's why regenerating passes things :P | ||
| But it's still because we're too liberal with regards to processing <:Foo> style things, so far as I understand. | |||
| samcv | so yeah. if we fix that. i maybe have fixed the problem | ||
| we can see i guess | |||
| let me recompile that 'fixed' i think version | 12:31 | ||
| jnthn | Yeah, my feeling is if we can fix that form to only consider general category values, script values, and boolean property names we're good. | ||
| Unless spectests rely on the previous more liberal interpretation /o\ | |||
| samcv | then they should feel bad! | 12:32 | |
|
12:32
Ven joined
|
|||
| samcv | i don't *think* they do | 12:32 | |
| but we will only know once we change it I think | |||
| jnthn | Indeed. | ||
| samcv | oh yeah k | ||
| jnthn | Anyway, I'm +1 to changing that. I wonder if it's best to try and do it in the regex compilation | ||
| samcv | yeah it is fixed | ||
| even ' ' ~~ /<:space>/ works | 12:33 | ||
| jnthn | With what fix? :) | ||
| samcv | magic | ||
| let me push it to my fork | 12:34 | ||
| github.com/samcv/MoarVM/commit/960...539ffa53a1 | 12:35 | ||
| oh wait maybe not that one idk | |||
| there are two commits | |||
| it at least works in the most recent one | |||
| github.com/samcv/MoarVM/commits/working | |||
| the one just called 'a' it at least works atm. let me run the two tests which caused problems | 12:36 | ||
| through all the fiddling there were two test files that would stop pasing if things went worng | |||
| i should change that from "terrible workaround" to "amazing workaround because it works" | |||
| though it's not like. super great. but working is working | 12:37 | ||
| jnthn | :-) | ||
| True | |||
| samcv | oh yes they pass | 12:38 | |
| let me try the one before commit called 'a' | |||
| just commited quickly cause it worked… haha | |||
| i *think* the one i said was fully working maybe wasn't working so i made another commit? or both work | 12:39 | ||
| idk | |||
| either both work or the newest one works | |||
| after working on it all day and most of the time the breakage being caused by nothing but seemly chance it gets harder to tell. but i know the change i made with the MVM_UNICODE_PROPERTY_SPACE is the only thing that didn't re-break by running it again | 12:40 | ||
| and changing other things | |||
| which is a good place to start if you can finally get it reproducible | |||
| jnthn | Indeed | 12:41 | |
| samcv | ok no it's only the most recent one 'a' that passes | ||
| where it looks like i changed it back… | 12:42 | ||
| well. it works. | 12:43 | ||
| let me see what else i changed in that | |||
| well i ended up with a different number of keypairs | 12:44 | ||
| that is 6 things smaller | |||
| oh here jnthn github.com/samcv/MoarVM/commit/2f5...cae8fL1005 | |||
| i remember now | |||
| and then once i did that' some of the things errored so i added github.com/samcv/MoarVM/commit/2f5...cae8fL1138 | 12:45 | ||
| changed this die into an if condition | |||
| jnthn | o.O | 12:51 | |
| Ugh. What a headache. :S | |||
| samcv | yes | ||
| jnthn | Oh heck, thinking about the <:Foo> code-gen again... | 12:55 | |
| op('unipropcode', $pcode, $pname), | |||
| It seems to be feeding a property value in there | 12:57 | ||
| And relying on us having polluted the property names table with property values | |||
| That is, $pname is potentally something like Ll | |||
| samcv | yeah | ||
| it is really. really not good | 12:58 | ||
| values, properties? who cares! mash em together | |||
| jnthn | Indeed :( | ||
| I wonder if we *only* rely on it in that one place | |||
| samcv | which line | ||
| well. | |||
| yeah which line :D | |||
| jnthn | The one in NQP code-gen that I referenced | ||
| jnthn finds it again :) | 12:59 | ||
| samcv | oh the one you linked? in nqp? | ||
| ah | |||
| yeah i remember that one | |||
| i thought you meant in moar | |||
| JimmyZ | jnthn: my PR 459 needs your review :) | 13:00 | |
| jnthn | github.com/perl6/nqp/blob/master/s....nqp#L1404 | 13:01 | |
| Note how it uses $pname in both lookups | |||
| samcv | also jnthn what does this merge_ins do | ||
| and what does it do with uh. these things in that list | 13:02 | ||
| like i can see what it calls but not really what it does with it | |||
| jnthn | op means "emit a MoarVM op" | ||
| samcv | oh no | ||
| oh ok | |||
| jnthn | 'unipropcode' is the instruction name | ||
| samcv | ok that makes more sense now | ||
| jnthn | $pcode, $pname and registers | 13:03 | |
| merge_ins just means "stick this array of instructions into this other array of instructions" | |||
| Like "append" in Perl 6 | |||
| Earlier in the method we do things like | 13:04 | ||
| my $pcode := $!regalloc.fresh_i(); | |||
| Which allocates a register | |||
| If you want to see the output, then write a /<:Ll>/ or so, and run nqp --target=mbc --output=x.moarvm | 13:05 | ||
| And then moar --dump x.moarvm | |||
| You can do it with perl6 instead of nqp too, but the nqp code has less clutter around the stuff you'll want to see, and we re-use the same code-gen path for both. | |||
| JimmyZ: Will try and get to that soonish :) | 13:07 | ||
| Time to make lunch; bbl :) | |||
| samcv | ok i just checked it _does_ fail something but it's only failing <:Greek> | ||
| all the other scripts work fine | |||
| idk there's something about space and greek | |||
| well greek is a block AND a uh | |||
| script | 13:08 | ||
| m: my $a = "\c[GREEK LETTER SMALL CAPITAL GAMMA]"; $a.uniprop('Script').say; $a.uniprop('Block').say | 13:09 | ||
| camelia | rakudo-moar 4724bd: OUTPUT«GreekPhonetic Extensions» | ||
| samcv | yeah it fails that regex test, but it gets those properties fine with uniprop | 13:10 | |
| so the problem isn't in nqp | |||
| butttt <:Script<Greek>> works fine | 13:12 | ||
| so i think nqp and moar problem here | |||
| thanks for that time jnthn about using nqp instead of perl6. i was dumping the perl 6 and there were so many things | 13:16 | ||
| i gotta go to bed now. talk to you soon jnthn :) | 13:36 | ||
| nine | /win 14 | 13:42 | |
| JimmyZ | jnthn: thanks ;) | 13:52 | |
|
14:21
pyrimidine joined
15:00
brrt joined
15:03
stmuk joined
|
|||
| dalek | arVM/even-moar-jit: a4632df | brrt++ | / (3 files): Make linear_scan allocator public Also fix a number of minor things, and add some support for register specs. I'm not yet sure how to deal with conflicting register requirements. |
15:09 | |
| arVM/even-moar-jit: 842d5a7 | brrt++ | src/jit/ (2 files): Fix some of the more obvious bugs in linear_scan |
|||
| brrt | nasty bugses | 15:15 | |
|
16:12
japhb_ joined
18:14
dogbert17 joined
18:35
domidumont joined
19:27
FROGGS joined
19:55
pyrimidine joined
23:02
nebuchadnezzar joined
23:55
vendethiel joined
|
|||