»ö« Welcome to Perl 6! | perl6.org/ | evalbot usage: 'perl6: say 3;' or rakudo:, niecza:, std:, or /msg p6eval perl6: ... | irclog: irc.perl6.org/ | UTF-8 is our friend!
Set by sorear on 4 February 2011.
sorear good * #perl6 02:07
phenny sorear: 17 May 07:32Z <moritz> tell sorear that t/spec/S05-grammar/action-stubs.t passes 13 tests and fails a few, might be worth looking at
sorear hello lateau 02:38
lateau goodmorning 02:39
sorear poll for native Japanese users: What do you feel the value of "ぎょ".chars should be? +[ "ア" .. "ン" ] ? 02:51
lestrrat I don't really get what str.chars is supposed to do :/ 02:54
length of the string?
PerlJam depends on context as always :)
PerlJam but, in item context it should be the number of chars in the string 02:55
lestrrat "ぎょ".chars == 2 ?
PerlJam rakudo: "ぎょ".chars == 2 02:56
p6eval rakudo 5f1bf6: ( no output )
PerlJam oops
rakudo: say "ぎょ".chars == 2
p6eval rakudo 5f1bf6: OUTPUT«Bool::True␤»
lestrrat seems right.
takesako i agree
lestrrat so now I have to wonder what sorear's reference to '+[ "ア" .. "ン" ]' is supposed to mean 02:57
takesako rakudo: say "き゚ょ".chars 03:00
p6eval rakudo 5f1bf6: OUTPUT«3␤»
tokuhirom rakudo: say "よ。".chars 03:01
p6eval rakudo 5f1bf6: OUTPUT«2␤»
sorear lestrrat: .chars is supposed to do the most useful thing in terms of the selected language 03:02
Yappoko__ rakudo: say "み。".chars 03:02
sorear lestrrat: .graphs has a very specific technical definition (it counts Unicode code points, except for zero-width combining characters)
p6eval rakudo 5f1bf6: OUTPUT«2␤»
sorear every kana or kanji counts for exactly 1 in .graphs 03:03
so far, every language I've checked out has had .chars matching up with .graphs
takesako Unicode BOM must be 0 ? 03:04
lestrrat rakudo: say "ぎょ".graphs 03:05
p6eval rakudo 5f1bf6: OUTPUT«Method 'graphs' not found for invocant of class 'Str'␤ in main program body at line 22:/tmp/AaZuOpXKss␤»
lestrrat grr
sorear I think BOM shouldn't even show up in Unicode strings
but if it does, it should be 0
takesako oh, now i see what you mean 03:07
sorear How often do you look at "ぎょ" and see a single, indivisible thing? 03:08
lestrrat without knowing the details of unicode and such, my initial reaction is that it should always be 2. 03:09
sorear rakudo: say "ア" .. "ン" # lestrrat: should this be changed to return 46 results?
p6eval rakudo 5f1bf6:
..OUTPUT«(timeout)アアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアアア􏿽xE3
sorear lestrrat: the point of .chars et al is that we want users, especially non-English users, to not have to think about Unicode at all 03:10
rakudo: say map *.chr, "ア".ord .. "ン".ord
p6eval rakudo 5f1bf6: OUTPUT«アィイゥウェエォオカガキギクグケゲコゴサザシジスズセゼソゾタダチヂッツヅテデトドナニヌネノハバパヒビピフブプヘベペホボポマミムメモャヤュユョヨラリルレロヮワヰヱヲン␤»
takesako www.fileformat.info/info/unicode/ch.../index.htm 03:11
tokuhirom where is "ァ"...
takesako Unicode Character 'COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK' (U+3099)
sorear takesako: one before the beginning of that list
the .. operator on strings is another place where Perl 6 is supposed to be sensitive to natural language 03:12
it knows that some alphabets have a natural order
rakudo: say [ 'a' .. 'z' ].elems
p6eval rakudo 5f1bf6: OUTPUT«26␤»
sorear rakudo: say 'α'..'ω' 03:13
p6eval rakudo 5f1bf6: OUTPUT«αβγδεζηθικλμνξοπρστυφχψω␤»
sorear rakudo: say map *.chr, 'α'.ord .. 'ω'.ord
p6eval rakudo 5f1bf6: OUTPUT«αβγδεζηθικλμνξοπρςστυφχψω␤»
takesako oh
sorear note that the .. form is smart and skips over 'ς' (a special form of σ used at the end of words) 03:14
takesako Windows .NET String.Length property returns the number of Char objects in this instance, not the number of Unicode characters. 03:18
sorear yes, it's horrible 03:19
that should be .bytes('utf16') / 2
.codes should count Unicode characters 03:20
.graphs counts non-combining characters
rakudo: say "とうきょう".comb.perl # Would anyone expect this to produce 4 elements? 2? 03:21
p6eval rakudo 5f1bf6: OUTPUT«("と", "う", "き", "ょ", "う")␤»
takesako rakudo: say "㍑".graphs 03:23
p6eval rakudo 5f1bf6: OUTPUT«Method 'graphs' not found for invocant of class 'Str'␤ in main program body at line 22:/tmp/mB21cexQkp␤»
sorear that is an interesting character! 03:24
but it will come up 1 eventually
what Rakudo calls .chars today is what the spec calls .codes
takesako www.drk7.jp/MT/drk/images/090615/img01.jpg 03:27
lateau rakudo: say "㌶".ord 03:28
p6eval rakudo 5f1bf6: OUTPUT«13110␤»
sorear takesako: what do you use that symbol for? 03:30
is it some kind of purist replacement for 七L ?
takesako unicode.org/reports/tr15/ 03:38
TimToady sorear: .chars always means .graphs by default, and cannot have language-specific meanings in the absense of a specific declaration either setting the language or saying where the language is specified 03:47
'course, in rakudo .chars always means .codes currently... 03:48
.graphs tries to be all things to all languages, to the extent that is possible, but it pays no attention to the current language 03:49
unicode seems to call ㌶ "square hectares", but that's a silly name for something that's already an area measurement 03:52
PerlJam maybe square hectares are really cubic? :)
TimToady maybe rectangular hectares are a different symbol :) 03:53
square hectares would have to be 4 dimensional, meseemeth
sorear what do square Roentgens measure? 03:55
takesako rakudo: say "㍉㌔㌢㍍㌘㌧㌃㌶㍑㍗㌍㌦㌣㌫㍊㌻㎜㎝㎞㎎㎏㏄㎡㍻№㏍℡㈱㈲㈹㍾㍽㍼".comb(/./)>>.ord.join(",") 03:56
p6eval rakudo 5f1bf6: OUTPUT«13129,13076,13090,13133,13080,13095,13059,13110,13137,13143,13069,13094,13091,13099,13130,13115,13212,13213,13214,13198,13199,13252,13217,13179,8470,13261,8481,12849,12850,12857,13182,13181,13180␤»
PerlJam wonders what a round hectare looks like 03:58
TimToady I think the "square" part of it merely means it fits into a character cell... 04:07
TimToady except my character cells never come out quite square, alas... 04:09
perigrin TimToady: shody workmanship, find a better contractor and they'll make sure the cells come out square. 04:30
sorear just spent 15 minutes trying to find 04:48
whether :x($y) allows passing by the name $y 04:49
dalek ecza: d2e6f22 | sorear++ | src/ (2 files):
mergeback; allow passing positional parameters by name
04:53
cotto Who all in here wants Parrot's Select dynpmc to become a thing? pmichaud mentioned that some Rakudo folks wanted to see it, but I haven't had a chance to ask him who. 05:20
sorear cotto: a select op would also be an option 05:24
cotto sorear, it would. We already have a dynpmc, but I'm not a fan of the interface.
sorear cotto: rakudo can't generally use dynpmcs as is (limitations in HLL mapping?); there will be a wrapper regardless of what you offer
cotto sorear, sure. I would like to know what would require the least wrapping. 05:25
i.e. what'd be a nice natural interface
sorear do we want the OS dependancies to exist at the Parrot level or the Rakudo level?
do we want bare bindings to select, poll, epoll, kqueue, etc, or a single unified AIO system? 05:26
cotto That sounds like a Rakudo decision.
but it's also important to figure out what Parrot wants to provide.
sorear right; I'm sort of asking everyone 05:27
sorear it might make sense to offer both low-level bindings and a MultiIO.nqp library 05:28
that way HLLs that need to offer select can do so 05:29
OTOH, I've been told that Parrot has no interest in providing direct POSIX bindings
cotto I don't remember such a discussion, but it sounds plausible. 05:30
maybe I do.
The nice thing about Select as a dynpmc is that we have code for it now. 05:32
moritz good morning 05:47
sorear hi moritz 05:53
sorear cognominal: When the Perl 6 text operators are set to French mode, are there any groups of letters that substr() should treat as a unit and not split up? 06:18
Su-Shee
.oO(I've just parsed "french maid mode".. need more coffee...)
06:45
good morning everyone.
sorear hi Su-Shee. 06:47
it seems TimToady was mostly thinking of German ss and Spanish ch/ll when he created chars-mode; I am increasingly wondering if there are *any* languages that are supported by Unicode, but have use for chars-mode 06:48
do we have any expert users of Devanagari and its geographic kin here?
how about Semitic scripts?
tadzik good morning Su-Shee 06:50
and everyone :)
moritz sorear: maybe try p6l, it has a wider circle of readers than #perl6 07:17
cognominal sorear, I am not sure I understand the question. I don't know much about Unicode. I suppose that if somehow one can get e' for é as a normal form, this a atomic group of letter you are talking about. 08:04
But then, how to deal with the string 'e' ? 08:05
does my answer make any sense to you?
may be if you point me to some documentation, I will be able t make my mind or do further research. 08:06
Su-Shee, you are reading to much the papers about DSK :) 08:07
rakudo: say "é".chars 08:09
p6eval rakudo 5f1bf6: OUTPUT«1␤»
cognominal rakudo : say "Dès Noël où un zéphyr haï me vêt de glaçons würmiens, je dîne d’exquis rôtis de bœuf au kir à l’aÿ d’âge mûr & cætera !".chars 08:10
rakudo: say "Dès Noël où un zéphyr haï me vêt de glaçons würmiens, je dîne d’exquis rôtis de bœuf au kir à l’aÿ d’âge mûr & cætera !".chars 08:11
p6eval rakudo 5f1bf6: OUTPUT«119␤»
cognominal hum, 119 sounds too much.
moritz rakudo: say 'bœuf'.chars 08:12
p6eval rakudo 5f1bf6: OUTPUT«4␤»
cognominal The sentence is a pangram that contains all the accentuated chars known to french. 08:13
well, that the definition of a pangram.
To be true, the definition of pangram is silent about the presence of ligatures and accentuated chars. 08:15
Also there are ligatures that are optional like for word with ff 08:16
moritz cognominal: I understand sorear's question differently: are there any ligatures which are written as two characters, but treated as one in French? 08:17
cognominal Many people write boeuf or caetera because they don't know to do a ligature with a keyboard. But if I see these ligatures in the keyboard visualizer when holding the ⌥ key on my mac. Anyway they are so rare that mot people ignore them. 08:28
moritz but there are also cases where 'eu' doesn't stand for the ligature, right? 08:29
cognominal indeed. Like in "il a eu peur" 08:32
moritz phenny: fr "eu"? 08:33
phenny moritz: "had" (fr to en, translate.google.com)
moritz had his last french lessons a decade ago
considering that I had basically no practise, it's amazing how much written French text I can still read 08:34
cognominal But the ligature in bœuf is in oe.
moritz oh
(listening to French is no good, those people just always talk way too fast)
cognominal I will check antidote, may be the ligature is facultative now. 08:35
I feel antidote is the best french dictionary, done in Québec, available on mac, pc and iDevices 08:37
moritz 'oe' as non-ligature also exists
moritz coexister 08:37
cognominal yes, and antidote says indeed "when the sequence o e represents more than one sound, there is no ligature" 08:40
for the ligature æ, antidote says it is up to the user to do it or not. 08:41
personally, I have never seen before the form "curiculum vitæ" mentionned by antidote. 08:43
moritz "curiculum vitæ" looks like somebody being overly smug :-)
cognominal indeed. That's from latin and I don't think there ligature in latin. 08:44
*there ere ligatures
cognominal I am not an expert in typography, bu roman letter were the equivalent of modern upper case, I think. Now, it is possible to do the two french ligatures in upper case but I think it is done more rarely. 08:50
cognominal antidote is apparently silent on that subject even if it talks on a related one, accents for upper case letters 08:53
arnsholt cognominal: IIRC monumental inscriptions in Latin will use the AE ligature (think I saw that in Rome) 11:17
cognominal arnsholt: interesting
arnsholt sorear: I can read and write Devanagari 11:18
Know a bit about Arabic script, but not much more than you can learn from Wikipedia
cognominal indeed, en.wikipedia.org/wiki/%C3%86 talks aboit the AE ligature in classical latin 11:19
ggoebel www.google-melange.com/gsoc/proposa...cian1900/1 13:02
python3on parrot...
> use 6model. It's full of unknowns and I'm not convinced it can map to Python's object system cleanly and efficiently,
moritz ggoebel: you missed to quote the rest of the sentence 13:03
"either use 6model, or..." 13:04
moritz ggoebel: which I understand as "evaluate the two alternatives, and use whatever fits better" 13:04
pmichaud good morning, #perl6 13:35
moritz o/ 13:36
takadonet pmichaud: morning
moritz tadzik: FYI I'll be on a conference next week (Sunday - Thursday), and likely have poor connectivity. If you need help with gsoc-y decisions, please defer to masak 13:37
tadzik: you can als /msg me, I'll read it when I get internet access 13:38
pmichaud decides to rename "rakbench" to "rpmark" 13:50
PerlJam rpm ark? rp mark? r pm ark? :) 13:51
pmichaud "Rakudo/Parrot benchmarks"
and yes, there are other possibilities for it as well :)
moritz use rpbench instead
pmichaud yeah, I thought of that too. 13:52
moritz "bench" is easier to map to "benchmark" than "mark"
pmichaud well, "mark" is what we produce. :-)
isBEKaml r pm ark, sounds a lot gruff. :-)
har har har
pmichaud also, I want to preserve the possibility that it'll be used for more than just benchmarking Rakudo :) 13:53
"rp" gives a few more options
ggoebel moritz: yes... quoted slightly out of context. I hope 6model fits best. Purpose of mentioning it was to try to connect jnth to lucian if that connection doesn't already exist
pmichaud I'll decide between rpmark and rpbench in a bit. votes welcome :)
bbiab
isBEKaml speaking about benchmarks, I vaguely remember util/colomon working on their own benchmarks. Is that still on? 13:56
colomon: ^^
tadzik moritz: acknowledged, thanks 14:00
pmichaud isBEKaml: yes, those are in perl6/bench-scripts 14:12
I'm copying many of those scripts (as appropriate) into rpmark
pmichaud still on fence between "rpmark" and "rpbench" (or even something else entirely) 14:13
JimmyZ_ +1 to rpbench 14:14
pmichaud maybe I should call them "onions"... because so far when I open them up, they make me cry. 1/2 :-)
JimmyZ_ I don't know what does mark mean
isBEKaml pmichaud: oh, thanks. 14:15
pmichaud: +1 to rpbench 14:17
JimmyZ_: mark as in bench_mark_
:)
JimmyZ_ isBEKaml: yes, but it does not give me that impression 14:19
isBEKaml JimmyZ_: yes, I guess that's part of the reason we both voted for rpbench. :) 14:21
JimmyZ_ yeah
PerlJam pmichaud: I vote rpbench btw. Makes me think of "bench testing" and such 14:23
pmichaud wfm
speed.pypy.org uses geometric mean (of a suite of tests) to compare pypy against other python implementations. is geometric mean a good calculation to use for a "composite index" like this? 14:32
i.e., if I wanted to come up with an overall comparison of one build versus another... would geometric mean of the benchmark times of each work? 14:33
PerlJam I looked at speed.pypy.org and really I have no idea what their overall speed score really *means* 14:35
colomon my first naive thought would be arithmetic mean of the percentage difference? I'm not sure why geometric mean might be preferred.
PerlJam It didn't look like it was tied in any way to the importance of the feature being tested
colomon +1
pmichaud I'm pretty sure arithmetic mean of percentage won't be quite right
PerlJam: my guess on speed.pypy.org is this: consider the first two benchmarks -- they have speeds that are .65 and .07 of cpython 14:36
so to find "how much better is pypy than cpython", report the geometric mean of the two
although since geometric means are always less than arithmetic means... there might be a bias there 14:38
PerlJam pmichaud: but better at *what*? It presupposes that the benchmarks adequately measure the speed of "important" things and doesn't unduly consider the speed of "unimportant" things 14:39
pmichaud PerlJam: that's outside the scope of what I'm trying to do
colomon "The geometric mean is more appropriate than the arithmetic mean for describing proportional growth..." sounds like an argument for percentages and geometric mean.
pmichaud the number just needs to say "on this set of benchmarks, build A did X factor better than build B" 14:40
I'm not trying to make larger claims about the overall improvement
pmichaud I'll go with geometric mean of the percentages for now 14:42
thanks
jdv79 will these benchmarks be posted somewhere? 16:23
sorear moritz: to get a useful response on p6l I would need to be much clearer, because it's not as easy to go back and forth... 16:36
moritz sorear: what you want is two languages, one treating the same sequence of characters as one character, and the other treating them as distinct characters. Correct? 16:58
sorear moritz: basically, yes 17:15
sorear moritz: really what I'm after is cases where a grapheme-level .comb, .length, or .substr goes wrong 17:16
The Unicode standard itself special-cases Korean, so Korean is definitely not such a language
sorear observation: Rakudo 2011.05 still doesn't have a name 17:21
Su-Shee "George-Henry Miller-Whipplesteen"?
sorear Su-Shee: what county is George-Henry Miller-Whipplesteen.pm in? 17:22
Su-Shee *hehe* Absurdistan ;) 17:23
tadzik shame there's no masak around, we may be able to go for Zebras.pm then ;)
sorear wonders what ey needs to do to get niecza supported by rpmark 17:24
moritz sorear: I guess test scripts that run both with rakudo and niecza 17:31
arnsholt sorear: Ping? 17:51
arnsholt phenny: tell sorear I'm read Devanagari (of the Sanskrit variety, slightly different rules than the Hindi one) and Wikpedia-familiar with Arabic script. What do you need to know? 17:57
phenny arnsholt: I'll pass that on when sorear is around.
flussence ooh, someone's making a bash script parsing library for GSoC. The fun I could have with that... 20:32
sbp link?
flussence dev.gentoo.org/~qiaomuf/libbash.html 20:33
sbp thanks!
moritz what's wrong with the parser in bash? 20:40
bash has the -n option for syntax checking, so it can surely parse without executing 20:41
flussence I think the main reason for this as a library is for embedding in gentoo's package management stuff, because it saves a few hundred forks that way 20:43
(all the packages are bash scripts, more or less)
moritz and it only needs to parse (and not execute) them= 20:46
s/=/?/
flussence IIRC, the package format spec says external commands are only allowed in a few specific places, all the metadata should be plain bash syntax 20:47
(not that people always follow the spec...) 20:48
moritz that's close to the borderline between genious and insanity, or something
flussence nah, it's a few light years from that line :) 20:49
moritz on which side? :-)
flussence well, people have forked the distro more than once over the state of this stuff, so... :) 20:50
tylercurtis o/, #perl6 21:06
moritz \o 21:08
Util \o 21:10
pmichaud Latest benchmark results (with updated toolset): gist.github.com/979705 22:12
PerlJam pmichaud: smaller numbers are better? 22:35
flussence those are times, so yes 22:36
PerlJam just checking
whiteknight is it possible to print some kind of status information to the console during the core.pm build? 23:12
like, the name of each function as it is compiled
or anything that would show forward progress?
pmichaud we'd have to add something to nqp 23:13
er, nqp-rx
whiteknight does Rakudo use the version of parrot-nqp that's bundled with Parrot?
pmichaud yes
whiteknight ok
pmichaud benchmark results for plum: gist.github.com/979705 23:14
whiteknight I may try to add something to the perl6 actions file, to print out something when we compile a function
if I can find a suitable place to add such a beast