svn switch --relocate svn.openfoundry.org/pugs svn.pugscode.org/pugs/ | run.pugscode.org | spec.pugscode.org | paste: sial.org/pbot/perl6 | pugs.blogs.com | dev.pugscode.org/
Set by putter on 11 February 2007.
00:07 IllvilJa joined
[M]erk What does the darker green mean vs. lighter green in the smokes summaries? 00:42
00:42 shay joined 00:46 weinig is now known as weinig|away 00:52 sunnavy joined 01:03 TimToady_ joined
svnbot6 r15266 | lwall++ | infix and postfix now more predictive 01:13
01:28 TimToady_ is now known as TimToady 01:29 gaal joined
TimToady [M]erk: darker green is actually a Todo 01:29
ought to be purple or some such 01:30
01:30 jisom_ joined 01:39 bonesss joined 01:44 weinig|away is now known as weinig 01:49 cmarcelo joined
audreyt Grrrr: congratulations on DBIx::Perlish. It's simply brilliant. :-) 01:55
cmarcelo Trac is on feather? if so, where are its config files? 01:57
(moose)
audreyt cmarcelo: /data/svn/trac/ 01:58
wolverian it's also very scary. :)
cmarcelo feels that pugs has too many sites =|
i'll try to connect them to each other at least... 01:59
audreyt *nod*
you're now TRAC_ADMIN
feel free to hand out admin bits as needed (from the Admin/permissions page)
cmarcelo audreyt: (feeling better?)
k 02:00
audreyt no, not really, can't focus for more than 15min at a time
highly annoying
cmarcelo and the "saving throw"-thing ?
audreyt it looks like I passed
a small chance I can leave hospital tomorrow 02:01
otherwise I'll stay for another week or so
but either way this will pass
and it looks like no complication will follow
cmarcelo @tell putter Do you have a blog bit? What do you think about posting on project status, etc?
lambdabot Consider it noted.
audreyt I guess I should be grateful :)
but it's still highly annoying :) 02:02
cmarcelo audreyt: great news :) [except for annoyances now/next week]
02:04 justatheory joined
audreyt yeah :) *faints some more* 02:04
araujo hospital? 02:05
cmarcelo the family of sites is: pugscode.org, feather, dev.pugscode.org, rakudo-wiki, perl.org/p6... (shout if I missed something) 02:06
blog.p.o and irc.p.o too 02:07
audreyt spec.pugscode.org 02:09
irc.pugscode.org
run.pugscode.org
invite.pugscode.org 02:10
smoke.pugscode.org
that's it I guess
cmarcelo rakudo is "Perl 6" oriented and dev wiki is "Pugs" oriented? 02:12
audreyt or rather, rakudo is user-facing
while dev I hope is dev-facing
i.e. it more closely integrates svn/tickets/irc
while rakudo is more about presenting userland info 02:13
cmarcelo perl.org/perl6 is parrot-oriented?
audreyt it is history-oriented :) 02:14
cmarcelo k
02:19 dmq joined
cmarcelo Pugs development takes place on the perl6-compiler mailing list. 02:21
:o)
(quotation from pugscode.org)
TimToady we just fixed that on the wiki 02:22
cmarcelo TimToady: by fixed you mean? (which wiki? rakudo?) 02:24
TimToady dev.pugscode.org/wiki/AboutPugs 02:27
lambdabot Title: AboutPugs - Pugs - Trac
cmarcelo tks. I'll borrow the fix then.. 02:28
TimToady mostly just delete that paragraph...
cmarcelo hmm, but Pugs::Doc::Hack points to old (but beautiful) CPAN version... 02:29
I think it's better to make "How to get involved" point to dev.pugscode.org 02:31
[M]erk Which is the pugs mailing list then? It doesn't look like perl6-compiler is all that active. And perl6-internals seems to be mostly parrot, right? 02:55
cmarcelo [M]erk: pugs discussions are mainly at this IRC channel. perl6-compiler is seldom used... 02:57
svnbot6 r15267 | cmarcelo++ | * feather index: add links to other sister sites, remove kwiki links. 03:00
r15267 | cmarcelo++ | * pugscode.org: add more links, cleanup a bit.
r15267 | cmarcelo++ | * Pugs/Doc/Hack.pod: fix some links.
[M]erk PUGS - the project that email is too asynchronous for! PUGS - the project that is living in the now! PUGS - the project that is...
cmarcelo [M]erk: but we have a Trac system now => dev.pugscode.org ;) 03:01
(dev.pugscode.org/changeset/15267 => peer-review welcome) 03:02
lambdabot Title: Changeset 15267 - Pugs - Trac 03:03
[M]erk gasps.
You know that's a...
[M]erk whispers, "Python project" :p
cmarcelo s/Trac/development wiki/ 03:04
=P
[M]erk Can someone give me svn commit privileges? 03:06
SamB [M]erk: shh
they'll hear you!
and give you a commit bit!
lumi They? 03:07
[M]erk: Got an email address? You can /msg me 03:09
svnbot6 r15268 | cmarcelo++ | * pugscode.org: fix a linebreak. 03:12
lumi [M]erk: Commit bit sent, welcome to Pugs! 03:14
[M]erk Thanks.
lumi [M]erk: You can test commit by adding your name to AUTHORS
It's a tradition, or an old charter, or something :) 03:15
SamB their ought to be a PURPORTED-AUTHORS file for that ;-P 03:23
04:45 ilogger2 joined 05:08 shay joined 06:02 autark_ joined 06:09 devogon joined 06:13 BooK joined 06:24 amnesiac joined 06:51 sunnavy joined, jamhed joined
svnbot6 r15269 | lwall++ | Constraints on which operators metaoperators can metaoperate on. 07:22
07:26 iblechbot joined 07:28 marmic joined 07:45 kanru joined 07:54 VanilleBert joined
Grrrr audreyt: thanks :-) 07:58
08:19 keigo joined 08:49 kanru joined 08:56 baest joined 08:57 baest_ joined 08:58 nekokak joined 09:13 VanilleBert left 09:21 UWC joined 09:25 andara joined 09:36 lumi_ joined 09:40 iblechbot joined
dduncan So if Grrrr is indeed Anton Berezin ... I like the idea of what I saw in DBIx::Perlish ... and am hoping that you may be able to do something similar for my non-SQL database, currently called QDRDBMS, after it is released. 09:57
For simplicity, my own DBMS takes certain Perl objects, holding an AST, as input, but it might be nice to have an even more Perlish interface as an optional extension, such as like you did with DBIx::Perlish. 09:58
Talk more later ... 09:59
09:59 pfarmer joined 10:06 elmex joined
Grrrr dduncan: I don't see a problem; parsing optree is relatively easy, thanks to B.pm; the difficult part is to come up with a sensible subset of Perl5 syntax to support 10:31
dduncan fortunately, the language of my DBMS is a lot easier to map Perl to than SQL is, given partly that its design is more Perlish 10:38
or should I say, more like a normal programming language
10:42 lichtkind joined 10:47 ruoso joined 11:06 zgh_ joined
Aankhen`` Wow. Just built Pugs again after a year (I think!), and `nmake fast` really is *fast*. 11:47
I suppose it could also be the completely different system. :-P 11:48
12:05 koye joined
ruoso @seen fglock 12:06
lambdabot I saw fglock leaving #perl6 5d 17h 54m 48s ago, and .
ruoso hmm... that's unusual...
12:07 chris2 joined 12:08 ofer1 joined 12:18 gaal joined 12:23 TimToady joined 12:25 rfordinal joined 12:34 lichtkind_ joined
baest_ ruoso: not sure, but I think he has a vacation 13:06
13:14 buetow joined 13:21 Limbic_Region joined 13:32 baest_ is now known as baest 13:42 andara joined 13:59 rfordinal_ joined 14:00 bonesss joined 14:14 pmurias joined, thepler joined
pmurias hi 14:14
moritz hi ;) 14:15
pmurias in the mmd algorith is the type narrownes dependent on the paramater under consideration 14:16
?
mortiz: is saying "hi" before asking a question silly? 14:17
moritz pmurias: ni, it's not ;) 14:18
pmurias is thinking how to efficiently implement mmd 14:24
14:24 bonesss is now known as bones`eat 14:29 diakopter joined 14:36 rfordinal_ is now known as rfordinal
pmurias do other languages with MMD allow you to specifiy the importance of parameters with semi-colons? (not nessesarly with the same syntax) 14:40
?
14:42 rindolf joined 14:54 amnesiac joined 14:58 vel joined
Coke_ I don't think parrot works that way, fwiw. 15:00
15:12 iblechbot joined
[particle] i know some languages dispatch based on first one or two params only, but none that use a qualifier to denote dispatch semantics 15:20
however, my knowledge of languages with polymetric polymorphism (aka mmd) semantics is incomplete
pmurias thanks 15:22
15:23 sunnavy joined 15:24 bones`eat is now known as bonesss 15:30 GeJ joined 15:37 ofer1 joined 15:45 andara joined 17:11 rfordinal_ joined 17:17 cjeris joined 17:23 buetow joined 17:46 andara left 17:48 dmq joined
svnbot6 r15270 | lwall++ | Constraint checks on parameter zones 17:48
r15270 | lwall++ | Constraint checks on reducable infixes
r15270 | lwall++ | Random cleanup
dmq just thought id mention that abigail has successfully converted the BNF for email addresses to a perl5.10 recursive regex. 17:49
pasteling "dmq" at 84.58.61.90 pasted "rfc compliant regex parser as a perl 5.10 recursive regex" (69 lines, 2.6K) at sial.org/pbot/23004 17:50
broquaint Brilliant. 17:51
That's some nice work you've done on the regex engine there, dmq. 17:52
dmq he said on #p5p he will look into writing a BNF to regex converter.
glad you like it broquaint. 17:53
btw, long time no chat. hope you are well
broquaint I'm well, yourself? 17:54
dmq not too bad.
broquaint All is well :) 17:56
17:58 xinming joined
[particle] great example, dmq 18:01
i'd like to see a p6regex -> p5regex converter, and attribute handlers as the syntax marker
dmq yeah, it is cool. Abigail has been doing some testing of the new features. 18:02
[particle] he's certainly qualified :)
dmq heh
did i say testing?
torturing.
[particle] abigail++
dmq anyway, hopefulyl ive put most of what is required to write such a converter in place. 18:03
assuming you ignore code stuff in p6.
[particle] fglock will be most happy, i imagine 18:04
dmq heh.
until he finds the stuff that p5 does that p6 doesnt ;-)
(?|...) comes to mind. 18:05
although maybe p6 does already have something like that. (it makes capture buffers in different alternations share the same indexes) 18:06
[particle] i don't see that construct in perlre. ew to 5.10?
*new
dmq yes. a couple of weeks old i guess. 18:07
[particle] ah
dmq (?|..(foo)..|..(bar)..) both capture into the same buffer.
H. Merijn Brands idea. 18:08
[particle] $<buffer>:=[(foo)|(bar)] i imagine
18:08 araujo joined
dmq heh. figured p6 would cover it somehow. 18:08
18:10 justatheory joined 18:12 gilimanjaro joined 18:17 apostols joined 18:25 gilimanjaro joined
Coke_ f 18:30
(oops)
TimToady dmq: (?|...) is the standard behavior for alternations under P6 18:36
S05: The index of a given subpattern can always be statically determined, but 18:37
is not necessarily unique nor always monotonic. The numbering of subpatterns
restarts in each lexical scope (either a regex, a subpattern, or the
branch of an alternation).
wolverian argh, the lack of covariance in java is making me insane. can someone please just destroy this horrid crap. 18:40
thanks, now I feel better.
Patterner Wait for Java 8 18:44
18:51 DebolazX joined
wolverian I particularly like how it does support return type covariance, but not parameter covariance, making the other practically useless (for my purposes, anyway) 18:53
the return type covariance needs to be explicit, too. sigh. 18:54
18:54 apostols left 18:59 drupek12167 joined 19:03 UWC joined
dmq timtoady: oh goodie, then fglock WILL be happy about (?|...) 19:05
what happens when there is a different number in one of the alternations? 19:06
(?|..(foo)...(foo)...|..(bar)..)(baz)
what number will the baz buffer have?
i made it be $3
Coke_ java 8 or Java 1.8? =-)
Coke_ suggests being all hip and having Perl 6 actually be perl 5.24 19:07
dmq btw, sorry for the lag timtoady.
19:15 cddar joined, Caelum joined 19:16 wilx` joined 19:37 bernhard joined 19:41 larsen_ joined 19:48 wilx` is now known as wilx 19:54 rindolf joined
rindolf Hi all. 19:54
moritz re rindolf ;) 19:56
rindolf Hi moritz 19:57
moritz: what's up?
moritz rindolf: I'm fine, just had a bunch of pancakes as supper ;)) 19:59
moritz feels fat ;)
rindolf: but regarding perl6/pugs: not much :(
rindolf moritz: do you feel fat or do you feel full? 20:00
moritz rindolf: rather full than fat ;) 20:01
rindolf moritz: OK. 20:02
20:02 UWC joined
moritz and I'm trying to do some web apps with catalyst... 20:04
it's very confusing, too many different files
specbot6 r13587 | larry++ | Split statement_modifier category in two. 20:07
r13587 | larry++ | List comprehensions can now be done with statement modifiers.
r13587 | larry++ | Multiple dispatch now explained in terms of topological sort.
r13587 | larry++ | Multiple dispatch with single semicolons clarified, maybe. However, multis
r13587 | larry++ | with single semicolon are likely just a reserved syntax in 6.0.0.
TimToady dmq: no, it's actually $1 in P6, short for $/[1]
the others would be $/[0][0]
$0[0] for short
maybe $/[0;0] works too 20:08
20:09 stevan_ joined
dmq ah right 20:10
rindolf TimToady: are you larry in the previous commit?
TimToady: or are you lwall?
dmq is that an xor or an or or? 20:11
TimToady I'm larry on perl.org and lwall on pugscode.org.
nobody's ever accused me of being consistently consistent
dmq er, actually, "ah right" was the wrong thing to say. better would have been "oh really. umm ok, i probably havent read something i should have" 20:13
:-)
TimToady I hear S05 comes highly recommended. :) 20:18
masak mm, @evens = ($_ * 2 if .odd for 0..100); 20:22
nifty
also quite readable
TimToady and just falls out of existing syntax, basically 20:25
course if you want to have multiple lists, you have to get fancier
rindolf masak: I get a syntax error in r15257 20:26
masak: this expression seems Pythonic. 20:27
dmq ok, ill read it up.
i thought i already did. but obviously theres a lot of material.
btw, do you have an suggestions for how to do char class set operations in perl5? 20:28
rindolf TimToady: did you invent TAP (that "^ok" "^not ok" syntax)?
dmq ive thought of introducing an "extended char class notation" as (?[....]....)
like (?[a-z]-aeiou)
but its kinda fugly. 20:29
Coke_ "The basis for the TAP format was created by Larry Wall in the original test script for Perl 1" - frmo search.cpan.org/~petdance/TAP-1.00/...pm#AUTHORS 20:30
lambdabot Title: TAP - The Test Anything Protocal - search.cpan.org
rindolf Coke_: thanks.
20:30 rashakil_ joined
TimToady dmq: if you want to do it more p6-like, you'd say something like (?+[a-z]-[aeiou]) 20:35
which also allows for Unicode properties as names 20:36
mugwump git.catalyst.net.nz/gitweb2?p=perl....3;f=t/TEST # first Harness :)
lambdabot Title: git.catalyst.net.nz Git - perl.git/blob - t/TEST, tinyurl.com/25bk2x
rindolf TimToady: wanna see some Lisp code I wrote in vim?
20:36 [particle] joined 20:37 nipra joined
TimToady I am not so in love with either Lisp or vim that I would crushed to miss it. :) 20:38
what does it do?
dmq (?+...) sounds interesting.
not sure if its been grabbed for something tho. 20:39
rindolf TimToady: calculates the Graham function.
TimToady: it's a port of my code to the advanced Perl Quiz-of-the-Week No. 8, IIRC.
TimToady: don't you use vi?
dmq dang, (?+...) is taken. 20:40
relative recursion.
TimToady should have read S05 first... :)
dmq that sucks.
i did.
theres a LOT of stuff in there.
rindolf opensvn.csie.org/shlomif/programs/l...-function/ just in case.
lambdabot Title: Revision 975: /programs/lisp/trunk/graham-function, tinyurl.com/2sl8uf 20:41
dmq and in the regex engine itself. keeping both in mind at the same time is kinda hard. :-)
TimToady yes, but you should realize that <...> is the exact analog of (?...)
dmq yes, i like that.
shay hello folks
rindolf Hi shay 20:42
TimToady howdy
shay hi shlomif
larry
dmq hrm, maybe we should change relative recursion so its (?&+1) and (?&-1)
rindolf shay: what's up?
[particle] too late to change (?+...) to (?@...) or something else?
shay rindolf, one week left :)
rindolf TimToady: I also have a Perl 6 version.
shay: to what?
shay to get released
chafshash
dmq its +/- for a specific reason.
rindolf shay: from the Army?
shay yeah
rindolf shay: is it the end of your service?
shay yes 20:43
rindolf shay: or just a vacation?
shay: nice.
shay: congratulations.
shay end of the never-ending service
rindolf, thanks
I'll finally have time to work on the sparc port
rindolf shay: SPARC port to what?
shay: SPARC port of what? 20:44
shay perl6
pugs/parrot
rindolf shay: you mean it doesn't run on SPARC atm?
shay: doesn't ghc run on SPARC? 20:45
shay rindolf, it *should* run, but I wan to officially maintain it
rindolf shay: ah OK.
shay rindolf, test every release, make some test near-daily
rindolf shay: do you have a SPARC at home?
shay rindolf, yeah, a Sun Ultra10 20:46
moritz is jealous ;)
TimToady dmq: I would suggest that character class sets are probably more important and want a shorter Huffman coding
shay rindolf, yba gave it to me :)
rindolf shay: Yonathan Ben-Avraham?
shay hello moritz :)
rindolf, yes
rindolf shay: OK.
shay: nice.
shay he sent a mail to linux-il asking if someone want it, I reply'd 20:47
then he asked me why should he give it to *me*
[particle] shay: we'll be happy to add you as a parrot porter for sparc
shay I told him that I want to make some code portability test and learn the architechture in general
next mail was: "when are you able to pick it up?"
dmq timtoady: i think i agree.
no, i agree.
shay [particle], give me a week 20:48
[particle], I need to get home, I'm in the army now
[particle] yes, i see that
shay [particle], is someone working on that port atm?
[particle] no
shay great
I'm doing my work on NetBSD
[particle] be great to have a smoker setup
shay will have 20:49
[particle] shay++
join us on #parrot (irc.perl.org) when you're ready
dmq its just a pain to do.
20:49 czth__ joined
dmq as im sure you recall from regcomp.c :-) 20:49
rindolf shay: what do you do at the IDF?
dmq . o O ( If he told you he would have to kill you ) 20:50
shay rindolf, fields intelligence combatant
rindolf shay: I see.
dmq hah!
shay dmq, kind of :)
20:57 pbuetow joined
lichtkind_ vorgive me for repeating but what is the current main topic of change? 20:59
TimToady do you mean, what are we working on the hardest right now? 21:01
I'm mostly working on svn.pugscode.org/pugs/src/perl6/Per...0.0-STD.pm 21:02
other folks are of course working on other things 21:04
or are you referring more to design and spec changes? 21:05
or to the topic of the channel? 21:06
rindolf TimToady: aren't you missing a =cut there?
TimToady =cut is gone in Perl 6
rindolf TimToady: oh.
TimToady (though I suspect pugs still recognizes it) 21:07
rindolf TimToady: what text editor are you using?
TimToady but =begin/=end are always supposed to nest right and return you to whatever context was on the outside.
I use vim, but I think some of the syntax that has developed over the years is extremely crufty. 21:08
rindolf TimToady: what syntax?
TimToady regex, for one
but then, I think that about Perl 5 too... :) 21:09
it would be fun to rewrite vim with Perl 6 as its fundamental syntax.
rindolf p6re 21:10
TimToady currently everything in vim is very ad hoc
moritz sadly, yes
if I imagine the combined power of perl and vim... 21:11
(and I don't mean the linked perl interepreter in vim, that's not _so_ powerfull)
we could have world dominance ;)
TimToady syntax hilighting with real grammars, for instance 21:12
lichtkind_ TimToady sorry girlfriend aked something, yes i mean spec changes 21:13
i think thats enough for the first
TimToady most of the spec changes are from trying to write the grammar 21:14
but the mmd algorithm has also been on my mind for six months or so
lichtkind_ TimToady as you may knoe im writing editor in perl :)
but perl6 is at very bottom on todo :) 21:15
but its definitly a dream
tene Would be nice to have an editor with Perl6 integrated like lisp is in emacs
lichtkind_ that was one of the reasons i started the project 21:16
but it was 2002, never heard of perl6
21:16 lichtkind_ is now known as lichtkind 21:17 dduncan joined
lichtkind of course is perl6 much cooler 21:17
but i use it as my primary editor
21:17 dduncan left
lichtkind and always try be rock stable 21:18
moritz lichtkind: does it have vi-like modes and key bindings?
lichtkind not yet
i personly think vi modes suck badly but for the advantages they bring io wanted to introduce something similar 21:19
but currently we have main topic CPANification
21:19 arcady joined 21:20 dduncan joined
lichtkind to have short command for aditing is very cool 21:20
but i like it more distinkt visually than different modes
i guess if your interested we can discuss that in another channel 21:21
moritz yeah, it's not the best idea to start an editor flame war ;)
lichtkind nop because i studied editors a lot and can see good things in all but in the end i plan to make a better one than all together :) 21:22
similiar larrys standpoint to languages 21:23
:)
moritz lichtkind: when you're finished I promise I'll try it ;)
lichtkind you know your never finished :) 21:24
dmq everybody wants to make a better editor.
:-)
moritz not ;)
lichtkind: s/finished/published Version 3.0/ ;)
lichtkind dmq no my involment was an exident :) 21:25
moritz thats bad because currently i have 0.3.3.17 and 1.0 contains most i can think of today :) 21:26
moritz lichtkind: that's no problem, you'll get more ideas ;)
lichtkind: no, honestly, I'll try it before, "when I have time[tm]" ;) 21:27
lichtkind yes but gratest problem is that it cant be one man show forever
is this another word for never :)
Juerd lichtkind: Hi 21:28
moritz lichtkind: ok, tell me an URL ;)
lichtkind proton-ce.sf.net
but i highly recommend a nightly build
from web52.xeon225.server4you.de/
moritz I highly recommend to offer debian packages ;) 21:29
lichtkind hello Juerd glad to see you
Juerd lichtkind: Is it acceptable for you to leave halfway day 3?
Because otherwise I won't be home in time
21:29 Aankhen`` joined
lichtkind moritz like i said cpanification is on the way, there is a script to make debian packages of it 21:30
moritz i suspect you have linux
therefore a little bit work :)
Juerd ok i would like to stay a bit so i see you as my backup plan :), but i say in time when i found something else 21:31
Juerd lichtkind: Okay 21:33
You can always drive along on the way there, of course.
That'll be the 20th
21:33 Psyche^ joined
lichtkind Juerd thanks thats great but from ffm to munich i have aleady a seat in car of a friend, after munich he goes to parents in austria thatswhy i need another way back :) 21:36
dmq lichkind you are in ffm? 21:37
am i going to be driving to munich with you i wonder? 21:38
:-)
lichtkind dmq no god beware, but a girl friend of mine :)
ah and when you drive back? 21:39
dmq ah ffm aint soooo bad.
im not driving back.
train.
Juerd lichtkind: I see
lichtkind dmq and which time?
dmq back?
Juerd dmq: Destination?
dmq not sure.
ah, GPW?
oder, DPW
Juerd dmq: And the other way? 21:40
dmq oh, back to ffm
i live here
Juerd DPW is confusing: could be Dutch or Deutsche ;)
dmq there
Juerd What's ffm? :P
moritz Juerd: Frankfurt/Main
lichtkind frankfurt
Juerd I see
dmq but the weird thing is im driving to munich with someone who afterwards is going to austria to see his parents. 21:41
Juerd dmq: I'm almost driving through Frankfurt. If you need a ride...
The thing is that I'm leaving during day 3. Probably missing the last 3 or 4 talks.
lichtkind dmq i come from east and from villige thatswhy its hard for me to live there more than a week
dmq ill be going with corion from pm. so i wouldnt worry about me, its very kind to offer tho.
lichtkind Juerd the second last is interesting to me 21:42
Juerd lichtkind: I'm quite interested too, but need to make it back home in time
lichtkind of course
Juerd It's 10 hours from Munich to Dordrecht
And I need to be awake the day after 21:43
lichtkind :)
Juerd So I'm planning on arriving approx 1:00, saturday, then sleeping 7 hours, and waking up at 8 am.
My office will be rewired that day.
Getting 3 * 35 A 21:44
Which is a nice upgrade from 2 * 16 A, but requires some additional changes.
(To use it well.)
lichtkind dmq when you go back at saturday we can join a weekend ticket
dmq no im going back friday afaik.
unfortunately. sorry mate.
Juerd wonders if there's a GPW irc channel 21:45
dmq actually, i take it back. i have no idea how im getting home.
we can discuss it there im sure. 21:46
all i know is i love my bahcard50!
bahncard-50
Juerd dmq: Well, you could drive along with me :) 21:47
I'd like a passenger to keep me awake :P
(And maybe share fuel costs, but that's secondary, if not tertiary)
dmq sounds interesting, but as im speaking id kinda feel bad bailing early. 21:48
althought ill think about it.
Juerd Just don't leave before your own talk ;)
dmq I can definitely sympathise with the desire to have company in the car tho.
21:49 Psyche^ is now known as Patterner
Juerd I've done Berlin -> Dordrecht and Chemnitz -> Dordrecht alone before, and while it's doable, I didn't like that I was by myself. 21:49
lichtkind dmq i thought you go train? 21:50
dmq i go by car, return by train. 21:54
i have a feeling the same car you are going in.
:-)
if you are going with strat.
Juerd Grin 21:55
lichtkind Ƥh yes
Juerd Ƥ :) 21:56
21:56 iblechbot joined
svnbot6 r15271 | lwall++ | Factored out rule names from all #= comments; preprocessor now expected to 22:02
r15271 | lwall++ | recognize /^[rule|token|regex] <ident>/ as implicit start of {*} identity.
r15271 | lwall++ | The #= comments now only add to that base identity.
dmq it would be cool for referencing if synopsis 5 had anchors on the bulltet points. 22:14
:-) 22:15
tene dmq: are you asking for a commit bit?
Juerd dmq: Patch^WCommits welcome ;)
tene That kind of talk around here will get you a commit bit if you're not careful.
dmq is it html or is it generated from pod? 22:16
Juerd tene: I expect dmq to already have one. Or two. Or three. :)
dmq: The latter
dmq: But that shouldn't stop you, should it? :)
dmq heh
Juerd feather.perl6.nl/syn/ # these we control ;)
lambdabot Title: Official Perl 6 Documentation
dmq I guess its a prioritization issue. Theres things very few people beside me can do, and theres other things that lots of folks can do. Which should i choose? 22:17
22:17 larsen_ joined
Juerd dmq: The most -Ofun things. 22:17
dmq heh 22:18
of the options most of them arent fun.
:-)
im currently trying to make unicode character classes in perl 5.10 use a sane data structure. 22:19
i dont even use unicode damnit.
:-)
Juerd Oh, but you do!
You're sending perfect utf8 sequences to irc ;)
dmq heh.
Juerd utf7 too, most of the time 22:20
Why do you not use unicode?
dmq because i dont need to.
i was blissfully unaware of unicode until i started hacking the regex engine. 22:21
what a schock. :-)
[particle] doesn't use unicode--because it's <<<unamerican>> :) 22:22
shamu uit
moritz [particle]: actually that would be a good reason to use it ;)
shamu uit
22:23 shamu joined
dmq its a kind of pet rant for me. IMO a big chunk of unicodes horribleness comes from trying to please everyone. Im not sure its the right approach. 22:23
moritz dmq: what would you say is the right approach? 22:24
shamu well, bear in mind that UTF-8 is unicode, and the first 7 bits match one-for-one with ASCII -- that's *Am'rken* Standard code for Information Interchange, podner
Juerd dmq: But do you never use non-ascii?
moritz dmq: utf-32 with just one char for each possible sign?
dmq im not sure what the right solution is. 22:25
SamB I think a better approach would be to try and please someone
dmq and in some ways yes, utf32 has a bunch of advantages that are relevent to me.
but i think it comes down to what samb said somewhat. I mean unicode does everything, including dead languages. 22:26
shamu fyi, I've been trying to figure out what 'Unicode' actually is for a long time
moritz would like to see a generic utf-2**n (for any integer n ;) *duck*
Juerd dmq: Does it matter that it includes dead languages?
shamu any recommendations? The only intelligible one I ever found was www.joelonsoftware.com/articles/Unicode.html
lambdabot Title: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know A ...
SamB I wish it tried to do something right
Juerd shamu: That's a good one. 22:27
dmq my view is human communication evolves just like the rest of our cultural insitutions and we shouldnt waste computer cycles because somebody might want to use ancient greek.
Juerd shamu: See also wikipedia. And if you want to use it with Perl, perlunitut.
SamB rather than trying to do everything, except leaving it all half-done
lichtkind svn.pugscode.org/pugs/src/perl6/Per...0.0-STD.pm is this perl6 metadata es part of the interpreter
Juerd dmq: Ancient Greek needs to be digitized too, if we are to preserve things.
dmq unfortunately im also a native english speaker, so people tend to think im being an english bigot when i say stuff like that. but i mean seriously.
sure digitize it. just dont make every program everywhere pay the price. 22:28
Juerd dmq: And it's great that we can put it all in one encoding.
shamu well, computers are now multicore at 2Ghz plus -- what's the point, if not to have some of those cycles to waste on reading and publishing something the ancient greeks figured out a long time ago, so we can learn from their mistakes and hopefully culturally evolve
Juerd dmq: If only for web pages, it's useful.
SamB also, mathemeticians needed those letters anyway
dmq sure there are places where unicode is useful.
like for a web browser.
but for an os?
shamu also, isn't every ascii program automatically utf-8? 22:29
dmq or a programming language?
masak other recommendations: www.tbray.org/ongoing/When/200x/200...06/Unicode
Juerd dmq: Well, programming languages deal with the web, too.
lambdabot Title: ongoing &#xb7; On the Goodness of Unicode
masak www.tbray.org/ongoing/When/200x/2003/04/26/UTF
Juerd So if it's useful for the web, it's automatically useful for programming languages too.
lambdabot Title: ongoing &#xb7; Characters vs. Bytes
masak www.tbray.org/ongoing/When/200x/200...e-and-Ruby
lambdabot Title: ongoing &#xb7; Unicode and Ruby, tinyurl.com/yoz8sp
dmq shamu have you looked at what is required to read a utf8 stream even if it only ascii.
shamu well, if it is only ascii, can't you just say 'getchar(); if high bit set, abort reading stream'? 22:30
I mean 7-bit ascii
dmq ascii is 7 bit
Juerd Very few texts that I encounter are pure ascii. 22:31
dmq utf8 is exactly equivelent to ascii for code points 0-127
Juerd Most of the time it's utf8, second place goes to latin1.
dmq juerd sure: but dont you think that a fixed wisth 16 bit encoding would be sufficient for all your needs?
Juerd But I deal with iso-8859-15 and -3, and windows-125x, and koi8-r, too.
dmq: I personally don't care if Perl *internally* uses 16 bit or not. 22:32
dmq i can see the utlity in unicode, but using it as a standard internal encoding doesnt make sense to me.
SamB you would prefer what?
Juerd dmq: But when outputting things, I like to use utf8, because then I can still use it on an old fashioned terminal.
dmq note the misspelled "fixed width"
SamB no standard internal encoding?
all programs having to deal with things in unspecified encodings? 22:33
Juerd SamB: My personal preference for perl's internal string encoding is bitwise negated utf8.
dmq yes, use a kludge to work around a kludge.
SamB and files people send you often not working?
dmq im not saying there are easy solutions or that unicode isnt the least worst.
but i dream of a better world :-)
SamB oh, by all means unicode is horrible 22:34
dmq SamB: I bet it is extremely rare to find a document that contains all 100k letters in it.
Juerd SamB: There must be some internal encoding, but as long as Perl knows which one it is, and how to convert it to the other encodings, all are fine with me.
SamB but is currently the least-horrible available thing of its kind
Juerd really wants his ~utf8, but lacks C fu :(
(and tuits) 22:35
dmq anyway, im probably biased as with what ive been coding tends to require efficient random access to characters. Which most unicode encodings dont allow.
SamB ah.
try UTF32.
[particle] juerd: why ~? 22:36
Juerd [particle]: So the mistake of forgetting to encode your output is clearly visible.
dmq yes, ill just recode perl5 to use utf32 internally. :-)
[particle] win32 encodes files in ucs2 iirc
Juerd [particle]: As would the mistake of not decoding.
[particle] juerd: should be really easy on parrot 22:37
allbery_b hm, iisn't ucs2 deprecated?
SamB Juerd: that is what typesystems are for
Juerd [particle]: If Perl 6 has strings the way I think they will be, that won't be necessary. :)
dmq before xp it did, xp and later it does utf-16
en.wikipedia.org/wiki/UTF-16
Juerd SamB: Yes, but Perl 5 doesn't have them, and does need to support BOTH unicode and text.
allbery_b that makes more sense
Juerd s/text/binary/
22:37 gnuvince_ joined
SamB isn't utf-16 only a little different from ucs2? 22:37
in practice?
dmq yes
[particle] yes
dmq no surrogate pairs 22:38
SamB about as different as utf-8 and ascii?
Juerd No, much less different.
SamB except for the small detail of there not being terribly many characters in Unicode that don't fit in two bytes? 22:39
dmq no, ascii and utf8 are very different. 22:40
oh well, ok, yes, ascii->utf8 fixed width/variable width, utf-16->ucs2 fixed width/variable width. 22:41
whatever. 22:42
22:42 ProperNoun joined
SamB also, both utf-8 and utf-16 use what were unasigned codepoints for their multi-word character codings, yes? 22:43
dmq if i get the question yes for utf8, pass on utf-16. 22:44
SamB I'm pretty sure that the codepoints corresponding to the words in surrogate pairs were previously unasigned 22:46
allbery_b actually I'd claim utf8 is not very different from ASCII, because ASCII only defines code points 0-127
diotalevi ASCII? No, I thought that defined all of 0 -> 255. 22:48
dduncan no, ASCII is 7-bit
Juerd ASCII is 7 bit. It cannot have anything > 127
Not without compression, at least :)
dduncan and UTF-8 is identical to ASCII for codepoints 0..127, afaik, which is part of its appeal 22:49
Juerd diotalevi: There are ascii-compatible encodings like iso-8859-1, cp437, etcetera, that have 255 characters.
dmq we are talking about the merits of an encoding tho.
diotalevi dduncan: er, minus some of the control character parts of ASCII. I thought that differed slightly.
dmq so the relevence of codepoint equivelency is kinda a seperate issue
Juerd diotalevi: No, 0..127 are fully equal in ascii and utf8 22:50
dduncan if you're going to use unicode, which I recommend to be the default, I would say that UTF-8 is the best default bet
its other advantages include being byte order independent
[particle] ascii is a codeset and an encoding, so it's hard to speak about clearly
s/codeset/charset/
dduncan and relatively compact
dmq and reading ascii is not identical to reading utf8 unless you know in advance that you are really dealing with ascii. 22:51
dduncan also, UTF-8 is encoded such that you can start in a text stream at any byte and you can easily tell where the character boundaries are
Juerd dmq: Can't you read in ascii, and upgrade to utf8 when you encounter the first high bit? 22:52
dmq so if you dont know, or you are dealing with characters outside of ascii you have to do the clumsy read and scan of utf8.
dduncan which helps reliability
dmq juerd: im kinda inclined to think that encoding should a problem the coder deals with. only they have the information to make the right decision.
diotalevi Say, 127 isn't defined in Unicode but is DEL in ASCII. Are you sure that one is the same? 22:53
dmq utf8 is /defined/ to be ascii for lowbit bytes (in a wellformed utf8 string)
[particle] transcoding is definitely a user issue. but support for major encodings should be supported in core ops/libraries
diotalevi er, wait. I was reading the wrong line.
22:53 sunnavy joined
dmq particle: yes i agree pretty much. 22:54
dduncan: that is true. 22:55
about finding the boundaries from a given point. but finding boundaries doesnt replace the fact you cant do random access. 22:56
dduncan if you know a stream is UTF-8, then you can do random access
SamB utf-8 probably sucks as an in-memory representation
dmq dduncan: how do you reckon. 22:57
SamB but not so bad for an on-disk encoding for programs, usually...
22:57 Psyche^ joined
dmq you need to a linear scan. 22:57
22:57 elmex joined
dduncan the bit patterns of utf-8 characters are such that you can recognize just from looking at no more than 6 consecutive bytes where the character boundary is 22:57
dmq heh. i wonder, maybe some of those old algorithms for tape would be useful.
dduncan: that means you scan. 22:58
dduncan but you don't scan from the start of the string, which is my point
a handful of bytes is nothing
dmq there is no way without scanning to say "jump to the 10th boundary from here"
allbery_b you can spot *a* character boundary but ==dmq
dduncan its when you have to start at the beginning of the string to know how to interpret the characters you get to correctly, which is the problem 22:59
allbery_b on the flip side, 32-bit chars are always fast to index but slow to do anything else with (see Haskell [Char]) 23:00
dduncan while "go to 10th character" is needed for some apps, many apps don't require you to do that, such as things with data interchange or network operations
Juerd dduncan: "If you know a stream ..., random access ..." That's the problem: utf8 only makes sense as as stream. You need to scan it. Therefor, there's no sane way to do random access, unless you keep an offset map.
allbery_b (well, they also have indexing issues because it's [Char] instead of Array ... Char)
dduncan for the apps that need to do this, you can transcode it to UCS32 for internal use
Juerd (For *huge* data, it makes sense to keep an offset map of every 128th byte, or so) 23:01
dduncan er, UCS4
(ucs uses bytes, utf uses bits)
dmq huh and huh?
dduncan afaik
lichtkind night folks, i believe in you !
dduncan er, the numbers in the names of UCS count in bytes, in UTF, bits 23:02
that's what I meant to say
dmq i thought the ucs names were the old ones
dduncan they are
utf is more modern, and what I prefer 23:03
Juerd hungry.
dmq right
dduncan ditto
dmq i dont get the every 128 bytes comment exactly. i probably havent thought about it longer.
long enough
23:04 sunnavy joined
dduncan I don't know the significance of 128 bytes either 23:04
23:04 Psyche^ is now known as Patterner
[particle] i think he means something to mark the start of a grapheme 23:05
dmq ah i see.
[particle] so you can seek to that position and tell safely
dmq right. that makes sense.
but then theres the overhead of doing that. 23:06
sigh. it all sucks.
:-)
dduncan so there's a marker for each 128 bytes that says what character number is there?
I think that makes sense 23:07
Juerd dmq: If you, during reading, scan everything and cache the character offset for every 128th byte (rounded up or down to full character boundaries), you can more efficiently locate character N, because you can start scanning at the closest checkpoint.
dmq: As said, this is only beneficial for *huge* data.
allbery_b yeh, so instead of counting from the start you can pick it up in the middle. tradeoff between overhead of keeping a count and having to step
Juerd Like, entire books :)
dmq right right 23:08
Juerd (And even then, you should think twice before going through the trouble of implementing all this.)
dmq no no
im still on character classes in unicode.
no worries.
[particle] well, if it's static content... just create a lookup table
Juerd (After all, the (Christian) Bible, fits in 1.44 MB! :P)
allbery_b I'd actually say it's worthwhile if the string is >4k or so
Juerd allbery_b: 4 kB already?!
dmq i cant quite get invert(invert($class)) to work.
Juerd Nah, I don't think it will.
[particle]: The mapping I referred to *is* a lookup table :) 23:09
dmq i dont suppose anybody has the unicode book that covers inversion lists handy?
allbery_b of course, most strings are << that, so it's still not much of a win in practice
23:09 sunnavy joined
[particle] juerd: sorry, i meant *store the lookup table 23:09
Juerd allbery_b: Ā«? ;) 23:10
[particle] is distracted by food
Juerd allbery_b: I think that with a 4 kB string, the overhead of keeping a mapping table is still too large to benefit from it.
allbery_b *you* try doing unicode through a vnc client on OSX sometime :>
TimToady dmq: don't get fixated on random access to strings. it's only going to get less important with time. And not even UTF-32 is a fixed width encoding of graphemes, which is what the user really wants to think in terms of anyway. 23:22
regexes don't really need random access, for instance. nearly all the offsets are very small and relative to your current position. 23:24
the quest for a fixed unit of storage to represent characters is misguided in my opinion except as an optimization that is below the abstraction level of the programmer.
very few people complained that substr slowed down when we went with utf-8 in perl 5 23:25
Gothmog_ That's not necessary a good argument. 23:27
TimToady It's not necessarily a bad argument either. :) 23:28
allbery_b it's a "good enough" argument. which, given that you can't have perfection, is not a bad thing
TimToady the point is that substr and friends aren't all that useful once we start getting away from the punchcard metaphor of text.
nearly all the pattern matching done in Perl is done with regex, 23:29
Gothmog_ I think of UTF-8 vs. some fixed width encoding as a speed vs. memory trade-off.
TimToady and regex naturally finds boundaries without caring about large offsets
Gothmog_: fine, but that should be below where the typical user is thinking.
which is at the grapheme level, which corresponds to what the user thinks of as a "character". 23:30
Gothmog_ Hm right, but it might be important if some kind of string lookup is O(1) or O(n).
Like that's why we use hashes and not array of pairs.
s/array/$&s/
TimToady that's one of the things that a VM is pretty good at optimizing on the fly 23:31
but I am adamant on the subject that a string position in Perl 6 is *not* *not* *not* an integer.
Gothmog_ Hm. 23:32
TimToady it's one of my hot buttons, in fact
Gothmog_ What is it that a VM can optimize on the fly, and what do you think should a string position be, if not an int? 23:33
moritz so what is it? a pointer?
dmq not an integer?
TimToady absolutely not
dmq wider?
TimToady integers don't know their units
diotalevi . o O ( A marker? ) 23:34
TimToady yes, basically a marker
dmq ah ok. a vector.
Gothmog_ So, you want to differ n bytes / n graphemes / n whatever?
dmq so it wont count code points?
TimToady if you force it to count in a particular unit, you must make sure it knows the correct units
dmq: by default, no 23:35
dmq interesting.
Gothmog_ And what happens if you don't enforce a particular unit?
TimToady the default in Perl 6 is graphemes, and has been from day one
Gothmog_ That seems to be sane.
moritz so is 1 grepheme = 1 code point is this context?
TimToady the default Unicode level is to count by graphemes
dmq i suppose its true.
TimToady a grapheme may be several code points 23:36
dmq you dont necessarily need to store a full map.
TimToady a base character plus its combining characters, basically
that is also why there is no .length method in Perl 6 23:37
Gothmog_ But if you access the nth grapheme, n is an int, or not?
TimToady it will grudgingly translate n to a string position, and then try to maintain the abstraction from then on.
dmq ive been thinking of how to store a trail of positons reached via accepting states from a DFA so that it cant be used intermingled with the backtracking engine.
and your right, that is all localized small offsets. 23:38
thanks. thats a useful observation. 23:39
TimToady indeed, I'm an acquaintance of the person who hacked the utf-8 matcher into regexec.c :)
23:40 Psyche^ joined
TimToady so basically Perl 6 has string positions as opaque markers or pointers 23:40
internally it can be a string plus a byte or codepoint offset, but that's hidden from the user's view. 23:41
dmq right but regexec.c can keep that kind of data on stack. 23:42
TimToady and you can force string positions and lengths back to numbers as long as you specify the units
yes, that's internal
dmq if you mean what i think you mean.
so you can cheat when you end up with easy units like single width chars right? 23:43
TimToady yes, but only when you know it for sure.
dmq but with a dfa, everything is a codepoint/char. so you end up hypothetically building a scary stack. 23:44
TimToady Perl 5's type system is a bit dicey on the subject of knowing such a thing.
dfa's are always scarey. :)
dmq so i was thinking that if its offsets (therefore localized), and run length encoded, then you coudl do it for a dfa without worrying about the stacking blowing up. 23:45
the units would be codepoints i guess.
23:45 Psyche^_ joined
dmq im really interested in the idea of making as much of a regex happen using a dfa. 23:46
23:48 CardinalNumber joined
TimToady everything in S05 about "longest token" is aimed at the same goal. 23:49
dmq yeah.
TimToady but it tends to make more sense for a parser than for a one-shot regex
which are often more efficient with a Boyer-Moore algorithm
dmq and i noticed that other semantics are chosen to make longest token not be super-expensive when each branch cant be handled via a dfa. 23:50
TimToady because the dfa is required to look at every character, and BM isn't
dmq right.
the dreaded offsets code.
i almost got lookbehind properly optimsable, but then my head exploded.
TimToady well, dfa is in the abstract side-effect free, and a parser wants to be full of side effects. 23:51
so you have to manage the transition from patterns to actions somehow
dmq yes
TimToady much like your typical awk statement
P6 requires reversibility on lookbehind patterns 23:52
(though that implies an encoding that can be scanned backwards too)
dmq heh 23:53
23:53 Psyche^_ is now known as Patterner
dmq it still would be nice to extract fixed substrings from them 23:54
if possible.
so that things like /(?<=foo)/ can be as efficient as /foo/ 23:55
i almost had it working.
TimToady it should match oof but run the counter the other way
dmq or it could just BM for 'foo' 23:56
:-)
TimToady no, that's the wrong approach entirely
dmq and use the spot after it.
or what about /(?<=foo)bar/ it should just look for 'foobar' and use the middle. 23:57
TimToady that's a possible optimization, but in the abstract it's not difficult to look for a position with oof going left and bar going right. 23:58
dmq i realize it doesnt scale when you add quantifiers, im talking about an optimisation only.
TimToady and the user nearly always has a good reason for having written it that way in the first place.
so you're almost never going to be able to do that optimization anyway
dmq i think mainly to not have bar in $& or $1
actually that type of thing is quite common in split. 23:59