svn switch --relocate svn.openfoundry.org/pugs svn.pugscode.org/pugs/ | run.pugscode.org | spec.pugscode.org | paste: sial.org/pbot/perl6 | pugs.blogs.com | dev.pugscode.org/ Set by putter on 11 February 2007. |
|||
00:07
IllvilJa joined
|
|||
[M]erk | What does the darker green mean vs. lighter green in the smokes summaries? | 00:42 | |
00:42
shay joined
00:46
weinig is now known as weinig|away
00:52
sunnavy joined
01:03
TimToady_ joined
|
|||
svnbot6 | r15266 | lwall++ | infix and postfix now more predictive | 01:13 | |
01:28
TimToady_ is now known as TimToady
01:29
gaal joined
|
|||
TimToady | [M]erk: darker green is actually a Todo | 01:29 | |
ought to be purple or some such | 01:30 | ||
01:30
jisom_ joined
01:39
bonesss joined
01:44
weinig|away is now known as weinig
01:49
cmarcelo joined
|
|||
audreyt | Grrrr: congratulations on DBIx::Perlish. It's simply brilliant. :-) | 01:55 | |
cmarcelo | Trac is on feather? if so, where are its config files? | 01:57 | |
(moose) | |||
audreyt | cmarcelo: /data/svn/trac/ | 01:58 | |
wolverian | it's also very scary. :) | ||
cmarcelo feels that pugs has too many sites =| | |||
i'll try to connect them to each other at least... | 01:59 | ||
audreyt | *nod* | ||
you're now TRAC_ADMIN | |||
feel free to hand out admin bits as needed (from the Admin/permissions page) | |||
cmarcelo | audreyt: (feeling better?) | ||
k | 02:00 | ||
audreyt | no, not really, can't focus for more than 15min at a time | ||
highly annoying | |||
cmarcelo | and the "saving throw"-thing ? | ||
audreyt | it looks like I passed | ||
a small chance I can leave hospital tomorrow | 02:01 | ||
otherwise I'll stay for another week or so | |||
but either way this will pass | |||
and it looks like no complication will follow | |||
cmarcelo | @tell putter Do you have a blog bit? What do you think about posting on project status, etc? | ||
lambdabot | Consider it noted. | ||
audreyt | I guess I should be grateful :) | ||
but it's still highly annoying :) | 02:02 | ||
cmarcelo | audreyt: great news :) [except for annoyances now/next week] | ||
02:04
justatheory joined
|
|||
audreyt | yeah :) *faints some more* | 02:04 | |
araujo | hospital? | 02:05 | |
cmarcelo | the family of sites is: pugscode.org, feather, dev.pugscode.org, rakudo-wiki, perl.org/p6... (shout if I missed something) | 02:06 | |
blog.p.o and irc.p.o too | 02:07 | ||
audreyt | spec.pugscode.org | 02:09 | |
irc.pugscode.org | |||
run.pugscode.org | |||
invite.pugscode.org | 02:10 | ||
smoke.pugscode.org | |||
that's it I guess | |||
cmarcelo | rakudo is "Perl 6" oriented and dev wiki is "Pugs" oriented? | 02:12 | |
audreyt | or rather, rakudo is user-facing | ||
while dev I hope is dev-facing | |||
i.e. it more closely integrates svn/tickets/irc | |||
while rakudo is more about presenting userland info | 02:13 | ||
cmarcelo | perl.org/perl6 is parrot-oriented? | ||
audreyt | it is history-oriented :) | 02:14 | |
cmarcelo | k | ||
02:19
dmq joined
|
|||
cmarcelo | Pugs development takes place on the perl6-compiler mailing list. | 02:21 | |
:o) | |||
(quotation from pugscode.org) | |||
TimToady | we just fixed that on the wiki | 02:22 | |
cmarcelo | TimToady: by fixed you mean? (which wiki? rakudo?) | 02:24 | |
TimToady | dev.pugscode.org/wiki/AboutPugs | 02:27 | |
lambdabot | Title: AboutPugs - Pugs - Trac | ||
cmarcelo | tks. I'll borrow the fix then.. | 02:28 | |
TimToady | mostly just delete that paragraph... | ||
cmarcelo | hmm, but Pugs::Doc::Hack points to old (but beautiful) CPAN version... | 02:29 | |
I think it's better to make "How to get involved" point to dev.pugscode.org | 02:31 | ||
[M]erk | Which is the pugs mailing list then? It doesn't look like perl6-compiler is all that active. And perl6-internals seems to be mostly parrot, right? | 02:55 | |
cmarcelo | [M]erk: pugs discussions are mainly at this IRC channel. perl6-compiler is seldom used... | 02:57 | |
svnbot6 | r15267 | cmarcelo++ | * feather index: add links to other sister sites, remove kwiki links. | 03:00 | |
r15267 | cmarcelo++ | * pugscode.org: add more links, cleanup a bit. | |||
r15267 | cmarcelo++ | * Pugs/Doc/Hack.pod: fix some links. | |||
[M]erk | PUGS - the project that email is too asynchronous for! PUGS - the project that is living in the now! PUGS - the project that is... | ||
cmarcelo | [M]erk: but we have a Trac system now => dev.pugscode.org ;) | 03:01 | |
(dev.pugscode.org/changeset/15267 => peer-review welcome) | 03:02 | ||
lambdabot | Title: Changeset 15267 - Pugs - Trac | 03:03 | |
[M]erk gasps. | |||
You know that's a... | |||
[M]erk whispers, "Python project" :p | |||
cmarcelo | s/Trac/development wiki/ | 03:04 | |
=P | |||
[M]erk | Can someone give me svn commit privileges? | 03:06 | |
SamB | [M]erk: shh | ||
they'll hear you! | |||
and give you a commit bit! | |||
lumi | They? | 03:07 | |
[M]erk: Got an email address? You can /msg me | 03:09 | ||
svnbot6 | r15268 | cmarcelo++ | * pugscode.org: fix a linebreak. | 03:12 | |
lumi | [M]erk: Commit bit sent, welcome to Pugs! | 03:14 | |
[M]erk | Thanks. | ||
lumi | [M]erk: You can test commit by adding your name to AUTHORS | ||
It's a tradition, or an old charter, or something :) | 03:15 | ||
SamB | their ought to be a PURPORTED-AUTHORS file for that ;-P | 03:23 | |
04:45
ilogger2 joined
05:08
shay joined
06:02
autark_ joined
06:09
devogon joined
06:13
BooK joined
06:24
amnesiac joined
06:51
sunnavy joined,
jamhed joined
|
|||
svnbot6 | r15269 | lwall++ | Constraints on which operators metaoperators can metaoperate on. | 07:22 | |
07:26
iblechbot joined
07:28
marmic joined
07:45
kanru joined
07:54
VanilleBert joined
|
|||
Grrrr | audreyt: thanks :-) | 07:58 | |
08:19
keigo joined
08:49
kanru joined
08:56
baest joined
08:57
baest_ joined
08:58
nekokak joined
09:13
VanilleBert left
09:21
UWC joined
09:25
andara joined
09:36
lumi_ joined
09:40
iblechbot joined
|
|||
dduncan | So if Grrrr is indeed Anton Berezin ... I like the idea of what I saw in DBIx::Perlish ... and am hoping that you may be able to do something similar for my non-SQL database, currently called QDRDBMS, after it is released. | 09:57 | |
For simplicity, my own DBMS takes certain Perl objects, holding an AST, as input, but it might be nice to have an even more Perlish interface as an optional extension, such as like you did with DBIx::Perlish. | 09:58 | ||
Talk more later ... | 09:59 | ||
09:59
pfarmer joined
10:06
elmex joined
|
|||
Grrrr | dduncan: I don't see a problem; parsing optree is relatively easy, thanks to B.pm; the difficult part is to come up with a sensible subset of Perl5 syntax to support | 10:31 | |
dduncan | fortunately, the language of my DBMS is a lot easier to map Perl to than SQL is, given partly that its design is more Perlish | 10:38 | |
or should I say, more like a normal programming language | |||
10:42
lichtkind joined
10:47
ruoso joined
11:06
zgh_ joined
|
|||
Aankhen`` | Wow. Just built Pugs again after a year (I think!), and `nmake fast` really is *fast*. | 11:47 | |
I suppose it could also be the completely different system. :-P | 11:48 | ||
12:05
koye joined
|
|||
ruoso | @seen fglock | 12:06 | |
lambdabot | I saw fglock leaving #perl6 5d 17h 54m 48s ago, and . | ||
ruoso | hmm... that's unusual... | ||
12:07
chris2 joined
12:08
ofer1 joined
12:18
gaal joined
12:23
TimToady joined
12:25
rfordinal joined
12:34
lichtkind_ joined
|
|||
baest_ | ruoso: not sure, but I think he has a vacation | 13:06 | |
13:14
buetow joined
13:21
Limbic_Region joined
13:32
baest_ is now known as baest
13:42
andara joined
13:59
rfordinal_ joined
14:00
bonesss joined
14:14
pmurias joined,
thepler joined
|
|||
pmurias | hi | 14:14 | |
moritz | hi ;) | 14:15 | |
pmurias | in the mmd algorith is the type narrownes dependent on the paramater under consideration | 14:16 | |
? | |||
mortiz: is saying "hi" before asking a question silly? | 14:17 | ||
moritz | pmurias: ni, it's not ;) | 14:18 | |
pmurias is thinking how to efficiently implement mmd | 14:24 | ||
14:24
bonesss is now known as bones`eat
14:29
diakopter joined
14:36
rfordinal_ is now known as rfordinal
|
|||
pmurias | do other languages with MMD allow you to specifiy the importance of parameters with semi-colons? (not nessesarly with the same syntax) | 14:40 | |
? | |||
14:42
rindolf joined
14:54
amnesiac joined
14:58
vel joined
|
|||
Coke_ | I don't think parrot works that way, fwiw. | 15:00 | |
15:12
iblechbot joined
|
|||
[particle] | i know some languages dispatch based on first one or two params only, but none that use a qualifier to denote dispatch semantics | 15:20 | |
however, my knowledge of languages with polymetric polymorphism (aka mmd) semantics is incomplete | |||
pmurias | thanks | 15:22 | |
15:23
sunnavy joined
15:24
bones`eat is now known as bonesss
15:30
GeJ joined
15:37
ofer1 joined
15:45
andara joined
17:11
rfordinal_ joined
17:17
cjeris joined
17:23
buetow joined
17:46
andara left
17:48
dmq joined
|
|||
svnbot6 | r15270 | lwall++ | Constraint checks on parameter zones | 17:48 | |
r15270 | lwall++ | Constraint checks on reducable infixes | |||
r15270 | lwall++ | Random cleanup | |||
dmq | just thought id mention that abigail has successfully converted the BNF for email addresses to a perl5.10 recursive regex. | 17:49 | |
pasteling | "dmq" at 84.58.61.90 pasted "rfc compliant regex parser as a perl 5.10 recursive regex" (69 lines, 2.6K) at sial.org/pbot/23004 | 17:50 | |
broquaint | Brilliant. | 17:51 | |
That's some nice work you've done on the regex engine there, dmq. | 17:52 | ||
dmq | he said on #p5p he will look into writing a BNF to regex converter. | ||
glad you like it broquaint. | 17:53 | ||
btw, long time no chat. hope you are well | |||
broquaint | I'm well, yourself? | 17:54 | |
dmq | not too bad. | ||
broquaint | All is well :) | 17:56 | |
17:58
xinming joined
|
|||
[particle] | great example, dmq | 18:01 | |
i'd like to see a p6regex -> p5regex converter, and attribute handlers as the syntax marker | |||
dmq | yeah, it is cool. Abigail has been doing some testing of the new features. | 18:02 | |
[particle] | he's certainly qualified :) | ||
dmq | heh | ||
did i say testing? | |||
torturing. | |||
[particle] | abigail++ | ||
dmq | anyway, hopefulyl ive put most of what is required to write such a converter in place. | 18:03 | |
assuming you ignore code stuff in p6. | |||
[particle] | fglock will be most happy, i imagine | 18:04 | |
dmq | heh. | ||
until he finds the stuff that p5 does that p6 doesnt ;-) | |||
(?|...) comes to mind. | 18:05 | ||
although maybe p6 does already have something like that. (it makes capture buffers in different alternations share the same indexes) | 18:06 | ||
[particle] | i don't see that construct in perlre. ew to 5.10? | ||
*new | |||
dmq | yes. a couple of weeks old i guess. | 18:07 | |
[particle] | ah | ||
dmq | (?|..(foo)..|..(bar)..) both capture into the same buffer. | ||
H. Merijn Brands idea. | 18:08 | ||
[particle] | $<buffer>:=[(foo)|(bar)] i imagine | ||
18:08
araujo joined
|
|||
dmq | heh. figured p6 would cover it somehow. | 18:08 | |
18:10
justatheory joined
18:12
gilimanjaro joined
18:17
apostols joined
18:25
gilimanjaro joined
|
|||
Coke_ | f | 18:30 | |
(oops) | |||
TimToady | dmq: (?|...) is the standard behavior for alternations under P6 | 18:36 | |
S05: The index of a given subpattern can always be statically determined, but | 18:37 | ||
is not necessarily unique nor always monotonic. The numbering of subpatterns | |||
restarts in each lexical scope (either a regex, a subpattern, or the | |||
branch of an alternation). | |||
wolverian | argh, the lack of covariance in java is making me insane. can someone please just destroy this horrid crap. | 18:40 | |
thanks, now I feel better. | |||
Patterner | Wait for Java 8 | 18:44 | |
18:51
DebolazX joined
|
|||
wolverian | I particularly like how it does support return type covariance, but not parameter covariance, making the other practically useless (for my purposes, anyway) | 18:53 | |
the return type covariance needs to be explicit, too. sigh. | 18:54 | ||
18:54
apostols left
18:59
drupek12167 joined
19:03
UWC joined
|
|||
dmq | timtoady: oh goodie, then fglock WILL be happy about (?|...) | 19:05 | |
what happens when there is a different number in one of the alternations? | 19:06 | ||
(?|..(foo)...(foo)...|..(bar)..)(baz) | |||
what number will the baz buffer have? | |||
i made it be $3 | |||
Coke_ | java 8 or Java 1.8? =-) | ||
Coke_ suggests being all hip and having Perl 6 actually be perl 5.24 | 19:07 | ||
dmq | btw, sorry for the lag timtoady. | ||
19:15
cddar joined,
Caelum joined
19:16
wilx` joined
19:37
bernhard joined
19:41
larsen_ joined
19:48
wilx` is now known as wilx
19:54
rindolf joined
|
|||
rindolf | Hi all. | 19:54 | |
moritz | re rindolf ;) | 19:56 | |
rindolf | Hi moritz | 19:57 | |
moritz: what's up? | |||
moritz | rindolf: I'm fine, just had a bunch of pancakes as supper ;)) | 19:59 | |
moritz feels fat ;) | |||
rindolf: but regarding perl6/pugs: not much :( | |||
rindolf | moritz: do you feel fat or do you feel full? | 20:00 | |
moritz | rindolf: rather full than fat ;) | 20:01 | |
rindolf | moritz: OK. | 20:02 | |
20:02
UWC joined
|
|||
moritz | and I'm trying to do some web apps with catalyst... | 20:04 | |
it's very confusing, too many different files | |||
specbot6 | r13587 | larry++ | Split statement_modifier category in two. | 20:07 | |
r13587 | larry++ | List comprehensions can now be done with statement modifiers. | |||
r13587 | larry++ | Multiple dispatch now explained in terms of topological sort. | |||
r13587 | larry++ | Multiple dispatch with single semicolons clarified, maybe. However, multis | |||
r13587 | larry++ | with single semicolon are likely just a reserved syntax in 6.0.0. | |||
TimToady | dmq: no, it's actually $1 in P6, short for $/[1] | ||
the others would be $/[0][0] | |||
$0[0] for short | |||
maybe $/[0;0] works too | 20:08 | ||
20:09
stevan_ joined
|
|||
dmq | ah right | 20:10 | |
rindolf | TimToady: are you larry in the previous commit? | ||
TimToady: or are you lwall? | |||
dmq | is that an xor or an or or? | 20:11 | |
TimToady | I'm larry on perl.org and lwall on pugscode.org. | ||
nobody's ever accused me of being consistently consistent | |||
dmq | er, actually, "ah right" was the wrong thing to say. better would have been "oh really. umm ok, i probably havent read something i should have" | 20:13 | |
:-) | |||
TimToady | I hear S05 comes highly recommended. :) | 20:18 | |
masak | mm, @evens = ($_ * 2 if .odd for 0..100); | 20:22 | |
nifty | |||
also quite readable | |||
TimToady | and just falls out of existing syntax, basically | 20:25 | |
course if you want to have multiple lists, you have to get fancier | |||
rindolf | masak: I get a syntax error in r15257 | 20:26 | |
masak: this expression seems Pythonic. | 20:27 | ||
dmq | ok, ill read it up. | ||
i thought i already did. but obviously theres a lot of material. | |||
btw, do you have an suggestions for how to do char class set operations in perl5? | 20:28 | ||
rindolf | TimToady: did you invent TAP (that "^ok" "^not ok" syntax)? | ||
dmq | ive thought of introducing an "extended char class notation" as (?[....]....) | ||
like (?[a-z]-aeiou) | |||
but its kinda fugly. | 20:29 | ||
Coke_ | "The basis for the TAP format was created by Larry Wall in the original test script for Perl 1" - frmo search.cpan.org/~petdance/TAP-1.00/...pm#AUTHORS | 20:30 | |
lambdabot | Title: TAP - The Test Anything Protocal - search.cpan.org | ||
rindolf | Coke_: thanks. | ||
20:30
rashakil_ joined
|
|||
TimToady | dmq: if you want to do it more p6-like, you'd say something like (?+[a-z]-[aeiou]) | 20:35 | |
which also allows for Unicode properties as names | 20:36 | ||
mugwump | git.catalyst.net.nz/gitweb2?p=perl....3;f=t/TEST # first Harness :) | ||
lambdabot | Title: git.catalyst.net.nz Git - perl.git/blob - t/TEST, tinyurl.com/25bk2x | ||
rindolf | TimToady: wanna see some Lisp code I wrote in vim? | ||
20:36
[particle] joined
20:37
nipra joined
|
|||
TimToady | I am not so in love with either Lisp or vim that I would crushed to miss it. :) | 20:38 | |
what does it do? | |||
dmq | (?+...) sounds interesting. | ||
not sure if its been grabbed for something tho. | 20:39 | ||
rindolf | TimToady: calculates the Graham function. | ||
TimToady: it's a port of my code to the advanced Perl Quiz-of-the-Week No. 8, IIRC. | |||
TimToady: don't you use vi? | |||
dmq | dang, (?+...) is taken. | 20:40 | |
relative recursion. | |||
TimToady | should have read S05 first... :) | ||
dmq | that sucks. | ||
i did. | |||
theres a LOT of stuff in there. | |||
rindolf | opensvn.csie.org/shlomif/programs/l...-function/ just in case. | ||
lambdabot | Title: Revision 975: /programs/lisp/trunk/graham-function, tinyurl.com/2sl8uf | 20:41 | |
dmq | and in the regex engine itself. keeping both in mind at the same time is kinda hard. :-) | ||
TimToady | yes, but you should realize that <...> is the exact analog of (?...) | ||
dmq | yes, i like that. | ||
shay | hello folks | ||
rindolf | Hi shay | 20:42 | |
TimToady | howdy | ||
shay | hi shlomif | ||
larry | |||
dmq | hrm, maybe we should change relative recursion so its (?&+1) and (?&-1) | ||
rindolf | shay: what's up? | ||
[particle] | too late to change (?+...) to (?@...) or something else? | ||
shay | rindolf, one week left :) | ||
rindolf | TimToady: I also have a Perl 6 version. | ||
shay: to what? | |||
shay | to get released | ||
chafshash | |||
dmq | its +/- for a specific reason. | ||
rindolf | shay: from the Army? | ||
shay | yeah | ||
rindolf | shay: is it the end of your service? | ||
shay | yes | 20:43 | |
rindolf | shay: or just a vacation? | ||
shay: nice. | |||
shay: congratulations. | |||
shay | end of the never-ending service | ||
rindolf, thanks | |||
I'll finally have time to work on the sparc port | |||
rindolf | shay: SPARC port to what? | ||
shay: SPARC port of what? | 20:44 | ||
shay | perl6 | ||
pugs/parrot | |||
rindolf | shay: you mean it doesn't run on SPARC atm? | ||
shay: doesn't ghc run on SPARC? | 20:45 | ||
shay | rindolf, it *should* run, but I wan to officially maintain it | ||
rindolf | shay: ah OK. | ||
shay | rindolf, test every release, make some test near-daily | ||
rindolf | shay: do you have a SPARC at home? | ||
shay | rindolf, yeah, a Sun Ultra10 | 20:46 | |
moritz is jealous ;) | |||
TimToady | dmq: I would suggest that character class sets are probably more important and want a shorter Huffman coding | ||
shay | rindolf, yba gave it to me :) | ||
rindolf | shay: Yonathan Ben-Avraham? | ||
shay | hello moritz :) | ||
rindolf, yes | |||
rindolf | shay: OK. | ||
shay: nice. | |||
shay | he sent a mail to linux-il asking if someone want it, I reply'd | 20:47 | |
then he asked me why should he give it to *me* | |||
[particle] | shay: we'll be happy to add you as a parrot porter for sparc | ||
shay | I told him that I want to make some code portability test and learn the architechture in general | ||
next mail was: "when are you able to pick it up?" | |||
dmq | timtoady: i think i agree. | ||
no, i agree. | |||
shay | [particle], give me a week | 20:48 | |
[particle], I need to get home, I'm in the army now | |||
[particle] | yes, i see that | ||
shay | [particle], is someone working on that port atm? | ||
[particle] | no | ||
shay | great | ||
I'm doing my work on NetBSD | |||
[particle] | be great to have a smoker setup | ||
shay | will have | 20:49 | |
[particle] | shay++ | ||
join us on #parrot (irc.perl.org) when you're ready | |||
dmq | its just a pain to do. | ||
20:49
czth__ joined
|
|||
dmq | as im sure you recall from regcomp.c :-) | 20:49 | |
rindolf | shay: what do you do at the IDF? | ||
dmq | . o O ( If he told you he would have to kill you ) | 20:50 | |
shay | rindolf, fields intelligence combatant | ||
rindolf | shay: I see. | ||
dmq | hah! | ||
shay | dmq, kind of :) | ||
20:57
pbuetow joined
|
|||
lichtkind_ | vorgive me for repeating but what is the current main topic of change? | 20:59 | |
TimToady | do you mean, what are we working on the hardest right now? | 21:01 | |
I'm mostly working on svn.pugscode.org/pugs/src/perl6/Per...0.0-STD.pm | 21:02 | ||
other folks are of course working on other things | 21:04 | ||
or are you referring more to design and spec changes? | 21:05 | ||
or to the topic of the channel? | 21:06 | ||
rindolf | TimToady: aren't you missing a =cut there? | ||
TimToady | =cut is gone in Perl 6 | ||
rindolf | TimToady: oh. | ||
TimToady | (though I suspect pugs still recognizes it) | 21:07 | |
rindolf | TimToady: what text editor are you using? | ||
TimToady | but =begin/=end are always supposed to nest right and return you to whatever context was on the outside. | ||
I use vim, but I think some of the syntax that has developed over the years is extremely crufty. | 21:08 | ||
rindolf | TimToady: what syntax? | ||
TimToady | regex, for one | ||
but then, I think that about Perl 5 too... :) | 21:09 | ||
it would be fun to rewrite vim with Perl 6 as its fundamental syntax. | |||
rindolf | p6re | 21:10 | |
TimToady | currently everything in vim is very ad hoc | ||
moritz | sadly, yes | ||
if I imagine the combined power of perl and vim... | 21:11 | ||
(and I don't mean the linked perl interepreter in vim, that's not _so_ powerfull) | |||
we could have world dominance ;) | |||
TimToady | syntax hilighting with real grammars, for instance | 21:12 | |
lichtkind_ | TimToady sorry girlfriend aked something, yes i mean spec changes | 21:13 | |
i think thats enough for the first | |||
TimToady | most of the spec changes are from trying to write the grammar | 21:14 | |
but the mmd algorithm has also been on my mind for six months or so | |||
lichtkind_ | TimToady as you may knoe im writing editor in perl :) | ||
but perl6 is at very bottom on todo :) | 21:15 | ||
but its definitly a dream | |||
tene | Would be nice to have an editor with Perl6 integrated like lisp is in emacs | ||
lichtkind_ | that was one of the reasons i started the project | 21:16 | |
but it was 2002, never heard of perl6 | |||
21:16
lichtkind_ is now known as lichtkind
21:17
dduncan joined
|
|||
lichtkind | of course is perl6 much cooler | 21:17 | |
but i use it as my primary editor | |||
21:17
dduncan left
|
|||
lichtkind | and always try be rock stable | 21:18 | |
moritz | lichtkind: does it have vi-like modes and key bindings? | ||
lichtkind | not yet | ||
i personly think vi modes suck badly but for the advantages they bring io wanted to introduce something similar | 21:19 | ||
but currently we have main topic CPANification | |||
21:19
arcady joined
21:20
dduncan joined
|
|||
lichtkind | to have short command for aditing is very cool | 21:20 | |
but i like it more distinkt visually than different modes | |||
i guess if your interested we can discuss that in another channel | 21:21 | ||
moritz | yeah, it's not the best idea to start an editor flame war ;) | ||
lichtkind | nop because i studied editors a lot and can see good things in all but in the end i plan to make a better one than all together :) | 21:22 | |
similiar larrys standpoint to languages | 21:23 | ||
:) | |||
moritz | lichtkind: when you're finished I promise I'll try it ;) | ||
lichtkind | you know your never finished :) | 21:24 | |
dmq | everybody wants to make a better editor. | ||
:-) | |||
moritz not ;) | |||
lichtkind: s/finished/published Version 3.0/ ;) | |||
lichtkind | dmq no my involment was an exident :) | 21:25 | |
moritz thats bad because currently i have 0.3.3.17 and 1.0 contains most i can think of today :) | 21:26 | ||
moritz | lichtkind: that's no problem, you'll get more ideas ;) | ||
lichtkind: no, honestly, I'll try it before, "when I have time[tm]" ;) | 21:27 | ||
lichtkind | yes but gratest problem is that it cant be one man show forever | ||
is this another word for never :) | |||
Juerd | lichtkind: Hi | 21:28 | |
moritz | lichtkind: ok, tell me an URL ;) | ||
lichtkind | proton-ce.sf.net | ||
but i highly recommend a nightly build | |||
from web52.xeon225.server4you.de/ | |||
moritz | I highly recommend to offer debian packages ;) | 21:29 | |
lichtkind | hello Juerd glad to see you | ||
Juerd | lichtkind: Is it acceptable for you to leave halfway day 3? | ||
Because otherwise I won't be home in time | |||
21:29
Aankhen`` joined
|
|||
lichtkind | moritz like i said cpanification is on the way, there is a script to make debian packages of it | 21:30 | |
moritz i suspect you have linux | |||
therefore a little bit work :) | |||
Juerd ok i would like to stay a bit so i see you as my backup plan :), but i say in time when i found something else | 21:31 | ||
Juerd | lichtkind: Okay | 21:33 | |
You can always drive along on the way there, of course. | |||
That'll be the 20th | |||
21:33
Psyche^ joined
|
|||
lichtkind | Juerd thanks thats great but from ffm to munich i have aleady a seat in car of a friend, after munich he goes to parents in austria thatswhy i need another way back :) | 21:36 | |
dmq | lichkind you are in ffm? | 21:37 | |
am i going to be driving to munich with you i wonder? | 21:38 | ||
:-) | |||
lichtkind | dmq no god beware, but a girl friend of mine :) | ||
ah and when you drive back? | 21:39 | ||
dmq | ah ffm aint soooo bad. | ||
im not driving back. | |||
train. | |||
Juerd | lichtkind: I see | ||
lichtkind | dmq and which time? | ||
dmq | back? | ||
Juerd | dmq: Destination? | ||
dmq | not sure. | ||
ah, GPW? | |||
oder, DPW | |||
Juerd | dmq: And the other way? | 21:40 | |
dmq | oh, back to ffm | ||
i live here | |||
Juerd | DPW is confusing: could be Dutch or Deutsche ;) | ||
dmq | there | ||
Juerd | What's ffm? :P | ||
moritz | Juerd: Frankfurt/Main | ||
lichtkind | frankfurt | ||
Juerd | I see | ||
dmq | but the weird thing is im driving to munich with someone who afterwards is going to austria to see his parents. | 21:41 | |
Juerd | dmq: I'm almost driving through Frankfurt. If you need a ride... | ||
The thing is that I'm leaving during day 3. Probably missing the last 3 or 4 talks. | |||
lichtkind | dmq i come from east and from villige thatswhy its hard for me to live there more than a week | ||
dmq | ill be going with corion from pm. so i wouldnt worry about me, its very kind to offer tho. | ||
lichtkind | Juerd the second last is interesting to me | 21:42 | |
Juerd | lichtkind: I'm quite interested too, but need to make it back home in time | ||
lichtkind | of course | ||
Juerd | It's 10 hours from Munich to Dordrecht | ||
And I need to be awake the day after | 21:43 | ||
lichtkind | :) | ||
Juerd | So I'm planning on arriving approx 1:00, saturday, then sleeping 7 hours, and waking up at 8 am. | ||
My office will be rewired that day. | |||
Getting 3 * 35 A | 21:44 | ||
Which is a nice upgrade from 2 * 16 A, but requires some additional changes. | |||
(To use it well.) | |||
lichtkind | dmq when you go back at saturday we can join a weekend ticket | ||
dmq | no im going back friday afaik. | ||
unfortunately. sorry mate. | |||
Juerd wonders if there's a GPW irc channel | 21:45 | ||
dmq | actually, i take it back. i have no idea how im getting home. | ||
we can discuss it there im sure. | 21:46 | ||
all i know is i love my bahcard50! | |||
bahncard-50 | |||
Juerd | dmq: Well, you could drive along with me :) | 21:47 | |
I'd like a passenger to keep me awake :P | |||
(And maybe share fuel costs, but that's secondary, if not tertiary) | |||
dmq | sounds interesting, but as im speaking id kinda feel bad bailing early. | 21:48 | |
althought ill think about it. | |||
Juerd | Just don't leave before your own talk ;) | ||
dmq | I can definitely sympathise with the desire to have company in the car tho. | ||
21:49
Psyche^ is now known as Patterner
|
|||
Juerd | I've done Berlin -> Dordrecht and Chemnitz -> Dordrecht alone before, and while it's doable, I didn't like that I was by myself. | 21:49 | |
lichtkind | dmq i thought you go train? | 21:50 | |
dmq | i go by car, return by train. | 21:54 | |
i have a feeling the same car you are going in. | |||
:-) | |||
if you are going with strat. | |||
Juerd | Grin | 21:55 | |
lichtkind | Ƥh yes | ||
Juerd | Ƥ :) | 21:56 | |
21:56
iblechbot joined
|
|||
svnbot6 | r15271 | lwall++ | Factored out rule names from all #= comments; preprocessor now expected to | 22:02 | |
r15271 | lwall++ | recognize /^[rule|token|regex] <ident>/ as implicit start of {*} identity. | |||
r15271 | lwall++ | The #= comments now only add to that base identity. | |||
dmq | it would be cool for referencing if synopsis 5 had anchors on the bulltet points. | 22:14 | |
:-) | 22:15 | ||
tene | dmq: are you asking for a commit bit? | ||
Juerd | dmq: Patch^WCommits welcome ;) | ||
tene | That kind of talk around here will get you a commit bit if you're not careful. | ||
dmq | is it html or is it generated from pod? | 22:16 | |
Juerd | tene: I expect dmq to already have one. Or two. Or three. :) | ||
dmq: The latter | |||
dmq: But that shouldn't stop you, should it? :) | |||
dmq | heh | ||
Juerd | feather.perl6.nl/syn/ # these we control ;) | ||
lambdabot | Title: Official Perl 6 Documentation | ||
dmq | I guess its a prioritization issue. Theres things very few people beside me can do, and theres other things that lots of folks can do. Which should i choose? | 22:17 | |
22:17
larsen_ joined
|
|||
Juerd | dmq: The most -Ofun things. | 22:17 | |
dmq | heh | 22:18 | |
of the options most of them arent fun. | |||
:-) | |||
im currently trying to make unicode character classes in perl 5.10 use a sane data structure. | 22:19 | ||
i dont even use unicode damnit. | |||
:-) | |||
Juerd | Oh, but you do! | ||
You're sending perfect utf8 sequences to irc ;) | |||
dmq | heh. | ||
Juerd | utf7 too, most of the time | 22:20 | |
Why do you not use unicode? | |||
dmq | because i dont need to. | ||
i was blissfully unaware of unicode until i started hacking the regex engine. | 22:21 | ||
what a schock. :-) | |||
[particle] doesn't use unicode--because it's <<<unamerican>> :) | 22:22 | ||
shamu | uit | ||
moritz | [particle]: actually that would be a good reason to use it ;) | ||
shamu | uit | ||
22:23
shamu joined
|
|||
dmq | its a kind of pet rant for me. IMO a big chunk of unicodes horribleness comes from trying to please everyone. Im not sure its the right approach. | 22:23 | |
moritz | dmq: what would you say is the right approach? | 22:24 | |
shamu | well, bear in mind that UTF-8 is unicode, and the first 7 bits match one-for-one with ASCII -- that's *Am'rken* Standard code for Information Interchange, podner | ||
Juerd | dmq: But do you never use non-ascii? | ||
moritz | dmq: utf-32 with just one char for each possible sign? | ||
dmq | im not sure what the right solution is. | 22:25 | |
SamB | I think a better approach would be to try and please someone | ||
dmq | and in some ways yes, utf32 has a bunch of advantages that are relevent to me. | ||
but i think it comes down to what samb said somewhat. I mean unicode does everything, including dead languages. | 22:26 | ||
shamu | fyi, I've been trying to figure out what 'Unicode' actually is for a long time | ||
moritz would like to see a generic utf-2**n (for any integer n ;) *duck* | |||
Juerd | dmq: Does it matter that it includes dead languages? | ||
shamu | any recommendations? The only intelligible one I ever found was www.joelonsoftware.com/articles/Unicode.html | ||
lambdabot | Title: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know A ... | ||
SamB | I wish it tried to do something right | ||
Juerd | shamu: That's a good one. | 22:27 | |
dmq | my view is human communication evolves just like the rest of our cultural insitutions and we shouldnt waste computer cycles because somebody might want to use ancient greek. | ||
Juerd | shamu: See also wikipedia. And if you want to use it with Perl, perlunitut. | ||
SamB | rather than trying to do everything, except leaving it all half-done | ||
lichtkind | svn.pugscode.org/pugs/src/perl6/Per...0.0-STD.pm is this perl6 metadata es part of the interpreter | ||
Juerd | dmq: Ancient Greek needs to be digitized too, if we are to preserve things. | ||
dmq | unfortunately im also a native english speaker, so people tend to think im being an english bigot when i say stuff like that. but i mean seriously. | ||
sure digitize it. just dont make every program everywhere pay the price. | 22:28 | ||
Juerd | dmq: And it's great that we can put it all in one encoding. | ||
shamu | well, computers are now multicore at 2Ghz plus -- what's the point, if not to have some of those cycles to waste on reading and publishing something the ancient greeks figured out a long time ago, so we can learn from their mistakes and hopefully culturally evolve | ||
Juerd | dmq: If only for web pages, it's useful. | ||
SamB | also, mathemeticians needed those letters anyway | ||
dmq | sure there are places where unicode is useful. | ||
like for a web browser. | |||
but for an os? | |||
shamu | also, isn't every ascii program automatically utf-8? | 22:29 | |
dmq | or a programming language? | ||
masak | other recommendations: www.tbray.org/ongoing/When/200x/200...06/Unicode | ||
Juerd | dmq: Well, programming languages deal with the web, too. | ||
lambdabot | Title: ongoing · On the Goodness of Unicode | ||
masak | www.tbray.org/ongoing/When/200x/2003/04/26/UTF | ||
Juerd | So if it's useful for the web, it's automatically useful for programming languages too. | ||
lambdabot | Title: ongoing · Characters vs. Bytes | ||
masak | www.tbray.org/ongoing/When/200x/200...e-and-Ruby | ||
lambdabot | Title: ongoing · Unicode and Ruby, tinyurl.com/yoz8sp | ||
dmq | shamu have you looked at what is required to read a utf8 stream even if it only ascii. | ||
shamu | well, if it is only ascii, can't you just say 'getchar(); if high bit set, abort reading stream'? | 22:30 | |
I mean 7-bit ascii | |||
dmq | ascii is 7 bit | ||
Juerd | Very few texts that I encounter are pure ascii. | 22:31 | |
dmq | utf8 is exactly equivelent to ascii for code points 0-127 | ||
Juerd | Most of the time it's utf8, second place goes to latin1. | ||
dmq | juerd sure: but dont you think that a fixed wisth 16 bit encoding would be sufficient for all your needs? | ||
Juerd | But I deal with iso-8859-15 and -3, and windows-125x, and koi8-r, too. | ||
dmq: I personally don't care if Perl *internally* uses 16 bit or not. | 22:32 | ||
dmq | i can see the utlity in unicode, but using it as a standard internal encoding doesnt make sense to me. | ||
SamB | you would prefer what? | ||
Juerd | dmq: But when outputting things, I like to use utf8, because then I can still use it on an old fashioned terminal. | ||
dmq | note the misspelled "fixed width" | ||
SamB | no standard internal encoding? | ||
all programs having to deal with things in unspecified encodings? | 22:33 | ||
Juerd | SamB: My personal preference for perl's internal string encoding is bitwise negated utf8. | ||
dmq | yes, use a kludge to work around a kludge. | ||
SamB | and files people send you often not working? | ||
dmq | im not saying there are easy solutions or that unicode isnt the least worst. | ||
but i dream of a better world :-) | |||
SamB | oh, by all means unicode is horrible | 22:34 | |
dmq | SamB: I bet it is extremely rare to find a document that contains all 100k letters in it. | ||
Juerd | SamB: There must be some internal encoding, but as long as Perl knows which one it is, and how to convert it to the other encodings, all are fine with me. | ||
SamB | but is currently the least-horrible available thing of its kind | ||
Juerd really wants his ~utf8, but lacks C fu :( | |||
(and tuits) | 22:35 | ||
dmq | anyway, im probably biased as with what ive been coding tends to require efficient random access to characters. Which most unicode encodings dont allow. | ||
SamB | ah. | ||
try UTF32. | |||
[particle] | juerd: why ~? | 22:36 | |
Juerd | [particle]: So the mistake of forgetting to encode your output is clearly visible. | ||
dmq | yes, ill just recode perl5 to use utf32 internally. :-) | ||
[particle] | win32 encodes files in ucs2 iirc | ||
Juerd | [particle]: As would the mistake of not decoding. | ||
[particle] | juerd: should be really easy on parrot | 22:37 | |
allbery_b | hm, iisn't ucs2 deprecated? | ||
SamB | Juerd: that is what typesystems are for | ||
Juerd | [particle]: If Perl 6 has strings the way I think they will be, that won't be necessary. :) | ||
dmq | before xp it did, xp and later it does utf-16 | ||
en.wikipedia.org/wiki/UTF-16 | |||
Juerd | SamB: Yes, but Perl 5 doesn't have them, and does need to support BOTH unicode and text. | ||
allbery_b | that makes more sense | ||
Juerd | s/text/binary/ | ||
22:37
gnuvince_ joined
|
|||
SamB | isn't utf-16 only a little different from ucs2? | 22:37 | |
in practice? | |||
dmq | yes | ||
[particle] | yes | ||
dmq | no surrogate pairs | 22:38 | |
SamB | about as different as utf-8 and ascii? | ||
Juerd | No, much less different. | ||
SamB | except for the small detail of there not being terribly many characters in Unicode that don't fit in two bytes? | 22:39 | |
dmq | no, ascii and utf8 are very different. | 22:40 | |
oh well, ok, yes, ascii->utf8 fixed width/variable width, utf-16->ucs2 fixed width/variable width. | 22:41 | ||
whatever. | 22:42 | ||
22:42
ProperNoun joined
|
|||
SamB | also, both utf-8 and utf-16 use what were unasigned codepoints for their multi-word character codings, yes? | 22:43 | |
dmq | if i get the question yes for utf8, pass on utf-16. | 22:44 | |
SamB | I'm pretty sure that the codepoints corresponding to the words in surrogate pairs were previously unasigned | 22:46 | |
allbery_b | actually I'd claim utf8 is not very different from ASCII, because ASCII only defines code points 0-127 | ||
diotalevi | ASCII? No, I thought that defined all of 0 -> 255. | 22:48 | |
dduncan | no, ASCII is 7-bit | ||
Juerd | ASCII is 7 bit. It cannot have anything > 127 | ||
Not without compression, at least :) | |||
dduncan | and UTF-8 is identical to ASCII for codepoints 0..127, afaik, which is part of its appeal | 22:49 | |
Juerd | diotalevi: There are ascii-compatible encodings like iso-8859-1, cp437, etcetera, that have 255 characters. | ||
dmq | we are talking about the merits of an encoding tho. | ||
diotalevi | dduncan: er, minus some of the control character parts of ASCII. I thought that differed slightly. | ||
dmq | so the relevence of codepoint equivelency is kinda a seperate issue | ||
Juerd | diotalevi: No, 0..127 are fully equal in ascii and utf8 | 22:50 | |
dduncan | if you're going to use unicode, which I recommend to be the default, I would say that UTF-8 is the best default bet | ||
its other advantages include being byte order independent | |||
[particle] | ascii is a codeset and an encoding, so it's hard to speak about clearly | ||
s/codeset/charset/ | |||
dduncan | and relatively compact | ||
dmq | and reading ascii is not identical to reading utf8 unless you know in advance that you are really dealing with ascii. | 22:51 | |
dduncan | also, UTF-8 is encoded such that you can start in a text stream at any byte and you can easily tell where the character boundaries are | ||
Juerd | dmq: Can't you read in ascii, and upgrade to utf8 when you encounter the first high bit? | 22:52 | |
dmq | so if you dont know, or you are dealing with characters outside of ascii you have to do the clumsy read and scan of utf8. | ||
dduncan | which helps reliability | ||
dmq | juerd: im kinda inclined to think that encoding should a problem the coder deals with. only they have the information to make the right decision. | ||
diotalevi | Say, 127 isn't defined in Unicode but is DEL in ASCII. Are you sure that one is the same? | 22:53 | |
dmq | utf8 is /defined/ to be ascii for lowbit bytes (in a wellformed utf8 string) | ||
[particle] | transcoding is definitely a user issue. but support for major encodings should be supported in core ops/libraries | ||
diotalevi | er, wait. I was reading the wrong line. | ||
22:53
sunnavy joined
|
|||
dmq | particle: yes i agree pretty much. | 22:54 | |
dduncan: that is true. | 22:55 | ||
about finding the boundaries from a given point. but finding boundaries doesnt replace the fact you cant do random access. | 22:56 | ||
dduncan | if you know a stream is UTF-8, then you can do random access | ||
SamB | utf-8 probably sucks as an in-memory representation | ||
dmq | dduncan: how do you reckon. | 22:57 | |
SamB | but not so bad for an on-disk encoding for programs, usually... | ||
22:57
Psyche^ joined
|
|||
dmq | you need to a linear scan. | 22:57 | |
22:57
elmex joined
|
|||
dduncan | the bit patterns of utf-8 characters are such that you can recognize just from looking at no more than 6 consecutive bytes where the character boundary is | 22:57 | |
dmq | heh. i wonder, maybe some of those old algorithms for tape would be useful. | ||
dduncan: that means you scan. | 22:58 | ||
dduncan | but you don't scan from the start of the string, which is my point | ||
a handful of bytes is nothing | |||
dmq | there is no way without scanning to say "jump to the 10th boundary from here" | ||
allbery_b | you can spot *a* character boundary but ==dmq | ||
dduncan | its when you have to start at the beginning of the string to know how to interpret the characters you get to correctly, which is the problem | 22:59 | |
allbery_b | on the flip side, 32-bit chars are always fast to index but slow to do anything else with (see Haskell [Char]) | 23:00 | |
dduncan | while "go to 10th character" is needed for some apps, many apps don't require you to do that, such as things with data interchange or network operations | ||
Juerd | dduncan: "If you know a stream ..., random access ..." That's the problem: utf8 only makes sense as as stream. You need to scan it. Therefor, there's no sane way to do random access, unless you keep an offset map. | ||
allbery_b | (well, they also have indexing issues because it's [Char] instead of Array ... Char) | ||
dduncan | for the apps that need to do this, you can transcode it to UCS32 for internal use | ||
Juerd | (For *huge* data, it makes sense to keep an offset map of every 128th byte, or so) | 23:01 | |
dduncan | er, UCS4 | ||
(ucs uses bytes, utf uses bits) | |||
dmq | huh and huh? | ||
dduncan | afaik | ||
lichtkind | night folks, i believe in you ! | ||
dduncan | er, the numbers in the names of UCS count in bytes, in UTF, bits | 23:02 | |
that's what I meant to say | |||
dmq | i thought the ucs names were the old ones | ||
dduncan | they are | ||
utf is more modern, and what I prefer | 23:03 | ||
Juerd hungry. | |||
dmq | right | ||
dduncan ditto | |||
dmq | i dont get the every 128 bytes comment exactly. i probably havent thought about it longer. | ||
long enough | |||
23:04
sunnavy joined
|
|||
dduncan | I don't know the significance of 128 bytes either | 23:04 | |
23:04
Psyche^ is now known as Patterner
|
|||
[particle] | i think he means something to mark the start of a grapheme | 23:05 | |
dmq | ah i see. | ||
[particle] | so you can seek to that position and tell safely | ||
dmq | right. that makes sense. | ||
but then theres the overhead of doing that. | 23:06 | ||
sigh. it all sucks. | |||
:-) | |||
dduncan | so there's a marker for each 128 bytes that says what character number is there? | ||
I think that makes sense | 23:07 | ||
Juerd | dmq: If you, during reading, scan everything and cache the character offset for every 128th byte (rounded up or down to full character boundaries), you can more efficiently locate character N, because you can start scanning at the closest checkpoint. | ||
dmq: As said, this is only beneficial for *huge* data. | |||
allbery_b | yeh, so instead of counting from the start you can pick it up in the middle. tradeoff between overhead of keeping a count and having to step | ||
Juerd | Like, entire books :) | ||
dmq | right right | 23:08 | |
Juerd | (And even then, you should think twice before going through the trouble of implementing all this.) | ||
dmq | no no | ||
im still on character classes in unicode. | |||
no worries. | |||
[particle] | well, if it's static content... just create a lookup table | ||
Juerd | (After all, the (Christian) Bible, fits in 1.44 MB! :P) | ||
allbery_b | I'd actually say it's worthwhile if the string is >4k or so | ||
Juerd | allbery_b: 4 kB already?! | ||
dmq | i cant quite get invert(invert($class)) to work. | ||
Juerd | Nah, I don't think it will. | ||
[particle]: The mapping I referred to *is* a lookup table :) | 23:09 | ||
dmq | i dont suppose anybody has the unicode book that covers inversion lists handy? | ||
allbery_b | of course, most strings are << that, so it's still not much of a win in practice | ||
23:09
sunnavy joined
|
|||
[particle] | juerd: sorry, i meant *store the lookup table | 23:09 | |
Juerd | allbery_b: Ā«? ;) | 23:10 | |
[particle] is distracted by food | |||
Juerd | allbery_b: I think that with a 4 kB string, the overhead of keeping a mapping table is still too large to benefit from it. | ||
allbery_b | *you* try doing unicode through a vnc client on OSX sometime :> | ||
TimToady | dmq: don't get fixated on random access to strings. it's only going to get less important with time. And not even UTF-32 is a fixed width encoding of graphemes, which is what the user really wants to think in terms of anyway. | 23:22 | |
regexes don't really need random access, for instance. nearly all the offsets are very small and relative to your current position. | 23:24 | ||
the quest for a fixed unit of storage to represent characters is misguided in my opinion except as an optimization that is below the abstraction level of the programmer. | |||
very few people complained that substr slowed down when we went with utf-8 in perl 5 | 23:25 | ||
Gothmog_ | That's not necessary a good argument. | 23:27 | |
TimToady | It's not necessarily a bad argument either. :) | 23:28 | |
allbery_b | it's a "good enough" argument. which, given that you can't have perfection, is not a bad thing | ||
TimToady | the point is that substr and friends aren't all that useful once we start getting away from the punchcard metaphor of text. | ||
nearly all the pattern matching done in Perl is done with regex, | 23:29 | ||
Gothmog_ | I think of UTF-8 vs. some fixed width encoding as a speed vs. memory trade-off. | ||
TimToady | and regex naturally finds boundaries without caring about large offsets | ||
Gothmog_: fine, but that should be below where the typical user is thinking. | |||
which is at the grapheme level, which corresponds to what the user thinks of as a "character". | 23:30 | ||
Gothmog_ | Hm right, but it might be important if some kind of string lookup is O(1) or O(n). | ||
Like that's why we use hashes and not array of pairs. | |||
s/array/$&s/ | |||
TimToady | that's one of the things that a VM is pretty good at optimizing on the fly | 23:31 | |
but I am adamant on the subject that a string position in Perl 6 is *not* *not* *not* an integer. | |||
Gothmog_ | Hm. | 23:32 | |
TimToady | it's one of my hot buttons, in fact | ||
Gothmog_ | What is it that a VM can optimize on the fly, and what do you think should a string position be, if not an int? | 23:33 | |
moritz | so what is it? a pointer? | ||
dmq | not an integer? | ||
TimToady | absolutely not | ||
dmq | wider? | ||
TimToady | integers don't know their units | ||
diotalevi | . o O ( A marker? ) | 23:34 | |
TimToady | yes, basically a marker | ||
dmq | ah ok. a vector. | ||
Gothmog_ | So, you want to differ n bytes / n graphemes / n whatever? | ||
dmq | so it wont count code points? | ||
TimToady | if you force it to count in a particular unit, you must make sure it knows the correct units | ||
dmq: by default, no | 23:35 | ||
dmq | interesting. | ||
Gothmog_ | And what happens if you don't enforce a particular unit? | ||
TimToady | the default in Perl 6 is graphemes, and has been from day one | ||
Gothmog_ | That seems to be sane. | ||
moritz | so is 1 grepheme = 1 code point is this context? | ||
TimToady | the default Unicode level is to count by graphemes | ||
dmq | i suppose its true. | ||
TimToady | a grapheme may be several code points | 23:36 | |
dmq | you dont necessarily need to store a full map. | ||
TimToady | a base character plus its combining characters, basically | ||
that is also why there is no .length method in Perl 6 | 23:37 | ||
Gothmog_ | But if you access the nth grapheme, n is an int, or not? | ||
TimToady | it will grudgingly translate n to a string position, and then try to maintain the abstraction from then on. | ||
dmq | ive been thinking of how to store a trail of positons reached via accepting states from a DFA so that it cant be used intermingled with the backtracking engine. | ||
and your right, that is all localized small offsets. | 23:38 | ||
thanks. thats a useful observation. | 23:39 | ||
TimToady | indeed, I'm an acquaintance of the person who hacked the utf-8 matcher into regexec.c :) | ||
23:40
Psyche^ joined
|
|||
TimToady | so basically Perl 6 has string positions as opaque markers or pointers | 23:40 | |
internally it can be a string plus a byte or codepoint offset, but that's hidden from the user's view. | 23:41 | ||
dmq | right but regexec.c can keep that kind of data on stack. | 23:42 | |
TimToady | and you can force string positions and lengths back to numbers as long as you specify the units | ||
yes, that's internal | |||
dmq | if you mean what i think you mean. | ||
so you can cheat when you end up with easy units like single width chars right? | 23:43 | ||
TimToady | yes, but only when you know it for sure. | ||
dmq | but with a dfa, everything is a codepoint/char. so you end up hypothetically building a scary stack. | 23:44 | |
TimToady | Perl 5's type system is a bit dicey on the subject of knowing such a thing. | ||
dfa's are always scarey. :) | |||
dmq | so i was thinking that if its offsets (therefore localized), and run length encoded, then you coudl do it for a dfa without worrying about the stacking blowing up. | 23:45 | |
the units would be codepoints i guess. | |||
23:45
Psyche^_ joined
|
|||
dmq | im really interested in the idea of making as much of a regex happen using a dfa. | 23:46 | |
23:48
CardinalNumber joined
|
|||
TimToady | everything in S05 about "longest token" is aimed at the same goal. | 23:49 | |
dmq | yeah. | ||
TimToady | but it tends to make more sense for a parser than for a one-shot regex | ||
which are often more efficient with a Boyer-Moore algorithm | |||
dmq | and i noticed that other semantics are chosen to make longest token not be super-expensive when each branch cant be handled via a dfa. | 23:50 | |
TimToady | because the dfa is required to look at every character, and BM isn't | ||
dmq | right. | ||
the dreaded offsets code. | |||
i almost got lookbehind properly optimsable, but then my head exploded. | |||
TimToady | well, dfa is in the abstract side-effect free, and a parser wants to be full of side effects. | 23:51 | |
so you have to manage the transition from patterns to actions somehow | |||
dmq | yes | ||
TimToady | much like your typical awk statement | ||
P6 requires reversibility on lookbehind patterns | 23:52 | ||
(though that implies an encoding that can be scanned backwards too) | |||
dmq | heh | 23:53 | |
23:53
Psyche^_ is now known as Patterner
|
|||
dmq | it still would be nice to extract fixed substrings from them | 23:54 | |
if possible. | |||
so that things like /(?<=foo)/ can be as efficient as /foo/ | 23:55 | ||
i almost had it working. | |||
TimToady | it should match oof but run the counter the other way | ||
dmq | or it could just BM for 'foo' | 23:56 | |
:-) | |||
TimToady | no, that's the wrong approach entirely | ||
dmq | and use the spot after it. | ||
or what about /(?<=foo)bar/ it should just look for 'foobar' and use the middle. | 23:57 | ||
TimToady | that's a possible optimization, but in the abstract it's not difficult to look for a position with oof going left and bar going right. | 23:58 | |
dmq | i realize it doesnt scale when you add quantifiers, im talking about an optimisation only. | ||
TimToady | and the user nearly always has a good reason for having written it that way in the first place. | ||
so you're almost never going to be able to do that optimization anyway | |||
dmq | i think mainly to not have bar in $& or $1 | ||
actually that type of thing is quite common in split. | 23:59 |