Parrot 1.2.0 released | parrot.org/ | 303 RTs left | Weekly Priority: Profiling
Set by moderator on 28 May 2009.
Infinoid Ok, I've increased my sample size, and... Eh. It's a wash, no regressions but nothing to write home about either. 00:02
00:03 bacek joined
jonathan *nod* 00:05
OK, I need sleep...night
bacek good night jonathan. 00:28
Austin_Hastings Is there some known behavior (possibly involving exception handlers) that would cause part of a stack trace to go missing when an error occurs?
bacek good localtime() everyone else 00:29
Austin_Hastings Good morning, Bacek.
Whoops. Never mind. PEBCAK. 00:31
bacek Infinoid: I can't spot what's wrong with t/pmc/annotations.t... 00:36
00:51 skids joined
Infinoid ok, thanks for checking anyway 01:03
I'll probably not have time to dig into it until next week, but I'll see what I can do
since it's intermittant, I guess it must be a race condition somewhere
01:15 ZeroForce joined 01:20 eternaleye joined 01:33 Zak joined 02:06 Theory joined 02:13 eternaleye joined 02:35 particle joined, whoppix joined 02:37 zak_ joined 02:46 snarkyboojum left 02:47 snarkyboojum joined, snarkyboojum left 02:48 janus joined 02:57 Ademan joined 03:01 Whiteknight joined 03:38 Theory joined 03:39 kid51 joined
kid51 Seen on nytimes.com: "Commencement 2009: HASH(0x5cfc60)" 03:40
03:43 Zak joined
dalek rrot: r39282 | pmichaud++ | trunk (2 files):
[core]: Fix TT #24, equivalence of hash keys.
03:59
rrot: r39283 | pmichaud++ | trunk/t/op/stringu.t:
[core]: Remove incorrect unicode test from t/op/stringu.t .
04:02
04:17 snarkyboojum joined 04:46 cotto joined 05:10 snarkyboojum left 05:23 skids joined 05:31 cotto joined 05:32 skids joined 05:51 david joined
dalek rrot: r39284 | pmichaud++ | trunk/t/op/stringu.t:
[core]: Add another (failing, todo) test for TT #24 --
06:04
06:12 snarkyboojum joined 06:38 david joined 06:59 cotto joined 07:05 viklund joined 07:13 patspam joined
Tene Anyone awake with much knowledge of Parrot's Makefile system? 07:31
Out of curiosity, anyone awake at all?
07:32 unitxt joined 07:35 unitxt left
Tene pmichaud: please review r39285. 07:54
dalek rrot: r39285 | tene++ | trunk (4 files):
Add a 'parrot' compiler object that can be used from HLLs to load
07:57
Tene gist.github.com/120809
08:15 iblechbot joined 08:56 patspam joined 09:41 bacek joined
bacek oh hai 09:44
09:57 viklund joined 10:17 Ademan joined
Infinoid good morning 10:35
purl Here I am, brain the size of a planet, and all they say is 'Good Morning'
bacek stopping himself to write mail to parrot-dev with all 3 usual sentence about design of STRING... 10:50
Infinoid oh? that sounds like an amusing email 10:52
bacek who decided to have zillions of internal string representations inside single interpreter??? 10:53
why we can't have simple single string representation and avoid all this bloody checks during runtime about string "compatibility" 10:54
Infinoid even if we flatten the encoding internally, we still need to keep track of different charsets 10:55
bacek (And TT#24 just not closeable. Because every single Hash on earth built on simple idea - if hashvalue is same, then values can be same. But if they different, then values CAN'T BE SAME)
Why we need different charsets?
Infinoid Because if we can't handle different charsets, neither can the HLLs. And being able to turn on and off unicode is a requirement for some languages 10:58
(perl5, for instance)
bacek chromatic> HELLO FROM THE 21ST CENTURY PLEASE JOIN US THE WATER IS FINE 10:59
"turn off unicode" for output - fine. But it doesn't imply not to use unicode internally
Ooo... 11:00
chromatic> But we have backwards compatibility with Perl 4 to consider, so we can't change too much.
bacek fell in love with ParrotQuotes
Infinoid :)
actually, I turn off unicode for performance reasons... but it's really utf8 I care about turning off, not unicode itself 11:01
the whole bytes vs. chars thing is *completely* useless when you're trying to write high performance network code
mikehh utf8 should be an external representatiom NOT internal 11:02
bacek utf8 is good. as input/output encoding.
Not as intern.. As mikehh said 11:03
(high performance) And this is another question - who was so high on cocaine to use STRING* as sequence of bytes? Who??? 11:04
Infinoid I dunno, STRING predates my involvement with parrot 11:05
But some things *are* sequences of bytes, and it's nice to be able to access those from parrot too
bacek Yes! As sequence of bytes! 11:06
mikehh but a sequence of bytes should NOT be a STRING
bacek No one should want to "uppercase this sequence of bytes please"
Infinoid why not?
bacek different semantic. totally 11:07
mikehh unless you consider bitstring bytestring and then charstring
Infinoid the fact that some things are difficult does not mean they are unnecessary
bacek vote for new core type BYTE. Among INTVAL, NUMVAL, PMC and STRING. 11:09
Infinoid No thanks. We already have enough interfaces that require one data type and exclude others that could potentially be useful
mikehh characters are NOT bytes, ASCII is ab encoding which maps to bytes - Unicode does not
s/ab/an/ 11:10
bacek Oh. We can have one type 'void*'...
Infinoid characters *are* bytes, when you're using binary or ascii charsets and fixed_8 encoding
I think PObj is a bit more useful than void*
bacek characters happens to map to bytes for "single byte encodings"
Infinoid Here's an example. If we couldn't make ascii/bytesequence STRINGs, NCI wouldn't be able to look up symbols in the linker table, which means NCI would be more or less useless 11:11
And if we had to transcode those to/from UCS4, I think performance would be noticably worse than it is now
mikehh no - bytrs can represent characters in ASCII or EBCDIC they are not bytes
bacek It will be faster
Infinoid mikehh: bytes are not bytes? 11:13
11:13 masak joined 11:15 elmex joined
Infinoid The code referenced by pmichaud in TT #24 looks a bit sketchy to me... what's to prevent hash collisions? 11:15
mikehh in general bytes are 8 bits - they can represent many things, 0-255, -128-127 ASCII characters, EBCDIC characters etc
11:15 Whiteknight joined
Infinoid they can represent anything, or nothing. My code tends not to care, as long as length() reports the number of bytes properly 11:15
and efficiently 11:16
mikehh as long as you know what is being represented
Infinoid if I'm making a webserver, I want all my strings to just be bytes. Some config can give me a "content-encoding" string to give to the client, but I just want to shovel bytes 11:17
if I have to deal with strings as utf8, my server performs worse for *no* good reason
11:17 snarkyboojum joined
mikehh furthermore the majority of the world uses other languages 11:17
don't use utf8 as an internal representation 11:18
Infinoid therefore, if parrot did not have the ability to handle non-unicode strings, I've lost the ability to write efficient servers in parrot HLLs.
it's not about utf8; it's about unicode in general
I do want to handle sequences of bytes directly 11:19
that's why I like STRING's ability to do so.
mikehh if you have to deal only with English ok what about Asian languages
Infinoid asian languages still use strings made of bytes, even if each character may consist of more than one of those bytes 11:20
mikehh or Hebrew, Cyrillic etc
Infinoid My code does not *care* about your characters. It just wants to shovel the bytes
Let the user interface deal with charset nonsense
mikehh thats utf8
Infinoid what's utf8? 11:21
purl i think utf8 is the One True Encoding or RFC 2044 or statico's test at langworth.com/pub/unicode.png (screenshot) and langworth.com/pub/unicode.html (source & browserable) or at www-950.ibm.com/software/globalizat...&s=ALL or teh sux0r ( sam.zoy.org/writings/utf8/ ) or use the fine 8
Infinoid You may think I'm crazy for writing web servers and network stacks in perl, but I'd say you're crazy for suggesting things that will prevent me from doing so efficiently
mikehh I don't think you are crazy :) 11:22
bacek Infinoid: I wrote a lot of web services. And network apps. And clean separation between std::vector<uint8_t> and std::string was always helpful
Infinoid Lets say you have a text file on disk. I open the file with a fixed_8 encoding, I read it into a STRING, and I send it to you on a network socket. Your *client* can interpret it however it wants, but for my purposes, I just want to get the bytes into and out of my server as quickly as possible 11:23
mikehh no one involved in this project is crazy are they?
Infinoid utf8 really pisses me off, in that length() does not report a value I can use for the Content-Length header
So I turn it off. It's just useless overhead on this layer
bacek you read "sequence of bytes" and send "sequence of bytes". 11:24
and "length" and "chars" are different!
Infinoid So I need to be able to handle strings in stupid, fast, sequence-of-bytes mode
bacek (Even Parrot's STRING has 2 different methods)
Infinoid If parrot transcoded those to UCS4 internally and back, that would still be stupid, but less fast 11:25
11:25 Austin_Hastings joined
bacek You need to be able to handle sequence of bytes in stupid, fast, sequence-of-bytes mode 11:25
Whiteknight so Infinoid, you need he byte length, not the codepoint length?
bacek You shouldn't convert it to STRING
That's why I proposed BYTE as core type. 11:26
Infinoid most of the I/O functions only work with STRINGs. You'd have to double the size of our API for that
bacek We have 240+ VTABLEs already, who cares? :/
Infinoid I don't really object to a BYTE, though I think it's a lot of extra work and we already have problems with APIs using the wrong datatype. 11:27
What I object to is: [03:54] <@bacek> why we can't have simple single string representation and avoid all this bloody checks during runtime about string "compatibility"
I *like* having encoding plugins for STRINGs, I don't want to transcode everything on input/output 11:28
bacek So, TT#24 isn't closeable. Because to same strings will have different hashvalue 11:29
s/to/two/
Or we have to translate it on every "hash_put", "hash_exists" 11:30
Infinoid the hashval thingy seems like an optimization, it still falls back on full string compare
bacek Infinoid: no!
Infinoid But I still worry about hash collisions, there
bacek hashval is how hashes works. 11:31
And STRING_compare is for collision detection
s->hashval is just premature optimisation.
Infinoid agreed 11:32
bacek You can comment out using it in key_hash_STRING. Result will be same
Infinoid I know how hashes work. There's nothing wrong with having multiple entries in the same bucket, it just means your hash function didn't detect a difference in the keys (or maybe your hash is getting full) 11:33
bacek So, pmichaud's patch breaking contract with Hash.
Infinoid But relying on hashval comparisons seems to violate the documentation of the function: "Compares the two strings, returning 0 if they are identical."
oh, wait, it's only a negative check 11:34
bacek Indeed.
Collision detection only.
Infinoid how does pmichaud's patch break the contract?
bacek For Hashes: if(hash(k1) != hash(k2)) means v1 != v2 11:35
Now, we have two strings A1 and A2, and hash(A1) != hash(A2), but A1==A2
and hash just blow up 11:36
Infinoid pmichaud's patch fixes that
now it only compares the hashvals if the charsets are the same... otherwise it continues and does the full string comparison 11:37
bacek and on "hash_resize" it crashes
again.... STRING_compare is for collision detection only. 11:38
Infinoid uh oh
bacek fixed anyway... 11:49
dalek: ping? 11:50
dalek rrot: r39286 | bacek++ | trunk (3 files):
[core] Don't rely on CHARSET for compute STRING hashvalue. Closes TT#24

from unicode/compute_hash to be used for all string.
Infinoid bacek++ 11:51
bacek hates STRING even more than before... 11:54
housekeeping time. 11:56
bacek departs to kitchen 11:57
dalek TT #24 closed by bacek++: non-equivalence of equal Hash key strings 12:11
12:49 ZuLuuuuuu joined
dalek TT #699 closed by coke++: plusthree site down again 13:13
TT #637 reopened by coke++: smolder: server error on submission 13:17
13:17 cognominal joined 13:18 cotto joined
dalek rrot: r39287 | bacek++ | trunk/src/string/api.c:
[core] Don't use atod in Parrot_str_to_num on non-single-byte encoded strings. Closes TT#724.
13:22
TT #724 closed by bacek++: [bug] Parrot fails numeric conversion of ucs2 strings 13:24
pmichaud bacek: ping 13:28
bacek pmichaud: pong
pmichaud I don't think that patch will work in the general case.
bacek which one?
pmichaud I don't think you can blindly convert everything to ASCII.
the one for #724 13:29
bacek It's in str_to_num. So string can contain only small subset of characters
pmichaud converting to a num shouldn't throw an exception, though. 13:30
bacek (I've got handcrafted FSM for parsing. Implemented just for fun :)
pmichaud nor do we want to be creating more GC-able objects every time we do a str-to-num conversion. 13:31
nopaste "bacek" at 122.110.49.211 pasted "handcrafted float parsing" (161 lines) at nopaste.snit.ch/16742
bacek I can commit patch from no-paste 13:32
pmichaud bacek: I'd like there to be a bit more analysis and discussion, first.
bacek ok. There is no C-functions to convert string to float for widestrings.
pmichaud I know that. 13:33
It just bugs me a bit that we get "this patch solves this problem" without actually considering all of the use cases.
bacek 2 solutions: convert to char* or parse it manually. I've got both.
(And actually prefer second) 13:34
pmichaud yes, but I don't like having the duplicate code.
bacek "duplicate"? 13:35
purl hmmm... "duplicate" is the working policy here.
bacek purl: good girl :)
purl thanks bacek :)
pmichaud There's already some FSM for parsing/calculating str-to-num in Parrot. I'd prefer that we not have two of them.
bacek Ah. Didn't spot this before. Let me check
pmichaud thus my comment "I'd like there to be a little more discussion/analysis" before we commit a patch and consider a problem "solves". 13:36
*"solved"
(src/pmc/string.pmc, btw)
bacek string.pmc calls Parrot_str_to_num... 13:37
Ah. Parrot_str_to_int have similar FSM. 13:38
And have same problem with widestrings... 13:39
pmichaud agreed, it has the same problem.
bacek Looks like proper solution is revert previous commit, apply from nopaste, fix str_to_int, and than try to refactor common parts of str_to_num and str_to_int 13:40
pmichaud that might be better. I'll revert the previous commit if you don't, as it's a serious regression.
bacek I'll do it in. 13:41
pmichaud you might wait for the test I'm about to add.
(the one that breaks under r39287)
bacek reverted already... 13:43
pmichaud was my test from #724 added to the suite anywhere...?
bacek But any new tests will be helpful.
dalek rrot: r39288 | bacek++ | trunk/src/string/api.c:
Revert r39287.
13:44
bacek pmichaud: adding... 13:45
purl adding is easy
pmichaud General policy is to not commit changes (such as r39287) without also committing a regression test for whatever has been changed. 13:46
bacek ok 13:48
13:48 skids joined
bacek making another note 13:48
Why str_to_int accepts null strings, but str_to_nim doesn't? 13:52
str_to_num 13:53
pmichaud I don't know. 13:57
bacek And str_to_int throws exception on overflow... 13:58
pmichaud I don't mind if it throws and exception on overflow.
dalek TT #724 reopened by pmichaud++: [bug] Parrot fails numeric conversion of ucs2 strings
pmichaud But it shouldn't throw an exception simply because the string happens to have a non-ascii string in it. 13:59
er, non-ascii character
see my updated test case in #724
bacek checking 14:02
purl checking is just different
bacek expected result is "140"? 14:03
14:05 mikehh joined
Infinoid I think so 14:08
or, I guess, 140\\n140\\n 14:09
14:19 jsut joined
bacek I've got 2 spec failures... 14:27
14:29 tetragon joined
pmichaud expected output would be 140\\n140\\n, yes. 14:31
bacek ouch... Float.cmp is wrong. 14:35
14:36 iblechbot joined
bacek Is there is C analogue of std::limits<double>::epsilon? 14:39
ETOOMANYIS... 14:40
Infinoid like DBL_EPSILON? 14:47
bacek O! 14:50
Infinoid float.h has FLT_EPSILON, DBL_EPSILON and LDBL_EPSILON, but I'm having trouble finding out (via google) whether those are C89 14:51
ooh. According to www.schweikhardt.net/identifiers.html, they are C89 14:52
nopaste "bacek" at 122.110.49.211 pasted "Proposed patch for FLOAT_IS_ZERO" (18 lines) at nopaste.snit.ch/16743 14:56
bacek Any objections?
pmichaud We need this because....? 14:57
fwiw, there were long discussions about FLOAT_IS_ZERO in parrot history
the current implementation was not lightly chosen. 14:58
bacek because of loosing precision.
Infinoid If you do need to use the epsilon constant, parrot/config.h has definitions for PARROT_FLOATVAL_MIN and PARROT_FLOATVAL_MAX. I think PARROT_FLOATVAL_EPSILON should be the same kind of thing 14:59
pmichaud I highly recommend reviewing the mailing list discussions on FLOAT_IS_ZERO before committing.
bacek $N0 = 1; while($N0) { $N0 /= 10 }; will never stop...
pmichaud and mathematically speaking, it never should.
Infinoid some discussion of FLOAT_IS_ZERO (and how it was breaking when gcc optimization was turned on) here: rt.perl.org/rt3//Public/Bug/Display...l?id=47397 15:06
(The previous version of FLOAT_IS_ZERO casted the float to INTVAL or HUGEINTVAL before doing the comparison.) 15:08
bacek And it was definitely wrong... 15:10
New version not so definitely, but also wrong 15:14
pmichaud Given that chromatic, adougherty, and nclark all ended up advocating the ((f) == 0.0) approach, I'd be reluctant to go against their recommendation. 15:16
i.e., those are all people with lots of experience on the issue.
Infinoid If it's still broken, it would help that discussion to have a test case displaying the brokenness 15:18
pmichaud I totally agree there -- I'd like to see a test that demonstrates the brokenness.
Infinoid ok, traveling time 15:44
back wednesday, have fun all!
nopaste "bacek" at 122.110.49.211 pasted "FLOAT_IS_ZERO broken." (26 lines) at nopaste.snit.ch/16744 15:48
bacek pmichaud: there is a test 15:49
pmichaud bacek: I disagree with that test.
$N0 != e is an invalid comparison. 15:50
bacek why $N0 == 0.0 valid than?
pmichaud in some sense, zero is special.
indeed, that's why we have a macro for it. 15:51
15:56 Andy joined
nopaste "bacek" at 122.110.49.211 pasted "FLOAT_IS_ZERO is broken?" (26 lines) at nopaste.snit.ch/16745 15:58
bacek it's.. strange...
Tene pmichaud: ping 16:24
pmichaud pong 16:26
Tene pmichaud: can you review r39285? 16:28
gist.github.com/120809 has a use of it
pmichaud + name = request['name'] 16:30
I don't like that approach. We have named parameters, we should use them. 16:31
Tene Okay.
pmichaud background...
purl background is transparent btw
Tene I'll change that shortly. 16:32
pmichaud much of the stuff written in the namespace pdd comes from before we had the ability to do method calls on PMC objects
i.e., we couldn't do ns.'method'(....)
so, the standard way to get around that was to create a Hash, and have an opcode that looked something like
method ns, hash
where the hash held the "named parameters"
you see that today in things like new $P0, init_hash 16:33
er
new $P0, ['Class'], init_hash
however
- we now have the ability to make method calls on PMCs
- HLLCompiler isn't a PMC class
so I vote that we use the calling conventions here rather than what was-once-true-but-is-no-longer-true 16:34
16:35 ZuLuuuuuu joined
pmichaud (and this is why I've said we need to update the namespace pdd, to redesign it for what-is-available-in-parrot-today as opposed to what-we-were-limited-to-when-it-was-written) 16:35
Tene I think my original motivation was trying to play nicely with languages that didn't support named parameters, but I suspect I made that decision late at night, as it doesn't actually seem relevant anymore.
pmichaud that's a reasonable motivation
however, I doube that anyone will be calling this directly from the HLL -- more likely it'll be from some internal-PIR-or-PCT thingy 16:36
Tene Yes, agreed.
pmichaud *doubt
Tene I'll update it now. 16:37
pmichaud we should use underscores instead of hyphens
Tene Why?
pmichaud most languages don't recognize hyphens in identifiers
(including NQP, for now)
Tene If it's going to be called internal to PIR or PCT anyway, as we just established... 16:38
pmichaud not always, though.
Tene okay.
pmichaud at any rate, the *rest* of parrot uses underscores, so we should keep that.
I too totally dislike underscores in names and prefer hyphens.... but in this case I'd hate to add a third convention on top of the others. 16:39
16:39 Andy joined
pmichaud I'm not sure that fetch-library should catch a load_bytecode failure 16:40
I think we should let the caller catch it.
Tene sure, okay.
pmichaud other than those, this is a good start. 16:41
I rewrote 'import' for Rakudo
you might want to steal some ideas from it 16:42
also, I'd like name to accept ::-delimited names
(same as the rest of PCT)
in fact, I don't think we need a named parameter for that 16:43
parrot.'load_library'('Foo::Bar') # loads Foo/Bar
Tene pmichaud: the debate around names for this comes from differen tlanguages having different naming conventions... a list of strings seems to be a good compromise that should work for an identifier from any language. 16:44
pmichaud I agree that list of strings should work.
but I'd like 'Foo::Bar' to work also.
(same as rest of pCT)
Tene The "There's several equivalent ways to call this" has kind of bugged me a bit about some parts of PCT. 16:45
16:45 Theory joined
pmichaud It's the same principal whereby we can use a class, key, string, or namespace to identify a class. 16:45
*principle
Tene That's exactly the part that's bothered me. :) 16:46
pmichaud fair enough.
Tene I defer to you, though.
pmichaud I just know that writing Parrot::Compiler.load_library('Foo::Bar') in NQP is much easier than figuring out how to split the string.
Tene So 'name' should be the first positional argument, and fetch_library methods should accept any other named parameters, any of which it's free to use or ignore? 16:47
pmichaud Yes 16:48
that sounds great.
Tene Okay.
pmichaud and for consistency I think it should still be called load_library
Tene Okay, I'll defer on that too. :) 16:49
pmichaud I'm not as strongly attached to that, though, so I'm open for discussion :-) 16:50
unfortunately it'll have to be after I fetch lunch for the family here
Tene I also wanted to be clear about the difference between loading a library into my current program and retrieving a library from a foreign language.
pmichaud well, we're always telling a specific compiler to load a library 16:51
Tene Just looking at the name, I'd expect 'load_library' to be a method that fetched the appropriate library and loaded it into my local anmespace or lexical pad or whatever.
pmichaud but we aren't saying anything about importing
I'd think that the operation you describe would be 'import_library'
or 'load_library' followed by 'import'
Tene 'fetch_library' doesn't do any futzing with the namespace, or anything, it just returns a representation of the library.
And it's up to whoever called it to actually do something useful with it. 16:52
and 'fetch' sounds sufficiently idempotent.
pmichaud by way of analogy -- load_library in parrot doesn't do any importing
sorry.
load_bytecode in parrot doesn't do any importing
loadlib in parrot doesn't do any importing
etc.
they just mean "load"
as in "bring into Parrot" 16:53
Tene they often result in things being usefully available, though. :)
pmichaud sure, but only because the caller then knows where to go look for them.
Tene sure, okay.
pmichaud when I do load_bytecode "PGE.pbc", it doesn't change my current namespace at all
it just loads a bunch of things into the ['parrot';'PGE'] namespace.
Tene can i check for "It's a string" by comparing the result of typeof_s_p with 'String'?
pmichaud generally PCT checks for an array 16:54
Tene I don't know the best way to check an argument for stringiness.
pmichaud and if not an array, treat it as a string.
$I0 = does name, 'array'
Tene Ah.
pmichaud i.e., if it's an array, we don't futz with it.
if it's something else, we kinda figure out what it is.
Tene You want 'export-symbols' to use a _ too, I'd imagine. 16:55
pmichaud that means that arrays continue to be the "primary, absolutely respect what the caller sent us" form of argument
yes, export-symbols should use a _ 16:56
Tene Man, all this consistency... :P
pmichaud and perhaps just call it "export"
Tene I considered that, but worried about stepping on toes.
pmichaud it's an object
it's a compiler object :-)
Tene btw, thanks for being so opinionated. It helps maintain consistently high quality. :) 16:57
Oh, one more thing...
purl somebody said one more thing was the fact that I need to write some sort of a template engine
Tene HLLCompiler defaults to having abunch of stuff inappropriate for this, right?
pmichaud I don't quite understand 16:58
Tene I thought, at least...
I'm inheriting from PCt;:hllcompiler. 16:59
pmichaud oh
yes, you're correct.
but the inappropriate stuff is likely to change as part of my hllcompiler refactor
Tene Should I be removing things, or overriding anything with "Don't call this"?
Oh, okay.
pmichaud I think you can leave it for now. 17:00
my expectation at this point is that the "parrot" compiler will know how to turn PIR into bytecode, for one.
Tene Which it doesn't, and I'm not sure how to add that appropriately. :)
pmichaud right, that's a pretty significant refactor for HLLCompiler 17:01
basically the 'parrot' compiler will provide a HLLCompiler interface to the existing PIR comiler
*compiler
(currently the PIR compiler doesn't understand any methods -- you just invoke it on source)
thus parrot.'compile'('...PIR source...') will do something similar to what perl6.'compile'('...Perl6 source...') would do 17:02
including honoring any options
and parrot.'eval'('...PIR source...') would compile + eval, etc.
I gotta run, bbl
Tene seeya, thanks 17:03
dalek rrot: r39289 | tene++ | trunk/runtime/parrot/languages/parrot/parrot.pir:
Track changes in HLL library loading API.
17:11
kudo: 50ec44e | pmichaud++ | (3 files):
Use iso-8859-1 (fixed-width) instead of utf8 for parsing when we can.
17:13
kudo: 0b9c9a3 | tene++ | (2 files):
Track changes in HLL library loading API.
rdinal: e3d7404 | tene++ | (2 files):
Track changes in HLL library loading API.
17:16
rrot: r39290 | tene++ | trunk/runtime/parrot/languages/parrot/parrot.pir:
[parrot.pir]: Let other langauges catch our load_bytecode exceptions
17:24
Tene pmichaud: any reason I should also write a "load a foreign library" method for the 'parrot' compiler? Think anyone will be using a HLL from PIR? 17:49
PIR on Rails. :) 17:50
17:51 particle1 joined
Coke has the string-hash-key discussion been resolved? 17:51
Tene Coke: eh? 17:53
pmichaud Tene: The standard mechanism for loading a foreign library should be to ask the foreign library's compiler to do it. 17:57
Even from PIR.
Coke: I don't know if string-hash-key discussion is resolved, but bacek++'s latest patch solved the problem I was seeing.
(at least to the level that Rakudo no longer fails when using iso-8859-1 for source)
18:00 Whiteknight joined
dalek rrot: r39291 | fperrad++ | trunk/runtime/parrot/languages/parrot/parrot.pir:
fix various coding standards
18:12
mikehh I am getting t/pmc/undef.t passing the tests but failing with a segmentation fault 18:29
if I run with -v or -t it doesn't
ie ./parrot t/pmc/undef.t segfaults but ./parrot -t t/pmc/undef.t does not also -v 18:31
BTW my last build got sent to smolder but when I tried to connect - it didn't 18:38
it sent it about 4 hourd ago and I haven't been able to connect to the site since then 18:42
and furthermore ./parrot t/pmc/packfileannotations.t FAILS test 15 BUT ./parrot -t t/pmc/packfileannotations.t PASSES 18:52
and ./parrot -v t/pmc/packfileannotations.t PASSES but ./parrot t/pmc/packfileannotations.t FAILS Test 15 if I run again 18:57
I am running Kubuntu 9.04 Amd64 at r39288 18:59
19:32 Maghnus joined
Tene pmichaud: ping 19:44
Coke returns. 19:47
pmichaud Tene: pong 20:01
Tene pmichaud: I was wondering if there could be a scope for PAST::Vars that would translate to a find_name opcode. I found a workaround, though.
20:02 viklund joined
pmichaud I've been thinking about adding such a scope, yes. 20:03
But there's always simply PAST::Op( :pirop('find_name'), ... )
Tene Yeah.
pmichaud and perhaps we need to add find_name to piropsig 20:04
Tene it's there
I'm unsure whether a pirop would be more or less awkward than my workaround.
pmichaud it doesn't do much good to have find_name in a PAST::Val, because there's no way to bind or lvalue it
er, PAST::Var
PAST::Op.new( :pirop('find_name'), '$var' ) # seems pretty short to me 20:05
unless of course you need a viviself
that might be a reason to do it as a PAST::Var
20:07 Eevee joined
Whiteknight pmichaud: I was looking at your wiki the other day 20:18
I'm working with a research group who are setting up a wiki application and we looked at PMWiki as one of the potential platforms for it
pmichaud Whiteknight: I highly recommend it :-) :-) :-) 20:19
Whiteknight I was able to say "I know that guy!"
pmichaud although it needs a bit more love from me these days -- I've been spending too much time with p6 :-)
Whiteknight I'm actually surprised you didn't write it in Perl 20:20
pmichaud there are a couple of reasons for that
(1) it's *far* easier to install PHP applications than Perl ones for the typical webhosting environment
Whiteknight I don't know PHP as well, and we're doing some custom development so I was hoping it would be Perl so I could do it more easily
pmichaud (2) PHP has a slightly less-steep learning curve for people who might want to customize it
ease-of-installation was the clincher, really. 20:21
Whiteknight I imagine so
pmichaud The average person can download, untar, and have it running inside of 5 minutes
(often less)
Whiteknight the group decided on MediaWiki finally, but we definitely looked closely at PMWiki too 20:22
pmichaud sure, MediaWiki is a good choice, especially if very large scalability is important
jonathan hopes we can make Perl 6 have a competitive level of ease-of-installation.
pmichaud jonathan: yes, me too. mod_parrot helps a bunch there, I hope.
Whiteknight yeah, we had to go with the platform that the group was more familiar with 20:23
Tene mod_parrot has some of the same issues mod_perl does. 20:24
Whiteknight has to go catch a plane now. Late! 20:25
later*
pmichaud well, I suspect mod_php has to deal with those issues as well. With mod_parrot we have the advantage of not having to be mod_perl :-) 20:26
20:27 cotto joined
skids Newbie question -- can a PMC move from one class to another gracefully, keeping the same memory address (in-place promote)? Would it have to be preallocated/presized with that in mind? (The datastructure headers haven't quite gelled in my brain yet.) 20:28
Tene That is, you can't drop a .pl in with a bunch of .html and have it run. At least, that's not the default configuration.
We could certainly add that for mod_rakudo, if desired.
jonathan skids: Yes, that can work. 20:29
pmichaud skids: In rakudo, we had to write a rebless operator to make that work
skids: there's also the copy opcode, which can do that a little more destructively
jonathan Those were the two I was going to mention. :-)
skids Cool, I'll try to grep around for those. 20:30
jonathan rebless_subclass goes as one of the most evil things I ever wrote.
jonathan is terrified of having to ever change it again.
otoh, it's not been the source of any problems that I know of, so it appears to be stable. :-) 20:31
skids (Basically asking because I'm weighing the pros/cons of conditionals for compact versions data structures, e.g. hashes with < 4 keys, against having a "nanohash" or something class that upgrades.) 20:32
pmichaud there's also the "morph" opcode 20:33
but for what you're talking about, it sounds like it's better to build that switching-behavior into the existing hash class than to create a new one
similar to the way we'd like Integer to start acting like BigInt without changing its type
skids Just it's messy that way. 20:34
That is, if we ever end up with something complexity-wise comparible to say a Judy trie, where there are scores of different node types, all of which can be at the head. Not that we will, but... 20:37
For simple hashes it's not that big a deal, just ugly pointer casting. 20:38
On the flip side, big monolithic switch statements don't take up symbol table space. 20:40
20:42 eternaleye joined
Tene gist.github.com/121026 -- language loading support for my scheme compiler 20:44
Tene happy 20:46
Tene leaves for a while.
jonathan lol awesome! Tene++ 20:52
Tene Coke: does tcl have libraries? 21:10
pmichaud jonathan: ping (although others will be interested in this as well) 21:28
purl I can't find (although in the DNS.
jonathan purl: You're looking in the wrong place. 21:30
purl jonathan: what?
jonathan pmichaud: pong
pmichaud I'm working on improving the speed of postfix:<++> 21:31
I could use someone to bounce ideas off off
*off of
or at least to get reactions
jonathan hopes he is sufficiently bouncy
pmichaud let me establish a base case
I'm doing a basic loop, counting from 1 to 20000 21:32
code and timings about to be nopasted
(waiting for compile to finish)
nopaste "pmichaud" at 72.181.176.220 pasted "incrementing integers, base case (current master)" (10 lines) at nopaste.snit.ch/16752 21:33
pmichaud 16752 is my base case 21:34
i.e., we currently take 13 seconds. Lousy.
I've refactored postfix:<++> (and .succ and .pred)
with the refactor, not doing any special optimization for Ints, it now takes 3.222 seconds 21:35
Zak So I have a crazy idea for a language that seems like a good fit for Parrot. Is this a good place to bring it up, or is this more for discussing Parrot internals?
pmichaud so, my question now (and what I'm playing with) is deciding if/how to optimize the Int case
jonathan pmichaud: Any chance I can glance at the diff? 21:36
pmichaud sure
just a sec
jonathan Zak: This is the all-purpose Parrot channel. :-)
pmichaud easier is if I just show the new postfix:<++> code 21:37
(easier to read than the diff)
jonathan That's fine too. :-)
Zak Then the next obvious question is, does anybody want to listen to me ramble?
nopaste "pmichaud" at 72.181.176.220 pasted "new postfix:<++> code" (12 lines) at nopaste.snit.ch/16753
pmichaud very straightforward 21:38
to establish a lower bound of what we might expect, I wrote an optimized form of postfix:<++>(Int)
it's semantically wrong in a few areas, but here it is
21:39 bacek joined
nopaste "pmichaud" at 72.181.176.220 pasted "postfix:<++> optimized for Int" (6 lines) at nopaste.snit.ch/16754 21:39
pmichaud with the optimization from 16754 in place, the benchmark runs in 2.338 seconds
so at present we can't expect to do much better than that
jonathan One thing I'd be curious to know
What happens if you make it a Perl6MultiSub? 21:40
The MMD cache can do pretty well with a lot of invocations with the same type...
And Parrot's MultiSub lacks that.
pmichaud hmmm. 21:41
I could try that.
jonathan I'm curious where the overhead is in the dispatch, mostly.
erm, I mean
*if* the overhead is much in the dispatch or not.
pmichaud wouldn't it be roughly equivalent to the 10000 sub calls benchmark you're doing now? 21:42
jonathan The benchmark I'm doing doesn't try Parrot's MMD.
pmichaud right
but we know how long 10000 sub calls take, yes?
jonathan Yes
pmichaud and that is?
jonathan Run perl tools/benchmark.pl - the figure has changed over time 21:43
pmichaud (and yes, we could hand-optimize the Perl6MultiSub case a fair bit)
jonathan And the thing is
I was doing for (1..10000) and I suspect that ranges were using infix:++
pmichaud they do.
yes.
heh
chicken-and-egg
jonathan So you've probably rather changed the performance profile of the benchmarks now. :-) 21:44
pmichaud correct
well, let me try it
jonathan Going back to your patch
pmichaud (and yes, I'm going to re-factor the range code)
jonathan The main issue I can see is that if somebody subclasses Int and overrides .succ then ++ is not going to be sensitive to that change.
pmichaud the 10,000 MultiSub invocations takes 5.311 seconds on my box. But that's not representative, because I don't have your new dispatcher. 21:45
jonathan The new dispatcher doesn't affect MultiSub only method dispatch.
pmichaud ah 21:46
if I make it a arity-1 multisub, it's 4.311 seconds
anyway
jonathan heh, !SIGNATURE_BIND bites again. 21:47
pmichaud exactly. And we could avoid that in an optimized Perl6MultiSub version
anyway, the version I have is intended strictly as a lower-bound
do you basically agree we can't easily beat that for Int increment?
jonathan At the PIR level, I don't think we can do it in any less instructions. 21:48
pmichaud okay.
One thing we *might* be able to do in the compiler is detect when a variable is type-constrainted to Int, and issue inline PIR
that just does an inc 21:49
jonathan Yes.
When we have an optimizer. :-)
pmichaud although yes, that still suffers from the subclass-of-Int issue
although S03 does say that the compiler is allowed to optimize the Int case
jonathan You're allowed to add methods but not change representation, afaik. 21:50
pmichaud (The optimizer is allowed to assume
that the ordinary increment and decrement operations on integers will
not be overridden.)
jonathan Ah, OK.
That's a nice thing.
I can see us being able to optimize that.
pmichaud anyway
jonathan That probably means you can legitimately not call .succ 21:51
pmichaud I can adjust my "fastest increment" code so that it checks the operand for (1) readonly, (2) type constraint, (3) overflow and fall back to the default case if any of those are present
jonathan Essentially hand-inlining it.
pmichaud when I do that, I lose some performance, but not much 21:52
jonathan That sounds reasonable yes.
pmichaud (re-running to get updated time)
nopaste "pmichaud" at 72.181.176.220 pasted "postfix:<++> optimized for Int, with typechecks and readonly and overflow" (14 lines) at nopaste.snit.ch/16755 21:53
dalek rrot: r39292 | bacek++ | branches/tt24_unicode_numifications:
Branch for fixing unicode strings numification issues
21:54
pmichaud with that version (16755), the benchmark runs in 2.444 seconds
jonathan That does mean my Int $x; $x++ for 1..10000; is isn't going to be so performant... 21:55
pmichaud Correct, that leads up to my next question
adding the type constraint actually makes it slower
think it would be worth checking specifically for the "Int" type constraint ?
jonathan Maybe we just wave our hands in the air and say "wait until somebody writes a type-based optimizer" ;-) 21:56
pmichaud and at what point do we end up doing so many checks that we might as well default to the general case?
jonathan You could also do that yet
pmichaud OTOH, the default case is also slower when a type constraint is present
it takes 5.043 seconds instead of 3.222 secs 21:57
jonathan isa is a pig.
pmichaud so if we added the explicit check for the Int type constraint, we'd get a Win there.
Oh, we wouldn't have to do isa for that
jonathan oh, also 'cus we're doing ACCEPTS
pmichaud we just see if 'type' issame Int
jonathan Yes, for sure we can here.
I meant generally that's why it's slower. 21:58
pmichaud yes.
let's see what happens if I do that
first, the cost of an Int type constraint...
(recompiling) 21:59
jonathan need faster compiler ;-)_
pmichaud no, I made a typo
need faster programmer :-)
faster compiler wouldn't hurt either, though. 22:00
the other question I have: is it worth optimizing postfix:<--> also?
that's much less common
jonathan True.
I wouldn't for now.
pmichaud same
jonathan You've provided the pattern.
pmichaud I'll optimize prefix:<++> though.
jonathan It's not much work if somebody cares to duplicate the idea. 22:01
pmichaud okay, with the type constraint it becomes 5.139 seconds on my latest test
now let's check specifically for Int
jonathan In general on the type stuff, my efforts so far have really been focused on "make it work" than "make it fast"
pmichaud agreed. 22:02
jonathan Now we're closing in on S12 and S14 though...
...I start on S09!
...eww.
I am interested in starting getting a basic first-cut type checker/optimizer into Rakudo at some point. 22:03
nopaste "pmichaud" at 72.181.176.220 pasted "postfix:<++> optimized for Int, with typechecks and specific check for Int" (18 lines) at nopaste.snit.ch/16756
jonathan It'd be an option (-O style) for sure.
Not convinced myself we're at the point where it's worth starting ont hat yet though.
pmichaud it would be good if it could be lexical.
bacek good morning
purl Here I am, brain the size of a planet, and all they say is 'Good Morning'
jonathan That'd be possible. 22:04
pmichaud ooooh, Win
dukeleto howdy
purl niihau, dukeleto.
pmichaud by explicitly checking for Int constraint, 2.475 seconds
jonathan Nice
bacek bacek-- # silly typo in branch name
pmichaud I think I'm keeping that.
dalek rrot: r39293 | bacek++ | branches/tt24_unicode_numifications/t/op/string.t:
[t] Add tests for TT#724
rrot: r39294 | bacek++ | branches/tt24_unicode_numifications (3 files):
[core] Use handcrafted FSM for parsing float values.
pmichaud my Int $x; $x++ is likely to be a common pattern
dukeleto has anybody heard from Kevin Tew about his GSoC project ?
jonathan pmichaud: Yes
pmichaud okay, thanks. 22:05
We go with that.
jonathan Yay
pmichaud++
That'll be a decent performance improvement to report back.
pmichaud I also moved string increment/decrement out of C and into PIR :-)
jonathan Heh. Did it get faster?
pmichaud heh, I don't know.
But it'll work with unicode strings now, and we can start to do unicode ranges.
jonathan Yay
pmichaud I wasn't doing it for a speed perspective -- mainly because refactoring ++ required some substantial refactors to succ/pred, and that affected Perl6Str 22:06
jonathan Ah, OK.
pmichaud because Perl6Str implemented string increment as a increment VTABLE, instead of .succ/.pred
jonathan Ah. 22:07
pmichaud which mean things were kinda backwards
*meant
jonathan yes
One of those probably-right-under-some-version-of-the-spec thing.
pmichaud okay, now to see if my changes survive spectest
dalek rrot: r39295 | bacek++ | branches/tt24_unicode_numifications/t/op/string.t:
[t] Add more test for unicode string numification. pmichaud++
rrot: r39296 | bacek++ | branches/tt24_unicode_numifications/src/string/api.c:
[core] Refactor Parrot_str_to_int to be closer to Parrot_str_to_num.
rrot: r39297 | bacek++ | branches/tt24_unicode_numifications/t/op (2 files):
[t] Move unicode string tests to t/op/stringu.t
Tene Zak: I'm rather interested. 22:13
Zak Yay, a taker! 22:16
pmichaud jonathan: okay, this is weird.... 22:17
I just re-refactored postfix:++ in terms of prefix:++
Zak Well, the basic premise is that while we have a bunch of single-dispatch languages where everything is an object, and in the more dynamic ones can have methods added to it, we don't have a multiple dispatch language where every function in generic.
nopaste "pmichaud" at 72.181.176.220 pasted "Refactored postfix:++ and prefix:++" (17 lines) at nopaste.snit.ch/16757 22:18
pmichaud this version runs in 2.583 seconds. *Without* any special casing of Int
(previously it was 3.222)
Zak When I've mentioned this, most people felt it would be too slow, but I suspect it would not, especially on a VM like Parrot that seems to prefer to optimize by being more dynamic rather than more static. 22:19
Tene zak: interesting
jonathan pmichaud: That's...odd. 22:20
pmichaud I agree.
maybe because of the tailcall?
and the lack of find_name lookups?
I dunno.
jonathan Maybe.
find_name checks lexpads and then hits the namespace.. 22:21
pmichaud yes.
jonathan Oh, but there's no lexpads here.
So that should be cheap.
pmichaud correct.
jonathan Hmm. Odd.
It does give the correct answer, right? ;-)
Zak A function defined without dispatch information would be applicable to any object (though it could throw a runtime exception if it did invalid operations internally)
pmichaud I'll check.
yes, correct answer. 22:22
purl correct answer is "don't do that"
mikehh Zak: do you hace a proof-of-concept in mind
Zak In mind? Yes. In code? No.
pmichaud still worth putting in the special case to avoid the Int typecheck, though. 22:23
Zak Also, method combination like Common Lisp.
Tene explain? 22:25
Zak The core focus is extensibility. Anyone can extend or override code at any time without modifying the original.
In Common Lisp, you can define methods of generic functions that are run before, after or around the existing methods.
So, let's say you have a generic function that handles an incoming HTTP request. You could define an around method that first checks to see if the requestor's IP address is in a list of banned addresses. If it is not, it calls the next method. If it is, it returns a 403 error instead. 22:26
Another example would be to put logging information before and after a function with some hard-to-find runtime bug. 22:28
22:31 rg joined
Zak Anyway, that's the fundamental idea. Lots of other stuff goes in to making a good language, and I have plenty of ideas along those lines, but none of them are unique. 22:34
bacek Zak: looks like Perl6 from point of view 22:38
22:38 tetragon joined
Zak Not actually knowing a lot about Perl6, I can't comment. 22:39
bacek Zak: try channel #perl6 at freenode.org 22:40
jonathan Well, many languages look like some subset of Perl 6... ;-) 22:41
bacek usually small :)
small subset 22:42
Zak The language I have in my head looks like a hybrid of Common Lisp and Clojure, with some influence from Scheme and Haskell, but where all functions are generic. 22:43
dalek rrot: r39298 | bacek++ | branches/tt24_unicode_numifications/src/string/api.c:
[core] Use integers in Parrot_str_to_num to preserve precision if possible
Zak And with that, I must go for a while. If anyone thinks I'm crazy enough to have good ideas, feel free to email me at zak.wilson@gmail.com 22:44
Thanks for listening.
dalek rrot: r39299 | bacek++ | branches/tt24_unicode_numifications/t/op/number.t:
[t] Remove check for lond double in sqrt_n_n results.
22:47
23:16 Coke joined
Coke tene; (libraries) do you mean that in some particular way? Tcl has a standard library, yes. it's setup so that when called from tcl, most of it is auto_load'd when you try to use it. 23:26
23:28 Theory joined
Coke I'm not sure if I'd expect that to translate nicely to another language, but they're welcome to load the libraries explicitly. 23:31
bacek msg pmichaud Branch (misnamed...) tt24_string_numification is ready for merge. Can you review str_to_num and str_to_int please? 23:32
purl Message for pmichaud stored.
dalek rrot: r39300 | bacek++ | branches/tt24_unicode_numifications (2 files):
[core][t] Handle corner cases of Parrot_str_to_num correctly
23:33
Coke (so if I call parray, that dispatches to unknown, which tries to auto_load parray, which finds parray.tcl, which loads it, then checks to see if parray is defined now, which it is, and then ends with [uplevel 1 [parray $args]].
bacek: should I try this out on partcl, too?
bacek Coke: yes, it will be nice 23:38
Coke bacek: alreayd on it.
sfsg.
er, so far, so good.
bacek: as long as you're working parrot magic, I'd appreciate it if you could look at the bugs linked to from here: code.google.com/p/partcl/wiki/ParrotIssues =-) 23:39
bacek: no test failures with partcl in that branch. 23:40
Coke yays at actually being able to verify that before it hit trunk"! 23:41
(that's a big step for partcl this year. =-)
bacek :)
-t1 should work btw 23:42
Coke ORLY?
purl YA RLY.
Coke in trunk? (already switching back to trunk)
bacek in trunk too
I've fixed couple of issues last week
Coke hey, parrot tcl.pbc runs a LOT of pir before getting to the % prompt. 23:44
let me try a smaller test. =-) 23:45
bacek :)
gotta go 23:46
purl EXCUSE ME, I HAVE TO GO WASH MY COMPUTER
bacek see you soon
Coke ~~
23:51 snarkyboojum joined
Tene Coke: in the sense that use Foo:lang<tcl>; would make sense. 23:57