|
Parrot 1.2.0 released | parrot.org/ | Weekly Priority: Profiling | Parrot VM Workshop, Pittsburgh, June 20-21 Set by moderator on 7 June 2009. |
|||
| skids | Yes, my one useful contribution so far was fixing the one-off error that was breaking INITIAL_BUCKETS=4 :-) | 00:00 | |
| bacek | :) | ||
| Infinoid | Whiteknight: Are you running it on linux/x86 (32-bit)? | 00:02 | |
| Whiteknight | nope, x86-64 | 00:03 | |
| Infinoid | Yeah, that doesn't work. | ||
| Infinoid checks out trunk on x86-32 | |||
| Whiteknight | yeah, i figured that would be the problem | 00:06 | |
| nopaste | "skids" at 71.192.212.78 pasted "for hash profiling statistics" (103 lines) at nopaste.snit.ch/16826 | 00:08 | |
| skids | hand edited diff of an out of date file on an out of date file. Maybe it'll even apply :-) | ||
| Not sure I kept the STRING stuff. | 00:09 | ||
| My personal take on the path out of the madness: implement a sparse array, tie orderedhash to that to get it out of hash's hair, then reimplement hash. | 00:14 | ||
| Also, since all of hash, array, and str need efficient small-size implementations, adding infrastructure to aid with stuff that stores in the PMC and then later adds a data alloc when it grows would be wise. | 00:16 | ||
| (and what I pasted yesterday, nopaste.snit.ch/16809) | 00:18 | ||
| Infinoid | skids: Oh, did I break something when I set INITIAL_BUCKETS=4? Or was your fix already in at that point? | 00:24 | |
| bacek | We don't standardize on | ||
| Unicode internally, converting all strings to Unicode strings, because for the | |||
| majority of use cases it's still far more efficient to deal with whatever | |||
| input data the user sends us. | |||
| skids | Already in. | ||
| bacek | *sigh* | ||
| Infinoid | skids++ | 00:25 | |
|
00:26
patspam joined
|
|||
| skids | bacek: also since the majority of internally used strings are ascii. | 00:26 | |
| Infinoid | Whiteknight: Well, I re-ran mk_native_pbc on x86-32, but it doesn't appear to have changed any files. And t/pmc/packfile*.t are all passing here on a fresh trunk checkout | 00:27 | |
| bacek | and majority of external string are unicode. | ||
| Infinoid flips back to x86-64 to test there | |||
| skids | There could be a case made that only ascii strings should qualify for using a short in-PMC mini-string (thus no need to store encoding in the mini-string version) | 00:28 | |
| Infinoid | skids: Is there anything we can do to make "Key" less insane, at the same time? :) | 00:29 | |
| skids still does not fully understand "Key". | 00:30 | ||
| Infinoid does not think it's possible to fully understand "Key". That's part of the problem. | |||
| bacek | no one understand "Key" | ||
| Can we just kill it? | |||
| skids | Not if you want hash iterators. | ||
| Infinoid | Killing it would probably break the HLLs. | 00:31 | |
| skids | (unless you replace them) | ||
| Infinoid | But if you're reimplementing hashes, reimplementing keys on top of that would make sense | ||
| skids | s/make sense/be required/ | 00:32 | |
| I seem to recall it relies heavily on the type of hash used. e.g. it must be a chained bucket store. | 00:33 | ||
| And if I got the gist, the special key stuff is to keep track of where you are in the store. | |||
| If hash is to be reimplemented, though, a sane behavior of hash iterators when the hash is modified during the iteration should be defined. | 00:34 | ||
| Infinoid | yeah | ||
| skids | And it probably should be what languages need, if any language specifies that (probably the iterator gives you what was in the hash when it was instantiated) | ||
| bacek | Infinoid: my fault. I forgot to update mk_native_pbc to recreate t/native_pbc/annotations.pbc :/ | 00:35 | |
| (Because I don't understand shell to good extent) | |||
| Infinoid | I can fix it if you like, I just haven't gotten to it yet | 00:36 | |
| dalek | rrot: r39445 | bacek++ | branches/io_rewiring/t/native_pbc/annotations.pbc: Recreate annotations.pbc |
00:37 | |
| bacek | I just commit new annotations.pbc to io_rewiring branch. | ||
| If you can fix mk_native_pbc it would be helpful | |||
| Infinoid | You just ran "./parrot -o t/native_pbc/annotations.pbc t/native_pbc/testdata/annotations.pir", right? | 00:39 | |
| bacek | yes | 00:40 | |
| Infinoid | Then the upcoming nopaste will probably work | ||
| nopaste | "infinoid" at 24.182.55.77 pasted "Simple patch to rebuild annotations.pbc." (13 lines) at nopaste.snit.ch/16827 | ||
| bacek | Looks good for me | ||
| Infinoid | It'd probably be nicer if the pir sources were folded into the script with a heredoc, like the previous stuff... but I'm not entirely sure parrot would know to interpret it as pir or pasm | 00:41 | |
| I'll try it and see what breaks | |||
|
00:42
snarkyboojum joined
|
|||
| dalek | rrot: r39446 | bacek++ | trunk/docs/pdds/pdd28_strings.pod: [docs] Fix typo - UTF-16 isn't fixed-width encoding. Use UCS-2 instead. |
00:50 | |
| Whiteknight | Infinoid: I didn't merge into trunk, I merged trunk into the branch | 01:12 | |
| the branch is where the packfile tests are failing | 01:13 | ||
| but, I'm heading to bed now. I'll talk to you later | |||
| dalek | rrot: r39447 | Infinoid++ | trunk (2 files): [tools] Regenerate annotations.pbc as part of the mk_native_pbc run. |
||
| Infinoid | Oh, ok, I can refix the branch | ||
| bacek: I guess we can get rid of annotations.pir now | |||
| bacek | Infinoid++ | 01:16 | |
| Heh. We can reimplement mk_native_pbc in pir :) | 01:17 | ||
| bacek hides | |||
| Infinoid | Since it only works once parrot is already built, that actually makes sense | ||
| bacek | It was joke actually. Chicken and eggs dilemma | ||
| Infinoid | (though it does rebuild parrot a couple of times, heh) | ||
| bacek | we can limit it to generate only "current parrot native" version | 01:18 | |
| Infinoid | I think that's what the --noconf switch is for | 01:19 | |
| bacek | yes. | ||
| Infinoid | But its detection is broken; it generates *something* when you run it on x86-64 but the output doesn't seem to be useful | ||
| Whiteknight | Infinoid++ | ||
| and goodnight! | |||
| bacek | What is current policy for adding new core pmcs? | 01:20 | |
| And DynPMCs? | |||
| purl | DynPMCs are treated the same way as pmcs, as far as dumps go | ||
| Infinoid | policy in what sense? Deciding which to add? | 01:21 | |
| bacek | deciding | ||
| I really want FakeSub | |||
| Infinoid | I don't know what the official policy is, but I think dynpmcs are often used for linking with third party libraries that we don't want to link against unless the hll/pir requests it | ||
| e.g. pcre, gdbm, decnum | 01:22 | ||
| nopaste | "bacek" at 114.73.13.169 pasted "FakeSub to add" (106 lines) at nopaste.snit.ch/16828 | ||
| Infinoid | I don't know what decides whether something can be in core or not. Probably how often we plan to use it | ||
| bacek | This FakeSub used to implement PIR compiling in PIR :) | 01:23 | |
| Infinoid | That sounds useful. But to be honest, I think a feature like that is more likely to be accepted into core if it doesn't have a name containing "Fake" :) | 01:24 | |
| bacek | But it's fake! :) | 01:26 | |
| Infinoid | But it isn't immediately clear *why* you'd want a fake sub. Maybe PirSub? or SubCompiler? | 01:27 | |
| Anyway, it was just a suggestion, not important. | 01:28 | ||
| bacek | SerializeableSub is more close to semantic. "Fake" is implementation details :) | 01:30 | |
| Infinoid | Is it something we can just extend Sub to do? | 01:32 | |
| serialization sounds like something that could be generally useful | |||
| bacek | Infinoid: no. Otherwise we can easily broke real executable Sub. | 01:33 | |
| e.g. "set_start_offset" for real sub is very dangerous. | |||
| Infinoid | Ah. So it's only for PIR subs? | 01:34 | |
| (Do NCI functions use the Sub interface too?) | |||
| bacek | irclog.perlgeek.de/parrot/2009-06-06#i_1215578 | 01:36 | |
| dalek | rrot: r39448 | Infinoid++ | branches/io_rewiring/t/native_pbc (5 files): [t] Regenerate native_pbc files. |
||
| Infinoid | Thanks | ||
| bacek | Yes, this is for PIR only. Just for implement self-hosted PIR compiler in PCT | ||
| Infinoid | Sounds delightfully hackish | 01:39 | |
| bacek | heh :) | ||
| Infinoid | My first reaction is, of course, "fix IMCC". But you know how far that argument usually gets you | ||
| bacek | Dark and evil land of some macroassembly language where many heroes were died | 01:40 | |
| Infinoid | I think when we finally get rid of IMCC, quite a lot of code hacking around it can simply vanish | ||
| bacek | I checked pirc code. It's very nice, structured, documented, etc. | 01:41 | |
| One problem - it doesn't work :( | |||
| Infinoid | Yeah, but that's just an implementation detail | 01:42 | |
| bacek rotfl | |||
|
02:31
Theory joined
02:35
janus joined
02:37
Andy joined
03:12
bacek joined
03:30
Andy joined
|
|||
| pmichaud | message bacek objection to pmc_i_ops branch merge noted in parrot-dev mailing list | 03:43 | |
| purl | Message for bacek stored. | ||
| Infinoid | pmichaud: Paralleling Perl 5's I/O speed would be nice. But the branch didn't really change I/O speed at all; it just removed a bunch of PCCINVOKE nonsense on the way to calling the I/O functions | 03:47 | |
| Seems like speeding up I/O is all about keeping the rest of parrot out of the way | |||
| pmichaud | Infinoid: Sure. The main reason I cite Perl 5 is because it seems like the P5 vm and Parrot vm have about the same amount of work to do in each case | 03:51 | |
| so the Parrot VM really ought to be able to be on the same order as P5 | |||
| Infinoid | that seems reasonable (though I know nothing about perl5's internals) | ||
| pmichaud | I know very little about them. | 03:52 | |
| Infinoid | Then again, we don't match p5's speed in other areas either | ||
| (yet) | |||
| pmichaud | actually, at one time we did. | ||
| Infinoid | I'll bet we were less Correct back then when we were Fast... | 03:53 | |
| pmichaud | well, we are talking about very simple operations here. | ||
| for this particular benchmark. | 03:54 | ||
| I'd be very curious to know where Parrot is spending all of its time for this benchmark | |||
| Infinoid loves profiling | |||
| I'll see if I can find out | |||
| pmichaud | I mean, it's not as if there are a lot of explicit PCC calls here | 03:55 | |
| the benchmark just uses the built-in opcodes, which ought to be fairly fast | |||
| Infinoid | They weren't fast at all. Even the basic "read" and "write" were methods called through PCCINVOKE | 03:56 | |
| pmichaud | right, but is that still true in the branch? | ||
| Infinoid | Now we call through VTABLE_* if we recognize the base_type, and fall back on PCCINVOKE | ||
| That's what the branch sped up | 03:57 | ||
| pmichaud | right, but it still doesn't explain the 2x difference with p5 | ||
| I mean, there really shouldn't be any GC pressure, afaict | |||
| Infinoid | I think comparing p5's runloop to parrot's is sort of apples vs. oranges | 03:58 | |
| pmichaud | that's possible. | ||
| Infinoid | But we shall see. I'll post profiling results when I have some | ||
| pmichaud | and I'm not running an optimized parrot. | ||
| I could try that. | |||
| Infinoid | I've started being more careful about that, recently. Seems building with --optimize turns on some important gcc warnings, too | 04:03 | |
| Plenty of overhead, running this in callgrind is gonna take a while. | 04:09 | ||
| I suppose that implemented in C, this could be optimized down to about 10-15 assembly instructions. I doubt Parrot's ever going to do it that well | 04:11 | ||
|
04:12
payload left
04:17
payload joined
04:27
Zak joined
|
|||
| pmichaud | well, I'm not sorried about matching C's performance. :-) | 04:37 | |
| running with --optimize on my system gets 10.816s | 04:38 | ||
| Infinoid | I'm still waiting for callgrind :) Looks like it'll take around half an hour in total | 04:39 | |
| You were saying something about inplace math ops using clone. I suppose that would cause a lot more GC pressure for this benchmark | 04:40 | ||
| Therefore it should be pretty easy to see the effect | 04:41 | ||
| pmichaud | no, because this benchmark (the i/o one) isn't using any PMC ops | ||
| Infinoid | It's decrementing the integer | ||
| pmichaud | sure, but that's not a PMC | 04:42 | |
| Infinoid | oh. | ||
| if I have time tomorrow morning, I'll convert to an Int PMC and try to compare trunk vs branch performance | 04:44 | ||
| pmichaud | does the io_rewiring branch also include the pmc_i_ops work?!? | ||
| Infinoid | No. I meant comparing trunk to pmc_i_ops | 04:45 | |
| pmichaud | oh, that. | ||
| It may not make a significant performance difference for just that case. | |||
| I'm more concerned about the case where someone has subclassed Integer and changed the meaning of 'clone' | |||
| (or Float, or Complex, or ...) | |||
| Infinoid | well, 2.5 million extra created and destroyed PMCs should count for something | ||
| pmichaud | except that it won't be "extra" | ||
| in both the branch and trunk, 2.5 million new PMCs get created | 04:46 | ||
| the main question is whether that happens using pmc_new or VTABLE_clone | |||
| Infinoid | The difference is in the amount of additional init overhead from subclasses, right? | 04:48 | |
| Or is it a behavioral change? | |||
| valgrind --tool=callgrind --dump-instr=yes --trace-jump=yes ../test/parrot 2135.25s user 13.67s system 90% cpu 39:21.63 total | |||
| pmichaud | it's a behavioral change. | ||
| that was supposed to be the main point of my message -- did I not make that clear enough? | 04:49 | ||
| pmichaud re-reads. | |||
| maybe I should've said "substantial" instead of "real" in my last sentence. :-| | 04:50 | ||
| Infinoid | No, I'm just sleep-deprived and unable to read emails, apparently | 04:51 | |
| quack.glines.org/upload/infinoid/k...rot-IO.png | 04:52 | ||
| No really big single hog, but we are spending most of our time in things called by Parrot_io_putps | |||
| pmichaud | Parrot_print_p_i seems biggish | 04:53 | |
| Infinoid | It's spending all its time in the callees | 04:55 | |
| around 100 million samples within the function itself, compared to 17 billion within Parrot_sprintf_c and 26 billion within Parrot_io_putps | 04:56 | ||
| Overall, we appear to be spending about 70% in IO and string-processing functions, and 25% in PMC creation and destruction | 04:57 | ||
| But there's probably some overlap there | |||
|
04:57
cottoo joined
|
|||
| pmichaud | I'm a little surprised there's so much pmc creation/destruction, then | 04:57 | |
| that must be from the PCC_INVOKE? | |||
| but what is there to be PCC_INVOKED? | 04:58 | ||
| am I reading this correctly that there are 30 million calls to pmc_new? | 04:59 | ||
| Infinoid | yes | 05:00 | |
| pmichaud | W... T.. F... ?!?!?!?! | ||
| Infinoid | the biggest thing called by PCCINVOKE (which isn't part of PCC itself) looks like FixedIntegerArray.set_integer_native(), strangely | ||
| cottoo | I think that's used in the calling conventions. | 05:01 | |
| pmichaud | cottoo: sure, but I'm not making use of any calling conventions in this test. | ||
| obviously *something* is, but my test program itself isn't. | 05:02 | ||
| nopaste | "pmichaud" at 72.181.176.220 pasted "benchmark program (for cotto)" (20 lines) at nopaste.snit.ch/16829 | ||
| Infinoid | ok. Parrot_PCCINVOKE was called 5 million times by Parrot_io_putps, so apparently we can still optimize some more I/O | ||
| pmichaud | I think I need a callgrind / kcachegrind tutorial. :-| | 05:03 | |
| so I can do some of this. | |||
| Infinoid | oh, hang on. I think I ran it on trunk, not io_rewiring | ||
| ok. First, get valgrind | 05:04 | ||
| Then, set up your shell with an alias: cgp='time valgrind --tool=callgrind --dump-instr=yes --trace-jump=yes ./parrot' | |||
| Then, invoke it like: cgp x.pir | |||
| It will generate a dumpfile with the pid in the filename, like: callgrind.out.10329 | |||
| Then run: kcachegrind callgrind.out.10329 | |||
| That's it. | 05:05 | ||
| pmichaud | Infinoid++ # thanks, just what I needed | ||
| running now, on branch | |||
| needs about 35 minutes, yes? | 05:06 | ||
| Infinoid | For trunk, yes. Branch should be more like 10 minutes | ||
|
05:06
chromatic joined
|
|||
| pmichaud | (because it's 4 times faster... got it :-) | 05:06 | |
| chromatic | function kcg { | ||
| kcachegrind "callgrind.out.$1" 2> /dev/null & | 05:07 | ||
| } | |||
| That way you can write `kcg 10329` | |||
| Infinoid | You can monitor the progress with "wc -l test-pir.txt". The finished file has around 78 thousand lines for this benchmark | ||
| chromatic: Thanks for introducing me to this tool. I'm making stupid mistakes right now, but I love the amount of data it gives me | 05:08 | ||
| chromatic | You're welcome. | 05:09 | |
| I need to write a little tutorial; still plan to do so. | |||
| pmichaud | other way to monitor progress is filesize -- final filesize is 9003865 bytes :-) | ||
| chromatic | Maybe I can convince someone to take notes on what I say at YAPC if someone wants to listen to me walk through an optimization. | ||
| pmichaud | I'll take notes. | ||
| I might even bring a camera. :-) | |||
| (video) | |||
| chromatic | It should be obvious why I think callgrind-style output is useful for Parrot's PIR profiler. | 05:10 | |
|
05:10
Zak joined
|
|||
| Infinoid | I'm bummed. I'm moving to the east coast next week, and I still don't think I'm gonna be able to make YAPC | 05:10 | |
| pmichaud | 50% done | 05:11 | |
| Infinoid | Ok. In branch, Parrot_PCCINVOKE got called a total of 8 times. :) | 05:12 | |
| pmichaud | that's much more like it. | 05:13 | |
| how about pmc_new ? | |||
| Infinoid | 1219 times | 05:14 | |
| chromatic | That's more like it. | ||
| pmichaud | agreed. | ||
| chromatic | Any time we can remove PCCINVOKE from a hot path, we win. | 05:15 | |
| Infinoid | We're spending all our time under Parrot_print_p_i, processing printf format strings and creating STRINGs. A dedicated itoa() type of function would speed up this particular benchmark, but I don't know what it would do for parrot as a whole. | 05:17 | |
| chromatic | How many STRINGs do we create? | 05:18 | |
| Infinoid | 7.5 million | ||
| chromatic | From Parrot_str_append or Parrot_str_concat? | 05:19 | |
| Infinoid | from Parrot_sprintf_c, actually | ||
| chromatic | Neither append nor concat? That surprises me, but that's fine. | ||
| Infinoid | Not directly. 5 million calls to string_make from Parrot_sprintf_format, 2.5 million from Parrot_vsprintf_c | 05:20 | |
| I'll walk up the stack a bit to see if append or concat is involved | |||
| chromatic | Look in the right pane at the call graph. | ||
| It should be obvious. | |||
| Infinoid | Yeah, neither of those are in the All Callers pane. | 05:21 | |
| chromatic | Fair enough. | ||
| Infinoid | I almost want to use a static STRING in Parrot_print_p_i with a buffer that we know is big enough, and just make it call sprintf() directly | 05:22 | |
| chromatic | I mention them because they have some weird co-recursion between them that I've never quite been able to unravel. It seems like fixing that would speed up parts of PGE. | ||
| Infinoid | But that's just hacking around the fact that our I/O interface can't handle cstrings | ||
| chromatic | Can you create a string with the right buffer size at the start? | ||
| It doesn't have to be static (and look out concurrency), but it could avoid reallocating and resizing. | 05:23 | ||
| pmichaud | might also have to worry about COW stuff | ||
| Infinoid | How would that skip reallocation? We allocate a new STRING every time Parrot_print_p_i is called, it's the first thing that op does | ||
| s/reallocation/allocation/ | 05:24 | ||
| chromatic | This all almost makes me want to use ropes instead of strings instead. | ||
| I might be misunderstanding what you're suggesting. | |||
| pmichaud | Infinoid: what would it take to get the I/O interface to be able to handle cstrings, ooc? | ||
| or is that explicitly what was taken out in the I/O refactor? ;-) | |||
| (the earlier I/O refactor) | 05:25 | ||
| somehow the idea that int/num have to go through STRING in order to be output to a file handle is.... weird. | 05:26 | ||
| Infinoid | It does make per-filehandle encoding coercion easier, I guess, if we're doing that yet | ||
| pmichaud | I agree they have to be stringified. I'm just not sure they have to be made into STRINGs. | ||
| otoh, I'm not sure it's that big a win. | |||
| It might be over-optimizing to this specific case. | |||
| in Rakudo's case, we'd rarely be outputting integer registers -- we'd almost always be doing PMCs | 05:27 | ||
| Infinoid | chromatic: What I was suggesting was a hack. (A static STRING structure that just contains a buffer, and we just overwrite the buffer once per invocation, rather than allocating a new STRING. And break reentrancy entirely in the bargain.) | 05:28 | |
| Although... is it possible to allocate STRINGs directly on the stack? | |||
| pmichaud | right, but that hack might not work if others expect to use the STRING structure across multiple calls | ||
| Infinoid | This single-use pattern would have a lot less GC churn if we could do that | 05:29 | |
| chromatic | Stack allocation should be fine, except that marking stack-allocated STRINGs will be problematic. | ||
| Infinoid | True. We'd need a way to detect that. And raise a big huge flag if something in the heap holds a reference to something on the stack | 05:30 | |
| pmichaud | what I would almost prefer would be a way for PMCs to directly write themselves to a file buffer | ||
| chromatic | That does seem saner. | ||
| Infinoid | pmichaud: What would that look like? | ||
| pmichaud | Infinoid: well, think about it in C (more) | ||
| if I do printf("%d", 123); -- it doesn't normally write the integer to a buffer and then copy the buff | 05:31 | ||
| it actually does the equivalent of several putc calls to write each digit directly to the filebuffer | |||
| (at least, that's how it once did it) | 05:32 | ||
| Infinoid | right, it just uses putc for the output-a-character callback function | ||
| pmichaud | right | ||
| but that's a lot different from doing "convert-to-string-and-then-output-string" | |||
| especially in our case, wehre "convert-to-string" is in fact a bit expensive | |||
| so, what would be nicer is something (probably a vtable function) that allows a PMC to write itself to a filehandle | 05:33 | ||
| Infinoid | So in this case, print would be a method on the integer? | ||
| pmichaud | then print ofh, $P0 is really a vtable function on $P0, not on ofh | 05:34 | |
| Infinoid | cool. Then that vtable can call a vtable "putc" type of function on ofh | ||
| pmichaud | and yes, the default behavior can still be "stringify the PMC and send it to the filehandle", but for specific and common cases like Integer and Float it can be optimized a lot better. | 05:36 | |
|
05:36
flh joined
|
|||
| Infinoid | I like it. Though I think it'll get bloated when someone requests hex and octal and binary versions of print | 05:37 | |
| pmichaud | I don't know that we'd end up with those. Or, more to the point, those would go through sprintf (as they do now) | ||
| by "sprintf" I mean "sprintf opcode" | |||
| I just know that "output integer to a file" is something that can happen a lot. | 05:39 | ||
| and "output float to a file" also happens a lot | |||
| "output string to a file" happens more often, but that doesn't (shouldn't) require extra calls to string_make | |||
| hmmm, that's an interesting comparison -- just a sec | 05:40 | ||
| okay, I'm shocked. | 05:41 | ||
| nopaste | "pmichaud" at 72.181.176.220 pasted "benchmark, without outputting any integers" (29 lines) at nopaste.snit.ch/16830 | 05:42 | |
| pmichaud | looks like we can't put too muchof the blame on the integer->string comparison. | ||
| chromatic | Which opcode does print ofh, '000' become? print_p_sc? | 05:43 | |
| pmichaud | yes. | 05:44 | |
| I'm running callgrind on that one now | |||
| Infinoid | That part of the profile shouldn't have changed much | 05:45 | |
| it'll just output a few more bytes per write | |||
| pmichaud | well, it did run significantly faster in callgrind | ||
| Infinoid | That tweak seems to have sped things up about 8x | 05:46 | |
| pmichaud | why didn't it speed it up on my system, though? | ||
| Infinoid | What, you mean the "real" time? | ||
| pmichaud | yes | ||
| Infinoid | I dunno. Is it syncing to disk after every write? | 05:47 | |
| pmichaud | hard to say | ||
| Infinoid | It is! | ||
| pmichaud | I didn't request it to do so, no. | ||
| Infinoid | write(3, "000,000,000,000,000,000,000,000,"..., 8192) = 8192 | ||
| fsync(3) = 0 | |||
| pmichaud | well, apparently the sync is responsible for the bulk of the real-time cost (in the branch) | 05:51 | |
| Infinoid | Yes. I'm trying to track down what's calling it | ||
| pmichaud | trunk's cost is dominated by the PCCINVOKE | ||
|
05:52
iblechbot joined
|
|||
| chromatic | I thought we had buffered output. | 05:52 | |
| We've certainly had bugs in buffering output. | |||
| Infinoid | We do | 05:53 | |
| It buffers it, and then flushes it explicitly | |||
| nopaste | "pmichaud" at 72.181.176.220 pasted "same non-integer output benchmark for p5" (22 lines) at nopaste.snit.ch/16831 | ||
| Infinoid | I'm not so sure Parrot_io_write_buffer should *ever* be calling Parrot_io_flush, but it does | ||
| pmichaud | note that if the sync problem is resolved, then I suspect the parrot version will be several times faster than p5 | ||
| (for the case of not converting integers to strings) | 05:55 | ||
| chromatic | There's a FIXME in Parrot_io_write_buffer; that may be a sign. | ||
| pmichaud | fwiw, all of this isn't developed as an arbitrary benchmark (more) | 05:59 | |
| I wrote this benchmark because it's actually very similar to what pbc_to_exe is doing, which is what caused me to notice things being slow in the first place | 06:00 | ||
| Infinoid | Parrot_io_flush_filehandle calls both Parrot_io_flush_buffer (to do the write) and PIO_FLUSH (to call fsync). That doesn't seem easily separable | ||
| pmichaud | (i.e., it writes 2.5 million comma-separated integers to a file) | ||
| Infinoid | But after commenting out PIO_FLUSH, it runs the benchmark fast | 06:01 | |
| infinoid@chirp io_rewiring % time ./parrot x.pir | |||
| ./parrot x.pir 0.38s user 0.06s system 61% cpu 0.721 total | |||
| pmichaud | yeah | ||
| that's more like it | |||
| Infinoid | Thing is, I thought calling fsync() is the only thing calling Parrot_io_flush() did | 06:03 | |
| So that's a little confusing. | |||
| pmichaud | well, through Parrot my system seems to get a maximum I/O output of about 1MB/sec. Not very good. | 06:04 | |
| Infinoid | Commenting out src/io/filehandle.c:742 should improve that. But this I/O API is seriously misleading | 06:05 | |
| pmichaud | well, I'll leave it to you to figure out how to fix things from here. :-) | 06:06 | |
| Infinoid | Time for bed, goodnight :) | ||
| pmichaud | same here :) | ||
| thanks for the quick callgrind/kcachegrind tutorial -- I can make use of those. | 06:07 | ||
|
06:24
uniejo joined
06:33
Zak joined
06:37
flh joined,
Zak joined
06:45
bacek joined
06:54
viklund_ joined
07:21
snarkyboojum joined
|
|||
| chromatic | Commenting out PIO_FLUSH there *doubles* the speed of make coretest. | 07:21 | |
| That's wallclock speed, but still. | |||
|
07:28
Andy joined
07:38
clunker3 joined
|
|||
| Tene tries again to debug GC segfaults, fails badly. | 07:39 | ||
| I don't feel right posting this blog entry about inter-language programs on Parrot when some of the examples require running Parrot with GC disabled, though. | 07:40 | ||
| :( | |||
| chromatic | Agreed. | 07:43 | |
| Tene | Adding an ASSERT to that macro should only fail when the bad value is read from the context, not when it's written, afaict. | 07:44 | |
| So that didn't seem like it would help very much. | |||
| I was trying to convince gdb to watch the relevant memory location for me, but I failed at gdb. | 07:45 | ||
| chromatic | It helps narrow down where it gets written. | ||
| Tene | Will try again sometime tomorrow. | ||
|
07:50
muixirt joined
08:31
ttbot joined
|
|||
| ttbot | bacek: Parrot trunk/ r39321 MSWin32-x86-multi-thread make error tt.ro.vutbr.cz/file/cmdout/24948.txt | 08:31 | |
| bacek: Parrot trunk/ r39321 i386-linux-thread-multi make error tt.ro.vutbr.cz/file/cmdout/25045.txt | |||
| bacek: Parrot trunk/ r39321 i386-freebsd-64int make error tt.ro.vutbr.cz/file/cmdout/25126.txt | |||
| mj41 | Hi. I just started ttbot (TapTinder bot) to report build failures - tt.ro.vutbr.cz/buildstatus/pr-Parrot/rp-trunk . | 08:34 | |
| cotto | mj41++ | 08:38 | |
| chromatic | Huh. There were more optimizations in CodeString. | 08:42 | |
| Looks like a 3.5% performance improvement in parsing long files. | |||
| muixirt | chromatic, you might want to revisit use.perl.org/~chromatic/journal/35333 | 08:53 | |
| and report the progress since then | 08:54 | ||
|
08:56
mikehh_ joined
|
|||
| chromatic | I'm not sure I want to, at least until after a couple of branches land in trunk. | 08:56 | |
| muixirt smiles innocently, holds back the diabolic grin ;-) | 08:57 | ||
|
09:00
payload joined
|
|||
| chromatic | Hm, 2.61% improvement, but I'll take that. | 09:01 | |
|
09:01
masak joined
|
|||
| bacek | OH HAI | 09:04 | |
| bacek spent few hours on Keys, Iterators and Hashes... | |||
| I have one question - who designed it? So I can visit him with baseball bat in hands... | 09:05 | ||
| chromatic | I'm not sure anyone designed it. | 09:07 | |
| bacek | oh shi... | 09:08 | |
| dalek | rrot: r39449 | chromatic++ | trunk/src/pmc/codestring.pmc: [PMC] Avoided unnecessary STRING copying in emit() and lineno() methods by safe. Avoided mostly unnecessary calls to string_ord() to coalesce \\r\\n sequence into a single logical newline. This combination speeds up Rakudo's actions.pm processing by 2.61%. |
09:10 | |
| bacek | Can we put in DEPRECATED.pod something like "Whole Iterators, Containers and Keys will be refactored after 1.4"? | ||
| chromatic | I *think* we have to deprecate them by 1.4 to fix them after 2.0. | 09:12 | |
| Or fix them *in* 2.0. Coke knows much better. | |||
| bacek | ouch... Fix after 2.0 mean about a year to deployment... | ||
| chromatic | Could be worse. Could be Perl 5. | 09:13 | |
| bacek | no way... | 09:14 | |
| purl | WAY! | ||
| chromatic | Make up a plan for what to do, and we'll see what we can do. | ||
| bacek | ok. I'll try to make some .plan without using my favourite sentences about technical design. | 09:15 | |
| chromatic | "You are all idiots. I've known monkeys who can write code better than you. You are the universe's version of technical debt. I've sneezed better programs than you could ever write." | 09:16 | |
| bacek | Something like this :) | 09:17 | |
| chromatic | "There is a picture of your program next to the word CRACK in the dictionary." | ||
|
09:17
gaz joined
|
|||
| bacek | Small mistake. It's after "CRAP" :) | 09:17 | |
| chromatic | "If dmr read your program, he would invent time travel to prevent himself from inventing C so that you could never perpetuate such horror upon the world." | 09:18 | |
| bacek | dmr? | 09:19 | |
| purl | well, dmr is Dennis Ritchie, author of Unix and C; he is our Grey Eminence, and King. or at cm.bell-labs.com/cm/cs/who/dmr/ | ||
| chromatic | "The only thing worse than your code is the compiler which allowed it." | ||
| bacek | wow, we did you get it? I need this source of wisdom for my day-to-day technical discussions! | 09:21 | |
| s/we/where/ | |||
|
09:22
donaldh joined
|
|||
| chromatic | I made them all up. My other job is professional writer.... | 09:23 | |
| bacek requesting the book "Famous quotes by chromatic" | |||
| I actually quite enjoyed reading MPB and other your posts. | 09:24 | ||
| chromatic | Thanks. I try to throw in some over the top rhetoric so people know I'm not red-faced and ranting. | 09:25 | |
| bacek | btw, I work with some guy who have almost same sarcastic manners in speech (code comments, bug reports, etc). Many times I suspect that you and him are same persons :) | 09:28 | |
| (And he is one of greatest developers I've meat) | 09:29 | ||
| chromatic | Don't tell, but there's a franchise. | ||
| bacek | :) | 09:31 | |
| What is semantic behind "provides foo"? How I can check it and use it? | 09:35 | ||
| And how it differ from "does foo"? | 09:36 | ||
| chromatic | I don't recall that it does, but it's too late for me to remember clearly anyway. | 09:39 | |
| bacek | found it. | 09:41 | |
| VTABLE_does check provides_str which is generated from "provides foo" | 09:42 | ||
|
09:46
mikehh joined
09:55
snarkyboojum joined
10:26
payload joined
11:13
payload joined,
clinton joined
11:21
donaldh joined
11:43
clunker3 joined
11:54
skids joined
12:25
Andy joined
12:40
payload joined
12:45
whoppix joined
12:48
barney joined
12:58
gryphon joined
13:02
UltraDM joined
13:18
Whiteknight joined
|
|||
| Infinoid | Whiteknight: You'll love this next commit. | 13:21 | |
| dalek | TT #746 created by coke++: pdd19 fails t/codingstd/pdd_format.t | ||
| Whiteknight | really? | 13:22 | |
| szbalint readies the gift cookies | |||
| dalek | rrot: r39450 | Infinoid++ | branches/io_rewiring/src/io/buffer.c: We really don't need to call fsync() after every write(). write buffered but unwritten data. The problem is, that would also call PIO_FLUSH(), forcing the OS to push the data all the way to disk. While OS-level flushes are useful, we don't want to do them all the time, as doing so severely hurts performance. Ensuring disk consistency wasn't part of the contract of these functions anyway, they were just doing it to make the buffer pointers and lengths consistent so they could do their jobs. If we call Parrot_io_flush_buffer() directly instead of Parrot_io_flush(), we write out the buffered data, but don't call the OS flush, so life is good. Removing the OS flush sped up "coretest" by a factor of 2, for chromatic. Apparently my Thinkpad's disk is slower than his, so the difference is even more striking for me: before: make coretest 84.97s user 61.91s system 32% cpu 7:35.52 total after : make coretest 75.22s user 43.93s system 67% cpu 2:57.62 total |
13:23 | |
| Whiteknight | Damnit Jim, I'm a coder not a speed-reader | 13:24 | |
| Infinoid | Sorry. I had written it up as a ticket, before realizing how simple the solution was and deciding to just commit it directly. But apparently I still felt the need to rant a little | 13:25 | |
| Whiteknight | so that's a factor-of-2 speedup on top of the 4x speedup that some benchmarks are showing? | ||
| Infinoid | That's a wallclock time speedup, on top of your cpu speedup | 13:26 | |
| szbalint | Infinoid++ # you deserve more karma for this :) | ||
| Whiteknight | So basically it took less then three minutes, as opposed to over seven and ahlf? | 13:27 | |
| Infinoid | yes | ||
|
13:27
whoppix joined
|
|||
| Whiteknight | Infinoid++ # Holy shit | 13:27 | |
| Infinoid | It eliminates a lot of dead time waiting for the OS to catch up the data we've written | ||
| Whiteknight | wait, what command did you run to get those results? | 13:28 | |
| Infinoid | I had originally just commented out the call to PIO_FLUSH. This commit does it in a safer way | ||
| Whiteknight | what was your benchmark? | ||
| Infinoid | "time make coretest" | ||
| Whiteknight | So you cut coretest time to 38% of what it used to be? | 13:29 | |
|
13:29
ruoso joined
|
|||
| Infinoid | So it seems | 13:29 | |
| I'm going to redo pmichaud's benchmark to get the numbers, but there was an insane speedup there too | |||
| Something on the order of 9 seconds -> 0.7 seconds | 13:30 | ||
| (but I haven't done it with this safer version of the patch yet) | |||
| szbalint | fsync is expensive, no wonder | ||
| Infinoid | yeah, and completely unnecessary | ||
| It's nice to have, but not by default. | |||
| szbalint | fsync only makes sense after accumulating a largish chunk of data anyways | 13:32 | |
| Whiteknight | so what's the total speedup of the branch now versus trunk? | ||
| Infinoid | fsync only makes sense for databases and mail servers ensuring atomicity | ||
| szbalint | yeah | 13:33 | |
| Infinoid | here's stats for pmichaud's benchmark: | ||
| trunk : ./parrot x.pir 12.37s user 1.67s system 56% cpu 24.747 total | |||
| branch: ./parrot x.pir 0.42s user 0.06s system 64% cpu 0.741 total | |||
| Infinoid comes up with some numbers for coretest (this time ensuring his laptop won't change cpu speeds) | 13:34 | ||
| Uck, with all the disk I/O I can't fairly do both checkouts in parallel | 13:36 | ||
| Have you ever noticed how often "make test" sits there at almost 0% CPU? This is why it did that. (I've been wondering about that for a while) | |||
| Whiteknight | irclogs | 13:38 | |
| irclogs? | |||
| purl | i guess irclogs is irclog.perlgeek.de/parrot/today or see also: infrared clogs | ||
| Infinoid | pmichaud++ for making me look closer at this | ||
| trunk : make coretest 63.07s user 36.61s system 26% cpu 6:15.27 total | 13:44 | ||
| branch: make coretest 61.79s user 31.34s system 63% cpu 2:26.49 total | 13:45 | ||
| (the timings I put in that commit message were a little high, because my laptop was doing cpufreq nonsense) | |||
|
13:52
estrabd joined
|
|||
| Whiteknight | amost unbelievable | 13:54 | |
| Infinoid | I didn't see any difference from rakudo test (but I guess that's compilation-heavy, not io-heavy) | 13:55 | |
| Whiteknight | list? | 13:56 | |
| purl | rumour has it list is groups.google.com/group/parrot-dev or take that, moose-heads | ||
| pmichaud | good morning, #parrot | ||
| dalek | rtcl: r440 | coke++ | trunk/t/cmd_inline.t: Hopefully someone can suggest a way to rewrite this PASM invocation |
14:01 | |
|
14:03
uniejo joined
|
|||
| Infinoid | morning pm | 14:03 | |
| I checked in a version of that fsync cleanup which I think is safe, but if you have a moment, more eyeballs on it can't hurt | 14:04 | ||
| Whiteknight | purl Infinoid? | 14:08 | |
| purl | Infinoid is Mark Glines <mailto:mark@glines.org> or likes shiny things | ||
| Whiteknight | purl Infinoid is also the master of the universe | ||
| purl | okay, Whiteknight. | ||
| Whiteknight | Infinoid++ | ||
| Infinoid | hah, thanks | 14:09 | |
| I tried to hock it once, but they said they won't accept hot merchandise. | 14:10 | ||
|
14:10
Andy joined
|
|||
| Whiteknight | so much work to do still on the IO system! | 14:17 | |
|
14:18
PacoLinux joined
|
|||
| Infinoid | I wish I had time to actually do some in-depth tweaking... for instance, I'm still not sure src/io/filehandle.c should exist at all (I'd like to push that code into filehandle.pmc) | 14:19 | |
| Unfortunately, this week will be even crazier for me than the last one | |||
| Anyway, again, when you do the branch merge, please just remove the Pipe and PipeHandle PMCs, I haven't had the chance to test them at all, so I assume they don't work | 14:21 | ||
| Whiteknight | Okay, I can pull them out. I'm hoping to merge this branch tonight | 14:25 | |
| Andy | mmmm, branch merging | ||
| Whiteknight | we can start another branch to work on improving sockets and pipes | ||
| and filehandle stuff definitely needs some lovin | |||
| Infinoid | Awesome, thanks. | 14:26 | |
| Whiteknight | So long as I can get started on AIO in July, I don't care what else we have to do before that | ||
| Infinoid has no idea how these fixups will affect AIO | |||
| Whiteknight | I think it's all going to have a very positive effect | 14:28 | |
| I'm generally envisioning a system, at least two start, that has a completely separate backend for the asynchronous stuff | 14:29 | ||
| we can unify things and pretty them up over time | |||
| dalek | kudo: 5c065e0 | pmichaud++ | docs/spectest-progress.csv: spectest-progress.csv update: 399 files, 11428 passing, 0 failing |
14:31 | |
|
14:34
particle joined
|
|||
| dalek | rrot: r39451 | barney++ | trunk/t/library/pcre.t: [t] a saner name for a variable |
14:36 | |
|
14:45
particle1 joined
|
|||
| dalek | TT #747 created by barney++: Constants in library.h are not available as PASM constants | 14:48 | |
|
14:50
payload joined
|
|||
| dalek | rtcl: r441 | coke++ | trunk/ (9 files): - make tools/spectcl a generated file to track parrot location |
14:50 | |
| rrot: r39452 | barney++ | trunk/t/library/pcre.t: Add trac ticket number to a TODO comment |
14:52 | ||
| rtcl: r442 | coke++ | trunk/t/cmd_expr.t: Add a todo test that causes some failures in the expr spec tests. |
15:05 | ||
|
15:18
Theory joined
15:21
donaldh joined
|
|||
| dalek | rtcl: r443 | coke++ | trunk/t/cmd_expr.t: Make it clear that this isn't just a round edge case, but a floor. |
15:34 | |
|
15:35
HG` joined
16:01
Su-Shee joined
|
|||
| Su-Shee | hi. | 16:01 | |
| Infinoid | Hi, Su-Shee | 16:26 | |
|
16:29
viklund_ joined,
payload joined,
Psyche^ joined
16:35
flh joined
16:37
barney joined
|
|||
| particle1 | what made you decide on the newark job? | 16:39 | |
| particle | ...stupid irc client... | ||
| dalek | tracwiki: v16 | cotto++ | ParrotQuotes | 16:47 | |
| tracwiki: bacek++ and chromatic++ commiserate over code quality | |||
| tracwiki: trac.parrot.org/parrot/wiki/Parrot...ction=diff | |||
|
16:56
chromatic joined
|
|||
| dalek | rrot: r39453 | barney++ | trunk/t/pmc/array.t: TT #671 Rewrite of t/pmc/array.t to PIR Curtesy of bobw. |
16:56 | |
|
16:58
antiphase joined
|
|||
| dalek | TT #671 closed by barney++: [PATCH] Rewrite of t/pmc/array.t to PIR | 17:02 | |
| antiphase | Is there much PIR anywhere that I can look at for examples? I'm not getting very far with the official docs | 17:09 | |
| Tene | antiphase: t/ has a lot of small examples | 17:10 | |
| in the parrot tree | |||
| antiphase | Ah, thanks | ||
| cotto | antiphase, you can also pass --target-pir to most Parrot-based compilers to see what PIR they generate. | 17:11 | |
| although that code isn't designed to be human-readable | |||
| antiphase | I'm more trying to get into writing it by hand | 17:12 | |
| NotFound | antiphase: there are examples in the directory examples/ (suprprise!) | 17:14 | |
| cotto was just going to mention that | |||
| antiphase swears at his terminal colour scheme for having barely visible directory entries | 17:15 | ||
| NotFound | unalias ls | 17:16 | |
|
17:17
payload joined
17:30
darbelo joined
|
|||
| dalek | kudo: dbebac0 | pmichaud++ | src/parser/grammar.pg: Update "module Foo;" to allow statements before it. |
17:43 | |
|
18:02
AndyA joined
18:16
nnunley joined
18:32
clinton joined
18:41
bacek joined
18:44
sekimura joined,
clinton left
|
|||
| Tene | I prefer alias ls='ls -F' | 19:04 | |
|
19:06
bacek joined
19:20
donaldh joined
|
|||
| dalek | rtcl: r444 | coke++ | wiki/SpecTest (2 files): Update 'make spectest' information - update to reflect current status on feather. - mark tests we should skip with @SKIP |
19:32 | |
| mj41 | /msg purl karma mj41 | 19:35 | |
| dalek | rtcl: r445 | coke++ | trunk/tools (2 files): Don't maintain the skip list in two locations. |
19:36 | |
|
19:52
Whiteknight joined
|
|||
| Whiteknight | Infinoid: ping | 19:52 | |
| Infinoid | ohai | 19:58 | |
| Whiteknight | I'm not online for long here, but have you heard any other complaints about the io_rewiring branch? | ||
| because I'm going to merge it tonight unless any issues have popped up | |||
| Infinoid | No, sounds like partcl and rakudo work, haven't heard anything else about it at all | 19:59 | |
| Whiteknight | okay, hearing nothing is good enough for me | ||
| I want it in so we can hash out any issues tomorrow at #ps | |||
| Infinoid | And impress everyone with our performance, hopefully :) | 20:00 | |
| Whiteknight | hopefully | ||
| The 1.3 release should be quite an impressive one | |||
| at least in terms of performance | 20:01 | ||
| dalek | rtcl: r446 | coke++ | wiki/ParrotIssues.wiki: add another memory related TT. |
||
|
20:02
Su-Shee left
|
|||
| Whiteknight | okay, talk to you later | 20:14 | |
|
20:38
particle1 joined
21:03
Whiteknight joined
|
|||
| Whiteknight | Infinoid: ping again | 21:06 | |
| purl | I can't find again in the DNS. | ||
| Whiteknight | wow, coretest IS significantly more speedy on my system too | 21:13 | |
| chromatic | Try it with parallel testing. It's even better. | ||
| Whiteknight | how do you do that? | ||
| make -j2 coretest I assume? | 21:14 | ||
| chromatic | mj coretest TEST_JOBS=5 | 21:15 | |
| The important part is TEST_JOBS=n | |||
| Set that environment variable to tell TAP::Harness now many parallel tests to run. | 21:16 | ||
| -j2 only affects how many parallel jobs make will run. | |||
| Whiteknight | I don't have mj on my system. How to get it? | ||
| chromatic | alias mj='make -j9' | ||
| Whiteknight | ah | ||
| holy crap this is amazing | 21:17 | ||
| chromatic++ | |||
| you may have just optimized me | |||
| chromatic | I can run coretest in 45 seconds with a few of the PIO optimizations. That's a huge improvement. | 21:18 | |
| Whiteknight | how long was it taking you before? | 21:19 | |
| cotto | oh wow | ||
| Whiteknight | I think I just ran in in 64 seconds, it used to take me about 10 minutes | ||
| dalek | kudo: acd4cfb | jnthn++ | src/ (2 files): Fixes and improvements to .^parents, to get it passing all of the current tests in S12-introspection/parents.t. Should now be much more stable than before. |
||
| kudo: a18b1d3 | jnthn++ | t/spectest.data: Add S12-introspection/parents.t to spectest.data. |
|||
| Whiteknight | maybe closer to 7 minutes, but same ballpark | ||
| chromatic | I could run it somewhere between 90 - 120 seconds before. | 21:21 | |
| dalek | rrot: r39454 | whiteknight++ | branches/io_rewiring (60 files): [io_rewiring] Merge from trunk r39444:39453 |
21:22 | |
| Whiteknight | We're squeezing that IO system pretty tightly, I don't think there are too many more huge gains to be had from it | ||
| a few small points here or there, but I think we got the bulk of it | 21:23 | ||
| chromatic | I'll try to fix a couple of bugs in the calling conventions rewiring branch soon; I have some ideas. | ||
| Whiteknight | Last time I talked to allison she said she had a lot of uncommitted changes in that branch, so I haven't been touching it | 21:24 | |
| but I do hope that one lands soonish | |||
| chromatic | She committed those changes but asked that no one commit to it unless they can demonstrate that the patch solves real problems. | ||
| cotto | I think she committed them with the caveat they some might need to be rolled back | ||
| Infinoid | There's a "FIXME: This is badly optimized, will fixup later" comment in Parrot_io_write_buffer(), I think we could see some reasonable gains from working on that | ||
| chromatic | Yeah, that looked like it flushes too aggressively. | 21:25 | |
| Whiteknight | Infinoid: Yeah, there are some places we can squeeze some more performance, but nothing that's going to be "holy shit" faster | ||
| Infinoid | chromatic: Did you see my commit from this morning? | ||
| chromatic: trac.parrot.org/parrot/changeset/39450/ | |||
|
21:25
japhb joined
|
|||
| Infinoid | heh, yeah. That fsync() stuff was pretty awful. | 21:26 | |
| The more junk we can get out of the way, the faster IO will happen, that's the name of the game right now | 21:27 | ||
| Whiteknight | true, I plan to start another branch tomorrow to work on more of that | ||
| pmichaud | fwiw, the faster I/O isn't likely to affect rakudo execution speed much. It will help with building speed, however. | 21:29 | |
| and I'll want to re-benchmark pbc_to_exe | |||
| Whiteknight | pmichaud: any speedups are good speedups | ||
| and then we move to the next subsystem and optimize like maniacs | |||
| Infinoid | Whiteknight: oh, uh, pong | 21:31 | |
| Whiteknight | Infinoid: can you delete the Pipe and PipeHandle PMCs from the branch? | 21:32 | |
| If I do it I'm going to have to rebump PBC_COMPAT and make the packfiles again, and borkage will happen | 21:33 | ||
| Infinoid | oki | ||
| Whiteknight | thanks (sorry to keep bothering you today!) | ||
| Infinoid | Ok. We still want the PBC_COMPAT bump because of Handle, but I think I can just leave that as-is | 21:37 | |
| (bumping it twice within the same dev branch doesn't seem necessary to me) | 21:38 | ||
| Whiteknight | okay, that's fine | ||
| Whiteknight disappears | |||
|
21:42
ruoso joined
|
|||
| dalek | kudo: 063f3d5 | jnthn++ | src/parser/ (2 files): Returns traits should parse a fulltypename, not a typename. Fixes a bug spotted by pyrimidine++. |
21:48 | |
| pmichaud | also, I should note that I didn't actually test rakudo on the io_rewiring branch -- I just tried out my benchmark. | 21:49 | |
| jonathan | pmichaud: It'll be a win for Rakudo programs doing I/O, which is a nice thing. | 21:50 | |
| pmichaud | jonathan: I'm not sure it'll be a big win (more) | 21:51 | |
| iiuc, the speedup is when doing I/O for native types | |||
| when not doing native types, we still end up making PCCINVOKE calls, which dominate the overall cost | |||
| jonathan | I thought the biggest win was removing so many calls to fsync? | ||
| pmichaud | I can give an idea... just a sec | 21:52 | |
|
21:53
Limbic_Region joined
|
|||
| pmichaud | okay, it does seem to give a speedup even for PMCs | 21:54 | |
| checking more. | |||
| a nice speedup, even | |||
| particle1 | fsync alone should result in <50% previous timings | 21:55 | |
| er, ~50% | |||
| pmichaud | not if the previous timings were due to the cost of converting integers to strings, or making method calls on the FileHandle object | ||
|
21:55
particle left
|
|||
| pmichaud | Let's put it this way. | 21:56 | |
|
21:56
bacek joined
|
|||
| pmichaud | In trunk, my benchmark of writing 2.5 million integers takes 50+ sec | 21:56 | |
| particle1 | yes, for i/o related activities, of course | ||
| pmichaud | before making the fsync fix, the branch had that down to 11 seconds | ||
| so that tells me that 39 seconds of time was spent in conversion, not fsyncs (since the number of fsyncs should remain the same in both cases) | 21:57 | ||
|
21:57
eternaleye joined
|
|||
| pmichaud | fixing the fsync issue brings it down to about 4 seconds | 21:57 | |
| chromatic | How about when running with -G on the branch now? | 21:58 | |
| pmichaud | well, the benchmark doesn't make pmcs, so I suspect -G doesn't make a difference. | ||
| running some tests now. | 21:59 | ||
| particle1 | it does if it calls functions | 22:00 | |
| nopaste | "pmichaud" at 72.181.176.220 pasted "Pm's 2.5 million integer benchmark" (20 lines) at nopaste.snit.ch/16833 | ||
| pmichaud | it doesn't call functions. | ||
| in trunk, that benchmark requires 48.5 seconds (wall) | |||
| in io_rewiring, it requires 8.5 seconds (wall) | 22:01 | ||
| but most of that speed improvement is due to avoiding the internal PCCINVOKEs | |||
| surprisingly! | 22:02 | ||
| converting the benchmark to output a PMC Integer instead of a register integer causes it to run in 3.5 seconds | |||
| I have no clue why. | |||
| particle | faster with pmc's. | ||
| interesting. | |||
| jonathan | How is it with a PMC in trunk? | 22:03 | |
| Before the io_rewiring? | |||
| nopaste | "pmichaud" at 72.181.176.220 pasted "running with PMC instead of integer register" (30 lines) at nopaste.snit.ch/16834 | ||
| chromatic | That sounds like a job for callgrind. | ||
| pmichaud | That tells me that Parrot_io_print_p_i is very suboptimal somehow. | 22:04 | |
| if an Integer PMC can be stringified faster than an int register.... that's.... weird. | |||
| trying in trunk | |||
| chromatic | STRING * const s = Parrot_sprintf_c(interp, INTVAL_FMT, $2); | ||
| Parrot_io_putps(interp, $1, s); | |||
| Oh look, varargs. Hooray. | 22:05 | ||
| pmichaud | aha! | ||
| intger.pmc uses | |||
| return Parrot_str_from_int(INTERP, SELF.get_integer()); | |||
| obviously Parrot_str_from_int is faster. | |||
| chromatic | Lots. | 22:06 | |
| jonathan | Any reason not to use that in the op? | ||
| pmichaud | in trunk: 37.2 seconds using the PMC (was 48.5 seconds using the int register) | ||
| I'll try converting the op and see what we get | |||
| chromatic | I'm testing it too. | ||
| Infinoid | pmichaud: It's still taking 8 seconds in branch for you? fsync knocked it down to 0.7 here | ||
| (yes, the majority of the improvement was still PCCINVOKE) | 22:07 | ||
| pmichaud | Infinoid: was that the "print an integer" test or the "print a '000,' string" one? | ||
| when printing only strings I got down to 0.7, yes. | |||
| but printing integers is still slow (always was) | |||
| Infinoid | Actually, I didn't try integers today | 22:08 | |
| pmichaud | right, I know that printing strings is fast. No conversion needed. | ||
| Infinoid | ok, I remember now, converting integers required allocating lots of STRINGs | ||
| chromatic | So does printing PMCs. | ||
| All tests pass with that op change. | 22:10 | ||
| pmichaud | changing the op on my system brings the time down to 3.3 seconds (was 8.5) | ||
| (I only made the conversion in the one op -- looks like much of io.ops needs updating) | 22:11 | ||
|
22:11
cognominal left
|
|||
| pmichaud | notably, making those changes means that Parrot's IO is faster than p5's | 22:11 | |
| (for this benchmark) | |||
| chromatic | Are you using an optimized Parrot? | 22:12 | |
| particle | whee! | ||
| pmichaud | no. | ||
| chromatic | Assume it's some 15% faster then. | ||
| pmichaud | the equivalent p5 on my system requires 4.6 seconds | ||
| chromatic | If your Perl 5 is a threaded build, Parrot's still faster than unthreaded Perl 5 (if you build Parrot with optimizations). | 22:13 | |
| pmichaud | anyway, I think we probably want to do a search for INTVAL_FMT and see if there are many places where we can change it to use Parrot_str_from_int instead | 22:14 | |
| chromatic | Let's merge in the branch first and then gild it! | ||
| Not geld, mind you. We're ungelding it. | |||
| pmichaud | looks like not too many instances -- just the few in io.ops | ||
| two, really. | 22:15 | ||
| so, back to the bigger picture -- the biggest slowdown in trunk remains the use of PCCINVOKE for I/O, which the branch eliminates | 22:17 | ||
| fixing fsync gets us a bit faster than that, but the fsync improvement isn't as big as the PCCINVOKE one | 22:18 | ||
| particle | that makes sense, as there are generally more function calls than writes | 22:21 | |
|
22:21
bacek joined
|
|||
| bacek | good morning #parrot | 22:22 | |
|
22:23
Theory joined
22:27
eternaleye joined,
kid51 joined
|
|||
| GeJ | Good morning everyone | 22:28 | |
|
22:31
cognominal joined
22:32
rg joined
22:33
sekimura_ joined
|
|||
| particle updates from 36154 to head | 22:33 | ||
| dalek | rrot: r39455 | Infinoid++ | branches/io_rewiring (2 files): Add a couple of files to branch that are in trunk (and MANIFEST) but apparently weren't svn added. |
22:38 | |
| kid51 | seen allison? | ||
| purl | allison was last seen on #parrot 5 days, 21 hours, 1 minutes and 21 seconds ago, saying: and, yes, base_type comparisons will work fine for this (and be very fast) [Jun 3 01:33:45 2009] | ||
| rrot: r39456 | Infinoid++ | branches/io_rewiring/src/pmc (2 files): Remove Pipe and PipeHandle PMCs, they aren't ready for prime time yet. |
|||
| rrot: r39457 | Infinoid++ | branches/io_rewiring/t/native_pbc (5 files): Regenerate native_pbc files. |
|||
| rrot: r39458 | Infinoid++ | branches/io_rewiring/PBC_COMPAT: PBC_COMPAT version 4.8 just gets Handle, not Pipe or PipeHandle |
|||
| Infinoid | Whiteknight: It's all yours. | ||
|
22:55
Austin_Hastings left
|
|||
| darbelo | cotto: ping | 22:58 | |
| cotto | darbelo, pong | 23:03 | |
|
23:07
tetragon joined
|
|||
| darbelo | I was trying to gauge the DecNum completion by comparing it with the other numeric PMCs and I think implementing visit, freeze and thaw are our only unimplemented VTABLEs. | 23:09 | |
| cotto | That | ||
| That's quite possible. | |||
| darbelo | So I figured I might as well implement them. And then I realize I don't really know what the visit VTABLE is supposed to be doing. | 23:10 | |
| cotto | it's a little confusing. I remember having some trouble with it too. | 23:11 | |
| lemme dig up my notes | |||
| darbelo | So far, the src/dynpmcs dir wasn't very helpful. Only one pmc there implements it and it's only call to SUPER(). | 23:12 | |
|
23:13
Coke joined
|
|||
| darbelo | Looking at src/pmc now. | 23:13 | |
|
23:13
Coke joined
|
|||
| Coke | hio | 23:14 | |
| chromatic | Did you try Partcl with the fsync optimization on the IO branch? | 23:15 | |
| Infinoid | By the way, are there any HLLs out there other than Rakudo which can build from a separate (non-installed) Parrot build directory? | 23:16 | |
| I've tried partcl and lua, and they both require an installed Parrot. (Which I don't want to do 5 times a day whenever I want to test something.) | |||
| chromatic | Pheme can, but it doesn't do much. | 23:17 | |
| Pynie probably can. | |||
| Tene | Infinoid: steme can | 23:19 | |
| cardinal | |||
| purl | hmmm... cardinal is mail.freesoftware.fsf.org/pipermail...dinal-dev/ or the Ruby-on-Parrot project. or xrl.us/uyz3 | ||
| Infinoid | Cool, thanks. That'll help widen my test pool | ||
| Tene | Infinoid: I just install to ~/parrot | ||
| fwiw | |||
| cotto | darbelo, VTABLE_visit is called before both freeze and thaw and should do one of two things, depending on visit_info->what: | ||
| nopaste | "kid51" at 68.237.12.237 pasted "io_rewiring branch: failures in 2 test files" (31 lines) at nopaste.snit.ch/16835 | 23:20 | |
|
23:20
donaldh joined
|
|||
| Coke | Infinoid: installing parrot is not that much of a lift. | 23:21 | |
| chromatic | Pipe is gone, so the test should disappear too. | ||
| Infinoid | Coke: It is when I want to compare trunk vs branch, side by side | ||
| chromatic: Just committed that. | |||
| kid51 | Nothing at all in this file in io_rewiring branch: t/op/arithmetics_pmc.t | ||
| dalek | rrot: r39459 | Infinoid++ | branches/io_rewiring/t/pmc/pipe.t: Pipe is gone, remove pipe.t. |
||
| cotto | actually, you don't need to worry about visit for a non-aggregate PMC. | 23:22 | |
| Coke | chromatic: I just did a plain run of the spec suite to get a baseline. | 23:23 | |
| chromatic: it's been 4.5 months since I did that; will run against the io branch shortly. | |||
| kid51 | What about arithmetics_pmc.t ? | ||
| Infinoid | kid51: That and an NQP test were improperly merged in from trunk, I'll fix that | ||
| kid51 | Infinoid: Thanks. I wanted to time that branch's make coretest but couldn't due to failures. | ||
| darbelo | cotto: "don't need to worry" == "Leave unimplmented" ? | 23:24 | |
| cotto | yup | ||
| darbelo | One down, two to go. ;) | ||
| cotto | Your code should be similar to the freeze/thaw code in src/pmc/integer.pmc | ||
| dalek | rrot: r39460 | Infinoid++ | branches/io_rewiring (2 files): Copy in 29-self.t and arithmetics_pmc.t from trunk; the files I added in r39455 were apparently empty. |
||
| Infinoid | kid51: If you can update to r39460, it will hopefully only update tests and not require a clean/realclean to re-test | ||
| cotto | To freeze, just push stuff onto visit_info->image_io. | 23:25 | |
| Infinoid | kid51: I haven't compared without --optimize, but the difference should be striking either way | 23:26 | |
| darbelo | What sould I do with the DecNumContext pointer? Freezing that seems like asking for a segfault to me. | ||
| cotto | Hmm. | ||
| chromatic | What's it store? | 23:27 | |
| darbelo | A pointer to another PMC. | ||
| cotto | Storing it is pretty simple (just a bunch of ints iirc), but it's a singleton. | ||
| jonathan | I think freezing pointers to other PMCs is what visit is about. | 23:31 | |
| But last time I played with a visit routine I just made lots of segfaults... :-/ | |||
| kid51 doesn't --optimize | 23:35 | ||
| cotto | That means that thawing a DecNumber PMC will either result in context inconsistency or the side-effect of changing the current context. | ||
| davidfetter | bon appetit, kid51_at_dinner | 23:37 | |
| cotto | I guess the context and number PMCs will need to be frozen/thawed separately and that the docs will need to warn about posible inconsistency if the programmer isn't careful. | ||
| particle | feh. on windows, nmake smoke ; nmake smoke rebuilds parrot | 23:41 | |
| darbelo | cotto: I could change from storing a PMC pointer in an ATTR to always pmc_new()ing the context. | ||
| particle bikes 11 miles & | |||
|
23:41
patspam joined
|
|||
| cotto | darbelo, I'd look into the efficiency of that. It might make more sense to use pmc_new to populate the ATTR when thawing the context. | 23:42 | |
| darbelo | Efficiency is why I put the ATTR there :) | 23:43 | |
| pmc_new on thaw sounds good, but I'm not sure how thaw()ing a singleton works. | 23:44 | ||
| I someone creates a context and then thaws a frozen one, does it get overwritten? | |||
| cotto | That's tricky. The answer is either "yes" or "no, and there may be a context mismatch". | 23:45 | |
| jonathan | What sort of things does the context hold? | 23:47 | |
| And what would the upshot of a mismatch be? | |||
| darbelo | A pointer to a structure. Thing is the semantics of the PMC rely on the singleton-ness of the context. | 23:50 | |
| cotto | speleotrove.com/decimal/dncont.html - number of digits, exponent min/max, error info, rounding mode | ||
| I think the best option may be to implement thawfinish as a place to do a sanity check on the newly-thawed PMC. | |||
| (thawfinish is called after thaw to do any cleanup work) | 23:51 | ||
| actually, you could do that during thaw too | |||
| s/too/instead/ | |||
| pmichaud | jonathan: do you expect to be around tomorrow? | 23:52 | |
| darbelo | I guess I could just pmc_new() a location for us and overwrite the structure, but I'm not sure a context can survive a gc run without a DecNum to mark it as live. | ||
| cotto | Without looking at the code, I'd assume that singletons wouldn't get GC'd until the interp exits. | 23:53 | |
| cotto goes to check | |||
| Whiteknight, ping | 23:57 | ||
| jonathan | pmichaud: Yes | 23:58 | |
| pmichaud: Got Slovak class early afternoon, may have to run an errand to the bank in the morning. | 23:59 | ||
| pmichaud | I'm planning to work on rakudo building from installed parrot tonight -- I'll probably need some testing on windows platforms | ||
| jonathan | pmichaud: But other than those, should be. | ||
| OK | |||
| Do you plan to keep it possible to build Rakudo against a Parrot build tree too, or installable only? | |||
| pmichaud | I hope to be able to use both | ||