00:24
avuserow joined
01:26
FROGGS_ joined
01:48
ilbot3 joined
03:12
klaas-janstol joined
|
|||
timotimo | speshing iscont ends up with sad spectests :\ | 06:06 | |
dalek | arVM: 5dccd58 | (Timo Paulssen)++ | src/jit/ (2 files): add an impl of callercode to the jit |
06:13 | |
timotimo | spectests with this are clean :) | ||
(except for the usual) | 06:14 | ||
my implementation of objprimspec seems to lead to a segfault; gdb's disassemble command doesn't seem helpful | 06:44 | ||
gist.github.com/timo/a9c5a2476b63a9afe842 | 06:49 | ||
i don't see anything wrong with it, which doesn't mean much :) | 06:50 | ||
07:25
brrt joined
|
|||
brrt | timotimo: ARG1 might be TMP1, but it might not | 07:25 | |
timotimo | oh! | ||
brrt | depending on ABI | 07:26 | |
timotimo | i didn't know that they are double-booked | ||
brrt | TMP5 == FUNCTION | ||
TMP6 == always free :-) | |||
timotimo | OK, let me try re-naming | ||
brrt | well, i should comment it | ||
timotimo | how many can i freely use? | ||
brrt | TMP5 and TMP6 are safe from the ARG1,2,3,4 set | 07:31 | |
ARG5 and ARG6 are /not defined/ on windows, so you shouldn't use them in hand-written code | |||
RV is also safe, so if you can't deal with it in any other way, you could use that | |||
timotimo | actually, i'm going to commute for about 1.5h or so | 07:46 | |
if you want, you can feel free to adapt the code to use different registers and commit&push that if it works | 07:47 | ||
but if not, feel free to work on whatever :) | |||
brrt | oh, yeah, i'll change it and test it | 07:51 | |
uhm, darn | 08:02 | ||
get_storage_spec returns an object (11 bytes wide as far as i can tell) not a pointer | 08:03 | ||
can i change the signature for that? | 08:12 | ||
i'd like to pass a pointer directly | 08:20 | ||
also, it crashes :-( | 08:24 | ||
brrt wonders why we suddenly have so many issues | 08:25 | ||
what limit are we reaching? | |||
dalek | arVM/moar-jit: b1b7dc9 | (Bart Wiegmans)++ | src/jit/ (2 files): Add an implementation of objprimspec. This add the space for the struct as a third argument. I'd like to change the get_storage_spec signature to make this formal so it'll work on all platforms. rakudo build fails with this patch. |
08:28 | |
08:53
brrt joined
|
|||
nwc10 | brrt: moar-jit causes ASAN barfage in Rakudo setting | 09:11 | |
hangon, PEBKAC. Barfage is real, but I'm not sure what version I'm building | 09:13 | ||
bother, no, it's master, but with JIT | 09:14 | ||
This is MoarVM version 2014.07-407-g5dccd58 built with JIT support | |||
OK, I can't repeat it. | 09:17 | ||
that's not normal | |||
brrt | ugh | 09:18 | |
it's not surpising either | |||
nwc10 | paste.scsys.co.uk/416285 | ||
retrying with a complete clean build | |||
09:19
cognome joined
|
|||
brrt | uhm, findmethod is allocating? | 09:23 | |
nwc10 | can reproduce with a clean build | ||
I don't know enough to answer that | 09:24 | ||
brrt | yeah, it does | 09:29 | |
dalek | arVM/moar-jit: 2a4ce16 | (Bart Wiegmans)++ | src/jit/ (2 files): Read only word of storagespec. Doesn't work at all |
09:35 | |
brrt | seems like that frames misses a refcount | 09:38 | |
we've got issues, man | 09:49 | ||
nwc10 | MoarVM master^ is unhappy | ||
brrt | yeah | 09:50 | |
i'm unhappy, i might add | 09:55 | ||
FROGGS | jnthn: here is what valgrind thinks about my problem: gist.github.com/FROGGS/446932521a4283eaca71 | 09:58 | |
errrm | 09:59 | ||
jnthn: this makes me think we add cu to gen2 twice: github.com/MoarVM/MoarVM/blob/mast...unit.c#L11 | 10:00 | ||
yeah, I am pretty sure that patch is wrong | 10:03 | ||
question is: did it really help to solve an issue? | |||
brrt | yeah, it did, CU was moved in during allocations | 10:04 | |
and the JIT continued to look at the wrong CU | 10:05 | ||
FROGGS | hmmm | ||
10:07
klaas-janstol joined
|
|||
nwc10 | 2014.07-405-g669a171 looks to be good | 10:14 | |
brrt | which one is that? | 10:31 | |
such mystery, much unhappy | 10:37 | ||
also.. i can't repeat it now, maybe i can repeat it some other time (yay) | 10:40 | ||
can assign be invokish? i may have asked this before | 10:43 | ||
since decont can be | |||
and the cur_op is incremented before calling store | 10:45 | ||
yeah, assign can be invokish | |||
ok | |||
we probably need to change that | |||
i think it's fair to say that made a lot of diffference | 10:55 | ||
since well, not dealing invokish well can lead to freaky states | 10:56 | ||
btw, can make spectest be parralelized? | |||
dalek | arVM/moar-jit: 85b3b14 | (Bart Wiegmans)++ | src/ (3 files): Made assign invokish Not dealing with invokishness makes for some freaky states. Enabled iscont, disabled objprimspec. Spectests pass for me |
11:01 | |
lizmat | brrt: I have TEST_JOBS=8 in my ENV to parallelize spectest | ||
I *think* you should be able to do the same with the hardware you have :-) | 11:02 | ||
brrt | yeah, i think so too | 11:04 | |
although i usually limit it to 4 | |||
but, spectests are clean for me | 11:05 | ||
anyway, lunch & | |||
sergot | hi o/ | 11:20 | |
11:24
klaas-janstol joined
11:38
brrt joined
|
|||
brrt | \o sergot | 11:38 | |
basically, from a jit perspective, returning structs between 8 and 16 bytes wide are a major pain | 12:16 | ||
with some emphasis on /major/ | |||
jnthn | brrt: I'm guessing storage_spec is the pain here? | ||
brrt | yes | ||
i'm busy changing that into void (ThreadContext*, MVMStable*, MVMStorageSpec *s) | 12:17 | ||
or rather, changing that, and i'm thinking 'what if we return the pointer as well as having it passed, that'd be handy in functions that take the result directly | 12:18 | ||
note that there is virtually no difference between the earlier signature and the win64 behavior | |||
dalek | arVM: eaf46f5 | jnthn++ | src/ (2 files): Avoid bogus mark_special_return_data calls. Hopefully fixes use-after-free found by nwc10++ with ASAN. |
12:20 | |
jnthn | brrt: I wonder if we should just return a pointer always. (more) | 12:21 | |
For some REPRs the data is basically constant, so we can just have it as static data and return its address. | |||
For the REPRs more complex than that, we can hang it off their REPR_data struct and build it just once | |||
That'd save building it each time also. | 12:22 | ||
brrt | jnthn: note that the use-after-free can potentially also be caused by assign being invokey and this being not handled | 12:23 | |
hmm | 12:24 | ||
jnthn | brrt: It showed up in code I added yesterday | ||
brrt: I assume we're considering findmethod invokey already? | |||
brrt | ehm... yeah, but i should check | ||
ehm, no | 12:25 | ||
not invokish | |||
probably lost in merge? | |||
wtf | |||
jnthn | findmeth w(obj) r(obj) str :pure :invokish | ||
findmeth_s w(obj) r(obj) r(str) :pure :invokish | |||
That's in master | |||
brrt | hmm let me see, i may be in the wrong branch | ||
jnthn | And master also has: | ||
assign r(obj) r(obj) | |||
assignunchecked r(obj) r(obj) | |||
Both of those are invokish | |||
brrt | sp_findmeth in master? | 12:26 | |
yeah i fixed that in moar-jit | |||
i should probably merge that back | |||
jnthn | sp_findmeth .s w(obj) r(obj) str sslot :pure | ||
That wants marking invokish too | |||
brrt | uh, that's invokish indeed | ||
jnthn | Though that one is interesting | ||
brrt | that wants a refactors | ||
jnthn | In so far as if we hit the monomorph cache we can jump straight over the invokish guard too. | ||
brrt | refactor | ||
12:26
zakharyas joined
|
|||
brrt | hmm yeah | 12:26 | |
jnthn | So we can save the check. | ||
brrt | wait | 12:27 | |
sp_findmeth is implemented in such a way as to deal with the invokish itself | |||
as in MVM_6model_find_method_spesh returns 1 if invokish, 0 if not | |||
so yeah | |||
it does what you suggest | |||
jnthn | OK, cool :) | ||
brrt | i should comment it in the oplist, though | ||
jnthn | So we needn't mark it invokish? :) | ||
brrt | no | 12:28 | |
jnthn | ok | ||
brrt | as in, it won't do any harm, but it won't do any good either | ||
jnthn | Heh, maybe the mark is as good as the comment ;) | ||
I guess if_o and unless_o already have their invoke potential handled by the desugaring? | 12:29 | ||
assertparamcheck r(int64) :noinline | 12:32 | ||
I think I saw that mentioned recently. It's invokish. | |||
brrt | yeah | 12:33 | |
oh is it? | |||
jnthn | yeah. That's what it does | ||
If the bind fails, it invokes something to produce and throw a decent error. | |||
We often eliminate it during spesh. | 12:34 | ||
brrt | yeah, i see | ||
if it pre-increments cur_op, good chance it's invokish :-) | |||
jnthn | Some of the continuation ones will be invokish (control and invoke) | ||
But I don't think we've tried to JIT those yet :) | |||
They'd just be calls anyway. | |||
Think that's all I see wrt invokish marks. | 12:35 | ||
brrt | ok, merged moar-jit nqp tests pass, at least | 12:39 | |
jnthn | :) | ||
brrt | parrallel tests make a lot of difference | 12:42 | |
FROGGS | jnthn: please backlog here when you have time | ||
jnthn | FROGGS: Did already; that's where I found the double-free thing I fixed a moment ago... | 12:44 | |
FROGGS: Sadly, the ASAN output for the bug you're seeing is a bit less informative about the problem there :( | |||
As for "adds it twice": that's not really what's going on here. MVM_gc_root_gen2_add isn't about allocating in gen2, it's about saying "this gen2 object may point to nursery objects" | 12:46 | ||
FROGGS | hmmm | ||
dalek | arVM: 7dad12b | (Bart Wiegmans)++ | src/jit/ (2 files): Commit branching iscont as well as extra labels Both these changes break spectest / nqp. So something is off. Commit on moar-jit for the time being. 92e3f76 | jnthn++ | src/core/frame.c: Clear special return data more eagerly. This prevents access to it if the special return handler frees it, then goes on to do something that allocates nwc10++ for identifying the problem. |
12:49 | |
12:50
dalek joined
|
|||
brrt | adding a few extra labels, btw, still breaks about everything | 12:50 | |
but why? no idea | |||
jnthn | FROGGS: My guess was that somewhere we'd be not properly protecting storing an SC reference in the CU, e.g. with MVM_ASSIGN_REF. But having looked through all the places that happens, I can't find anywhere it's an issue. :S | 12:53 | |
brrt | on the good news side, we now have both iscont and assign | ||
jnthn | (There's actually only a few places.) | ||
FROGGS | :/ | ||
13:12
ggoebel11111110 joined
|
|||
brrt is starting on the get_storage_spec thingy | 13:27 | ||
timotimo | o/ | 13:28 | |
brrt: i assume you're aware of what the "firm pencils down" and "1. final evaluation deadline" mean? | 13:37 | ||
(because i forgot the exact stuff again, but i know where to look) | 13:40 | ||
jnthn | timotimo: I think "firm pencils down" means "any work beyond this point can't be taken into account for GSoC evaluation", not "please stop contributing" :) | ||
timotimo | *that* part i recall vividly :P | 13:41 | |
jnthn | The key thing is that on 19th, 20th, or 21st, brrt and I ensure we file evaluations. | 13:43 | |
And on that a little beyond that, some code samples are submitted. | 13:44 | ||
s/on that// | |||
timotimo | good, i see you already have this thing planned out perfectly and will likely not require further semi-educated nagging | ||
brrt | eh, yeah, i'm aware of what it mean | 13:45 | |
timotimo | while digging a few finger's widths into the jit code, i thought at first that making the register allocation stuff will be super hard, then realized it may actually end up being somewhat easy | 13:46 | |
jnthn | I think brrt++ and I will be in the same place when the eval deadline arrives, anyway. :) | 13:47 | |
brrt | means | ||
timotimo | i'm sure i'll actually stumble upon the thing that makes it seem not easy if i do any more work inside the emit.dasc file | 13:48 | |
brrt | basically, we need to add stuff to dynasm to make it work, and i haven't gotten too that yet | 13:49 | |
and by 'stuff', i mean 'select all registers on x64 dynamically' | |||
timotimo | yup, my evaluation of the situation presumed that feature already existing | ||
brrt | ah | 13:50 | |
well, yeah, i agree | |||
when that happens - sketching ahead into the future right now - we'll probably make some very simple nodes like conditionals and stuff | |||
timotimo | to make register allocation work properly across jump borders? | 13:51 | |
brrt | and try to move calls that are now implemented in asm back into the graph | ||
no, ehm, to be honest, that would require some work with the ssa form rather than the original registers i think | |||
timotimo | OK | 13:52 | |
brrt | but i haven't really thought that out | ||
yet | |||
timotimo | so we have proper working iscont and assign (and assignunchecked i guess too?) on moarvm's master branch? | 13:53 | |
brrt | yeah | 14:05 | |
it seems that way | |||
and i'm working on objprimspec | |||
but that means refactoring get_storage_spec | 14:06 | ||
which has /many/ implementations | |||
so that's taking a while | |||
(bbiab, errands &) | |||
timotimo | oh well :\ | ||
15:21
nebuchadnezzar joined
15:27
brrt joined
|
|||
nwc10 | jnthn: sadly not paste.scsys.co.uk/416320 | 15:33 | |
paste.scsys.co.uk/416321 | 15:35 | ||
15:54
nebuchadnezzar joined
16:21
zakharyas1 joined
16:27
brrt joined,
brrt left
18:06
brrt joined
|
|||
timotimo | i'd like to build something like "jvisualvm" for moarvm, some program to give access to all kinds of stats, performance numbers, spesh/jit dumps, ... | 18:32 | |
is there a known design pattern/thingie that'll allow me to put data gathering stuff into moarvm without impeding non-instrumented execution? | 18:33 | ||
diakopter | my thought (and parrot's technique) was to render specialized interpreter "runcores" | 18:34 | |
timotimo | right, moarvm already has this trace thingie that'd put an ifdefd piece of code into the beginning of the interpreter loop | 18:35 | |
i could build something based on ptrace or something similar, but that'd be linux-only | 18:37 | ||
diakopter | right, but you could munge the file to have several versions of the same interpreter loop in the same binary | ||
timotimo | and using a gdb from python is ridiculously slow (for example the code that scans the gen2 and nursery for types and such) | ||
i could; it'd be more or less copy-pasting the whole interpreter loop, though :\ | |||
diakopter | yeah, but it's fine if you don't mind a few hundred more KB in the binary for such frankenbuilds | 18:38 | |
timotimo | IIUC the "local state" in the interpreter loop isn't too complex, so you could even imagine longjmping from a normal interpreter loop into an instrumented one back and forth | ||
diakopter | obviously it wouldn't be the default to build more than one variant | ||
timotimo | of course | ||
do you know something that'd make the copy-pasting stuff automated and somewhat robust-ish? | |||
diakopter | methinks a perl script could do it | 18:39 | |
robust? what's that? | |||
timotimo | :)) | ||
AFK | |||
diakopter | if you *really* wanted to, you could put the loop variants in the same routine so you could jump between them as you crossed tracing/non-tracing boundaries | 18:43 | |
you'd want special instructions/ops to switch runcores :) | |||
(so you didn't have to run a check with each instruction) | |||
timotimo | aye, thought of that as well | 18:45 | |
jnthn | timotimo: I've a few ideas on this, though am a bit tied up worrying about fixing bugs and doing my YAPC::EU talks and also preparing to be away from home a while these next days...could happily look at it together in a few days... | 18:47 | |
timotimo | but then i won't have the "lazy sunday evening" effect working for me any more ;)) | 18:48 | |
i think you once said spesh would be a fantastic opportunity to do instrumentation of regular bytecode | |||
jnthn | Yes | ||
I still think that'd be a decent approach. | 18:49 | ||
timotimo | in principle we could also change spesh to just spesh *every single* frame and instead of optimizing stuff it'd put instrumentation in | ||
hmm. but maybe for instrumentation it'd be enough to just instrument frames that end up being speshd in the regular way | 18:50 | ||
as in: "hot" frames | |||
i wonder if i can use hardware breakpoints to trap calls to the relevant pieces of the gc | 18:52 | ||
hmm, with CGOTO there's not really a single spot where i could plop a function call that'd check for MVM-bytecode-level breakpoints | 18:55 | ||
diakopter | in the macro itself? | 18:56 | |
OP? | 18:57 | ||
timotimo | oh | 18:58 | |
that's actually a better idea than at the goto NEXT spot. | |||
18:59
zakharyas joined
|
|||
diakopter | that's not to say I'm supportive of such checking for breakpoints; I think dynamically injecting the special ops is a better idea (as you mentioned doing with spesh, and as my earlier idea was assuming you'd have to do) | 18:59 | |
timotimo | hm, so every annotation for a line number would get a little "line number changed" op called that'd be hooked into for things | 19:03 | |
diakopter | maybe, but why not use those markers instead to just know where to inject your "breakpointbreak" op | 19:04 | |
that reminds me to watch Point Break again. | |||
afk& | |||
nwc10 | jnthn: I can see why ASAN is still barfing. I'm not sure what the right fix is | 19:22 | |
there's a GC run while attempting to throw the exception in late_bound_find_method_return(), and the GC mark routine assumes that frame->special_return_data is still pointing to something valid, because mark_special_return_data is not 0 | 19:24 | ||
somehow the free-ing and the NULL-ing need to end up in the same function | |||
jnthn | ah... | 19:30 | |
Or we NULL before we ever call the function... | 19:31 | ||
Since we're passing the data into it anyway | |||
And the only reasonable thing for it to do is clear up | |||
Since it's not like it can be called again. | |||
brrt | ahah... reprs can be assigned from deserialization too | 19:32 | |
much great | |||
ugh | 19:34 | ||
this is harder than it looks | |||
(this = refactoring all get_storage_specs) | |||
timotimo | got a hot tip how i could instrumentalize the GC to collect stats like how long does it take, how many objects (of what sizes) got promoted to {the next nursery, gen2}, what kind of allocations cause gc runs most often, ... | 19:35 | |
hm. i should probably look into perf some more. | 19:37 | ||
it may just be able to do half of that stuff by itself | |||
nwc10 | jnthn: if that works, yes, sounds right. I wasn't sure if it was how it worked | 19:38 | |
timotimo | oh wow | 19:41 | |
i can just put a perf probe into any currently running program | |||
all it needs is debug symbols, it seems | |||
FROGGS | Program received signal SIGSEGV, Segmentation fault. | 19:45 | |
0x00007ffff79c78ec in gc_mark (tc=0x603650, st=<optimized out>, data=<optimized out>, worklist=0x379d700) at src/6model/reprs/SCRef.c:75 | |||
75 MVM_gc_worklist_add(tc, worklist, &(sc->sr->root.sc)); | |||
timotimo | interesting, there's a -ggdb flag for gcc | ||
FROGGS | sc->sr->root is accessible, sc->sr->root.sc points to invalid mem | ||
jnthn | ooh | ||
That's a good find | |||
FROGGS | coincidence :o) | ||
nwc10 | same bug, or different? | 19:46 | |
jnthn | Will look in a moment, trying my second attempt at fixing the nwc10++ reported thing first... :) | ||
FROGGS | nwc10: related to what I saw yesterday me thinks | ||
:o) | |||
19:47
itz_ joined
|
|||
brrt | y have i an all zero p6num repr_data | 19:49 | |
where are repr_data's created | |||
jnthn | brrt: Allocated in compose or deserialize_repr_data normally | 19:52 | |
brrt | hmmm | ||
doesn't seem like they're called | 19:53 | ||
jnthn | It's possible (though maybe a bit odd) that get_storage_spec somehow is called before those... | 19:55 | |
I think get_storage_spec has a code-path to handle that | |||
And hands back a fairly "default" answer in that case. | |||
brrt | yeah | 19:56 | |
but still i get a p6num with a zeroed-out storage spec | |||
which is just | |||
weird | |||
imho | |||
jnthn | It is a bit | 19:58 | |
jnthn | I don't quite see how that can happen. | ||
Apart from a serialization oddity | |||
Like, we're asked for the storage spec before having deserialized. | |||
Do you know what is asking for the storage spec at this point? | 20:01 | ||
dalek | arVM/moar-jit: cb9e151 | jnthn++ | src/6model/6model.c: Complain properly about missing late-bound methods. |
||
MoarVM/moar-jit: 5dccd58 | (Timo Paulssen)++ | src/jit/ (2 files): | |||
MoarVM/moar-jit: add an impl of callercode to the jit | |||
brrt | yeah, smart numify | ||
the thing is in fact deserialzed ahead of time | |||
jnthn | Ah | 20:02 | |
20:02
dalek joined
|
|||
jnthn | Then...odd. | 20:02 | |
brrt | yes | ||
anyway, this should be the bulk of the work, i'll investigate the rest tomorrow | |||
good night :-) | |||
jnthn | 'night o/ | ||
timotimo | jnthn: i can just inject a probe into any given process and then i can use perf to get counters for that! | ||
20:02
brrt left
|
|||
timotimo | like, i can select a function and then an intra-function spot (like @return) | 20:03 | |
or per line number | 20:06 | ||
jnthn | timotimo: Sounds useful to know... | ||
timotimo | i just don't know what to try it out on ... | ||
jnthn | Full GC collects? Deopts? Lexical vivifies? :) | 20:07 | |
timotimo | mhm, mhm | 20:08 | |
bugger, i have to do these as root it seems | 20:19 | ||
otherwise i get a permission denied for the trace events i added | |||
which is understandable, i guess | |||
oooooh! | 20:24 | ||
perf_events has JIT support to solve this, which requires the VM to maintain a /tmp/perf-PID.map file for symbol translation. Java can do this with perf-map-agent, and Node.js 0.11.13+ with --perf_basic_prof. I'll write up instructions for these when I get a chance. | 20:25 | ||
perf also has a "trace" command that is like strace, but with less overhead and prettier output | 20:49 | ||
*wow*, you can apparently ask perf to give you a list of variables for a given functions and then you can trace the values ... | 20:51 | ||
jnthn | o.O | 20:57 | |
timotimo | like ... wow. | 20:58 | |
so with a simple set of commands we can get a histogram of what sizes get allocated in the nursery or gen2 or fixed-size-allocator or ... stuff | 20:59 | ||
i don't know exactly what this perf map is supposed to look like on the inside | 21:01 | ||
oh, it's just "startaddr size name" | |||
jnthn | FROGGS: Am currently ding a v5 build | 21:02 | |
FROGGS | ohh | ||
jnthn | FROGGS: I guess nqp_to_perl6 is the correct branch? | ||
FROGGS | correct | ||
jnthn | hmm | ||
Compiling (43) warnings::register to mbc | |||
===SORRY!=== | |||
FROGGS | but gimme a sec to push something | ||
jnthn | Unable to write bytecode to 'lib/Perl5/warnings/register.pm.moarvm' | ||
oh, that came after an error | 21:03 | ||
"The syntax of the command is incorrect." | |||
I don't suppose you've got a mkdir -p in there? :P | |||
FROGGS | ohh | ||
*g* | |||
jnthn: I have :o( | 21:04 | ||
Build.pm line 73 | |||
jnthn | Well, creating it manually worked for now | 21:06 | |
I just want to get to your SEGV. | |||
FROGGS | when you have worked around that, you can test with: perl6 Build.pm summary, then abort and run: perl6 -Ilib t/spec/op/read.v5 | 21:07 | |
(to get the fudged test file) | |||
jnthn | Well, I'm up to module 335 by now... :P | 21:08 | |
FROGGS | you can abort that too | ||
you normally only need to build up to Config.pm | |||
jnthn | C:\Perl64\bin\perl.exe t/spec/fudgeall --add_use_v5 v5 t/*/*.t t/*/*/*.t | 21:09 | |
Can't open perl script "t/spec/fudgeall": No such file or directory | |||
That's what I get from summary | |||
Indeed, I've no t/spec | 21:10 | ||
FROGGS | jnthn: pull and run the summary again | 21:11 | |
jnthn | Did that | 21:12 | |
bah! | |||
C:/consulting/perl6/v5/t/spec/fudge: No such test file 't/*/*.t' | |||
C:/consulting/perl6/v5/t/spec/fudge: No such test file 't/*/*/*.t' | |||
so fail! | |||
FROGGS | damn | ||
:o/ | |||
jnthn: I'll fix all issues on windows and will tell you then | |||
but this won't be today most likely :/ | 21:13 | ||
jnthn | OK. I tried to manually fudge it | 21:14 | |
But making a .v5 copy and putting "use v5" as the first thing after the shebang | |||
When I do that I get | 21:15 | ||
===SORRY!=== | |||
This type does not support associative operations | |||
FROGGS | hmmm, I don't see that here... but perhaps on windows... | ||
ohh, there are some "#v5 emit #" | 21:16 | ||
jnthn | I don't see them? | ||
FROGGS | hmmm, weird | ||
jnthn | Do I need some branch of spectest repo too? | 21:17 | |
FROGGS | no | ||
I'll check that too | |||
jnthn | origin git://github.com/rakudo-p5/roast5.git (fetch) | ||
Is that the right thing? | |||
(for my t/spec) | |||
FROGGS | okay, pushed the fudge lines | 21:18 | |
Build.pm is fixed, will fix now t/test_summary | 21:23 | ||
jnthn | Yeah, I got a fudged one now. Same error. | 21:24 | |
FROGGS | need to build a fresh rakudo on my windows box first... | 21:25 | |
jnthn | This type does not support associative operations at src/Perl6/Actions.nqp:4675 (C:\consulting\MoarVM\install\languages\nqp\lib/Perl6/Actions.moarvm:circumfix:sym<{ }>:180) | ||
Further back down is a | 21:26 | ||
from src/Perl5/Grammar.pm:1456 (C:\consulting\perl6\v5\lib\Perl5\Grammar.pm.moarvm::135) | |||
from src/Perl5/Grammar.pm:1449 (C:\consulting\perl6\v5\lib\Perl5\Grammar.pm.moarvm:FOREIGN_LANG:108) | |||
FROGGS | jnthn: you are using an installed v5 | 21:27 | |
the old nqp version | |||
jnthn: -Ilib should help | |||
jnthn | perl6-m --ll-exception -Ilib t\spec\op\read.v5 | 21:29 | |
FROGGS | that should do | ||
do you have a lib\Perl5\Actions.pm.moarvm file? | |||
jnthn | Yes | 21:30 | |
Should I just do a git clean -fdx and build stuff again? :) | |||
FROGGS | no, let me fix this first, so you don't waste more time :o) | 21:31 | |
jnthn | OK | ||
21:41
avuserow joined
|
|||
timotimo | how should i detect whether or not MoarVM should write a /tmp/perf-PID.map file? | 21:57 | |
(in its jit compiler) | |||
also, i'd like to force a flush, but not make the moarvm process wait for it to complete ... | 21:58 | ||
FROGGS | op/read.v5 passes on my windows box :/ | 22:00 | |
timotimo | can we do anything to make inc/dec frame less costly? | 22:05 | |
inline more stuff? :) | |||
jnthn | timotimo: How costly does your profiling thing they are, ooc? | 22:07 | |
inc should be really cheap | |||
dec includes the logic to release/clean up things | |||
timotimo | in "say [+] rand xx 100000" i get 14% interp_run, 9.1% dec_ref, 4.88% inc_ref | ||
jnthn | There are various things we could do, though. Inlining is one. Detecting frames that don't need an ->outer and not bothering with it is another. | 22:08 | |
Really? An atomic increment is so costly? | |||
timotimo | 95.65 times (whatever that means?) it hits the pop op after the lock xadd in MVM_incr | 22:09 | |
jnthn | Would be worth taking a look at what libatomicops is doing there | ||
Ah, so it is using lock xadd | |||
timotimo | yes | 22:10 | |
jnthn | I know that's not *free*, but I'm surprising it's showing up so highly. | ||
timotimo | yeah | ||
since we now always have a second thread for the libuv event thread, we can't even switch out inc_ref and dec_ref to use non-atomic ops instead :\ | 22:11 | ||
jnthn | You only get that thread started if you do something that needs it. | ||
timotimo | oh | 22:12 | |
jnthn | But still, I'm surprised it comes out so costly. | ||
timotimo | so would it be sensible to swap out frame_inc_ref and dec_ref? that would prevent inlining and such :\ | ||
and gives an indirection for ... all the time | |||
jnthn | It'd be interesting to compare what other profilers make of it on the same benchmark. | ||
timotimo | right; perf is probabilistic | 22:13 | |
say [+] rand xx 100000 is the exact code i used | 22:14 | ||
i thought it was kinda interesting that free from libc only ended up at about 0.4% | 22:15 | ||
so the win from a separate thread to do the free() calls is not going to help this very benchmark much | |||
but it's one of these microbenchmark things | |||
i suppose this is just a case of having a ridiculous amount of calls and returns when we do [+] rand xx ... | 22:16 | ||
maybe gather/take makes things jump back and forth all the time? | |||
jnthn | Oh...gather/take does some inc/dec, yes | 22:17 | |
xx being implemented in terms of gather/take is silly. | |||
timotimo | mhh | ||
jnthn | Should really do it with a loop iter | ||
Apart from we don't have those yet. | |||
timotimo | oh, interesting | 22:18 | |
[+] do rand for ^100000 gives me dec_ref in the 2nd place, but inc_ref is way lower | |||
interestingly, __lll_lock_elision from libpthread shows up high-ish | |||
diakopter | my guess is cache thrash since there are *so* many atomically incremented/decremented values.. might be worth centralizing them all in an array accessed viaan offset instead of inside each frame's struct..? | 22:30 | |
jnthn | Well, the more general issue is that MVMFrame carries around a bunch of stuff that many frames don't need. | 22:31 | |
diakopter | since the cleanup-related thingies could be handled in a background thread eventually, could actually farm out all the increment/decrement actions to another thread | ||
jnthn | Except in a tight invoke loop we'd like to re-use the same frame right away for the next invocation. | 22:32 | |
diakopter | o yeah | 22:33 | |
jnthn | I count 12 or so fields in MVMFrame that could easily be pushed out to an allocated-if-we-need-it data structure hanging off frame. | ||
timotimo | could we just keep the reference count of a frame the same and if we know it's one we'll just re-use it for the next invocation? or something like that? | 22:35 | |
i haven't actually looked at what's inside such a MVMFrmae | 22:36 | ||
23:02
avuserow joined
|
|||
jnthn | 'night o/ | 23:13 |