00:54
diakopter____ joined
01:23
diakopter___ joined
01:45
ShimmerFairy joined
02:00
llfourn joined
02:21
rjbs joined
|
|||
rjbs | moarvm seems to use -mt as a cc switch on Solaris, even when using gcc | 02:22 | |
02:24
FROGGS_ joined
|
|||
rjbs | It's in the ccmiscflags in nqp/MoarVM/build/setup.pm | 02:25 | |
I don't know how to make it depend on cc, but I reckon somebody does. That is all. :-/ | |||
diakopter | rjbs: hunh | 03:14 | |
rjbs | I did get rakudo built on illumos with gcc, though. Only test that failed was native call. | 03:16 | |
diakopter | which ones? (all?) | ||
rjbs | t/04-nativecall/11-cpp.t (Wstat: 65280 Tests: 16 Failed: 0) | ||
diakopter | ah, same one that fails for me on windows | 03:17 | |
er. mac. | |||
rjbs: do you dare try spectest? | |||
rjbs | I'm not on that host anymore, let me see if that workspace still exists. | 03:18 | |
It does, running. | |||
diakopter | restart with TEST_JOBS=CORESPLUS1 or something | ||
it's a lot faster | |||
rjbs | k! | 03:19 | |
diakopter | I mean, an actualy number there | ||
rjbs | yeah | ||
I'm running 9. Will let you know when it says anything useful. | |||
well, when I notice it do; I'm also writing some other very exciting billing code | |||
diakopter | people need their bills | ||
03:35
vendethiel joined
|
|||
rjbs | Files=1085, Tests=50349, 654 wallclock secs (16.39 usr 9.32 sys + 2200.27 cusr 480.14 csys = 2706.13 CPU) | 03:48 | |
Result: FAIL | |||
good night! -- gist.github.com/afe2ff3639e36d3f7ab2 | 03:50 | ||
diakopter | yeesh | 03:51 | |
that's quite the smattering | |||
06:55
domidumont joined
07:01
domidumont joined
07:15
vendethiel joined
07:17
kjs_ joined
07:27
FROGGS joined
07:53
kjs_ joined
08:36
zakharyas joined
09:54
brrt joined
|
|||
brrt | timotimo: maybe continue here | 10:49 | |
you hadn't merged it yet? | |||
timotimo | right, it seems like i haven't! | 10:50 | |
it seems to be on my laptop | |||
i thought i put it onto my desktop, too | |||
brrt | i'll review it if you want | ||
JimmyZ | \o brrt++ # new post | ||
brrt | thanks JimmyZ :-) | ||
i've... also made some progress at the implied cost problem | 10:51 | ||
in that i can reorder the tiler table to calculate them | |||
only i can do it now just for the 'first order' | |||
i.e. i can take into account the (load reg) in the (add reg reg) candidate for the (add reg (load reg)) tree; but not further | |||
my running example is (nz (and (load (addr reg)) (const))) | 10:52 | ||
dalek | arVM/no_atomic_if_single_threaded: c044b2a | timotimo++ | src/core/frame.c: WIP cheaper access to ref counts if only 1 thread exists. currently causes livelocks on multi-threaded situations |
||
brrt | timotimo, correct me if i'm wrong, but i thought locking in a single-thread case wasn't so costly? | 10:53 | |
timotimo | it is only not costly if you throw out the explicit lock instructions | 10:54 | |
every instance of sp_deref_get_i64 is currently preceded by a getlex and then a sp_guardrwconc | 11:01 | ||
so the exprjit probably aborts before that | |||
huh, spec tests aren't clean | 11:03 | ||
holy fuck, t/spec/S06-operator-overloading/sub.rakudo.moar compiles SLOW | 11:04 | ||
brrt | are they clean relative to master? | ||
timotimo | must be that NFA optimization thing | ||
i have to check | |||
brrt | yes, yes, i can compile getlex in the expr jit, if you really want it :-P | ||
just hasn't been a priority up until now | |||
jnthn | o/ #moarvm | 11:05 | |
timotimo | well, if it doesn't compile the guard op, it's not enough ;) | ||
and that contains a branch, doesn't it? | |||
yo jn | 11:07 | ||
jnthn: | |||
dergh. | |||
jnthn | Typing skills. YOu has them. :P | 11:08 | |
hah :) | |||
timotimo | unintentional autopun? | ||
brrt | yes, the guard can't be compiled just yet | 11:10 | |
and that's not a very high priority so far | |||
(you may notice my list of priorities is rather long...) | 11:11 | ||
good * jnthn | |||
timotimo | hah :) | ||
i understand why your priority list looks like that, though | |||
jnthn | No, laptop keyboard, which I ain't used for two months | 11:12 | |
timotimo | makes sense | ||
jnthn | And I type on a natural/split keyboard at home, so this feels...unnatural :) | ||
timotimo | did you get the results of my little bit of utf8-c8 prodding? | 11:13 | |
jnthn | "Initializing the thingy to -1 helps, but doesn't fix everything" or so? | 11:14 | |
timotimo | yeah | 11:15 | |
and "the segv comes from the loop running without bounds, because the ++pos < other_pos never stops being true | 11:16 | ||
or something | |||
brrt: the spec tests are just one test more broken than on master, but that test file runs cleanly when i run it on its own ... | 11:18 | ||
brrt | hmmmm | 12:10 | |
that's no proof it's correct though | |||
(i'll finally look at it now() | |||
timotimo | :) | 12:11 | |
brrt | looks fairly sane... although i do have a question | 12:12 | |
why not simply do two sp_get_i64, with the intermediary result being an integer | 12:13 | ||
or is that overly evil | |||
timotimo | yes, that is very evil. | ||
brrt | well, that's just me | 12:14 | |
timotimo | hum. a loop with nothing to do but $a++ doesn't benefit :\ | 12:15 | |
brrt | fwiw, i think you're right, we don't want to tell people to use nqp to get performance | ||
also, i know actually not that much of performance | |||
timotimo | but i can put the thing into a sub and use $a = $a + 1 to not have the invocation overhead | 12:16 | |
but still get the "is rw" thing | |||
diakopter | well, I swapped out uthash for tommyds, but no core setting compilation improvement at first glance | 12:20 | |
jnthn: I guess you saw rjbs' spectest gist above | 12:21 | ||
brrt | still good to try diakopter :-) | 12:25 | |
tommyds also has other interesting data structures | 12:26 | ||
diakopter | the one I tried was hashlin | ||
I guess I could try another one of them | |||
brrt | yeah, sure | 12:27 | |
brrt wonders if the little dynamic array thingy is useful enough to merge into master | |||
jnthn | diakopter: Yes, noticed it | 12:37 | |
brrt | timotimo: looks quite sane to me | 12:39 | |
13:03
zakharyas joined
13:10
domidumont joined
|
|||
brrt | actually, the performance benchmarks of tommyds are quite remarkable, considering that the implementation seems.. quite normal | 13:48 | |
diakopter | yeah, but they may be a little too randomized | 13:50 | |
nine | brrt: just read your most recent blog post. Glad you're making progress :) | 13:55 | |
brrt | thanks :-) | 13:57 | |
i'm nearing the phase in which i can usefully say, 'here's this heap of work you can do, and i can help you do it' | 13:58 | ||
jnthn | :) | 14:00 | |
jnthn may join in with that in the new year :) | |||
brrt | that would be very welcome :-) | 14:01 | |
i have a nqp todo passing, is that an error? | |||
in t/qregex | 14:02 | ||
jnthn | No, I saw that too, but couldn't work out when it popped up | 14:03 | |
And wanted to make sure it wasn't a regression | |||
14:06
donaldh joined
|
|||
brrt | hmmm | 14:06 | |
14:11
domidumont joined
|
|||
timotimo | with the deref_get ops, perl6 -e 'my int $a = 0; sub do-loop(int $a is rw) { while $a < 500_000_000 { $a = $a + 1 } }; do-loop($a)' gets about 6x slower :\ | 14:20 | |
and i don't know why | |||
but i gota be AFK for lots of driving soon | |||
brrt | hmm,m that is weird | 14:21 | |
15:04
domidumont joined
|
|||
dalek | arVM/even-moar-jit: bb79f5d | brrt++ | src/jit/ (3 files): Implement getlex and bindlex for expr JIT Compute address for lexical registers using a loop that loads the outer frame. Does not yet work for object getlex, which is unfortunately the majority of getlexes. |
15:17 | |
brrt | i think i should have a append_template_during_traversal function or something like that | 15:18 | |
17:48
kjs_ joined
17:49
JimmyZ_ joined
18:34
kjs_ joined
18:38
FROGGS joined
18:39
vendethiel joined
18:42
domidumont joined
18:52
zakharyas joined
19:03
lizmat joined
19:21
ggoebel7 joined
20:02
domidumont1 joined
20:04
Peter_R joined
20:18
vendethiel joined
|
|||
dalek | arVM: 6e4b90f | FROGGS++ | src/core/nativecall (3 files): handle "is rw" pointers in native routines |
20:21 | |
timotimo | i'm back at the computer, so i'll be able to have a closer look at the deref problem again | 20:51 | |
21:08
kjs_ joined
21:18
kjs_ joined
21:24
kjs_ joined
|
|||
diakopter | timotimo: welcome back To The Computer | 21:39 | |
21:40
njmurphy_ joined
|
|||
timotimo | holy smokes | 21:41 | |
'my int $n = 0; sub do-loop(int $n is rw) { while $n < 500_000_000 { $n = $n +1; } }; do-loop($n)' | |||
^- 14305 gc runs | |||
diakopter | awesome | ||
timotimo | the allocations tab doesn't show a single one of 'em | ||
"of the allocations that cause this BS" | 21:42 | ||
jnthn | That's why $native-int++ isn't fast yet | ||
It allocates 500 million references there | 21:43 | ||
Once the inliner gets to that stuff, it'll be able to allocate zero of them | |||
timotimo | this isn't using ++ though :) | ||
diakopter | .karma $native-int | ||
jnthn | Oh...the call to do-loop is outside the loop | 21:44 | |
Hmm | |||
timotimo | yup | ||
it's because i want to have an "is rw" thing involved | |||
timotimo grabs the speshlog | |||
at first i thought "oh lord, it's not getting jitted at all!" but that was just because do-loop does OSR and the overview page only shows a single frame jitted out of 4 | 21:45 | ||
gist.github.com/timo/d9ba0cde4833cbddaba2 - go wild if you'd like | 21:47 | ||
the only thing i could imagine allocates there would be getlex? | |||
diakopter | timotimo: wait, how long does that snippet take to run for you | ||
timotimo | 100s | 21:48 | |
21:49
kjs_ joined
|
|||
diakopter | why does camelia run the 50_000_000 version in 1s | 21:49 | |
timotimo | because it doesn't have the patches that were meant to make stuff like this faster | ||
diakopter | o_O | ||
timotimo | yeah. my face looked like that when i found out, too. | 21:50 | |
diakopter | faster than 1s for 50M iterations? | ||
m: my int $n = 0; sub do-loop(int $n is rw) { while $n < 50_000_000 { $n = $n +1; } }; do-loop($n); say now - BEGIN now | 21:51 | ||
camelia | rakudo-moar 2b5c41: OUTPUTĀ«1.137019ā¤Ā» | ||
timotimo | yeah, wy not? | ||
diakopter | I guess I don't see how jitting more operations could make a 800% pessimization | ||
timotimo | the other version doesn't GC at all, i think | ||
didn't --profile it yet | |||
diakopter | ohhh what prevents it GCing | 21:52 | |
what other optimizations are disabled by jitting | |||
timotimo | gist.github.com/timo/d9ba0cde4833cbddaba2 t has a speshlog of the faster run up top | 21:53 | |
okay, the only differences i see: guard_rwconc and the _deref_ ops | 21:55 | ||
aha! | |||
um, no. not aha. | 21:56 | ||
huh. | |||
guardrwcont runs a fetch on the thing | |||
diakopter | so.. need to jit those? | ||
timotimo | it gets jitted | 21:57 | |
but perhaps the fetch causes an allocation because we don't know better? | |||
thing is: the optimization of decont_i and assign_i through the REPR will get_and_use_facts, which includes "is an rw container" | 21:58 | ||
that causes the guardrwconc to appear in the output instead of being thrown out | |||
diakopter | what's the purpose of guardrwconc | 21:59 | |
a lock? | |||
timotimo | no | ||
diakopter | what is conc | ||
timotimo | it assures something is a read-writable container that contains a concrete value | ||
OSLT :P | |||
diakopter | oh | 22:00 | |
timotimo | aha | ||
my implementation of guardrwconc in the jit is bogus | |||
diakopter | oh | ||
timotimo | um. no, it's not | ||
it's a 1:1 translation of the C in the interpreter | |||
if (check && IS_CONCRETE(check) && STABLE(check) == want_c) { | |||
MVMContainerSpec const *contspec = STABLE(check)->container_spec; | |||
if (contspec->can_store(tc, check)) { | |||
MVMRegister r; | |||
contspec->fetch(tc, check, &r); | |||
if (r.o && IS_CONCRETE(r.o) && STABLE(r.o) == want_v) | |||
ok = 1; | |||
} | 22:01 | ||
see the fetch in there? we create an object there. | |||
diakopter | what kind of object | ||
timotimo | whatever's in that container | ||
jnthn: how can we get around this problem? | |||
normally a container will just ->fetch by returning a reference to an object | 22:02 | ||
but native refs don't contain objects, they contain native values | |||
so they have to box the native value | |||
just so we can then check "was it an Int in there?" | |||
diakopter | sounds like the ops could be combined | ||
timotimo | "the ops"? | 22:03 | |
diakopter | if you make the check part of the container spec function set | ||
(in addition to fetch) | 22:04 | ||
jnthn | timotimo: Perhaps we need guards for is native rw | ||
diakopter | hey, that's what I said | ||
timotimo | jnthn: right, and then instead of want_v checking against an STABLE, it'd check against the INT_BP | ||
i forget what those are called | |||
jnthn | I don't know you'd have to care | 22:05 | |
timotimo | but i have to go sleep soon and get up super early in the morning | ||
jnthn | In fact | ||
The type of the native is encoded as part of the container type for native containers, iirc | |||
timotimo | hm, probably don't have to care, because code-gen already encoded that info in the code | ||
that's true, too. it's in the STABLE | |||
jnthn | So you might be able to just use a normal guard. | ||
Rather than a container one | |||
Though please document in the code why that's OK :) | |||
timotimo | that'd go into args.c or something, right? | 22:06 | |
it'd have to throw out the "is rw container" fact, otherwise we'd still be emitting the guard; so would we want another flag for that? | |||
hold on ... i think this actually comes from logging information, because it goes through a lexical | 22:07 | ||
jnthn | Maybe...needs some thinking :) | ||
And I'm tired :) | |||
timotimo | me tired, too | 22:08 | |
good night! | |||
.o( and why is our optimizer emitting a lexical there in the first place? because nobody made lexref-to-localref work yet ... that nobody is me :/ ) | |||
jnthn | sleep well | 22:09 | |
o/ | |||
22:17
colomon joined
23:18
colomon joined
|
|||
diakopter | anyone seen this.. www.oracle.com/technetwork/oracle-l...index.html | 23:26 | |
ilmari | gcc 5.2/glibc 2.19 seems to require _GNU_SOURCE to be defined to provide a definition for pthread_yield | 23:59 |