00:29
MasterDuke joined
00:39
quotable6 joined,
tangible6 joined,
releasable6 joined,
coverable6 joined,
nativecallable6 joined,
bloatable6 joined,
committable6 joined,
bisectable6 joined,
unicodable6 joined,
greppable6 joined,
statisfiable6 joined,
benchable6 joined,
evalable6 joined,
squashable6 joined
01:00
lizmat joined
01:35
evalable6 joined
02:05
MasterDuke_ joined
02:40
lizmat joined
02:55
ilbot3 joined
03:41
evalable6 joined
09:29
lizmat joined
09:51
domidumont joined
09:56
domidumont joined
12:16
robertle joined
12:59
benchable6 joined
|
|||
timotimo | it occurs to me that if we know the types of our p6 bigints, i.e. Int, we could spesh the "get_boxed_ref" part out of all the bigint ops | 13:10 | |
it's not responsible for a very large chunk, but it's big enough for my tastes | 13:14 | ||
16 billion incl for the whole MVM_bigint_add and 702 million incl for get_boxed_ref | |||
of course when we have our jit be very smart about bigint calculations, we might compile a path that doesn't even allocate Int objects at all? | 13:15 | ||
jnthn | With EA we can at least "allocate" them on the stack | 13:18 | |
timotimo | right | 13:19 | |
in this benchmark a significant chunk of time is spent in mp_set_long :\ | 13:20 | ||
6 billion Ir out of the 16 | 13:21 | ||
and out of those 6 billion, 5 billion are in mp_mul_2d | |||
bbl | 13:22 | ||
13:29
evalable6 joined
|
|||
nine | So a MVM_SPESH_ANN_DEOPT_INLINE annotation is really just a "the inlined code had _any_ kind of deopt annotation" marker? | 13:38 | |
That means the annotation of the original code could have been a MVM_SPESH_ANN_DEOPT_ALL_INS, i.e. the goto carrying the annotation used to be an invoke_*. In that case the real case to look at is the MVM_SPESH_ANN_DEOPT_ALL_INS one, i.e. entering an inlined BB. | 13:44 | ||
dogbert17 | doc question: what would be a good, one line, description of the Encoding class. Currently we don't have any | 14:13 | |
nine | Another question: does it really matter what I do? A no_op should already be much better than a goto and will be turned into literally nothing by the JIT. So trying to get rid of it doesn't sound like useful use of my time. | 14:18 | |
lizmat | nine: could it be that Inline::Python still depends on Panda? | 14:34 | |
nine | lizmat: I wouldn't be surprised | 14:35 | |
MasterDuke | nine: don't you only get failures in a very few spots? how hard is it to benchmark no_op vs removed entirely? | 14:36 | |
jnthn | nine: Well, it's kicking whatever deeper problem exists here down the road for you or somebody else to solve, I guess... | 14:37 | |
nine | jnthn: and the point of the exercise is actually for me to learn about inlining in spesh. And the no_op trick doesn't work in all cases anyway... | 14:38 | |
jnthn | Ah, if it doesn't work in all cases, it's less attractive. | ||
nine | jnthn: what really scares me is that you meant this to be a kind of beginner exercise as an entry for me. How hard will my actual goal be then?? | 14:39 | |
Geth | MoarVM/inline_in_place: 4b25ee8603 | (Timo Paulssen)++ (committed by Stefan Seifert) | 4 files Put inlined blocks between their caller and its succ Previously inlined callees were added to the end of the basic block list. We now put the inlined blocks into the list at the position of the invoke op. However we cannot yet get rid of the goto ops entering and exiting the inlined code as that would lead to odd bugs possibly related to the annotations on these ops. |
||
MoarVM/inline_in_place: b4b8c5765e | (Stefan Seifert)++ | src/spesh/optimize.c Turn inline_end annotated pointless gotos into no_ops We can't remove them completely as that causes weird deopt issues. But turning them into no_ops should give almost the same benefits without the cost. |
|||
nine | I pushed it anyway as it documents the findings in a way. | ||
jnthn | You'd not be alone in putting things off for later in spesh. A good while back I marked lastexpayload :noinline to avoid having to debug the issues inlining it brought up. Only in the last week or so did I fix one, golf another into something in the JIT that brrt++ fixed, and there's still one more issue to hunt down. | 14:41 | |
And the issues rarely show up in nice, simple, isolated examples. | 14:42 | ||
As to how hard things are - this tends to hinge on whether in the course of the task, something turns up that makes an in-theory straightforward change expose some other problem, and potentially an existing bug. | 14:47 | ||
Which I guess is the issue here | |||
bbiab | 14:48 | ||
14:48
evalable6 joined
|
|||
timotimo | so anyway, this for loop code, it passes the 32bit boundary if you set $n to 2047 (surely a conicidence!!), so the default of 10⁴ is spending the vast majority of its time doing big int math for under 64 bit numbers | 15:14 | |
er, it passes the boundary if you set it to 2048, 2047 is the last one below the boundary | 15:17 | ||
15:45
zakharyas joined
15:50
zakharyas joined
16:22
evalable6 joined
16:35
domidumont joined
|
|||
dogbert17 | added some, possibly interesting, info to the thread on github.com/rakudo/rakudo/issues/1202 | 16:46 | |
20:40
Ven joined
21:00
MasterDuke joined
|
|||
dogbert17 | interesting, stresstesting on a 64 bit VM uncovers problems I don't see on my 32 bit VM | 21:03 | |
here's one example, gist.github.com/dogbert17/cebea492...1a183813fc | 21:07 | ||
21:13
Ven joined
22:11
MasterDuke joined
22:30
unicodable6 joined
|