[07:37] *** soverysour joined
[07:42] *** soverysour left
[09:22] *** rakkable left
[09:22] *** rakkable joined
[09:47] *** [Coke]_ joined
[09:47] *** [Coke] left
[11:34] *** soverysour joined
[11:42] *** soverysour left
[11:48] <disbot2> <librasteve> yeah, I get that this is not at the raku level … hard core FP folks will fail Raku for that (rightly) … really Raku(do) is a GC language with a Functional attitude. Which is a fine thing btw since the GC will recover the memory anyway if you have very complex recursive play with eg multiple objects.

[11:58] <timo> with non-tail-recursion-optimized code you're accumulating stack frames which hold on to references to objects, so the GC will not recover the memory of those objects and also not the memory of the stack frames

[12:21] <timo> with a language that has procedural features like loops, the lack of tail-call optimization is not such a big issue

[12:22] <timo> in a language where you have to loop by doing a call to the loop body from the end of the loop body, you've got a problem when there isn't a tail call optimization that can throw away the stack frames from previous iterations

[12:24] <timo> in a language where you have a syntactic difference between loops and recursion you may be very surprised when your compiler or runtime decides that your call was actually an optimizable tail call and suddenly your stack traces have what looks like unexplainable holes in them

[12:25] <timo> C is going to get explicit tail calls with a syntax "return goto" (or "goto return"?) where you explicitly opt into the semantics of a tail call where your previous stack frames disappear

[12:28] <timo> now the interpreter loop, especially with computed goto, is already very similar to what the tail call version of the interpreter looks like, but the important bit is apparently the compiler has a lot less difficulty optimizing the little snippets instead of having to optimize the huge function that has a boatload of internal labels and jumps in it

[12:28] <timo> interp.c is already the longest file to compile by far in some versions of some compilers

[12:28] <timo> like, a ridiculous amount of time spent compiling just that file in some circumstances

[12:29] <timo> though i think there used to be a compiler bug (maybe it was clang?) that made it especially bad, which was presumably fixed some time ago? maybe i'm misremembering

[13:29] <disbot2> <librasteve> thanks for the detailed explanation… I’m going to study it carefully

[13:33] <timo> sure, do feel free to ask follow-up questions, too

[13:34] <timo> i tend to write these explanations in the hopes of other readers of the channel also benefitting even if they don't speak up, so more questions from readers - especially those not intimately familiar with the details yet - are also useful

[13:34] <disbot2> <librasteve> =b

[13:51] *** soverysour joined
[13:51] *** soverysour left
[13:51] *** soverysour joined
[13:56] *** soverysour left
[14:12] <disbot2> <librasteve> So my initial view that GC recovers the memory anyway is incorrect (recursion uses stack memory, GC recovers heap memory). That said, since Raku has a procedural loop syntax (unlike pure FP I infer), then it does not suffer the stack bloat that tail recursion solves. All that at the Raku level, nothing to do with MOARVM internals.

[14:18] <timo> yeah, that sounds about right

[14:22] <timo> turning the interpreter loop from the "big function with labels that we goto to" into "a bunch of functions that tail-call into each other" has given good performance improvements in other projects such as CPython

[14:22] <timo> primarily by allowing the C compiler to do a better job

[14:31] <Geth> ¦ MoarVM/tail_call_interpreter: 5d286163f6 | (Timo Paulssen)++ | 4 files

[14:31] <Geth> ¦ MoarVM/tail_call_interpreter: Split MVM_CGOTO into "has feature" and "use for interp"

[14:31] <Geth> ¦ MoarVM/tail_call_interpreter: 

[14:31] <Geth> ¦ MoarVM/tail_call_interpreter: so that we can have interp.c as musttail and program.c as Computed Goto

[14:31] <Geth> ¦ MoarVM/tail_call_interpreter: review: https://github.com/MoarVM/MoarVM/commit/5d286163f6

[15:11] *** [Coke]_ is now known as [Coke]

[22:39] <timo> lizmat: can you test if this commit ^ gives you a performance improvement over main now?

[23:34] <Geth> ¦ MoarVM/spesh_str_eq: b2983533ea | (Timo Paulssen)++ | 10 files

[23:34] <Geth> ¦ MoarVM/spesh_str_eq: Spesh-optimize string comparisons against spesh-time-known strings

[23:34] <Geth> ¦ MoarVM/spesh_str_eq: 

[23:34] <Geth> ¦ MoarVM/spesh_str_eq: Put the length and cached hash code of the constant string into the op as

[23:34] <Geth> ¦ MoarVM/spesh_str_eq: arguments and use that to quickly reject strings that don't match.

[23:34] <Geth> ¦ MoarVM/spesh_str_eq: 

[23:34] <Geth> ¦ MoarVM/spesh_str_eq: In case the constant string is 8 characters or shorter and the graphemes

[23:34] <Geth> ¦ MoarVM/spesh_str_eq: fit into 8bit storage, we drop the usage of the constant string's register

[23:34] <Geth> ¦ MoarVM/spesh_str_eq: completely and instead store the content as a 64bit integer constant in

[23:34] <Geth> ¦ MoarVM/spesh_str_eq: the op arguments.

[23:34] <Geth> ¦ MoarVM/spesh_str_eq: review: https://github.com/MoarVM/MoarVM/commit/b2983533ea

[23:43] <timo> 44.774 +- 0.647 seconds time elapsed  -->  43.949 +- 0.206

[23:43] <timo> not sure it's a decisive improvement

