Welcome to the main channel on the development of MoarVM, a virtual machine for NQP and Rakudo (moarvm.org). This channel is being logged for historical purposes.
Set by lizmat on 24 May 2021.
Nicholas [* GOOD *] 06:06
(especially for ducks)
Nicholas blog.pyston.org/2022/06/08/announc...on-module/ -- not that I have an answer, but if you can install 80% of the benefits as a module, what's the business model? 06:40
moon-child business model? Afaik they made it mainly for their own use 06:45
nine Looking at pyston.org, if they're trying to make money off this, they're not trying hard enough 06:46
Nicholas IIRC. 06:47
(oops, I started too soon. I remembered a citation exists) 06:48
blog.pyston.org/2020/10/ -- Our plan is to open-source the code in the future, but since compiler projects are expensive and we no longer have benevolent corporate sponsorship, it is currently closed-source while we iron out our business model. 06:49
[time passes with no updates]
blog.pyston.org/2021/05/05/pyston-...en-source/ -- what it says
(what it doesn't say is that Facebook revealed and open sourced Cinder about the same time.) 06:50
[time passes with no update]
blog.pyston.org/2021/08/30/pyston-...-anaconda/ -- We talked to a couple of companies about a possible joint future for Pyston -- Now that we have Anaconda’s sponsorship, we are planning out a short-term roadmap for the project. 06:51
I can't find the cittaion, but some cynic roughly defined "startup" as "a company in search of a business model" 06:54
anyway, the take away from the 3 blog posts (in sequence) is that the intent is to get paid to make python faster, which should pan out, because it saves enough money on hardware costs for some firms 06:55
s/some firms/sufficient firms/
to be viable
but trying to get enough of those firms to see it that way is much harder than actually the engineering task 06:56
(oops, grammar)
and similar funding problems seem to happen for all open source infrastructure projects 06:57
nine The good news is that I got rid of almost all regressions caused by my BEGIN time execution work. The bad news is that I realized that my approach is flawed and cannot be the real solution. 07:03
Nicholas but you still have coffee? 07:04
nine I do. And an idea that might actually work. 07:07
lizmat ++nine++ 07:51
timo just replace your python code with starlark :P 11:31
MasterDuke wow. `my $OUT = $*OUT; $OUT.put("1") for ^1_000_000` takes ~0.34s, but `put("1") for ^1_000_000` takes ~1.30s, and `$*OUT.put("1") for ^1_000_000` takes ~1.26s 12:12
japhb Yeah, dynamic variable lookup is still painfully slow. I've gotten in the habit (where I need the performance) of doing I/O via objects that cache their intended IO::Handle's and use methods on those. 12:19
MasterDuke all the dynamics lookups are the source of those stats i posted earlier. but i have no idea if there's an easy improvement 12:21
i guess all the hash/cache misses are because they're dynamics being looked up, so i don't know what could be done 12:29
timo can you trace back where these strings come from that don't have the string hash code cached? 12:49
maybe they are for some reason created from concatenation or substringing over and over again
japhb MasterDuke: The part I don't understand is how you got millions of cases of *both* hashed and not hashed. 12:50
MasterDuke japhb: to confuse things, i've both talked about the string having a hash code, and the existence of a hash used as a cache of lexical names. so not sure which exactly your question is about 12:52
japhb The second one, sorry
MasterDuke timo: 1000002 '$code' has no cached_hash_code, 1000005 'utf8' has no cached_hash_code, 1000006 '$*LIBPATH' has no cached_hash_code 12:54
that's when running `put "1\n" for ^1_000_000`
japhb: i guess it's because sometimes the variable is in a frame with enough lexicals to cause the hash/cache to be created? 12:57
github.com/MoarVM/MoarVM/blob/mast...#L647-L661
brrt ohai #moarvm 13:00
Nicholas good *, brrt 13:03
lizmat brrt o/
timo inteesting, you wanna give changing that number to be lower a try and see how it impacts both run time and hashing details 13:19
there's no reason not to calculate&cache hash codes for stuff we get from get_heap_string in that piece of code maybe 13:20
MasterDuke you mean try fewer iterations of the loop? 13:21
timo the "num_lexicals <= 5" number
just the one on line 656
damn... what if there was an irc bot or something else where you could just say "apply s/foo/bar/ to moarvm file blah.c on line 1234, build a raku with it, then benchmark this snippet" 13:23
MasterDuke ah, now just 1000005 'utf8' has no cached_hash_code (and the same other random much smaller cases)
timo could you imagine
MasterDuke didn't el_che show something like that?
timo i dunno! maybe! 13:24
was that long ago?
MasterDuke year+ iirc
fyi, i changed the 5 to 2 13:25
with just that change, `put("1") for ^1_000_000` drops to ~1.24s 13:26
timo i don't know what exact function you need to call if you want to ensure a string has a hash code 13:31
but if you can find that, put the number back to 5, and then in an else branch to the piece that adds stuff into the hash, only create the hash code 13:32
that may give the same improvement, perhaps even better
also, maybe nicholas can tell us how good our new hash impl is when the hash has really few entries, and if there is perhaps an improvement to be had there 13:33
and maybe even something for when you know the strings up front and that the hash shouldn't grow ever?
brrt is way out of date... 13:34
timo now i have a very understanding of "got a job, can't do my favourite open source project any more" 13:35
MasterDuke ah. i put the number back up to 5, but then added `if (!name->body.cached_hash_code) MVM_string_compute_hash_code(tc, name);` right after `MVMString *name = get_heap_string(tc, cu, NULL, pos, 6 * j + 2);` and now it's just `1000005 'utf8' has no cached_hash_code` 13:56
`put("1") for ^1_000_000` looks to be just a tiny bit faster, maybe ~1.32s now 13:57
brrt timo: the thing is that open source projects don't start shouting at you, when you fail to make progress, whereas $dayjob-managers do 14:25
nine I can't imagine any company can afford to have even halfway decent software engineers get shouted at. Feels like the only thing one has to do to get a new and better job is to stop saying no to offers. 18:50
lizmat agree 18:57
nine Corollary is: realize how much power you actually have in your position and use it to stop any abuse right in its track, or just leave. There's just no reason to compromise there. 18:59
nine Finally! Got a working version of BEGIN time code execution without regressions, without totally messing up the compilation phases and with a code structure that I can live with. Passing 450 spec test files now 21:14
[Coke] nine++ 21:15
MasterDuke nice 21:16
japhb Found a deadlock in my code that eventually boiled down to MVM_io_read_bytes and MVM_io_is_tty trying to lock the same (per-filehandle) mutex. I can totally see *most* IO operations being mutexed per-filehandle for safety ... but does MVM_io_is_tty need that treatment? Are there any OSen where checking whether an fd is a TTY is an operation that would mess with other IO? 22:19
(I introduced the deadlock by adding a debug assertion in my Raku code that wouldn't actually trigger during my test case, but included a .t on a filehandle already being .read in another thread -- so the assertion test itself triggered the deadlock.) 22:21