Welcome to the main channel on the development of MoarVM, a virtual machine for NQP and Rakudo (moarvm.org). This channel is being logged for historical purposes.
Set by lizmat on 24 May 2021.
Nicholas good *, #moarvm 06:33
lizmat good after* ! 10:26
jnthnwrthngtn might find this PR interesting: github.com/rakudo/rakudo/pull/4918
vrurg Anybody to advise on serialization? 19:45
Nicholas I don't know what the question is yet, so I don't know if I can actually answer usefully 19:46
"use Sereal rather than Storable" - oh, very wrong channel 19:47
vrurg ;)
I just wonder if it is possible to detach an object from serialization context. 19:48
Nicholas I have no idea about that
(Sorry)
vrurg nine: ping? 19:49
Too late, perhaps.
Nicholas I sort of assumed the same - he's not normally around at this time of the morning 19:50
vrurg C'mon, it's just 11 in Germany! Even me is woken up at this time! 19:51
Normally...
MasterDuke Nicholas: do you know anything about MVM_string_compute_hash_code and whether there are any easy optimization possibilities? 19:52
hm, it doesn't have a whole lot of code in it... 19:53
Nicholas I don't know anything about that code (as in, I've read it breifly previously, but never needed to touch it) 19:56
nine pong 19:57
vrurg nine: if I do `nqp::setobjsc($obj, nqp::null())` – would it throw, or exclude the object from serialization? 19:58
nine I don't think it's possible to detach an object once it's in a serialization context. It is possible to prevent it from getting attached. There's also nqp::neverrepossess 19:59
Maybe take a step back and describe your problem?
vrurg I'm trying to make symbol merging on Stash more thread-safe by clone - replace $!storage. But when it happens compile time the clone gets lost. If I add it to serialization – the original would still pollute the bytecode. 20:00
nine What do you mean by "pollute the bytecode"?
vrurg Both hashes: the original $!storage and its modified clone would be serialized. 20:01
nine Why would the clone get lost if it's referenced from the Stash object? 20:02
MasterDuke just throwing out random words here, but are nqp::scwbdisable and nqp::scwbenable relevant?
vrurg It does. Until I added $*W.add_object_if_no_sc($storage) I was experiencing lost symbol tables for `use NativeCall::Compiler::GNU` and alike. 20:04
nine But why?
vrurg MasterDuke: no because there is a need to replace an already serialized object.
nine: No idea. Let me create a gist with the method code.
gist.github.com/vrurg/b7dcda63d288...9c4151d6a7 20:06
It is experimental. The idea for now is ModuleLoader.merge_globals calls this method on Stash'es to let them wrap merging into a lock. 20:07
MasterDuke Nicholas: btw, what does Perl use?
Nicholas what what does Perl use? hashing algorithm?
vrurg nine: oops, just've noticed that I forgot to uncomment cloning. Perhaps I'm wrong. 20:08
MasterDuke yeah 20:09
vrurg is recompiling...
MasterDuke particularly for strings, if those are done differently
a perf report of `say "big2.txt".IO.slurp.lc.words.Bag.elems` has MVM_string_compute_hash_code as the second most expensive function 20:10
and MVM_string_compute_hash_code kind of has just two parts to it, the string iteration and the hashing. i can't easily do anything about the string iteration, but maybe there's something better than siphash 20:12
vrurg nine: sorry, false alarm. It's not about serialization, after all.
Nicholas all things in Perl are hashed as strings. And I've lost track of what is now used. Yves did a lot of work to make it flexible (at C copile time)
but the choices weren't just Siphash. (And IIRC "Siphash" defaults to something like Siphash 2-4, where the "2" and "4" refer to rounds for each loop iteration, and rounds during finalisation.) 20:13
MasterDuke perldoc.perl.org/perlsec#Algorithm...ty-Attacks says siphash is the fallback, but the default is much faster (though it doesn't say what that is...) 20:15
ah, for my machine at least (using the system perl), `PERL_HASH_SEED_DEBUG=1 perl -e ''` gives `HASH_FUNCTION = SBOX32_WITH_SIPHASH_1_3` 20:17
Nicholas I should go to bed (evil alarm clock goes off too soon) but I *thought* the default was 1-3, not 2-4 20:18
the Perl 5 default, that is.
japhb remembers back when it was "multiply by 33, add, shift"
Nicholas :-) 20:19
Nicholas remembers when it was Kindergarten, and it didn't totally matter when you arrivedd, but "before 9" was generally good.
I don't appreciate the "good morning" from the alarm clock
japhb My biggest problem with alarm clocks is training myself to hit snooze when I'm mostly asleep rather than turning it off completely. 20:21
vrurg Respect to iphone, where 'stop' is a little gray button, hard to click accidentally. Though, I have this achievement unlocked for several times... 20:23
MasterDuke hm, github.com/google/highwayhash looks interesting
japhb vrurg: On Pixel, it's swipe center-to-left or swipe center-to-right. Which is especially annoying when the alarm is going off and you're too tired to get your finger to sufficiently match "center" at the start 20:24
(Meanwhile your panicing mind is reminding you that your partner is *also* having to hear the alarm that won't stop.) 20:25
MasterDuke sadly, i sort of look back fondly on when i used to set an alarm clock. i now only set one a couple times a year, kids get us up early enough (for pretty much anything scheduled during the day) otherwise 20:27
nine We now make it through stage parse of Test.pm6 on RakuAST (including BEGIN time effects like traits and proto auto-generation). 20:45
MasterDuke cool beans 20:46
jnthnwrthngtn nine: Wow, that's some progress :) 21:20
MasterDuke Nicholas: does Perl ever not compute the hash code if the string is too large? 22:01
jnthnwrthngtn: you might have some intuition here. for this example `say "big2.txt".IO.slurp.lc.words.Bag.elems` (where big2.txt is a collection of random stuff, e.g., project gutenberg text, wikipedia text), a lot of time is spent in MVM_string_compute_hash_code 22:03
in 1095695 of the cases that MVM_string_compute_hash_code is called, the storage type is MVM_STRING_STRAND, two strands, with 4 graphs in the first strand and 6488666 graphs in the second 22:05
the next most common case for two strands with 21 occurrences is 7 graphs in the first and 15 graphs in the second 22:06
any idea why the case with 6488666 graphs in the second strand would be happening so frequently? 22:08
jnthnwrthngtn MasterDuke: First guess: Bag uses an object Hash, and thus .WHERE/ObjAt. The WHERE value is formed of "Str|the value", the "Str|" is the first entry in the strand. Look carefully at the second strand, I bet it's got offsets, since substr is generally non-copying rather rather produces a strand. 22:09
s:1st/rather/and
sourceable6 jnthnwrthngtn, No idea, boss. Can you give me a Code object? Output: 4===SORRY!4=== Error while compiling /tmp/YzdXnm6TNm␤Confused␤at /tmp/YzdXnm6TNm:1␤------> 318⏏4st/rather/and␤ expecting any of:␤ whitespace␤
jnthnwrthngtn These indices are what I refer to: github.com/MoarVM/MoarVM/blob/d11b...ring.h#L58 22:10
MasterDuke hm, let's see what those values are... 22:11
ah yes. now i see that all the calls have different start and ends 22:14
jnthnwrthngtn lizmat: Missed this before, but: "what makes the NEXT phaser different from e.g. LAST" - it's that in NEXT we can be 100% sure that we ran the outer block (because if there's a next value, there was a loop iteration), but `LAST` need to run even if we didn't have any values, so we need some kind of hack to cope with it 22:16
lizmat ah, ok
well, I figured it out in github.com/rakudo/rakudo/pull/4918 22:17
which in turn lead me to github.com/rakudo/rakudo/pull/4919 22:18
meanwhile, /me is going to get some shuteye & 22:19
MasterDuke hm, so i guess the time spent in MVM_string_compute_hash_code is because of the number of calls and their total cost, not because it's being called (multiple times) on a large string 22:20
yeah, vast majority of the number of graphs actually being checked are <20 22:22