github.com/moarvm/moarvm | IRC logs at colabti.org/irclogger/irclogger_logs/moarvm
Set by AlexDaniel on 12 June 2018.
00:05 vrurg joined 00:09 vrurg left 00:10 MasterDuke left 00:46 leont left 01:03 vrurg joined 03:01 klapperl left, klapperl joined 03:04 avar left 03:16 avar joined, avar left, avar joined 04:03 bartolin left 04:17 bartolin joined 06:33 frost-lab joined 07:58 domidumont joined 08:26 patrickb joined 08:31 MasterDuke joined
patrickb o/ 08:35
MasterDuke \o 08:39
08:55 zakharyas joined 08:57 sortiz left 08:59 sena_kun left 09:02 sena_kun joined 09:07 leont joined 09:12 Geth joined
MasterDuke i'm looking at a perf report of a tiny script that defines some custom operators (essentially taken from www.khanate.co.uk/blog/2021/02/23/...-part-2/), but takes 9s to run 09:47
lizmat could you see if it is as slow when the code is precomped? 09:48
MasterDuke there are 1.35m calls to MVM_coerce_smart_intify, hitting this case github.com/MoarVM/MoarVM/blob/mast...#L417-L418 and i logged the types: `1349339 intifying an object of type P6int (BOOTInt)`
is that faster `REPR(obj)->box_funcs.get_int` faster than the above block github.com/MoarVM/MoarVM/blob/mast...#L393-L409 which i would have though would have been hit? 09:50
i assume the get_int is just github.com/MoarVM/MoarVM/blob/mast....c#L89-L97 which should be pretty fast, but MVM_coerce_smart_intify was higher in the perf report than i'd expect 09:54
lizmat: the runtime isn't very much (so i'm sure precompiling would make the entire script much faster), but it's the compiling of it i'm trying to speed up 09:56
lizmat ack ++MasterDuke
MasterDuke run time is 41 ms, compile time is 9s 09:59
6s of which are spent in github.com/Raku/nqp/blob/master/sr...L813-L1007 but i don't know if there are any more micro-optimizations to be done there, i suspect something algorithmic is needed 10:01
10:01 frost-lab left
lizmat Looks to me line 966 is dead code ? 10:05
nine indeed 10:06
MasterDuke yeah, same with 873. but i don't think those are going to significantly impact the time 10:07
lizmat lemme remove them and runs the tests
nine I'm pretty sure spesh would optimize away 873 and likely 966, too
OTOH removed code is debugged code 10:08
lizmat 873 appears to be needed for debugging
nine both are 10:09
MasterDuke same with 966 (used in 969)
lizmat aaahh ok
nine but they may be commented out
MasterDuke but they should be commented out with the debugging code 10:10
lizmat doing that now
MasterDuke fwiw, process_worklist, MVM_VMArray_at_pos, VMArray_gc_mark are the top moarvm functions according to perf 10:11
nine I'm a bit surprised at MVM_VMArray_at_pos, since that ought to get devirtualized in the JIT 10:12
Or, no
10:13 frost-lab joined
nine Well it would get devirtualized, but that only means that we get rid of MVM_repr_at_pos and replace it with MVM_VMArray_at_pos. And the latter is probably a bit too complicated to implement in the JIT directly 10:13
jnthn The array element type is, however, hung off the REPR data, and so if we know the type we know the element type, so any branching on the array kind goes away 10:14
I'd suggest we get the re-org of VMArray done first 10:15
MasterDuke re-org? the FSA work i've yet to finish?
nine Yeah, looking at it, the code is very repetitive and the only real difference is the multiplier for the index to get the memory address
jnthn MasterDuke: Yes, and I think moving the length information into the chunk allocated with the FSA, to give us safety 10:16
After that I guess it JITs into an offset calculation and a bounds check
Then the deref
MasterDuke yeah, that's the part i haven't finished yet
that is, moving the length information into the chunk allocated with the FSA
jnthn OK. The JIT output would change with that, so probably it's worth doing first. 10:17
I'll try and backlog here a bit later. Still suffering limited keyboard time due to wrist.
nine oh...still not better? 10:18
MasterDuke i think the remove candidates PR is pretty close to done, hope to finish up the VMArray FSA stuff next
jnthn nine: Well, in the absolute sense no, in the relative sense, improvement since I started using some gel and realizing that less keyboard time is useless if I don't also do less smartphone time :) 10:19
MasterDuke (just distracted from either this morning by the slow custom operator compiling)
jnthn Custom op compilation is slow more 'cus of the NFA design than anything, I suspect 10:20
MasterDuke heh, yeah, i just keep hoping to find something lower hanging than the very top of the tree...
nine jnthn: I can tell from experience that playing piano is also not the smartest idea in that case :) 10:21
MasterDuke or throwing/catching an (american) football
nine MasterDuke: too late! Considering the path you're on, you'll keep jumping from tree top to tree top
MasterDuke ha 10:23
10:35 frost-lab left
nine Why does CStruct's storage spec claim that it needs only space the size of a pointer for inlining: github.com/MoarVM/MoarVM/blob/mast...uct.c#L793 when its body actually contains 2 pointers? github.com/MoarVM/MoarVM/blob/mast...ruct.h#L21 10:38
Ah, because it cannot be inlined according to the same storage spec 10:42
10:46 patrickb left
MasterDuke btw, is there anything that can be done for all those MVM_coerce_smart_intify calls where the type is P6int (BOOTInt)? don't know if it matters, but they're coming from that NFA optimize method and NQPMu's BUILDALL 11:12
12:08 zakharyas left
MasterDuke oh, github.com/MoarVM/MoarVM/blob/mast...ze.c#L1006 is supposed to take care of it. but then why is there still a smrt_intify in the 'after' of NQPMu's BUILDALL...? 12:33
smrt_intify      r13(3),   r8(9) 12:37
...
r8(9): usages=1, deopt=9, flags=0
...
r13(3): usages=6, deopt=53,51,50,49,48,47,46,45,44,43,42,41,40,39,38,37,36,35,34,32,31,30,29,28,27,25,26,22,21,20,19,18,17,16,15,14,13,12,11,10, flags=0
i don't know how to read facts
lizmat resists the urge to mention something with "alternative" in it 12:39
MasterDuke heh
fwiw, it's this line github.com/Raku/nqp/blob/master/sr...Mu.nqp#L23 12:42
lizmat you could try removing the "int" part ? 12:46
but I guess that will cascade into a lit of calls in the nqp::iseq_i 's ? 12:47
*lot
MasterDuke that's what i assume
12:53 patrickb joined
MasterDuke i think it doesn't get into the body of the optimize_coerce because it fails `if (facts->flags & (MVM_SPESH_FACT_KNOWN_TYPE | MVM_SPESH_FACT_CONCRETE) && facts->type) {` 13:08
`facts->flags & (MVM_SPESH_FACT_KNOWN_TYPE | MVM_SPESH_FACT_CONCRETE)` is 0 and `facts->type` is 0x0
lizmat so is the location of the code the reason? 13:10
MasterDuke i assume it's because spesh doesn't know for sure what the type is of the thing it's pulling out with the nqp::atpos. compared to atpos_i which would always be an int 13:12
lizmat yup, I'd say 13:15
sadly, that array cannot be turned into a list_i, as it can also contain code objects for BUILD and TWEAK, if I recall correctly
and in any case, in Rakudo this is all codegenned for each class, so it actually won't run BUILDALL in most cases 13:16
well, *that* BUILDALL
MasterDuke luckily this isn't the most expensive thing going on, but i thought it might be more easily optimized 13:18
lizmat but you raise a good point 13:20
I wonder though if it would be worthwhile to port the "create a custom BUILDALL method for a class" approach would be worthwhile in NQP, or even possible? 13:21
MasterDuke i'm afraid i can only be a rubber duck here, i know absolutely nothing about the BUILDALL stuff 13:24
lizmat how big a part is BUILDALL execution of what you're benchmarking ? 13:25
MasterDuke it's the 6th most expensive function, but in absolute time values almost nothing in comparison 13:26
lizmat ok, then let's focus on the top 5 :-) 13:27
MasterDuke 6.2s, 4.4s, 2s, 440ms, 240ms, then BUILDALL at 190ms 13:28
lizmat and the 6.2s one is? 13:30
MasterDuke those top three are optimize gen/moar/stage2/QRegex.nqp:817, mergesubstates gen/moar/stage2/QRegex.nqp:665, mergesubrule gen/moar/stage2/QRegex.nqp:560
i've looked at all three a bunch and i think the micro-optimizations are pretty much all found. like jnthn++ said, a re-design is needed 13:31
lizmat line 384 maybe better written as "elsif $to && @edges[0] == $EDGE_FATE and lose the inner if ? 13:33
that would first check $to (which is probably cheaper), and only then index into @edges 13:34
also: it's checking twice for @edges[0], maybe store that in a temp ?
MasterDuke isn't the second one @sedges (note the initial 's') 13:36
oh, you mean 831 and 834?
lizmat 830 yeah 13:37
sorry, what was I typing
13:42 zakharyas joined
MasterDuke no noticeable change in time with those changes 13:49
lizmat yeah, it was a long shot: those were in the initial setup, not in the actual optimization 13:52
afk for a few hours&
MasterDuke jnthn: i'm not sure about github.com/MoarVM/MoarVM/pull/1426...2c17b3e282 but that was needed to actually trigger the optimization being removed in that example you gave 14:12
nine I'm pretty sure all the regex code is from the era of "make it work somehow" rather than the later "make it work efficiently" stage 14:34
MasterDuke i know there's that "passing fates somewhere" optimization i've asked about before, i've been thinking about trying to give that a go once i finish up the current stuff 14:37
regexes/nfas/etc are not really my area of expertise, but then again, none of the rakudo/moarvm stuff i've done really are either... 14:39
nine I think regexes/nfsas/etc are the last large white area on my map, too :) 14:44
MasterDuke i really like using regexes (mastering regular expressions was only the second programming reference book i read cover-to-cover besides programming perl), but have never really spent any time with their implementation 14:47
nine At university (I studied on-the-job with already 15 years of experience) as a beginner example for programming C we had to implement a grep-like tool and just for fun I implemented a very basic backtracking regex engine :) It's really the best way to understand how these things work 15:16
15:22 linkable6 left, evalable6 left, linkable6 joined 15:24 evalable6 joined
MasterDuke my algorithms course had some stuff about NFAs/DFAs, but i don't remember any particularly practical exercises 15:32
15:35 patrickb left 15:37 patrickb joined 16:09 sortiz joined 16:54 patrickb left 17:05 cog left 17:06 cog joined 18:59 zakharyas left 19:33 domidumont left 20:21 sxmx left 20:38 sxmx joined 20:55 zakharyas joined 21:48 zakharyas left
lizmat eprint.iacr.org/2021/232 # RIP RSA ? 22:28
moritz the linked PDF says "work in progress 31.10.2019" 22:33
if it's as revolutionary as the abstract claims, why hasn't it destroyed RSA yet in the last year?
I don't understand number theory, so cannot judge the paper on its contents 22:34
leont Given the name on it, I would take it serious 22:35
Any idea I may have had of understanding a little number theory was quickly put to rest by that paper… 22:37
moritz aye, it's pretty dense :D
MasterDuke dense is the word i was just going to use
moritz and according to the Wikipedia, it does look like he's got some good credentials in the field 22:38
leont Then again, my understanding of number theory is based on having read Applied Cryptography like 15 years ago :-p
(and that's probably also why I had heard of Schnorr signatures) 22:47
23:10 Kaiepi left 23:11 Kaiepi joined