01:08 maggotbrain joined
Geth ¦ nqp-configure: vrurg self-assigned Cannot determine the brand of your nmake utility github.com/Raku/nqp-configure/issues/19 02:20
06:57 hungryd12 joined 07:00 hungrydonkey left 07:33 lichtkind joined 08:01 hungrydonkey joined 08:03 hungryd12 left
AlexDaniel Altai-man: just noticed that 2020.02.1 release announcement talks a bit too much about Perl 6 08:04
tellable6 AlexDaniel, I'll pass your message to Altai-man_
08:10 hungryd57 joined 08:12 hungryd57 left 08:13 hungrydonkey left, hungrydonkey joined 08:14 AlexDaniel left 08:27 AlexDaniel joined, AlexDaniel left, AlexDaniel joined 08:32 Kaeipi left 08:33 Kaiepi joined
lizmat Files=1306, Tests=111242, 215 wallclock secs (29.11 usr 8.47 sys + 3017.92 cusr 278.84 csys = 3334.34 CPU) 08:46
09:26 squashable6 left 09:27 squashable6 joined 09:30 sena_kun joined
lizmat MasterDuke: funny piece of code, that, using a native str as an index in a list 09:35
also, "if nqp::ord($name) == 36 && ($name eq '$!from' || $name eq '$!to') {" feels like a premature optimization nowadays ?
MasterDuke well, the native str would be one of the ascii digits at that point `if nqp::ord($name) < 58`, but yeah 09:41
lizmat m: use nqp; my int $a; my str $name = q/$bar/; for ^1000000 { ++$a if nqp::ord($name) == 36 && ($name eq q/$foo/ || $name eq q/$bar/) }; say $a; say now - INIT now
camelia 1000000
lizmat m: use nqp; my int $a; my str $name = q/$bar/; for ^1000000 { ++$a if $name eq q/$foo/ || $name eq q/$bar/ }; say $a; say now - INIT now
camelia 1000000
lizmat that's about 2.5x faster
I don't know how hot that loop is, but that would appear to be an easy win 09:42
MasterDuke that whole MATCH function is expensive and called a lot when parsing, so could be good
lizmat I think that piece of code is a leftover from the Parrot days 09:43
when string matching was relatively expensive, and ord was relatively cheap
now that we have proper grapheme support, I think it's the other way around now
MasterDuke: also another win: I see several places checking for CCLASS_ALPHANUMERIC and then checking for underscore: nqp::ord(95) 09:46
there's actually a CCLASS_WORD nowadays that combines them
so the extra check for nqp::ord(95) can then be removed 09:47
09:57 ggoebel joined
MasterDuke nice 10:02
ggoebel .tell nine we may need to treat compiled scripts differently that precompiled modules because the end goal of the gsoc project was deliverable was to "Add a --compile option to perl6 which will generate a foo executable PE or ELF formatted binary for a given foo.pl6 user program which will facilitate self-contained deployment and/or deployment dependent upon the system shared version of Perl 6 which is installed." 10:05
tellable6 ggoebel, I'll pass your message to nine
nine ggoebel: I don't read anything in that which would warrant different treatment. After all the self-contained program will want to use modules as well 10:07
ggoebel the project summary references the .net single file distribution... but the link is broken. I think I've found it here: github.com/dotnet/designs/blob/mas.../design.md 10:12
nine If I were to implement self-contained executables, I'd implement a new CompUnit::Repository and a new CompUnit::PrecompilationStore and fix the few remaining places in CompUnit::PrecompilationRepository where we still assume that we're dealing with files. 10:14
ggoebel that was the fuzzy direction I thought things might need to go. but not being intimately familiar with everything, I was hesitant to suggest it. 10:15
where the CU:PrecompilationStore would allow you to bundle the module dependencies within the self-contained executable and load them directly into memory from there 10:17
here is the link to the gsoc project summary: yakshavingcream.blogspot.com/2019/...eview.html
10:18 lichtkind_ joined 10:20 lichtkind left
ggoebel the self-contained program may at one extreme end of the spectrum want to bundle _everything_ into it. So that you have a stand-alone relocatable binary with no dependency on raku being installed on the system 10:20
nine Precompiled scripts sound rather orthogonal to self-contained programs to me anyway 10:25
ggoebel yes Perl5's PAR provides self-contained executables without precompilation 10:27
nine As do you apparently :) "And with that, I had a way to embed Perl 6 source code into an ELF file, and run it! It was time to move onto the next steps, figuring out how to run MBC directly, and adding the --compile flag to perl6." 10:31
ggoebel I imagine you might be able to load the precompiled bytecode in a self-contained program without performing whatever tests are required to decide when to recompile things? 10:32
FWIW: I am related to, but am not pamplemouse (the gsoc implementer) 10:34
nine A couple years ago I realized that we're actually already at a point very close to being able to ship binary only distributions. In a normal program run we never touch the installed modules' source files.
Ah, that clears that confusion :)
I thought you have just changed nick or so
ggoebel I encouraged them to get involved with raku via the gsoc project, but it is their work not mine. 10:35
is there much overhead in deciding when things need to be re-precompiled? I've never really understood how that works. 10:36
nine The overhead is reasonably small. It's all based on "trust the SHA". 10:37
Once you understand the problem, the solution should be rather straight forward to understand. 10:38
ggoebel so a checksum for each precompiled module?
I guess I should go read the code... 10:39
nine To decide whether a precomp file is up to date, we need to know whether it still reflects the sources or not and whether its dependencies are in the exact same state as when we precompiled
Therefore precomp files can become invalid when you change the source (of course) or when the result of resolving the dependency specifications would change from the time we precompiled to the time we loaded. 10:40
The latter may change simply because one installed a newer version of a dependency (the classic case). It can also change if a new repository was added to the $*REPO chain and this repository contains a different version that would be preferred. I.e. raku -I/some/dir 10:41
So to a first approximation, a precomp file is safe if all repositories have the exact same state as when we precompiled. I.e. nothing got installed or uninstalled and the sources are identical. This is what I implemented first. 10:42
ggoebel Does this mean we perform a checksum on every source file before using the associated precompiled bytecode? I appreciate you taking the time to answer my questions, but I don't want to take your time if there is a design doc or I should be figuring it out for myself by reading the code. 10:43
nine For this each repository generates an identifier (id) which reflects the full contents of the repository. Think a SHA hash of all installed modules. We record this hash in the precomp file and can compare when loading
Well, this may actually become the new "general overview" documentation :)
This first implementation works, but of course it has a major drawback: whenever we install any module, all precomp files will be invalidated. So we need to be a bit smarter. 10:44
When we detect that there was a change in the repository chain's identity - either by a repository getting added or removed or by a change in its contents, i.e. a module getting installed - we re-resolve the precomp file's dependencies. 10:45
ggoebel so when the repo id SHA checksum fails, all the precomp files get rewritten. I think I'm with you. 10:47
nine So we take the dependency specification as given in the use statement (use Foo:auth/me/:ver(v1.*)) and check what precomp file we'd end up with. If it's the same, everything is fine. To do this we actually record this CompUnit::DependencySpecification in the precomp file's header.
As a side node: this re-rechecking of dependencies can actually be turned off via the RAKUDO_RERESOLVE_DEPENDENCIES=0 environment variable as a workaround for BEGIN time EVAL issues. That's because we EVAL the stringified CompUnit::DependencySpecification to be able to work with it. 10:48
MasterDuke lizmat: i made the $name eq change, but changing the 3 instances of CCLASS_ALPHANUMERIC + nqp::ord(95) to just CCLASS_WORD caused the nqp compile to die with `Use of undeclared variable '$diff-1' at line 321, near ") ;\n " at gen/moar/stage1/NQPHLL.nqp:1028 ...` 10:49
lizmat hmmm... that's weird
nine There are situations where the user knows that the dependencies are the same, like when packaging modules for Linux distros. In this case the dependencies are handled by a trustworthy mechanism outside rakudo.
MasterDuke pretty sure i didn't have a typo, but i'm going to double check now
lizmat then either the documentation is wrong, or something else is not properly understood 10:50
ggoebel nine: thank you for taking the time to explain. and thank you for your work on raku 10:51
nine There's another optimization we use that has visible side effects: module distributions are specified as immutable, i.e. Foo:ver<0.1> must always contain the exact same sources. The module loader relies on this. This means that we don't actually have to look at the source files at all. Instead we just take a SHA of the distribution's meta data. We could even get by with just the distro's long name 10:52
Oh, we actually do. The SHA identifying a distribution is generated by Distribution.id which only looks at name, auth, ver and api 10:54
MasterDuke lizmat: well, the docs have CCLASS_WORD also including 'Nd' 10:55
10:56 Altai-man_ joined
lizmat MasterDuke: so does ALPHANUMERIC 10:56
only ALPHABETIC doesn't have that
ggoebel I was hoping there would be performance gains to be had from self-contained programs that didn't have to perform checksums or file io before loading the precompiled bytecode. But it sounds like there might not be much.
MasterDuke oh, two of the places are ALPHABETIC, one is ALPHANUMERIC. i'll back out those two places
lizmat yeah, you should only do this optimization for ALPHANUMERIC, *not* ALPHABETIC 10:57
10:58 sena_kun left
MasterDuke yeah, now it's fine 10:58
nine I've literally spent hundreds of hours on getting the module loader as fast and well scaling as possible while still allowing distro packages to just put files into place whithout having to update some registry. 10:59
I seriously hope that I have not left any low hanging fruit. Though of course that would be fantastic :)
The upshot is that we're actually much faster than Perl and Python when loading large applications, as long as they are precompiled. 11:00
lizmat m: dd 1,2,3 ... * > 0 # this feels wrong 11:02
camelia (1,).Seq
nine Really what happens, when you load a precompiled module with a huge dependency tree is that we do a directory listing of all CompUnit::Repository::Installation's "dist" directories (no need to actually read those files) to generate the repo ids. We read all files in the appropriate "short" directory of those repositories. 11:03
lizmat m: dd 1,2,3 ... * > 5 # guess it's a consequence of this
camelia (1, 2, 3, 4, 5, 6).Seq
nine Then we already have the desired precomp file. This contains a list of exact file names of the whole dependency tree. So read the precomp file, get a full list of everything else you need to read + checksums of those files.
Then we read those dependencies which have their own checksum right there in the header, so we just need to read that line to be able to compare with our list of checksums. Then we continue to load the byte code. 11:04
I see a possibility for improving performance mostly be parallelizing this check of precomp files which may be faster once that list is in the 100s of modules (like in one of our applications) 11:05
Also MoarVM verifies bytecode when loading a module. This may be run in parallel as well.
jnthn That's only half true 11:11
It verifies the bytecode lazily, on the first time a frame is hit
nine Oh, yes, there's that
jnthn So if you load a module with 100 frames but only call 5 of them, you'll only verify those 5. 11:12
nine We could actually add Yet Another Thread (tm) which verifies in the background ;)
jnthn Hm, but is it a win?
iirc we also lazily build some data structures in memory at the time we verify that relate to the frame 11:13
nine It may be but my gut also says that it won't be worth it for a long while. That C code is pretty fast. Lots more important places to improve
jnthn So if we do it speculatively on frames we don't need, we blow up our memory use
Even before considering we mmap the bytecode so we can demand-page it
11:45 pamplemousse joined 11:48 lichtkind_ left 11:57 MasterDuke left
pamplemousse nine: Do you know where that directory file lives? I was on the path of to essentially duplicating the work for finding all used files, so finding and using that instead would be wonderful 12:07
nine pamplemousse: a CompUnit::PrecompilationUnit has a dependencies method that gives you the full list 12:16
AlexDaniel lizmat: it's not wrong
lizmat: unless I'm missing something 12:17
12:17 MasterDuke joined
lizmat AlexDaniel:, yeah, just use ...^ if you don't want the last one 12:17
AlexDaniel lizmat: … includes the last element that matched, …^ doesn't
pamplemousse nine: Nifty, thanks
nine I'm not sure we actually have a public API that will give you a PrecompilationUnit though... 12:19
lizmat maybe we should :-)
nine Well, the precompilation store's load-unit method returns one 12:20
Geth nqp: e4db2f51f6 | (Daniel Green)++ | src/QRegex/Cursor.nqp
Some minor optimizations to NQP matching

Nowadays checking string equality is faster than nqp::ord, so get rid of an old optimization.
Also, CCLASS_WORD is the same was CCLASS_ALPHANUMERIC + '_', so just use that.
lizmat MasterDuke: any visible improvements ? 12:45
MasterDuke no, but i wasn't doing really extensive benchmarking
12:51 ggoebel_ joined 12:54 ggoebel left 12:57 sena_kun joined 12:58 Altai-man_ left 13:07 pamplemousse left 13:09 pamplemousse joined 13:59 ggoebel__ joined 14:01 ggoebel_ left 14:06 robertle joined 14:28 ggoebel_ joined 14:30 ggoebel__ left
[Tux] Rakudo version 2020.02.1-357-g5610416c8 - MoarVM version 2020.02.1-153-g9bb7a1850
csv-ip5xs0.946 - 0.949
csv-ip5xs-2010.243 - 10.394
csv-parser26.668 - 27.582
csv-test-xs-200.379 - 0.392
test8.278 - 8.620
test-t2.826 - 2.938
test-t --race1.016 - 1.032
test-t-2049.179 - 49.798
test-t-20 --race13.982 - 14.084
14:43 robertle left 14:56 Altai-man_ joined 14:58 sena_kun left 16:00 hungrydonkey left 16:03 ggoebel__ joined 16:06 ggoebel_ left 16:20 ggoebel joined, ggoebel__ left 16:50 cognomin_ joined 16:53 ggoebel left 16:54 cognominal left 16:57 sena_kun joined 16:59 Altai-man_ left 17:13 maggotbrain left 17:32 cognominal joined 17:36 cognomin_ left 17:51 ggoebel joined 18:02 ggoebel_ joined 18:04 ggoebel left 18:56 Altai-man_ joined 18:58 sena_kun left
MasterDuke i'm not really sure what TimToady meant here colabti.org/irclogger/irclogger_lo...-10-04#l14 "caching the first decision based on initial character would cut down the number of initial states we have to examine, for instance" 19:36
so i can't tell, is it still relevant? 19:37
lizmat I guess in the case of: 19:49
foo | bar | baz 19:50
caching the first character somewhere would only need to examine "foo" if the first is an "f", and "bar" and "baz" if the first is a "b", and none if something else ?
but wheyer 19:51
but whether that is still relevant? I'm not sure
not a lot has changed in that code for the past 5 years afaik
but perhaps current speshing / inlining / JITting has made these caches more of a burden than of a boon 19:52
20:04 MasterDuke left 20:12 MasterDuke joined 20:13 ggoebel__ joined 20:16 ggoebel_ left
lizmat m: dd ("c","c","c" ... *).head(10) 20:48
camelia ("c", "c", "c", "c", "c", "c", "c", "c", "c", "c").Seq
lizmat m: dd ("c","c","c" ... "g").head(10)
camelia ("c", "c", "c", "d", "e", "f", "g").Seq
lizmat m: dd (1,1,1 ... *).head(10)
camelia (1, 1, 1, 1, 1, 1, 1, 1, 1, 1).Seq
lizmat m: dd (1,1,1 ... 10).head(10)
camelia ().Seq
lizmat these all feel very inconsistent
m: dd (1,1,1 ... 1).head(10) 20:49
camelia (1,).Seq
lizmat m: dd ("c","c","c" ... "c").head(10)
camelia ("c",).Seq
lizmat at least these two last ones are consistent 20:50
20:56 pamplemousse left 20:57 sena_kun joined 20:58 Altai-man_ left
AlexDaniel lizmat: which is why I say that we should define some easy to understand behavior first 21:10
instead of trying to DWIM in all cases in which case nobody has a clue what it actually does 21:11
sourceable6: <a b c>.kv()
sourceable6 AlexDaniel, github.com/rakudo/rakudo/blob/5610...t.pm6#L746
22:01 sena_kun left