#raku-dev on 29 April 2020 - Raku Programming Language Log

01:08 maggotbrain joined

Geth

¦ nqp-configure: vrurg self-assigned Cannot determine the brand of your nmake utility github.com/Raku/nqp-configure/issues/19

02:20

06:57 hungryd12 joined 07:00 hungrydonkey left 07:33 lichtkind joined 08:01 hungrydonkey joined 08:03 hungryd12 left

AlexDaniel

Altai-man: just noticed that 2020.02.1 release announcement talks a bit too much about Perl 6

08:04

tellable6

AlexDaniel, I'll pass your message to Altai-man_

08:10 hungryd57 joined 08:12 hungryd57 left 08:13 hungrydonkey left, hungrydonkey joined 08:14 AlexDaniel left 08:27 AlexDaniel joined, AlexDaniel left, AlexDaniel joined 08:32 Kaeipi left 08:33 Kaiepi joined

lizmat

Files=1306, Tests=111242, 215 wallclock secs (29.11 usr 8.47 sys + 3017.92 cusr 278.84 csys = 3334.34 CPU)

08:46

09:26 squashable6 left 09:27 squashable6 joined 09:30 sena_kun joined

lizmat

MasterDuke: funny piece of code, that, using a native str as an index in a list

09:35

also, "if nqp::ord($name) == 36 && ($name eq '$!from' || $name eq '$!to') {" feels like a premature optimization nowadays ?

MasterDuke

well, the native str would be one of the ascii digits at that point `if nqp::ord($name) < 58`, but yeah

09:41

lizmat

m: use nqp; my int $a; my str $name = q/$bar/; for ^1000000 { ++$a if nqp::ord($name) == 36 && ($name eq q/$foo/ || $name eq q/$bar/) }; say $a; say now - INIT now

camelia

1000000
0.2328733

lizmat

m: use nqp; my int $a; my str $name = q/$bar/; for ^1000000 { ++$a if $name eq q/$foo/ || $name eq q/$bar/ }; say $a; say now - INIT now

camelia

1000000
0.08986747

lizmat

that's about 2.5x faster

I don't know how hot that loop is, but that would appear to be an easy win

09:42

MasterDuke

that whole MATCH function is expensive and called a lot when parsing, so could be good

lizmat

I think that piece of code is a leftover from the Parrot days

09:43

when string matching was relatively expensive, and ord was relatively cheap

now that we have proper grapheme support, I think it's the other way around now

MasterDuke: also another win: I see several places checking for CCLASS_ALPHANUMERIC and then checking for underscore: nqp::ord(95)

09:46

there's actually a CCLASS_WORD nowadays that combines them

so the extra check for nqp::ord(95) can then be removed

09:47

09:57 ggoebel joined

MasterDuke

nice

10:02

ggoebel

.tell nine we may need to treat compiled scripts differently that precompiled modules because the end goal of the gsoc project was deliverable was to "Add a --compile option to perl6 which will generate a foo executable PE or ELF formatted binary for a given foo.pl6 user program which will facilitate self-contained deployment and/or deployment dependent upon the system shared version of Perl 6 which is installed."

10:05

tellable6

ggoebel, I'll pass your message to nine

nine

ggoebel: I don't read anything in that which would warrant different treatment. After all the self-contained program will want to use modules as well

10:07

ggoebel

the project summary references the .net single file distribution... but the link is broken. I think I've found it here: github.com/dotnet/designs/blob/mas.../design.md

10:12

nine

If I were to implement self-contained executables, I'd implement a new CompUnit::Repository and a new CompUnit::PrecompilationStore and fix the few remaining places in CompUnit::PrecompilationRepository where we still assume that we're dealing with files.

10:14

ggoebel

that was the fuzzy direction I thought things might need to go. but not being intimately familiar with everything, I was hesitant to suggest it.

10:15

where the CU:PrecompilationStore would allow you to bundle the module dependencies within the self-contained executable and load them directly into memory from there

10:17

here is the link to the gsoc project summary: yakshavingcream.blogspot.com/2019/...eview.html

10:18 lichtkind_ joined 10:20 lichtkind left

ggoebel

the self-contained program may at one extreme end of the spectrum want to bundle _everything_ into it. So that you have a stand-alone relocatable binary with no dependency on raku being installed on the system

10:20

nine

Precompiled scripts sound rather orthogonal to self-contained programs to me anyway

10:25

ggoebel

yes Perl5's PAR provides self-contained executables without precompilation

10:27

nine

As do you apparently :) "And with that, I had a way to embed Perl 6 source code into an ELF file, and run it! It was time to move onto the next steps, figuring out how to run MBC directly, and adding the --compile flag to perl6."

10:31

ggoebel

I imagine you might be able to load the precompiled bytecode in a self-contained program without performing whatever tests are required to decide when to recompile things?

10:32

FWIW: I am related to, but am not pamplemouse (the gsoc implementer)

10:34

nine

A couple years ago I realized that we're actually already at a point very close to being able to ship binary only distributions. In a normal program run we never touch the installed modules' source files.

Ah, that clears that confusion :)

I thought you have just changed nick or so

ggoebel

I encouraged them to get involved with raku via the gsoc project, but it is their work not mine.

10:35

is there much overhead in deciding when things need to be re-precompiled? I've never really understood how that works.

10:36

nine

The overhead is reasonably small. It's all based on "trust the SHA".

10:37

Once you understand the problem, the solution should be rather straight forward to understand.

10:38

ggoebel

so a checksum for each precompiled module?

I guess I should go read the code...

10:39

nine

To decide whether a precomp file is up to date, we need to know whether it still reflects the sources or not and whether its dependencies are in the exact same state as when we precompiled

Therefore precomp files can become invalid when you change the source (of course) or when the result of resolving the dependency specifications would change from the time we precompiled to the time we loaded.

10:40

The latter may change simply because one installed a newer version of a dependency (the classic case). It can also change if a new repository was added to the $*REPO chain and this repository contains a different version that would be preferred. I.e. raku -I/some/dir

10:41

So to a first approximation, a precomp file is safe if all repositories have the exact same state as when we precompiled. I.e. nothing got installed or uninstalled and the sources are identical. This is what I implemented first.

10:42

ggoebel

Does this mean we perform a checksum on every source file before using the associated precompiled bytecode? I appreciate you taking the time to answer my questions, but I don't want to take your time if there is a design doc or I should be figuring it out for myself by reading the code.

10:43

nine

For this each repository generates an identifier (id) which reflects the full contents of the repository. Think a SHA hash of all installed modules. We record this hash in the precomp file and can compare when loading

Well, this may actually become the new "general overview" documentation :)

This first implementation works, but of course it has a major drawback: whenever we install any module, all precomp files will be invalidated. So we need to be a bit smarter.

10:44

When we detect that there was a change in the repository chain's identity - either by a repository getting added or removed or by a change in its contents, i.e. a module getting installed - we re-resolve the precomp file's dependencies.

10:45

ggoebel

so when the repo id SHA checksum fails, all the precomp files get rewritten. I think I'm with you.

10:47

nine

So we take the dependency specification as given in the use statement (use Foo:auth/me/:ver(v1.*)) and check what precomp file we'd end up with. If it's the same, everything is fine. To do this we actually record this CompUnit::DependencySpecification in the precomp file's header.

As a side node: this re-rechecking of dependencies can actually be turned off via the RAKUDO_RERESOLVE_DEPENDENCIES=0 environment variable as a workaround for BEGIN time EVAL issues. That's because we EVAL the stringified CompUnit::DependencySpecification to be able to work with it.

10:48

MasterDuke

lizmat: i made the $name eq change, but changing the 3 instances of CCLASS_ALPHANUMERIC + nqp::ord(95) to just CCLASS_WORD caused the nqp compile to die with `Use of undeclared variable '$diff-1' at line 321, near ") ;\n " at gen/moar/stage1/NQPHLL.nqp:1028 ...`

10:49

lizmat

hmmm... that's weird

nine

There are situations where the user knows that the dependencies are the same, like when packaging modules for Linux distros. In this case the dependencies are handled by a trustworthy mechanism outside rakudo.

MasterDuke

pretty sure i didn't have a typo, but i'm going to double check now

lizmat

then either the documentation is wrong, or something else is not properly understood

10:50

ggoebel

nine: thank you for taking the time to explain. and thank you for your work on raku

10:51

nine

There's another optimization we use that has visible side effects: module distributions are specified as immutable, i.e. Foo:ver<0.1> must always contain the exact same sources. The module loader relies on this. This means that we don't actually have to look at the source files at all. Instead we just take a SHA of the distribution's meta data. We could even get by with just the distro's long name

10:52

Oh, we actually do. The SHA identifying a distribution is generated by Distribution.id which only looks at name, auth, ver and api

10:54

MasterDuke

lizmat: well, the docs have CCLASS_WORD also including 'Nd'

10:55

10:56 Altai-man_ joined

lizmat

MasterDuke: so does ALPHANUMERIC

10:56

only ALPHABETIC doesn't have that

ggoebel

I was hoping there would be performance gains to be had from self-contained programs that didn't have to perform checksums or file io before loading the precompiled bytecode. But it sounds like there might not be much.

MasterDuke

oh, two of the places are ALPHABETIC, one is ALPHANUMERIC. i'll back out those two places

lizmat

yeah, you should only do this optimization for ALPHANUMERIC, *not* ALPHABETIC

10:57

10:58 sena_kun left

MasterDuke

yeah, now it's fine

10:58

nine

I've literally spent hundreds of hours on getting the module loader as fast and well scaling as possible while still allowing distro packages to just put files into place whithout having to update some registry.

10:59

I seriously hope that I have not left any low hanging fruit. Though of course that would be fantastic :)

The upshot is that we're actually much faster than Perl and Python when loading large applications, as long as they are precompiled.

11:00

lizmat

m: dd 1,2,3 ... * > 0 # this feels wrong

11:02

camelia

(1,).Seq

nine

Really what happens, when you load a precompiled module with a huge dependency tree is that we do a directory listing of all CompUnit::Repository::Installation's "dist" directories (no need to actually read those files) to generate the repo ids. We read all files in the appropriate "short" directory of those repositories.

11:03

lizmat

m: dd 1,2,3 ... * > 5 # guess it's a consequence of this

camelia

(1, 2, 3, 4, 5, 6).Seq

nine

Then we already have the desired precomp file. This contains a list of exact file names of the whole dependency tree. So read the precomp file, get a full list of everything else you need to read + checksums of those files.

Then we read those dependencies which have their own checksum right there in the header, so we just need to read that line to be able to compare with our list of checksums. Then we continue to load the byte code.

11:04

I see a possibility for improving performance mostly be parallelizing this check of precomp files which may be faster once that list is in the 100s of modules (like in one of our applications)

11:05

Also MoarVM verifies bytecode when loading a module. This may be run in parallel as well.

jnthn

That's only half true

11:11

It verifies the bytecode lazily, on the first time a frame is hit

nine

Oh, yes, there's that

jnthn

So if you load a module with 100 frames but only call 5 of them, you'll only verify those 5.

11:12

nine

We could actually add Yet Another Thread (tm) which verifies in the background ;)

jnthn

Hm, but is it a win?

iirc we also lazily build some data structures in memory at the time we verify that relate to the frame

11:13

nine

It may be but my gut also says that it won't be worth it for a long while. That C code is pretty fast. Lots more important places to improve

jnthn

So if we do it speculatively on frames we don't need, we blow up our memory use

Even before considering we mmap the bytecode so we can demand-page it

11:45 pamplemousse joined 11:48 lichtkind_ left 11:57 MasterDuke left

pamplemousse

nine: Do you know where that directory file lives? I was on the path of to essentially duplicating the work for finding all used files, so finding and using that instead would be wonderful

12:07

nine

pamplemousse: a CompUnit::PrecompilationUnit has a dependencies method that gives you the full list

12:16

AlexDaniel

lizmat: it's not wrong

lizmat: unless I'm missing something

12:17

12:17 MasterDuke joined

lizmat

AlexDaniel:, yeah, just use ...^ if you don't want the last one

12:17

AlexDaniel

lizmat: … includes the last element that matched, …^ doesn't

pamplemousse

nine: Nifty, thanks

nine

I'm not sure we actually have a public API that will give you a PrecompilationUnit though...

12:19

lizmat

maybe we should :-)

nine

Well, the precompilation store's load-unit method returns one

12:20

Geth

nqp: e4db2f51f6 | (Daniel Green)++ | src/QRegex/Cursor.nqp
Some minor optimizations to NQP matching

Nowadays checking string equality is faster than nqp::ord, so get rid of an old optimization.
Also, CCLASS_WORD is the same was CCLASS_ALPHANUMERIC + '_', so just use that.

12:42

lizmat

MasterDuke: any visible improvements ?

12:45

MasterDuke

no, but i wasn't doing really extensive benchmarking

12:51 ggoebel_ joined 12:54 ggoebel left 12:57 sena_kun joined 12:58 Altai-man_ left 13:07 pamplemousse left 13:09 pamplemousse joined 13:59 ggoebel__ joined 14:01 ggoebel_ left 14:06 robertle joined 14:28 ggoebel_ joined 14:30 ggoebel__ left

[Tux]

Rakudo version 2020.02.1-357-g5610416c8 - MoarVM version 2020.02.1-153-g9bb7a1850

csv-ip5xs	0.946	-	0.949
csv-ip5xs-20	10.243	-	10.394
csv-parser	26.668	-	27.582
csv-test-xs-20	0.379	-	0.392
test	8.278	-	8.620
test-t	2.826	-	2.938
test-t --race	1.016	-	1.032
test-t-20	49.179	-	49.798
test-t-20 --race	13.982	-	14.084

14:39

14:43 robertle left 14:56 Altai-man_ joined 14:58 sena_kun left 16:00 hungrydonkey left 16:03 ggoebel__ joined 16:06 ggoebel_ left 16:20 ggoebel joined, ggoebel__ left 16:50 cognomin_ joined 16:53 ggoebel left 16:54 cognominal left 16:57 sena_kun joined 16:59 Altai-man_ left 17:13 maggotbrain left 17:32 cognominal joined 17:36 cognomin_ left 17:51 ggoebel joined 18:02 ggoebel_ joined 18:04 ggoebel left 18:56 Altai-man_ joined 18:58 sena_kun left

MasterDuke

i'm not really sure what TimToady meant here colabti.org/irclogger/irclogger_lo...-10-04#l14 "caching the first decision based on initial character would cut down the number of initial states we have to examine, for instance"

19:36

so i can't tell, is it still relevant?

19:37

lizmat

I guess in the case of:

19:49

foo | bar | baz

19:50

caching the first character somewhere would only need to examine "foo" if the first is an "f", and "bar" and "baz" if the first is a "b", and none if something else ?

but wheyer

19:51

but whether that is still relevant? I'm not sure

not a lot has changed in that code for the past 5 years afaik

but perhaps current speshing / inlining / JITting has made these caches more of a burden than of a boon

19:52

20:04 MasterDuke left 20:12 MasterDuke joined 20:13 ggoebel__ joined 20:16 ggoebel_ left

lizmat

m: dd ("c","c","c" ... *).head(10)

20:48

camelia

("c", "c", "c", "c", "c", "c", "c", "c", "c", "c").Seq

lizmat

m: dd ("c","c","c" ... "g").head(10)

camelia

("c", "c", "c", "d", "e", "f", "g").Seq

lizmat

m: dd (1,1,1 ... *).head(10)

camelia

(1, 1, 1, 1, 1, 1, 1, 1, 1, 1).Seq

lizmat

m: dd (1,1,1 ... 10).head(10)

camelia

().Seq

lizmat

these all feel very inconsistent

m: dd (1,1,1 ... 1).head(10)

20:49

camelia

(1,).Seq

lizmat

m: dd ("c","c","c" ... "c").head(10)

camelia

("c",).Seq

lizmat

at least these two last ones are consistent

20:50

20:56 pamplemousse left 20:57 sena_kun joined 20:58 Altai-man_ left

AlexDaniel

lizmat: which is why I say that we should define some easy to understand behavior first

21:10

instead of trying to DWIM in all cases in which case nobody has a clue what it actually does

21:11

sourceable6: <a b c>.kv()

sourceable6

AlexDaniel, github.com/rakudo/rakudo/blob/5610...t.pm6#L746

22:01 sena_kun left

Powered by Cro and the Raku® Programming Language.

Please report any issues / comments / feature requests as an issue on App::Raku::Log.

Thank you!