Welcome to the main channel on the development of MoarVM, a virtual machine for NQP and Rakudo (moarvm.org). This channel is being logged for historical purposes. Set by lizmat on 24 May 2021. |
|||
00:06
reportable6 left
01:09
reportable6 joined
|
|||
timo | japhb: but isn't over-conservative the opposite of what's happening here? | 02:21 | |
02:28
frost joined
|
|||
japhb | timo: Ugh, English. :-/ There are two ways to interpret "conservative" here: 1. Assume only provably wrong code is wrong, conservatively assuming the warning checker is lacking sufficient nuance and would otherwise over-warn; 2. Assume only provably correct code is correct, and conservatively warn of ambiguity any time the code roughly matches the warning situation. | 02:45 | |
I had meant it in the second sense. | |||
timo | haha | 02:52 | |
it's a very inflammable language | |||
moon-child | let us burn it, then, and start fresh | 03:06 | |
04:41
squashable6 left,
notable6 left,
linkable6 left,
sourceable6 left,
benchable6 left,
quotable6 left,
releasable6 left,
coverable6 left,
shareable6 left,
tellable6 left,
bisectable6 left,
bloatable6 left,
nativecallable6 left,
unicodable6 left,
statisfiable6 left,
committable6 left,
evalable6 left,
reportable6 left,
greppable6 left,
sourceable6 joined,
benchable6 joined
04:42
bloatable6 joined,
quotable6 joined,
evalable6 joined,
reportable6 joined
04:43
releasable6 joined
05:41
notable6 joined
05:42
committable6 joined
05:43
coverable6 joined,
greppable6 joined,
linkable6 joined
05:44
nativecallable6 joined,
statisfiable6 joined,
tellable6 joined
06:08
reportable6 left
07:08
linkable6 left,
evalable6 left
07:09
evalable6 joined
07:10
reportable6 joined
08:42
bisectable6 joined
08:43
unicodable6 joined,
shareable6 joined
|
|||
Nicholas | ah OK, if "burn it, then, and start fresh" is the plan, apparently we now go with: bona universala saluttempo #moarvm | 08:44 | |
but apparently Esparanto is too irregalar for some, so maybe it should be xamgu munje prami tcika | 08:48 | ||
(Google translate can't do Lojban, so I have no idea if those make any real sense) | 08:49 | ||
09:10
linkable6 joined
|
|||
MasterDuke | pretty sure timo knows lojban | 09:34 | |
timo | aye | 09:50 | |
esperanto is in a way "the php of conlangs" :P | |||
i once made someone *very* angry | |||
MasterDuke | gist.github.com/MasterDuke17/ea694...8f446cf4ce is what i have so far for mimalloc. still in the situation where rakudo builds, but linking the runner complains `/usr/bin/ld: inst-rakudo.o: in function 'MVM_malloc': /home/dan/Source/perl6/install/include/moar/core/alloc.h:2: undefined reference to 'mi_malloc'` | 09:58 | |
adding -lmimalloc makes the rakudo linking complete, but then ldd shows inst-rakudo depending on /usr/lib/libmimalloc.so.2.0 instead of the statically included libmimalloc.a i built+used for moarvm | 10:31 | ||
timo | do you have like output from make that shows all the commands it issues? like no-silent-build or what it's called? | 10:41 | |
MasterDuke | this step succeeds: gcc -c -DSTATIC_EXEC_PATH='/home/dan/Source/perl6/install/bin/rakudo' -DSTATIC_NQP_HOME='/home/dan/Source/perl6/install/share/nqp' -DSTATIC_RAKUDO_HOME='/home/dan/Source/perl6/install/share/perl6'Ā Ā -std=gnu99 -Wextra -Wall -Wno-unused-parameter -Wno-unused-function -Wno-missing-braces -Werror=pointer-arith -Werror=vla -O3 | 10:54 | |
-DNDEBUG -g3Ā -D_REENTRANT -D_FILE_OFFSET_BITS=64 -fPIC -DDEBUG_HELPERS -DMVM_DTRACE_SUPPORT -DHAVE_TELEMEH -DMVM_HEAPSNAPSHOT_FORMAT=3 -march=native -D_GNU_SOURCE -I'/home/dan/Source/perl6/install/include' -I'/home/dan/Source/perl6/install/include/moar' -I'/home/dan/Source/perl6/install/include/libuv' | |||
-I'/home/dan/Source/perl6/install/include/libatomic_ops' -I'/home/dan/Source/perl6/install/include/libtommath' -I'/home/dan/Source/perl6/install/include/dyncall' -I'/home/dan/Source/perl6/install/include/mimalloc' -o inst-rakudo.o src/vm/moar/runner/main.c | |||
this next step fails: gcc -o inst-rakudo -O3 -DNDEBUG -g3Ā -Wl,-rpath,"//home/dan/Source/perl6/install/lib"Ā Ā inst-rakudo.o -L"/home/dan/Source/perl6/install/lib"Ā -lmoar | |||
but mostly afk for a bit | 10:55 | ||
11:14
sena_kun joined
|
|||
timo | ok would be interested to see how libmoar.so or whatever is built i guess? | 12:00 | |
12:07
reportable6 left,
reportable6 joined
|
|||
MasterDuke | timo: gist.github.com/MasterDuke17/ea694...-moar-make | 12:10 | |
timo | ok, i wonder if nm would show the mimalloc functions in libmoar.so or maybe some other file is responsible for handling -l like a .la or so? it's been a while since i had to dig this deep haha | 12:14 | |
oh, could it possibly have something to do with -rpath and friends? | 12:16 | ||
oh it could also be that src/vm/moar/runner/main.c simply doesn't use malloc at all? | 12:17 | ||
it does tho. just directly malloc, not MVM_malloc | |||
hum. and putting blah.a in a gcc linker commandline should be roughly equivalent to putting all the .o files that were made to build that .a file i think? | 12:21 | ||
12:41
squashable6 joined
|
|||
MasterDuke | libmoar.so has a whole bunch of mi_* symbols | 14:24 | |
nine | including mi_malloc? | 14:40 | |
MasterDuke | yep | 14:47 | |
15:02
Kaiepi joined
|
|||
MasterDuke | i don't quite understand why the changes in github.com/MoarVM/MoarVM/pull/1402/files work, but the ones i've done for mimalloc don't (even though i tried to copy them from that gmp pr) | 15:18 | |
and just to be sure i double checked with ldd and the libmoar from the gmp branch is *not* referencing a system gmp | 15:20 | ||
and neither is the inst-rakudo | |||
15:26
squashable6 left
16:29
squashable6 joined
17:56
sena_kun left
18:07
reportable6 left
18:09
reportable6 joined
20:35
linkable6 left,
evalable6 left
20:36
linkable6 joined
20:37
evalable6 joined
|
|||
dogbert17 | can anything useful be gleaned from the following gist? gist.github.com/dogbert17/6e366928...391550f08f | 20:55 | |
Nicholas | GC panics are not anything I know how to diagnose | 20:59 | |
timo | but what is the panic message? | 21:00 | |
Nicholas | (I read it. You are not alone, but I can't help) | ||
timo | certainly we're not panicing with just a null pointer instead of a bit of text? | ||
dogbert17 | timo: MoarVM panic: Invalid owner in item added to GC worklist | 21:15 | |
timo | do you perhaps have that caught in rr? then you could watch and reverse-continue to see what writes the bogus value, which is probably "free uncopied" or some allocation function from the gc, and then further back to see what the original object was | 21:24 | |
dogbert17 | :-) | 21:30 | |
nine | rr really is a game changer for these kinds of issues | 21:32 | |
dogbert17 | if you can run it | 21:33 | |
nine | You can't? | ||
dogbert17 | it's not supported under Virtualbox as far as I can remember | 21:34 | |
perhaps I can entice one of you to take a look instead :) | 21:36 | ||
this is another one of where I've set MVM_GC_DEBUG=3 | 21:37 | ||
s/of// | 21:38 | ||
MasterDuke | timo: any interesting in turning zstd into a submodule? maybe if you do that i can see what you did and figure out my mimalloc problem | ||
timo: another project idea. a new profile type of sqlite, where instead of storing all the data in memory and then writing out a sql file, directly use the sqlite api to create a database and do inserts to it on the fly | 21:50 | ||
timo | i've shied away from that, since nativecall from nqp seems just a little .. interesting :) | 22:35 | |
MasterDuke | why nativecall instead of just implementing in moarvm? it could be a backend-specific option | 22:51 | |
i.e., make sqlite a new 3rdparty module | |||
timo | a little odd maybe. we'd definitely want to load sqlite3 only when it's actually used | 22:52 | |
also, some of the data we'll still have to keep around .. but i guess not that much | |||
MasterDuke | i was thinking have moarvm do the inserts at profile collection time, not all at once at the end | ||
japhb | And those inserts would need to be batched -- sqlite (like many SQL engines) does not perform well with a great many tiny inserts. | 22:53 | |
MasterDuke | sure, do it at gc runs or something like that | ||
timo | at gc runs is when we do the collecting already anyway | 22:54 | |
wait | |||
i was thinking heap snapshots the whole time | |||
MasterDuke | could do those too, but i'm thinking of regular profiles | 22:55 | |
japhb | MasterDuke: I can see snapshotting the list at GC time, but not actually doing the SQL writes during stop-the-world; making GC pauses longer would make profiling hard to use for some usecases. | 22:56 | |
timo | kicking off the writing to sql on a separate thread can also very well interfere with performance measurement | 22:57 | |
japhb | True .... | ||
No winning | |||
GAH, our GC pause time drives me nuts but I haven't the mental cycles available to shrink that STW time. | 22:59 | ||
timo | got any ideas? :) | ||
japhb | I mean, the old school solution was to mark during STW, and sweep concurrently, and then optimize the heck out of the mark phase ... but we don't use a mark-and-sweep GC and never have. | 23:01 | |
We could get less exact about finding garbage, by doing something like card/page marking, in order to do all GC marking writes in the CPU cache. | |||
That would be way less space-efficient though. | 23:02 | ||
timo | haven't heard that yet | 23:03 | |
japhb | We could use processor virtual memory tricks, but those aren't super easily portable I think. (Though I'm sure someone has figured out how to fake some of it with mmap and mprotect shenanigans) | ||
timo: Which, card marking? | |||
timo | yeah, though page marking also not sure | 23:05 | |
japhb | IIRC (and it's been a while since I read The Garbage Collection Handbook), the idea is that instead of marking individual objects, you mark by memory blocks, saying "This (1K or 4K or 64K or whatever) block contains at least one live object". And then instead of marking on the data structures themselves, you mark on a bitmap that is mapped back to your whole heap, on a 1- or 2-bits-per-block basis. So now | 23:09 | |
all the marking writes happen in that bitmap, which is compact and probably fits in cache. Separate the heaps (and the mark bitmaps) by CPU core, and suddenly you have no-contention, very compact mark phases. | |||
BUT: It's very easy to end up with memory that looks like Havarti cheese. | |||
And we've generally biased in favor of keeping main memory more compact, rather than GC overhead memory. | 23:10 | ||
TGCH has a few improvements on that, I believe -- like ability to skip clearing the mark bitmap ever, by redefining the bit patterns at each GC instead. | 23:11 | ||
timo | ah, hehe | 23:12 | |
like with three colored approaches or so | |||
japhb | Right | ||
timo | i wonder how we could win something for stuff like "out of these last 100 objects, fifty were pointers to the Int type object" and making that faster | 23:16 | |
but that's kind of already a thing the cpu cache can do | |||
japhb | Well, I suppose we could put all type objects, HOWs and REPRs in a single pool which we consider "always live, never recurse" and special case that memory block in the GC | 23:26 | |
Or at least the HOWs and REPRs. | 23:27 | ||
(Because I can see someone try to DOS that pool by constantly making trivially new types) | |||
moon-child | probably not 'always live', but assume it's live except for some rare major gc | 23:28 | |
japhb | moon-child: You're thinking throughput GC though. I see "major gc" and see "sudden latency spike". | 23:30 | |
moon-child | when do we care about latency? | 23:31 | |
web applications? Do it during downtime | |||
video games? Do it on a loading screen | |||
there was some bank application that GCed once per day (or was it once per week?) | 23:32 | ||
japhb | Although I would 100% love the ability to say "Please do a major GC right *now*, and not otherwise." | ||
moon-child: Because right now we can't actually control when a major GC happens as opposed to a minor one. We can only request that *a* GC happen. | |||
moon-child | sure. It doesn't seem particularly hard to add | 23:33 | |
23:33
evalable6 left,
linkable6 left
|
|||
moon-child | compared with implementing the mechanism | 23:33 | |
japhb | For web apps, thread recycling after NN requests is certainly a reasonable thing, when you have almost all varying memory in per-thread pools | ||
timo | nah, making new types is common enough so that we want old types to be easily reclaimable | 23:34 | |
japhb | For video games, it can be hard because there can be a LOT of frames and corresponding network traffic before the next loading screen. | ||
moon-child | how many types are you creating during those frames? | ||
japhb | I think we're mixing two threads of conversation here. | 23:35 | |
timo | you're suggesting to make that a user-configurable thing at runtime? | ||
japhb | I'm not creating tons of types during gameplay. But MoarVM's GC has to support all the use cases. | ||
timo: There's precedent. Before it got really good, it was apparently not unusual to select the appropriately tuned GC for each JVM process launch | 23:36 | ||
timo | have a good spot to run gc like at the start of "the main loop" when all the variables from the last go around the loop have gone out of scope and barely any new ones have been filled yot | ||
yeah | 23:37 | ||
moon-child | sure. All I'm saying is that, given a choice between: 1) certain objects are always assumed to be live; and 2) certain objects are assumed to be alive _except_ for the purpose of some major gc which is going to take a long time to run anyway, I prefer option 2 | ||
timo | we already have the inter-generational set which kinda does what number 2 is | ||
moon-child | timo: related fun thing would be to tune nursery size to the amount you allocate over the course of a frame | ||
japhb | Actually, I can see an option 3: Be able to request a GC of *only* a single pool, and rotate which pool gets GC'd. | 23:38 | |
moon-child: nodnod | |||
MasterDuke | re kicking off a separate thread impacting perf measurements and not wanting to make gc pauses longer, i'm by no means certain this idea would work at all, but i wonder if it might make sense as sort of an alternate profiler (like the discussion about having alternate gcs). one that trades something (e.g., performance, latency, whatever) for | 23:49 | |
reduced memory use, so we can profile something like the rakudo compilation without requiring a 64g swapfile in addition to my 32g ram | |||
off to bed, but interesting discussion about gcs. i feel 20 years ago we wouldn't have had so many good examples to steal from, but .net and jvm have done some really cool stuff in the recent past | 23:51 | ||
timo | right, the profiler's main data usage is probably from the call graph, right? | 23:54 | |
can we somehow easily figure out what hasn't been touched in a while? | |||
in theory we could "swap" them out with like zstd | |||
especially compilation literally has these "phases" | 23:57 |