github.com/moarvm/moarvm | IRC logs at colabti.org/irclogger/irclogger_logs/moarvm Set by AlexDaniel on 12 June 2018. |
|||
00:01
anatofuz joined,
anatofuz left
00:29
anatofuz joined
|
|||
timotimo | uuuuuh | 00:30 | |
looking at the presentation on libmesh | |||
cool trick they have | |||
00:51
anatofuz left
00:53
anatofuz joined
00:54
Kaiepi left
00:58
Kaiepi joined
|
|||
timotimo | #define voidptr void *)(intptr_t | 01:02 | |
wow. | |||
TLDR: pages of the same size are regularly scanned to see if overlapping both has no conflicts. if that's the case, all but one of the matching pages are made read-only, everything copied over, and finally the virtual addresses from the old pages are mapped to point at the page containing everything, and finally the physical memory can be freed | 01:07 | ||
no pointers have to be updated, because the virtual addresses stay correct | 01:08 | ||
www.youtube.com/watch?v=c1UBJbfR-H0 - the talk i watched | 01:16 | ||
01:32
anatofuz left
01:54
anatofuz joined
03:49
anatofuz left
03:50
anatofuz joined
03:55
anatofuz left
04:10
anatofuz joined
04:14
anatofuz left
04:20
anatofuz joined
05:24
anatofuz left,
anatofuz joined
05:29
anatofuz left
05:40
robertle left
05:58
anatofuz joined
06:00
anatofuz left
06:02
domidumont joined
06:04
domidumont left
06:10
anatofuz joined
06:14
anatofuz left
06:15
harrow left
06:21
harrow joined
06:32
domidumont joined
06:35
anatofuz joined
|
|||
nine | If I read this code correctly, MVM_gc_gen2_transfer just adds the pointers to the source thread's pages to the destination threads' bins. But then MVM_gc_gen2_destroy will still free those pages. | 06:36 | |
The overflows (objects that are too large for the size bins) are not transferred at all. | |||
06:40
anatofuz left
|
|||
nine | Looks like that whole function got just a single fix after the initial commit which is just called "progress" | 06:43 | |
06:48
anatofuz joined
06:53
anatofuz left
06:54
sena_kun joined
06:58
anatofuz joined
07:03
anatofuz left
|
|||
nwc10 | That's definiately a LTA commit message. | 07:08 | |
I hope that there's an innocent reason, such that it was on a branch that was meant to get rebased and reviewed, and that one got missed | 07:09 | ||
07:14
anatofuz joined
|
|||
nine | It's from January 2013. Commit messages were rather basic back then and diakopter's even more so | 07:16 | |
07:20
anatofuz left
07:21
anatofuz joined
07:22
anatofuz_ joined,
anatofuz_ left
|
|||
nine | Oh, my reading was indeed wrong. The bins are cleared at the end of MVM_gc_gen2_transfer, so actually no pages get freed: MVM_free(gen2->size_classes[bin].pages); gen2->size_classes[bin].pages = NULL; | 07:23 | |
But this still leaves the overflows | |||
And indeed the STable in question gets freed by MVM_free(al->overflows[j]); in MVM_gc_gen2_destroy | 07:24 | ||
A bit ironic: in collect.c there's this comment regarding overflows: /* Dead over-sized object. We know if it's this big it cannot be a type object or STable, so only need handle the simple object case. */ | 07:25 | ||
07:26
anatofuz left
|
|||
nine | Oh, could be that I'm not actually looking for an STable but for the meta class object | 07:28 | |
07:39
anatofuz joined
07:46
anatofuz left,
anatofuz joined
|
|||
nine | I think implementing transfer of the overflows has improved things, but there are still other issues that are now croping up | 07:47 | |
07:49
zakharyas joined
07:59
anatofuz left
08:16
anatofuz joined
08:17
domidumont left
|
|||
Geth_ | MoarVM: 366a7b7a3e | (Stefan Seifert)++ | src/6model/6model.c Add missing write barrier to MVM_6model_get_how |
08:18 | |
MoarVM: fd2659ed57 | (Stefan Seifert)++ | src/6model/containers.c Fix possible access to fromspace in code_pair_configure_container_spec Allocating the fetch string may cause st to move. This could lead us to overwriting unrelated memory. Fix by MVMROOTing st. |
|||
MoarVM: 6866683276 | (Stefan Seifert)++ | src/gc/collect.c Fix segfaults in GCDEBUG_LOGs |
|||
lizmat | nine+++++ | 08:19 | |
08:19
domidumont joined
|
|||
Geth_ | MoarVM: aa2270c760 | (Stefan Seifert)++ | src/gc/gen2.c Fix overflow objects getting freed prematurely on thread exit While we already transferred live objects from the exiting thread's gen2 size bins, we did not do the same for the overflows (objects too large for the size bins). These objects where then freed by MVM_gc_gen2_destroy. Fix by implemeting transfer of those overflows. |
08:20 | |
MoarVM: d52282b201 | (Stefan Seifert)++ | src/6model/parametric.c Fix possible deadlock in finish_parameterizing finish_parameterizing can trigger garbage collection while holding the mutex_parameterization_add lock. So when multiple threads create mixins it's possible for one thread to wait for the lock while another one who holds it triggers GC, but GC will wait for the thread that waits for mutex_parameterization_add which is a classic dead lock. Fix by marking the thread as blocked while waiting for mutex_parameterization_add. Since now the GC may run while we're waiting for the lock, we need to MVMROOT parameters and parametric_type even for the call to MVM_6model_parametric_try_find_parameterization. |
08:27 | ||
08:27
travis-ci joined
|
|||
travis-ci | MoarVM build canceled. Stefan Seifert 'Fix overflow objects getting freed prematurely on thread exit | 08:27 | |
travis-ci.org/MoarVM/MoarVM/builds/586901984 github.com/MoarVM/MoarVM/compare/6...2270c76087 | |||
08:27
travis-ci left
|
|||
nine | More fixes coming up... It's funny how many issues can be brought up by a shell one liner :) | 08:28 | |
08:40
anatofuz left
08:41
anatofuz joined
08:45
anatofuz left,
anatofuz joined
|
|||
Geth_ | MoarVM: bcc785d23b | (Stefan Seifert)++ | src/gc/gen2.c Fix overflow objects still belonging to exiting thread Need to update the owner when transfering overflow objects from an exiting thread. |
08:49 | |
MoarVM: ea9b29e850 | (Stefan Seifert)++ | src/gc/orchestrate.c Fix destroyed thread context being accessable through the thread list MVM_ASSERT_NOT_FROMSPACE goes through the list of threads and checks each one's nursery. Since destruction of a thread is done when the world is no longer stopped, it possible that a thread context is already destroyed but still referenced through the thread list, causing MVM_ASSERT_NOT_FROMSPACE to access freed memory. Fix by first removing the pointer and only then destroying the tc |
09:11 | ||
09:13
anatofuz left
09:15
anatofuz joined
09:18
anatofuz left,
anatofuz joined
09:20
anatofuz left
|
|||
nine | With this my test case is very stable! Can run it with a 1000 threads and it will succeed most of the time while initially 10 threads were enough to cause it to explode quite reliably. | 09:25 | |
I can occasionally reproduce a: MoarVM panic: Collectable 0x7fffe006f5f0 in fromspace of thread 44080720 accessed | 09:26 | ||
09:28
anatofuz joined
|
|||
jnthn | nine++ # great finds | 09:33 | |
nine | Oh...of course. While my last commit improved things, it's not actually a fix. The thread could still have been destroyed after another thread read cur_thread->body.tc. With 1000 threads on the scheduler, some may have to wait quite some time | ||
jnthn: ohai! Why do we actually destroy threads when the world is already running? | 09:34 | ||
jnthn | nine: Dunno, maybe because in theory we evacuated everything from them at that point, so think we can get away with not having the world stopped? | 09:36 | |
nine | Evacuation is also done only when the world is already running: github.com/MoarVM/MoarVM/blob/mast...ate.c#L220 | 09:37 | |
Though since evacuation involves only the exiting thread and the one doing the evacuation, this should not lead to any explosions. | 09:38 | ||
jnthn | Hm, so the comment there is inacurate? | 09:39 | |
/* Reset GC status flags. This is also where thread destruction happens, | |||
* and it needs to happen before we acknowledge this GC run is finished. */ | |||
nine | Well, those threads do not look very stopped to me: gist.github.com/niner/d8149c397a3a...d22194e8a8 | 09:40 | |
jnthn | Hm, indeed | 09:41 | |
And yeah, reading through the code, I see we aren't waiting on any condvar | 09:43 | ||
The last place we do so is some lines back | |||
So, naughty misleading comment, I think | 09:44 | ||
nwc10 | bad programmer, more coffee | ||
jnthn | (Caveat: I lay awake most of the night because my throat was too sore to sleep, so I'm not liable to be the most use today...) | ||
nwc10 | :-( | ||
09:57
domidumont left
10:04
anatofuz left
|
|||
nine | And I don't know enough about GC thread synchronization to add the waiting | 10:06 | |
10:09
anatofuz joined
|
|||
nine | Well, I have now managed by re-using cond_gc_intrays_clearing which at that point should be guaranteed to be unused. | 10:11 | |
Or not, because I forgot to re-set the gc_intrays_clearing flag, so no one's actually waiting. That means that I cannot just re-use the thing, because re-setting that flag would be racy. | 10:18 | ||
jnthn | nine: I wonder if we can move that code back before that step...hmm | 10:21 | |
nine | jnthn: you mean destroy the thread before we processed intrays? That'd scare me :) | ||
jnthn | Hmm, yeah, the sequencing is actually important | ||
You might need an extra condvar | 10:22 | ||
That said... | |||
We're working around a thread being discoverable in the thread list while it's getting destroyed | |||
iiuc | |||
And perhaps we should solve *that* problem instead of making GC sync more costly | |||
I thought we already had a multi-step thread destruction sequence (spread over multiple GC runs) to help with this | 10:23 | ||
So perhaps if it's in a late enough state, then we know it can't have anything in its in-tray | |||
timotimo | for STables iirc? | ||
jnthn | Yes, but also for threads | ||
That's why there's phases like "clearing nursery" | |||
timotimo | ah, OK | 10:24 | |
jnthn | If there is - or if we add - a phase that's like "everything's done except destroying the thread", then in principle it'd have no chance of participating in the in-try clearing process, and so we could deal with it in the world stopped | ||
nine | FWIW the extra condvar seems to have done the trick: the thing seems quite stable now | 10:27 | |
jnthn | Yeah, I guess that's workable, but costs something | 10:28 | |
So we might want to see if we an find an alternative later | |||
timotimo | maybe with more stable thread destruction we'll put in "destroy worker threads if there have been enough for an hour or so" | 10:29 | |
nine | The thing just won't fail anymore :) | 10:36 | |
timotimo | i'll have to finally implement that "stash profiler data away when a thread gets destroyed" thing | 10:37 | |
Geth_ | MoarVM: 329f669c35 | (Stefan Seifert)++ | 3 files Really fix race condition with exited threads still in the thread list While commit ea9b29e8506820598d9fc209898f79d43bb5138d improved things, there was still a race condition where the thread context may have been destroyed between another frame checking for its existence and accessing it. Fix by adding another cond var signalling the GC run's completion. Smarter structuring of the code and employment of the different GC phases may render this expensive cond var obsolete in the future. For now we at least gain stability. |
10:51 | |
nine | I guess running non-reeantrant code on the ThreadPoolScheduler is a bad idea? | 11:12 | |
11:14
zakharyas left
|
|||
timotimo | if you only run one instance of it it could be fine :) | 11:15 | |
nine | Well I don't. Idea was to attach one Inline::Perl5 instance to each thread and use that to run Perl 5 code in a cro application. With today's fixes, I can get this to work when running 50 threads via Thread.start but with the thread pool, I assume that individual tasks may be interrupted and a switch to a different task may occur on the same thread. | 11:21 | |
timotimo | right | ||
you'll want your own awaiter or a different thread pool or whatever | |||
11:30
anatofuz left
11:33
anatofuz joined
|
|||
nine | If tasks get interrupted, why does RAKUDO_MAX_THREADS=1 fix things? Shouldn't that actually make things worse? | 11:38 | |
nwc10 | nine: IIRC "you" can remap which OS thread is associated with an ithread (OK, strictly MULTIPLICITY) and I think that mod_perl2 does this for the threaded backend. | 11:42 | |
I can't promise you that this is sane | |||
but I think it does mean that an m:n mapping of Perl 5 interpreters to OS threads is possible | 11:43 | ||
nine | nwc10: isn't that what PERL_SET_CONTEXT(my_perl) is about? | ||
nwc10 | it might even be *necessary* here, because I could see that it might make the glue implementation easier if one is getting [INTERRUPT] | ||
yes, I think that it is | |||
but I think that only mod_perl was nuts enough to use it | 11:44 | ||
... easier if one is in MoarVM and getting soemthing that schedules a MoarVM context onto "A thread from the MoarVM pool" and then that code realises that it needs to enter Perl 5 | |||
but I'm probably repeating something you already know? | 11:45 | ||
nine | I already use PERL_SET_CONTEXT extensively. Without that even in non-threaded programs, running multiple interpreters will cause explosions | 11:48 | |
nwc10 | OK | 11:50 | |
11:50
anatofuz left
11:56
anatofuz joined
12:00
anatofuz left
12:03
domidumont joined
12:18
anatofuz joined
|
|||
jnthn | nine: Thread pool threads will only yield control on an `await`; if you're just calling into Perl 5 code and staying in it, that shouldn't happen (however, it is the case that you can't know which thread you're on in the first place) | 12:20 | |
nine: I wonder if a better design is to maintain a pool of Perl 5 worker threads, and dispatch work to those. So they sit in a loop pulling from a Channel, say, and running the thing in question and then keeping a `Promise` with the result. | 12:21 | ||
nine | Ok, that refutes my hypothesis why it breaks on the thread pool while it works with Thread.start | ||
jnthn | (So you'd send the cdoe and the Promise) | ||
code | |||
.oO( a cdoe! a cdeer! a sigsegv deer ) |
|||
Note that if you're emitting into a `Supply` then that can use `await` if there's a logjam downstream | 12:24 | ||
nine | I don't. There shouldn't be any Supplies or Channels or awaits involved. It just runs my $schema = p5.invoke('Atikon::DB::Timemngt', 'connect', 'dbi:Pg:database=timemngt'); $schema.resultset('Contact').count.say; | 12:26 | |
jnthn | That looks innocent enough...though how do you obtain p5? | ||
nine | With p5 being: sub p5() { my role Perl5Interpreter { has $.p5 = Inline::Perl5.new; } my $thread = $*THREAD; unless $thread.does(Perl5Interpreter) { $thread does Perl5Interpreter; $thread.p5.use('Atikon::DB::Timemngt'); } $thread.p5 } | ||
jnthn | Hmm | 12:27 | |
Don't see anything obviously wrong there | |||
nine | The thread pool is innocent after all! | 12:34 | |
I managed to reproduce similar issues by adding a sleep rand /3; to the start of the threads in the Threads.start version | 12:35 | ||
13:08
zakharyas joined
|
|||
Geth_ | MoarVM: 47cd6c664b | (Ben Davies)++ | src/6model/reprs/CStr.c Fix the CStr REPR's copy_to function Its code was copy-pasted from CPointer with no changes whatsoever. This changes it so it actually does what it's supposed to be doing. |
13:10 | |
MoarVM: 270f1b0beb | niner++ (committed using GitHub Web editor) | src/6model/reprs/CStr.c Merge pull request #1132 from Kaiepi/cstr Fix the CStr REPR's copy_to function |
|||
13:33
lucasb joined
14:35
anatofuz left
14:36
anatofuz joined
14:43
anatofuz left
14:45
zakharyas left
15:16
anatofuz joined
15:17
domidumont left
15:37
domidumont joined
15:49
anatofuz left
15:50
domidumont left
|
|||
timotimo | gist.github.com/timo/82c4baddc7a08...7a989cee40 | 15:51 | |
so much better than before | |||
jnthn | timotimo++ | 15:55 | |
timotimo | the view of "all references this object has" also wants an improvement for the case when it's pages and pages worth of output | 15:57 | |
16:01
zakharyas joined
16:08
domidumont joined
|
|||
timotimo | added another file for "show"; just the data::dump::tree output of the structure i'm considering | 16:08 | |
ugh, it looks bad on gist, the lines don't connect, and of course the color's missing, too | 16:09 | ||
lizmat | perhaps looks better in "raw" mode ? | 16:11 | |
timotimo | gist.githubusercontent.com/timo/82...tfile2.txt | ||
much better alread | |||
y | |||
still worse than in the terminal | 16:12 | ||
it must be mixing different fonts or something | |||
16:15
domidumont left
17:00
patrickb joined
17:34
domidumont joined
17:45
anatofuz joined
17:46
zakharyas left
18:04
chloekek joined
18:19
anatofuz left
18:46
MasterDuke joined
18:55
anatofuz joined
18:57
domidumont left
19:00
Ven`` joined
19:29
anatofuz left
19:35
chloekek left
20:30
anatofuz joined
21:15
sena_kun left
21:34
anatofuz left
21:47
patrickb left
21:50
anatofuz joined
|
|||
timotimo | cdn.discordapp.com/attachments/614...nknown.png - moarperf right now displays three columns worth of different collectables, clicking on each number will navigate that column to that object | 22:05 | |
changing the number in the input field lets you jump directly to a number | 22:06 | ||
not completely satisfied yet | |||
MasterDuke | it's a little odd looking that the boxes/numbers aren't aligned | 22:09 | |
timotimo | yeah, it's not terribly easy to get that kind of thing to work properly | 22:11 | |
this was using flexbox | |||
22:12
anatofuz left
22:13
anatofuz joined
22:15
Ven`` left
22:16
Ven`` joined,
Ven`` left
22:17
anatofuz left
22:36
Kaiepi left
22:37
Kaiepi joined
22:38
Kaiepi left
22:39
Kaiepi joined
22:48
anatofuz joined
23:02
anatofuz left
23:03
anatofuz joined
23:45
lucasb left
23:57
anatofuz left
|