| IRC logs at
Set by AlexDaniel on 12 June 2018.
00:14 sena_kun joined 00:16 Altai-man left 01:11 AlexDaniel left 02:14 Altai-man joined 02:16 sena_kun left 02:53 guifa2 left 02:55 guifa2 joined 03:52 MasterDuke left 04:14 sena_kun joined 04:17 Altai-man left 05:27 leont joined
nwc10 good *, #moarvm 05:29
nine nwc10: that sounds like a fantastic idea! 05:44
06:14 Altai-man joined 06:16 sena_kun left 07:30 zakharyas joined 08:14 sena_kun joined 08:16 Altai-man left 08:35 domidumont joined 08:59 MasterDuke joined
jnthn nwc10: On the flag bit split: that only works if memory operations on an 8-bit region are atomic, and I think on some architectures sub-register-size writes are actually read+bit-twiddle+write or something. 09:15
nwc10 hmm. other than crays? 09:16
right now I can't spot a better way, other than aaaaaaaaaaaaaaaargh-atomic-ops 09:17
obsolete crays, that was
having *just* got to the end of figuring out what I could about the 13 used flag bits 09:18
I think we could get away with jsut using 8. Apart from this data race.
but it would mean overloading 4 bits to hav different meanings for gen2 and nursery
and squeezing all the type object/frame/stable/BORING into 2 bits
OK, and some microconrolers... 09:21
OK, I have several tabs to go through, but seems to be suggesting that "everything" "modern" that has byte access instructions can do it. With Alpha as the example of "not" 09:23
and possibly Cray K series *could* do it. It was 2 and 4 byte addressing that they could not (if it's K that I'm remembering correctly as the problem architecture), and this meant that you could not take pointers to *members* of socket structures, because unlike every other OS known to humanity, there, the socket structure fields were integer whatever-they-are-called with size constraints. bitfields? 09:25
also, if we can prove that libuv already assumes it, then we're golden ;-)
09:31 domidumont left
jnthn Ah, I guess what I said is true in so far as "that's how it'll work at the CPU cache level", but that's not going to be a semantic issue. 09:32
nwc10 yes, pretty sure what you said was true at cache level. Soemthing else can still get a "stale" read later
but seems that even the CPUs in some current arduinos can't do byte access. 09:33
But IIRC there was (at least) one microcontroller CPU wehre sizeof(long) == 1, because char was 32 bits
Geth MoarVM/MVM_malloc_trim-after-MVM_gc_collect_free_gen2_unmarked: 09812dfda4 | (Nicholas Clark)++ | src/gc/orchestrate.c
MVM_malloc_trim would be better after MVM_gc_collect_free_gen2_unmarked

MVM_gc_collect_free_gen2_unmarked can free memory (particularly "overflow" allocations for large objects), so the greatest chance of free memory at the top of the address space will be here.
nwc10 ooops. that wasn't on master. I shall neuralise it 09:41
09:47 domidumont joined 10:07 Altai-man joined, sena_kun left
Geth MoarVM/MVM_malloc_trim-after-MVM_gc_collect_free_gen2_unmarked: da23771762 | (Nicholas Clark)++ | src/gc/orchestrate.c
MVM_malloc_trim would be better after MVM_gc_collect_free_gen2_unmarked

MVM_gc_collect_free_gen2_unmarked can free memory (particularly "overflow" allocations for large objects), so the greatest chance of free memory at the top of the address space will be here.
MoarVM: nwc10++ created pull request #1335:
MVM_malloc_trim would be better after MVM_gc_collect_free_gen2_unmarked
nwc10 that's better. A Pull Request, because I'm not sure if I missed soemthing. 10:09
11:15 sena_kun joined 11:16 Altai-man left, zakharyas left
Geth MoarVM: da23771762 | (Nicholas Clark)++ | src/gc/orchestrate.c
MVM_malloc_trim would be better after MVM_gc_collect_free_gen2_unmarked

MVM_gc_collect_free_gen2_unmarked can free memory (particularly "overflow" allocations for large objects), so the greatest chance of free memory at the top of the address space will be here.
MoarVM: 03d3e43fa1 | (Jonathan Worthington)++ (committed using GitHub Web editor) | src/gc/orchestrate.c
Merge pull request #1335 from MoarVM/MVM_malloc_trim-after-MVM_gc_collect_free_gen2_unmarked

MVM_malloc_trim would be better after MVM_gc_collect_free_gen2_unmarked
timotimo wow, the very long branch name doesn't make my irc client very happy %)
11:36 MasterDuke left 11:53 patrickb joined
timotimo the "use MVM_alloc_array" thing could be done at any point, but of course it'll modify almost every file all over, so ideally i'd do it when no conflicts are to be expected for long-running branches like the dispatch chain one 12:15
12:26 domidumont left
lizmat And another Rakudo Weekly News hits the Net: 12:32
12:41 zakharyas joined 12:50 patrickb left
timotimo lizmat: where is the nwc change from? was that for mvm_malloc_trim_would_be_better_after_... branch? 12:52
nwc10 yes 13:00
timotimo that's more about optimizing deallocation i thought - just from the name 13:04
13:14 Altai-man joined
timotimo it's not actually important :D 13:16
13:16 domidumont joined 13:17 sena_kun left
lizmat timotimo: there wasn't a lot of other core developments this week 13:19
timotimo i'll be sure to mention when my mental state goes brrrrrrrrrrrrrrrrrrrr 13:20
13:26 lucasb joined
nwc10 I'm reducing the number of open tabs. This was interesting: -- Another consideration is that while CPU a may have byte load and store instructions, compiler isn't required to use them. A compiler, for example, could still generate the code Stroustrup describes, loading both b and c using 13:31
a single word load instruction as an optimization.
which would be like sparc gcc using a 64 bit load for two 32 bit struct members in a struct that it knwos has to be 8 byte aligned 13:32
(and going SIGBUS when the program/programmer is naughty and tried to cheat on alignment)
so I think here it won't matter, as the two (racing) flag bit swaps are in two functions. So won't be optimisable 13:33
oh, and one is inside a region with a mutex
13:42 domidumont left 13:56 domidumont joined 14:45 zakharyas left 15:14 sena_kun joined 15:17 Altai-man left 15:48 domidumont left 15:49 domidumont joined 16:39 zakharyas joined 17:00 MasterDuke joined 17:14 Altai-man joined 17:17 sena_kun left 17:31 Kaiepi left, Kaiepi joined 18:24 zakharyas left 18:27 vrurg_ is now known as vrurg 19:01 domidumont left 19:09 Kaiepi left 19:10 Kaiepi joined 19:14 sena_kun joined 19:16 Altai-man left 19:25 zakharyas joined
nwc10 sigh. the downside of continuing to use Perl 5 as a bootstrapping language is that one is still exposed to distros such as Fedora that like shipping a stripped down "perl" 19:31
aargh, our alignment is still fragile, 5 years later... 20:04
causing MVMCollectable to be the "wrong" size breaks arm, but not i386 20:05
aynway, at least I found my awkward bug.
20:39 zakharyas left
Geth MoarVM/flags-split: 7cb2332fe2 | (Nicholas Clark)++ | 23 files
Split `flags` in struct MVMCollectable into `flags1` and `flags2`.

This permits us to avoid a data race when one thread needs to set MVM_CF_HAS_OBJECT_ID and a second needs to set MVM_CF_REF_FROM_GEN2.
For now, don't change the flag bit values used in the two new members.
MoarVM/flags-split: 7728b30187 | (Nicholas Clark)++ | src/6model/6model.h
Shrink `flags1` and `flags2` in struct MVMCollectable down to MVMuint8.

Change the values of the MVM_CF_* flags to sit in the range 1-128.
nwc10 there will be a proper pull request tomorrow. But there might be some fun, as I found that I had to do this: 20:56
because it turns out that the ops p6setfirstflag and p6takefirstflag (ab)use `flags` in MVMThreadContext to set a "private" flag bit
obviously not "pick a number and hope that it works", but it isn't "officially" allocated or any sort of documented acceptable API being used here 20:58
and I realised that I'm a bit surprised that this isn't all being done with attributes attached to the MVMObject
because surely *that* would be easier to spesh and JIT
(fixing that isn't a blocker on this - I can see how to do the rakudo stuff with C #define ugliness) 21:03
jnthn The C extops in Rakudo are meant to be going away (though they're making a slow job of doing so :)) 21:06
So ugliness is tolerable in that it should be at least somewhat temporary
nwc10 the ugnliness plan was
1) first patch rakudo to do something like that diff, but as #if/#else/#endif 21:07
2) change MoarVM
(With a #define to trigger the #else side of rakudo)
3) remove the #if...#else part and the #endif
4) clean up MoarVM to remove the #define
and then 21:08
5) some point later the rakudo folks get rid of the C
that branch works but I've not yet tried to provoke the race condition
or built that branch with tsan 21:09
and it "has" to work for any platform conformat with the C++11 memory model. (If I understand all the jargon correctly)
and to a CPU with 64 byte cache lines - modifying 1 byte isn't massively different from modifying just 8.
that was the best explanation of all this stuff.
the cache management hardware has to be (Sort of) doing atomic read-modify-swap at the cache line level, whether you are writing 1 byte or 8. (or any other supported size) 21:10
21:14 Altai-man joined 21:16 sena_kun left 21:33 raku-bridge left, raku-bridge joined, raku-bridge left, raku-bridge joined 21:45 raku-bridge left, raku-bridge joined, raku-bridge left, raku-bridge joined 21:47 raku-bridge left 21:49 raku-bridge joined 21:52 raku-bridge left 21:55 raku-bridge joined, raku-bridge left, raku-bridge joined 23:05 leont left 23:14 vrurg left 23:15 vrurg_ joined