github.com/moarvm/moarvm | IRC logs at colabti.org/irclogger/irclogger_logs/moarvm
Set by AlexDaniel on 12 June 2018.
nwc10 good *, #moarvm 05:29
nine nwc10: that sounds like a fantastic idea! 05:44
jnthn nwc10: On the flag bit split: that only works if memory operations on an 8-bit region are atomic, and I think on some architectures sub-register-size writes are actually read+bit-twiddle+write or something. 09:15
nwc10 hmm. other than crays? 09:16
right now I can't spot a better way, other than aaaaaaaaaaaaaaaargh-atomic-ops 09:17
obsolete crays, that was
having *just* got to the end of figuring out what I could about the 13 used flag bits 09:18
I think we could get away with jsut using 8. Apart from this data race.
but it would mean overloading 4 bits to hav different meanings for gen2 and nursery
and squeezing all the type object/frame/stable/BORING into 2 bits
OK, and some microconrolers... 09:21
OK, I have several tabs to go through, but stackoverflow.com/questions/467210...-to-memory seems to be suggesting that "everything" "modern" that has byte access instructions can do it. With Alpha as the example of "not" 09:23
and possibly Cray K series *could* do it. It was 2 and 4 byte addressing that they could not (if it's K that I'm remembering correctly as the problem architecture), and this meant that you could not take pointers to *members* of socket structures, because unlike every other OS known to humanity, there, the socket structure fields were integer whatever-they-are-called with size constraints. bitfields? 09:25
also, if we can prove that libuv already assumes it, then we're golden ;-)
jnthn Ah, I guess what I said is true in so far as "that's how it'll work at the CPU cache level", but that's not going to be a semantic issue. 09:32
nwc10 yes, pretty sure what you said was true at cache level. Soemthing else can still get a "stale" read later
but seems that even the CPUs in some current arduinos can't do byte access. 09:33
But IIRC there was (at least) one microcontroller CPU wehre sizeof(long) == 1, because char was 32 bits
Geth MoarVM/MVM_malloc_trim-after-MVM_gc_collect_free_gen2_unmarked: 09812dfda4 | (Nicholas Clark)++ | src/gc/orchestrate.c
MVM_malloc_trim would be better after MVM_gc_collect_free_gen2_unmarked

MVM_gc_collect_free_gen2_unmarked can free memory (particularly "overflow" allocations for large objects), so the greatest chance of free memory at the top of the address space will be here.
09:40
nwc10 ooops. that wasn't on master. I shall neuralise it 09:41
Geth MoarVM/MVM_malloc_trim-after-MVM_gc_collect_free_gen2_unmarked: da23771762 | (Nicholas Clark)++ | src/gc/orchestrate.c
MVM_malloc_trim would be better after MVM_gc_collect_free_gen2_unmarked

MVM_gc_collect_free_gen2_unmarked can free memory (particularly "overflow" allocations for large objects), so the greatest chance of free memory at the top of the address space will be here.
10:08
MoarVM: nwc10++ created pull request #1335:
MVM_malloc_trim would be better after MVM_gc_collect_free_gen2_unmarked
nwc10 that's better. A Pull Request, because I'm not sure if I missed soemthing. 10:09
Geth MoarVM: da23771762 | (Nicholas Clark)++ | src/gc/orchestrate.c
MVM_malloc_trim would be better after MVM_gc_collect_free_gen2_unmarked

MVM_gc_collect_free_gen2_unmarked can free memory (particularly "overflow" allocations for large objects), so the greatest chance of free memory at the top of the address space will be here.
11:19
MoarVM: 03d3e43fa1 | (Jonathan Worthington)++ (committed using GitHub Web editor) | src/gc/orchestrate.c
Merge pull request #1335 from MoarVM/MVM_malloc_trim-after-MVM_gc_collect_free_gen2_unmarked

MVM_malloc_trim would be better after MVM_gc_collect_free_gen2_unmarked
timotimo wow, the very long branch name doesn't make my irc client very happy %)
timotimo the "use MVM_alloc_array" thing could be done at any point, but of course it'll modify almost every file all over, so ideally i'd do it when no conflicts are to be expected for long-running branches like the dispatch chain one 12:15
lizmat And another Rakudo Weekly News hits the Net: rakudoweekly.blog/2020/08/10/2020-...ey-please/ 12:32
timotimo lizmat: where is the nwc change from? was that for mvm_malloc_trim_would_be_better_after_... branch? 12:52
nwc10 yes 13:00
timotimo that's more about optimizing deallocation i thought - just from the name 13:04
timotimo it's not actually important :D 13:16
lizmat timotimo: there wasn't a lot of other core developments this week 13:19
timotimo i'll be sure to mention when my mental state goes brrrrrrrrrrrrrrrrrrrr 13:20
nwc10 I'm reducing the number of open tabs. This was interesting: stackoverflow.com/questions/467210...0#46722180 -- Another consideration is that while CPU a may have byte load and store instructions, compiler isn't required to use them. A compiler, for example, could still generate the code Stroustrup describes, loading both b and c using 13:31
a single word load instruction as an optimization.
which would be like sparc gcc using a 64 bit load for two 32 bit struct members in a struct that it knwos has to be 8 byte aligned 13:32
(and going SIGBUS when the program/programmer is naughty and tried to cheat on alignment)
so I think here it won't matter, as the two (racing) flag bit swaps are in two functions. So won't be optimisable 13:33
oh, and one is inside a region with a mutex
nwc10 sigh. the downside of continuing to use Perl 5 as a bootstrapping language is that one is still exposed to distros such as Fedora that like shipping a stripped down "perl" 19:31
aargh, our alignment is still fragile, 5 years later... 20:04
causing MVMCollectable to be the "wrong" size breaks arm, but not i386 20:05
aynway, at least I found my awkward bug.
Geth MoarVM/flags-split: 7cb2332fe2 | (Nicholas Clark)++ | 23 files
Split `flags` in struct MVMCollectable into `flags1` and `flags2`.

This permits us to avoid a data race when one thread needs to set MVM_CF_HAS_OBJECT_ID and a second needs to set MVM_CF_REF_FROM_GEN2.
For now, don't change the flag bit values used in the two new members.
20:53
MoarVM/flags-split: 7728b30187 | (Nicholas Clark)++ | src/6model/6model.h
Shrink `flags1` and `flags2` in struct MVMCollectable down to MVMuint8.

Change the values of the MVM_CF_* flags to sit in the range 1-128.
nwc10 there will be a proper pull request tomorrow. But there might be some fun, as I found that I had to do this: paste.scsys.co.uk/592401 20:56
because it turns out that the ops p6setfirstflag and p6takefirstflag (ab)use `flags` in MVMThreadContext to set a "private" flag bit
obviously not "pick a number and hope that it works", but it isn't "officially" allocated or any sort of documented acceptable API being used here 20:58
and I realised that I'm a bit surprised that this isn't all being done with attributes attached to the MVMObject
because surely *that* would be easier to spesh and JIT
(fixing that isn't a blocker on this - I can see how to do the rakudo stuff with C #define ugliness) 21:03
jnthn The C extops in Rakudo are meant to be going away (though they're making a slow job of doing so :)) 21:06
So ugliness is tolerable in that it should be at least somewhat temporary
nwc10 the ugnliness plan was
1) first patch rakudo to do something like that diff, but as #if/#else/#endif 21:07
2) change MoarVM
(With a #define to trigger the #else side of rakudo)
3) remove the #if...#else part and the #endif
4) clean up MoarVM to remove the #define
and then 21:08
5) some point later the rakudo folks get rid of the C
that branch works but I've not yet tried to provoke the race condition
or built that branch with tsan 21:09
and it "has" to work for any platform conformat with the C++11 memory model. (If I understand all the jargon correctly)
and to a CPU with 64 byte cache lines - modifying 1 byte isn't massively different from modifying just 8.
that was the best explanation of all this stuff.
the cache management hardware has to be (Sort of) doing atomic read-modify-swap at the cache line level, whether you are writing 1 byte or 8. (or any other supported size) 21:10