timotimo also, a tiny bit of compression would do wonders to our .moarvm files, since we decode that part anyway 00:01
like, a quite big chunk of the serialized blob is just AAA due to 64bit numbers that are 0 00:03
or "small"
jnthn We shouldn't be storing serialization stuff in .moarvm any more 00:24
As in, as strings we should not
Should look into that
I wish people would remember that if you compress, you can't mmap.
As soon as your 2 processes in, you're a 50% reduction ahead. 00:25
And s/your/you're/, but grammar is optional after drinking an eviltwin. : 00:26
As for recording where code was when a GC occurred - probably not useful. You just need to modify a tiny thing to make that happen at a different point. Or another thread can affect it. Maybe over a program than runs for a very long time you can get something interesting statistically, but the existing allocation profile already gives you data like that. 00:28
Put another way, GC is amortized over all allocations, so which one triggers it is not really so interesting. 00:29
timotimo OK, fair enough 01:13
my point about the base64 thing is: we can't mmap it anyway because it's base64 and we turn it into regular data right away 01:14
and going from base64 to regular binary data will get us from 4/3 down to 1
and since we are already base64-decoding before we read stuff from the serialized blob, we could just as well replace base64 with some compression scheme 01:25
01:44 Util joined 02:45 JimmyZ joined
JimmyZ_ timotimo: I think jnthn++ meant compressing all the .moarvm like jar does blocks mmap 02:46
and you meant compressing only the string part 02:47
timotimo yes, only the serialisation blobs 02:55
JimmyZ_ timotimo: + 1 to it :) 02:57
timotimo what kind of compression library can we use without adding a dependency that our users may not have our want? 02:59
I think RLE is not good enough
JimmyZ_ pack ? 03:00
as just write strings as binary string? 03:01
timotimo well yeah 03:04
we would put the serialised blob somewhere where binary data can live
but on top of that, having small numbers represented with fewer bytes may be worth something 03:05
JimmyZ_ that is, out of .moarvm files?
timotimo maybe just encoding every 64 bit chunk as a varint
no, inside these files
JimmyZ_ jnthn++ said 'We shouldn't be storing serialization stuff in .moarvm any more' 03:06
timotimo like, the bytecode section already contains binary data
JimmyZ_ I couldn't follow he
timotimo he corrected himself
he meant we should not use the string heap
JimmyZ_ so why? 03:07
timotimo so why what?
JimmyZ_ why we shouldn't store strings in .moarvm? 03:08
timotimo no no no
we should not store serialised objects in the strings 03:09
but strings will stay where they are
JimmyZ_ oh
timotimo :)
JimmyZ_ that is, replace base64 code with binary data in Strings heap? 03:10
timotimo I mean there is nothing wrong with storing the serialised blob as "packed" data 03:11
JimmyZ_ ok
timotimo but there are so many null bytes
we should store the serialised blob somewhere in the moarvm file that is not the string heap
make a new section for serialised blobs 03:12
JimmyZ_ fair enough
timotimo I am very tired and should go to bed soon
JimmyZ_ and still use base64 ?
good night
timotimo maybe you want to build a simple compression by using write varint 03:13
no, get rid of base64
dalek arVM: a533a69 | jimmy++ | src/ (2 files):
Small fixes.
arVM: 228e07e | jimmy++ | / (5 files):
rename nodes_moar.h to nodes.h, since there is no other nodes_*.h files.
07:46 JimmyZ_ joined
dalek arVM: 8fa0c7d | TimToady++ | src/6model/reprs/NFA. (2 files):
detect useless SUBRULE edges
09:11 JimmyZ_ joined 09:13 lizmat joined 09:22 woolfy joined 09:32 JimmyZ_ joined 09:36 JimmyZ_ joined 10:11 brrt joined 10:40 brrt left 11:51 JimmyZ_ joined
JimmyZ_ timotimo: about base64, I didn't find how to replace it with varint :( 12:32
timotimo OK, maybe i'll have a look 12:36
the problem is also that we can't just put the data back into the string heap
or at least we shouldn't
12:36 kjs_ joined
timotimo that means we need to have another way to give the serialized blob to the deserialize instruction 12:36
JimmyZ_ why we can't? 12:37
jnthn But...we already *do* have that way 12:38
timotimo can the string heap store all-binary data without trouble?
jnthn That's why I can't figure out why you're seeing base64 strings.
The string heap stores strings in utf-8
The seriealized data has its own binary section of the bytecode file 12:39
timotimo oh?
i must be speaking of something else, then
jnthn And has for ages
Well, we may have had some regression at some point that makes it do the base-64 thing...
timotimo check_and_dissect_input does a base64 decode on its input
and one of the instructions in a moarvm --dump of a module reads 00803 deserialize loc_1_str, loc_12_obj, loc_6_obj, loc_10_obj, loc_13_obj 12:40
where loc_1_str is the base64 encoded blob
JimmyZ_ Yeah, I saw it, I was even not sure what base64_encode encodes..
oh, it encodes serialized things 12:42
not strings 12:43
timotimo yes 12:44
that's what i was trying to explain %)
i was saying we store a serialized blob on the string heap
jnthn Yes, in all .moarvm files, or just a handful? 12:45
timotimo i shall see. 12:46
jnthn OK. I gotta head to the station fairly soon, but will be online on the train. :) 12:47
timotimo cairo has it
HTTP::Server::Async has a null string that it passes to deserialize 12:48
same with ::Request 12:49
could be about repossession conflicts, actually
SDL2::Raw has a base64 blob 12:50
zavolaj has a null string, too 12:51
12:54 colomon joined 13:53 ggoebel111111114 joined 14:53 zakharyas joined 15:03 kjs_ joined 15:42 vendethiel joined 15:51 kjs_ joined 16:25 woolfy left 16:30 zakharyas joined 17:52 FROGGS__ joined 18:45 zakharyas joined 20:39 FROGGS__ joined 20:47 vendethiel- joined