timotimo | now that unicode sequence names are case insensitive we can encode them with the same table we've used for the unicode names themselves | 00:08 | |
encode/compress | 00:09 | ||
though the shift-1 table from the names is probably almost 100% worthless for the sequences | |||
samcv | heh | 00:24 | |
not totally | |||
not for the non-emoji ones | |||
timotimo | how many of those exist? | 00:25 | |
samcv | www.unicode.org/Public/UCD/latest/u...uences.txt | ||
like 200? 300? | |||
lots of them have LETTER in it | |||
here is the gist gist.github.com/samcv/a5a8bc715d39...8e8483597c with the shift's | 00:26 | ||
timotimo | oh, and all these syllables | 00:28 | |
OK, cool | |||
and also WITH | |||
samcv | yea | 00:30 | |
oh btw | 00:31 | ||
here are the provisional ones www.unicode.org/Public/UCD/latest/u...esProv.txt | |||
compare | |||
geekosaur | that kind of thing was why I poked at compressing by words; there's a *lot* of reuse | ||
samcv | KEYCAP NUMBER SIGN, to the emoji listing | ||
keycap: # | |||
is what it is in the emoji | |||
m: say "\c[keycap: *]" | |||
camelia | rakudo-moar d06d7c: OUTPUTĀ«*ļøā£ā¤Ā» | ||
samcv | m: FE0F.uniname.say | 00:32 | |
camelia | rakudo-moar d06d7c: OUTPUTĀ«===SORRY!=== Error while compiling <tmp>ā¤Undeclared name:ā¤ FE0F used at line 1ā¤ā¤Ā» | ||
samcv | m: 0xFE0F.uniname.say | ||
camelia | rakudo-moar d06d7c: OUTPUTĀ«VARIATION SELECTOR-16ā¤Ā» | ||
timotimo | that's a really short list | 00:34 | |
samcv | the provisional is | ||
timotimo | yup | ||
m: say uniname 0xFE0F | |||
camelia | rakudo-moar d06d7c: OUTPUTĀ«VARIATION SELECTOR-16ā¤Ā» | ||
samcv | but those are www.unicode.org/Public/emoji/4.0/em...uences.txt here | ||
timotimo | ah | ||
that's more like it | |||
samcv | keycap: 2, vs KEYCAP DIGIT TWO | 00:35 | |
seems fine for us to support both | |||
timotimo | yeah | ||
i'm going to bed early today | |||
seeya! | |||
samcv | as long as the numbers are allowed otherwise can escape them | ||
bye! | |||
timotimo | o/ | ||
wait a minute ... we can't represent * or # in our base40 yet :o | 00:36 | ||
another shift level, then? | |||
samcv | hah | ||
or just don't use shift level for it and store as normal string | |||
there's also non-ascii characters | 00:37 | ||
timotimo | if we can signal up front that we want a normal string there | ||
oh, right, you mentioned that | |||
samcv | i mean they're gonna be stored in a separate structure | ||
timotimo | 1F1E8 1F1EE ; Emoji_Flag_Sequence ; CĆ“te dāIvoire # 6.0 [1] (šØš®) | ||
samcv | the sequences anyway | ||
timotimo | and also Ć land Islands | ||
samcv | utf-8 encoded between "" works fine in C yes? | 00:38 | |
it just literally takes the data put there | |||
into the char * | |||
timotimo | i've never tried :) | ||
samcv | i'm pretty sure that's true | ||
would make less sense it it wasn't | |||
timotimo | stackoverflow would know. also, we want to be compatible to a whole bunch of platforms and their compilers | ||
to be fair, that's gcc on almost all platforms, otherwise clang or msvc | 00:39 | ||
samcv | yeh it works perfect :) just tried with š§ | ||
well i think C just puts whatever the fuck data you have in between | |||
so you can put really anything that uses 8 bits per thing | |||
timotimo | oh | ||
samcv | as long as it fits in 8 bits | ||
timotimo | we could use strings for our 16bit pieces, then :P | 00:40 | |
samcv | since it's totally naive | ||
heh | |||
timotimo | except it'll probably be confused by \0 in the middle, and it'll put a \0 at the end, too | ||
samcv | madness | ||
timotimo | anyway, thinking about this is what costs me a lot of sleep | ||
so i'll just drop it :) | |||
samcv | it's not stored literally as a \0 though | ||
only in the table | |||
and that's as a char | |||
hahah! | |||
sleep well | |||
timotimo | i'll try | ||
have a good day :) | 00:41 | ||
MasterDuke | samcv: in radix(), for every char MVM_string_get_grapheme_at_nocheck() is called twice, which switches on the string's body.storage_type. so i pulled that check out and have a different while loop for every type | 01:43 | |
samcv | nice | 01:44 | |
MasterDuke | so it's definitely more efficient by work being done, but it makes the function about 3 times as big code-wise | ||
which is why i'm guessing it isn't all that much faster | |||
samcv | hm | 01:45 | |
link to code? | |||
MasterDuke | diff: gist.github.com/MasterDuke17/58754...2751650bcd | 01:47 | |
01:50
pyrimidine joined
|
|||
samcv | MasterDuke, would a list of unicode codepoints which have numeric values speed things up | 02:02 | |
for the non-ascii side | 02:03 | ||
like if we had a switch | |||
actually the switch could have the ascii digits too | |||
then just go the length of the string and pass value to the switch of the grapheme | 02:04 | ||
and the compiler can optimize it | |||
02:22
pyrimidine joined
|
|||
MasterDuke | samcv: maybe | 03:29 | |
03:33
pyrimidine joined
|
|||
samcv | MasterDuke, yeah i think that may be the fastest way to do it | 06:49 | |
07:04
pyrimidine joined
07:11
domidumont joined
07:18
domidumont joined
07:51
pyrimidine joined
08:22
domidumont joined
08:23
domidumont joined
08:44
zakharyas joined
08:55
pyrimidine joined
09:28
pyrimidine joined
09:42
Ven joined
10:28
pyrimidine joined
10:53
pyrimidine joined
11:20
ilmari[m] joined
11:43
pyrimidine joined
12:16
pyrimidine joined
12:35
Ven joined
13:41
pyrimidine joined
13:44
pyrimidi_ joined
|
|||
samcv | .ask timotimo so I have added some indexes to names.c for every other codepoint | 13:54 | |
yoleaux2 | samcv: I'll pass your message to timotimo. | ||
samcv | see here: gist.github.com/samcv/ebbcf638b925...4dde1a510f | 13:55 | |
also we don't store any codepoints without names anymore | |||
we call uninames with a codepoint number, if it returns 0 then uninames has generated the codepoint's name such as <control-0000>, otherwise it returns an offset, which is the offset for the total number of unicode names (not an offset for the codeme array per se) | 13:57 | ||
the name_index array holds the location in the codeme array of every other codepoint which HAS a name in the codeme table | 13:58 | ||
so the 0th item in the codeme table, is at 0, the 2nd one is at 4 | |||
14:30
pyrimidine joined
14:33
Ven joined
14:47
pyrimidine joined
15:01
pyrimidine joined
15:16
Ven joined
15:41
pyrimidine joined
|
|||
samcv | i gotta go to bed, but I updated the gist, and the repo so it should be basically ready for you to see if you can get working | 15:42 | |
15:45
pyrimidine joined
15:58
domidumont joined
16:50
brrt joined
|
|||
brrt | good *, #moarvm | 16:50 | |
brokenchicken | \o | 16:51 | |
jnthn | o/ | 16:52 | |
brrt | \o | 16:54 | |
soā¦ what i wanted to report about | |||
is that explaining register allocation is actually really, really hard | |||
the processs itself isn't, but it requires a lot of assumptions and background about how compilation works in the first place | 16:55 | ||
soā¦. i'm meaning to write a blog post about having implemented linear scan, but i find myself unable to explain what i did | |||
this bothers me | |||
brokenchicken | :) | 17:00 | |
brrt | maybe i can try here and if it works, i can transform it into a post | ||
basically; register allocation is all about ensuring that the right values are in the right registers in time for use by the instructions of the program | 17:01 | ||
there's two trivial ways to do that | 17:04 | ||
one is to compute and store every value ever in a register | |||
that is of course not correct for larger programs because the set of available registers is pretty small | 17:05 | ||
the second is to store-and-load every value ever from memory just before they are used | |||
that's correct (there is usually enough memory available to do this) but it's very much suboptimal, because a): your bytecode size grows by a bunch for all the load-and-store instructions, and b): memory operations are pretty slow if cached, and disastrously slow if not | 17:06 | ||
so what you want to do is to find the solution wherein you minimize the number of memory traffic required | |||
s/number/volume/ | |||
so the first thing you need to know, is which values are necessary when | 17:07 | ||
and you typicallly have to take flow control into account with that | |||
i.e. you need to 'deliver' the value that was computing using whatever flow control the programmer has designed | 17:08 | ||
one of the solutions is to divide the *variables* into a set that you'll store in registesr, and a set that you'll keep in memory | 17:09 | ||
but that's not very cool either, since it's well possible that different control flow paths use different sets of variables; you might end up having to 'sacrifice' one path for the other | 17:10 | ||
so the solution to that is to stop looking at variables but just at the values used | 17:11 | ||
that's achieved by the SSA transformation | |||
SSA = single static assignment, and it transforms every variable assignment into a declaration-and-assignment of a new variable; it makes variables immutable | 17:12 | ||
or in other words, it replaces variables with their values | |||
the expression JIT already does this | |||
so to find out where a register is required for a value analyze the 'liveness' of program values; the result of this analysis is a 'live range' for every value used | 17:23 | ||
a live range is basically the set of all instructions (or in our case, things-that-represent-instructions) that either have the value as output (definitions) or as input (uses) | 17:25 | ||
that set can contain more than one definition - contrary to my earlier statement! - because definitions can be connected using PHI nodes | |||
and a PHI node is basically something that describes a value defined in two or more different control flow paths | 17:26 | ||
nine | .oO(Memory traffic volume depends not only on how often in the code a memory slot is accessed but also how often that code is run (e.g. loops). So it's gonna be a trade off between local and global optimization) |
17:27 | |
brrt | that's exactly right :-) | ||
basically, you have to compute the expected cost of spilling | |||
decommuting & | |||
nine | .tell brrt so as we're talking about a JIT, do you actually have some information about runtime behavior accessible in the register allocator? I.e. does it have an idea how often loops may run? | 17:29 | |
yoleaux2 | nine: I'll pass your message to brrt. | ||
18:00
pyrimidine joined
18:50
ggoebel joined
19:18
zakharyas joined
19:23
dogbert17 joined
19:26
ggoebel joined
20:29
pyrimidine joined
21:19
pyrimidine joined
21:49
domidumont joined
22:08
pyrimidi_ joined
22:18
Ven joined
22:26
ggoebel joined
|
|||
MasterDuke | samcv: turns out replacing this ( github.com/MoarVM/MoarVM/blob/mast...#L375-L392 ) with a giant switch statement is 25% slower | 22:33 | |
22:33
Ven joined
|
|||
MasterDuke | for 1m iterations of nqp::radix() it resulted in 300m more instructions, 250m more branches, 8m more missed branches | 22:35 |