| timotimo | now that unicode sequence names are case insensitive we can encode them with the same table we've used for the unicode names themselves | 00:08 | |
| encode/compress | 00:09 | ||
| though the shift-1 table from the names is probably almost 100% worthless for the sequences | |||
| samcv | heh | 00:24 | |
| not totally | |||
| not for the non-emoji ones | |||
| timotimo | how many of those exist? | 00:25 | |
| samcv | www.unicode.org/Public/UCD/latest/u...uences.txt | ||
| like 200? 300? | |||
| lots of them have LETTER in it | |||
| here is the gist gist.github.com/samcv/a5a8bc715d39...8e8483597c with the shift's | 00:26 | ||
| timotimo | oh, and all these syllables | 00:28 | |
| OK, cool | |||
| and also WITH | |||
| samcv | yea | 00:30 | |
| oh btw | 00:31 | ||
| here are the provisional ones www.unicode.org/Public/UCD/latest/u...esProv.txt | |||
| compare | |||
| geekosaur | that kind of thing was why I poked at compressing by words; there's a *lot* of reuse | ||
| samcv | KEYCAP NUMBER SIGN, to the emoji listing | ||
| keycap: # | |||
| is what it is in the emoji | |||
| m: say "\c[keycap: *]" | |||
| camelia | rakudo-moar d06d7c: OUTPUTĀ«*ļøā£ā¤Ā» | ||
| samcv | m: FE0F.uniname.say | 00:32 | |
| camelia | rakudo-moar d06d7c: OUTPUTĀ«===SORRY!=== Error while compiling <tmp>ā¤Undeclared name:⤠FE0F used at line 1ā¤ā¤Ā» | ||
| samcv | m: 0xFE0F.uniname.say | ||
| camelia | rakudo-moar d06d7c: OUTPUTĀ«VARIATION SELECTOR-16ā¤Ā» | ||
| timotimo | that's a really short list | 00:34 | |
| samcv | the provisional is | ||
| timotimo | yup | ||
| m: say uniname 0xFE0F | |||
| camelia | rakudo-moar d06d7c: OUTPUTĀ«VARIATION SELECTOR-16ā¤Ā» | ||
| samcv | but those are www.unicode.org/Public/emoji/4.0/em...uences.txt here | ||
| timotimo | ah | ||
| that's more like it | |||
| samcv | keycap: 2, vs KEYCAP DIGIT TWO | 00:35 | |
| seems fine for us to support both | |||
| timotimo | yeah | ||
| i'm going to bed early today | |||
| seeya! | |||
| samcv | as long as the numbers are allowed otherwise can escape them | ||
| bye! | |||
| timotimo | o/ | ||
| wait a minute ... we can't represent * or # in our base40 yet :o | 00:36 | ||
| another shift level, then? | |||
| samcv | hah | ||
| or just don't use shift level for it and store as normal string | |||
| there's also non-ascii characters | 00:37 | ||
| timotimo | if we can signal up front that we want a normal string there | ||
| oh, right, you mentioned that | |||
| samcv | i mean they're gonna be stored in a separate structure | ||
| timotimo | 1F1E8 1F1EE ; Emoji_Flag_Sequence ; CĆ“te dāIvoire # 6.0 [1] (šØš®) | ||
| samcv | the sequences anyway | ||
| timotimo | and also Ć land Islands | ||
| samcv | utf-8 encoded between "" works fine in C yes? | 00:38 | |
| it just literally takes the data put there | |||
| into the char * | |||
| timotimo | i've never tried :) | ||
| samcv | i'm pretty sure that's true | ||
| would make less sense it it wasn't | |||
| timotimo | stackoverflow would know. also, we want to be compatible to a whole bunch of platforms and their compilers | ||
| to be fair, that's gcc on almost all platforms, otherwise clang or msvc | 00:39 | ||
| samcv | yeh it works perfect :) just tried with š§ | ||
| well i think C just puts whatever the fuck data you have in between | |||
| so you can put really anything that uses 8 bits per thing | |||
| timotimo | oh | ||
| samcv | as long as it fits in 8 bits | ||
| timotimo | we could use strings for our 16bit pieces, then :P | 00:40 | |
| samcv | since it's totally naive | ||
| heh | |||
| timotimo | except it'll probably be confused by \0 in the middle, and it'll put a \0 at the end, too | ||
| samcv | madness | ||
| timotimo | anyway, thinking about this is what costs me a lot of sleep | ||
| so i'll just drop it :) | |||
| samcv | it's not stored literally as a \0 though | ||
| only in the table | |||
| and that's as a char | |||
| hahah! | |||
| sleep well | |||
| timotimo | i'll try | ||
| have a good day :) | 00:41 | ||
| MasterDuke | samcv: in radix(), for every char MVM_string_get_grapheme_at_nocheck() is called twice, which switches on the string's body.storage_type. so i pulled that check out and have a different while loop for every type | 01:43 | |
| samcv | nice | 01:44 | |
| MasterDuke | so it's definitely more efficient by work being done, but it makes the function about 3 times as big code-wise | ||
| which is why i'm guessing it isn't all that much faster | |||
| samcv | hm | 01:45 | |
| link to code? | |||
| MasterDuke | diff: gist.github.com/MasterDuke17/58754...2751650bcd | 01:47 | |
|
01:50
pyrimidine joined
|
|||
| samcv | MasterDuke, would a list of unicode codepoints which have numeric values speed things up | 02:02 | |
| for the non-ascii side | 02:03 | ||
| like if we had a switch | |||
| actually the switch could have the ascii digits too | |||
| then just go the length of the string and pass value to the switch of the grapheme | 02:04 | ||
| and the compiler can optimize it | |||
|
02:22
pyrimidine joined
|
|||
| MasterDuke | samcv: maybe | 03:29 | |
|
03:33
pyrimidine joined
|
|||
| samcv | MasterDuke, yeah i think that may be the fastest way to do it | 06:49 | |
|
07:04
pyrimidine joined
07:11
domidumont joined
07:18
domidumont joined
07:51
pyrimidine joined
08:22
domidumont joined
08:23
domidumont joined
08:44
zakharyas joined
08:55
pyrimidine joined
09:28
pyrimidine joined
09:42
Ven joined
10:28
pyrimidine joined
10:53
pyrimidine joined
11:20
ilmari[m] joined
11:43
pyrimidine joined
12:16
pyrimidine joined
12:35
Ven joined
13:41
pyrimidine joined
13:44
pyrimidi_ joined
|
|||
| samcv | .ask timotimo so I have added some indexes to names.c for every other codepoint | 13:54 | |
| yoleaux2 | samcv: I'll pass your message to timotimo. | ||
| samcv | see here: gist.github.com/samcv/ebbcf638b925...4dde1a510f | 13:55 | |
| also we don't store any codepoints without names anymore | |||
| we call uninames with a codepoint number, if it returns 0 then uninames has generated the codepoint's name such as <control-0000>, otherwise it returns an offset, which is the offset for the total number of unicode names (not an offset for the codeme array per se) | 13:57 | ||
| the name_index array holds the location in the codeme array of every other codepoint which HAS a name in the codeme table | 13:58 | ||
| so the 0th item in the codeme table, is at 0, the 2nd one is at 4 | |||
|
14:30
pyrimidine joined
14:33
Ven joined
14:47
pyrimidine joined
15:01
pyrimidine joined
15:16
Ven joined
15:41
pyrimidine joined
|
|||
| samcv | i gotta go to bed, but I updated the gist, and the repo so it should be basically ready for you to see if you can get working | 15:42 | |
|
15:45
pyrimidine joined
15:58
domidumont joined
16:50
brrt joined
|
|||
| brrt | good *, #moarvm | 16:50 | |
| brokenchicken | \o | 16:51 | |
| jnthn | o/ | 16:52 | |
| brrt | \o | 16:54 | |
| so⦠what i wanted to report about | |||
| is that explaining register allocation is actually really, really hard | |||
| the processs itself isn't, but it requires a lot of assumptions and background about how compilation works in the first place | 16:55 | ||
| soā¦. i'm meaning to write a blog post about having implemented linear scan, but i find myself unable to explain what i did | |||
| this bothers me | |||
| brokenchicken | :) | 17:00 | |
| brrt | maybe i can try here and if it works, i can transform it into a post | ||
| basically; register allocation is all about ensuring that the right values are in the right registers in time for use by the instructions of the program | 17:01 | ||
| there's two trivial ways to do that | 17:04 | ||
| one is to compute and store every value ever in a register | |||
| that is of course not correct for larger programs because the set of available registers is pretty small | 17:05 | ||
| the second is to store-and-load every value ever from memory just before they are used | |||
| that's correct (there is usually enough memory available to do this) but it's very much suboptimal, because a): your bytecode size grows by a bunch for all the load-and-store instructions, and b): memory operations are pretty slow if cached, and disastrously slow if not | 17:06 | ||
| so what you want to do is to find the solution wherein you minimize the number of memory traffic required | |||
| s/number/volume/ | |||
| so the first thing you need to know, is which values are necessary when | 17:07 | ||
| and you typicallly have to take flow control into account with that | |||
| i.e. you need to 'deliver' the value that was computing using whatever flow control the programmer has designed | 17:08 | ||
| one of the solutions is to divide the *variables* into a set that you'll store in registesr, and a set that you'll keep in memory | 17:09 | ||
| but that's not very cool either, since it's well possible that different control flow paths use different sets of variables; you might end up having to 'sacrifice' one path for the other | 17:10 | ||
| so the solution to that is to stop looking at variables but just at the values used | 17:11 | ||
| that's achieved by the SSA transformation | |||
| SSA = single static assignment, and it transforms every variable assignment into a declaration-and-assignment of a new variable; it makes variables immutable | 17:12 | ||
| or in other words, it replaces variables with their values | |||
| the expression JIT already does this | |||
| so to find out where a register is required for a value analyze the 'liveness' of program values; the result of this analysis is a 'live range' for every value used | 17:23 | ||
| a live range is basically the set of all instructions (or in our case, things-that-represent-instructions) that either have the value as output (definitions) or as input (uses) | 17:25 | ||
| that set can contain more than one definition - contrary to my earlier statement! - because definitions can be connected using PHI nodes | |||
| and a PHI node is basically something that describes a value defined in two or more different control flow paths | 17:26 | ||
| nine | .oO(Memory traffic volume depends not only on how often in the code a memory slot is accessed but also how often that code is run (e.g. loops). So it's gonna be a trade off between local and global optimization) |
17:27 | |
| brrt | that's exactly right :-) | ||
| basically, you have to compute the expected cost of spilling | |||
| decommuting & | |||
| nine | .tell brrt so as we're talking about a JIT, do you actually have some information about runtime behavior accessible in the register allocator? I.e. does it have an idea how often loops may run? | 17:29 | |
| yoleaux2 | nine: I'll pass your message to brrt. | ||
|
18:00
pyrimidine joined
18:50
ggoebel joined
19:18
zakharyas joined
19:23
dogbert17 joined
19:26
ggoebel joined
20:29
pyrimidine joined
21:19
pyrimidine joined
21:49
domidumont joined
22:08
pyrimidi_ joined
22:18
Ven joined
22:26
ggoebel joined
|
|||
| MasterDuke | samcv: turns out replacing this ( github.com/MoarVM/MoarVM/blob/mast...#L375-L392 ) with a giant switch statement is 25% slower | 22:33 | |
|
22:33
Ven joined
|
|||
| MasterDuke | for 1m iterations of nqp::radix() it resulted in 300m more instructions, 250m more branches, 8m more missed branches | 22:35 | |