01:16
geekosaur joined
01:55
ilbot3 joined
|
|||
Geth | MoarVM: samcv++ created pull request #685: Merge collation-arrays branch |
05:02 | |
MoarVM: 088aa0a022 | Skarsnik++ (committed using GitHub Web editor) | src/6model/reprs/CArray.c Fix a leak in CArray repr. When expand is called (via at-pos) it make room for more perl6 object in the body, but this does not get free if the array is not managed by MoarVM. |
07:30 | ||
MoarVM: 1059eed1cc | niner++ (committed using GitHub Web editor) | src/6model/reprs/CArray.c Merge pull request #684 from Skarsnik/patch-1 Fix a leak in CArray repr. |
|||
07:58
domidumont joined
08:03
domidumont joined
|
|||
nine | ugexe: it's taken me years to get this into my head: MVM_ROOT is less about the value than it is about the variable. It's for having the GC update your local pointer to the object after it moved the object to the second generation as the object's address will change. | 08:32 | |
yoleaux | 8 Sep 2017 15:53Z <tbrowder> nine: I just filed issue #101 with Inline::Perl5; failure using Expect::Simple | ||
8 Sep 2017 19:19Z <tbrowder> nine: my version was too old, closing issue 101 | |||
08:52
leont joined
|
|||
Geth | MoarVM: 866623d933 | (Samantha McVey)++ (committed using GitHub Web editor) | 8 files Merge Full Unicode Collation Algorithm Implementation This is a full implementation of the Unicode Collation Algorithm. We iterate by codepoint and put this into a ring buffer. The ring buffers hold the exact number of codepoints which comprise the longest sequence of codepoints which map to its own collation keys in the Unicode Collation Algorithm. As of Unicode 9.0 this number was 3. When Generate-Collation-Data.p6 is run, this number will ... (42 more lines) |
10:18 | |
MoarVM: 0b81969db2 | (Samantha McVey)++ | 2 files Remove unneeded file from UCA implementation This file isn't needed for the generation script. |
10:21 | ||
timotimo | nine: not only will the address change when the object is moved to the second gen, but we also move it from one half of the nursery to the other half at least once | 10:45 | |
nine | Oh, I didn't know that! | 10:57 | |
timotimo | yeah, we have "semispace copying nurseries" or what it's called | 10:58 | |
jnthn | Yeah, the nursery is a semispace; we allocate in half of it, then evacuate the objects still alive, but never spotted by GC before, into the other half, and move the ones we did see once before into gen2 | 11:00 | |
So each thread switches between the spaces per GC | |||
This is a really nice scheme in that 1) you get cheap bump-the-pointer allocation, 2) for objects that die right away you do close to zero work | 11:02 | ||
And 1 implies good cache locality also | |||
Though you can't do it for the entire heap, otherwise your memory overhead becomes immediately at least 2 * what the program actually needs | 11:03 | ||
11:09
Skarsnik joined
11:44
vendethiel- joined
|
|||
MasterDuke | samcv: your latest commit or two broke the moarvm build for me | 11:46 | |
dir | 11:48 | ||
wrong window | |||
src/strings/unicode.c:78279:43: error: unknown type name āsub_nodeā | 11:49 | ||
src/strings/unicode.c:78324:24: error: ācodepoint_sequence_no_maxā undeclared here (not in a function); did you mean ācodepoints_by_nameā? | |||
anyone else getting those? | 11:59 | ||
12:13
brrt joined
|
|||
timotimo | let me look | 12:14 | |
MasterDuke | i did make realclean, but still seeing it | 12:15 | |
timotimo | yeah i get these errors too | ||
this starter_main_elems is a #define in unicode_uca.c | 12:16 | ||
MasterDuke | huh, it builds fine on my laptop | ||
timotimo | easy fix | 12:17 | |
rm unicode.c; make | |||
Skarsnik | what MVMROOT does? | 12:39 | |
timotimo | what about it? | 12:40 | |
Skarsnik | Dunno I am reading CArray.c at_pos function, since I get 2 crash pointing at this | 12:41 | |
well pointing at the NC.pm operator [] for array | 12:42 | ||
timotimo | oh you mean "what does it do" | ||
Skarsnik | Yeah sorry x) | ||
timotimo | it tells the GC "this local variable points at a GC-managed object. i want this object to stay alive and i want you to update the pointer when the object gets moved" | ||
Skarsnik | hm I am not sure understand. This mark the whole child_objs array and not the new object created in it ? github.com/MoarVM/MoarVM/blob/mast...ray.c#L296 | 12:45 | |
timotimo | the storage array is likely handled by gc_mark | 12:47 | |
the object we're calling at_pos on is "root", that gets pointed out to the GC and the GC will then do the rest | |||
but it doesn't look like the storage array is gc-relevant | 12:48 | ||
Skarsnik | storage is handled by C in this part. What I get is the chlild_objs array is what Moar add to track the perl6 objects it return | 12:51 | |
timotimo | mhm | ||
oh | 12:55 | ||
the reason why the crash points at the MVMROOT | |||
is because gdb is stupid | |||
it treats the whole macro as a single line | |||
so where the crash happens exactly is hidden | 12:56 | ||
Skarsnik | I was not clear, it does not crash on MVMROOT but the backtrace pointed me at this function x) | 12:57 | |
So I try to understand it x)Ć | |||
timotimo | oh ok | 13:00 | |
13:02
dogbert2 joined
|
|||
dogbert2 | c: MVM_SPESH_NODELAY=1 HEAD use Test; sub try_eval($str) { try EVAL $str }; is(try_eval('myĀ @xĀ =Ā <aĀ bĀ c>;Ā subĀ yĀ (@z)Ā {Ā @z[0]Ā };Ā y(@x)'), "aĀ bĀ c", "NO-BREAK SPACE") for ^10; | 14:08 | |
committable6 | dogbert2, Ā¦HEAD(9b42484): Ā«ok 1 - NO-BREAK SPACEā¤ok 2 - NO-BREAK SPACEā¤ok 3 - NO-BREAK SPACEā¤ok 4 - NO-BREAK SPACEā¤ok 5 - NO-BREAK SPACEā¤ok 6 - NO-BREAK SPACEā¤ok 7 - NO-BREAK SPACEā¤ok 8 - NO-BREAK SPACEā¤ok 9 - NO-BREAK SPACEā¤ok 10 - NO-BREAK SPACEĀ» | ||
dogbert2 | hmm | ||
c: MVM_SPESH_NODELAY=1 HEAD use Test; sub try_eval($str) { try EVAL $str }; is(try_eval('myĀ @xĀ =Ā <aĀ bĀ c>;Ā subĀ yĀ (@z)Ā {Ā @z[0]Ā };Ā y(@x)'), "aĀ bĀ c", "NO-BREAK SPACE") for ^10; | 14:09 | ||
committable6 | dogbert2, Ā¦HEAD(9b42484): Ā«ok 1 - NO-BREAK SPACEā¤ok 2 - NO-BREAK SPACEā¤ok 3 - NO-BREAK SPACEā¤ok 4 - NO-BREAK SPACEā¤ok 5 - NO-BREAK SPACEā¤ok 6 - NO-BREAK SPACEā¤ok 7 - NO-BREAK SPACEā¤ok 8 - NO-BREAK SPACEā¤ok 9 - NO-BREAK SPACEā¤ok 10 - NO-BREAK SPACEĀ» | ||
dogbert2 | this code is part of t/spec/S02-lexical-conventions/unicode-whitespace.t and on my system it behaves very badly when MVM_SPESH_NODELAY=1 | 14:10 | |
lots of test failures, although the first few works. Interestingly, if I add MVM_SPESH_BLOCKING=1 the problem vanishes | 14:12 | ||
for me the output from running the above mentioned file looks like this: gist.github.com/dogbert17/cb708cb3...0ff0a668e5 | 14:21 | ||
14:21
robertle joined
15:06
zakharyas joined
15:56
zakharyas joined
|
|||
dogbert2 | should it be possible to decode any Buf, containing bytes, to a Str with utf8-c8 ? | 16:28 | |
timotimo | should be, yeah, but currently we'll bail if there's "almost valid" utf8 | 16:31 | |
dogbert2 | ah, I was looking at one of [Tux]'s test files and it generates random buffers and try to decode them to Str with utf8-c8 | 16:33 | |
so if the buffer contains "almost valid" utf-8 then the utf8-c8 decoder will fail? | 16:34 | ||
timotimo | right | 16:36 | |
for you see | |||
there's encoding-wise valid utf8 that's still invalid because the codepoints encoded aren't proper | |||
we'd still want to encode these with synthetics, though | |||
nobody implemented that yet, though | |||
dogbert2 | aha, that explains it, many thanks | 16:37 | |
16:39
domidumont joined
|
|||
timotimo | sure | 16:39 | |
m: say chr(0x10ffffff + 1) | 16:43 | ||
camelia | Error encoding UTF-8 string: could not encode codepoint 285212672 (0x11000000), codepoint out of bounds. Cannot encode higher than 1114111 (0x10FFFF) in block <unit> at <tmp> line 1 |
||
timotimo | m: say chr(0x10ffff + 1) | ||
camelia | Error encoding UTF-8 string: could not encode codepoint 1114112 (0x110000), codepoint out of bounds. Cannot encode higher than 1114111 (0x10FFFF) in block <unit> at <tmp> line 1 |
||
timotimo | if i'm not mistaken, utf8 can no-problem represent codepoints higher than this, but they are "forbidden" | 16:47 | |
dogbert2 | that's mean :) | ||
timotimo | so if you encounter a properly encoded value above 0x10ffff we won't create a synthetic because the encoding is correct, but it's still explosive in a "later stage" | ||
with the cold i'm having i don't have the necessary brain grease to step in and fix it | |||
dogbert2 | do you use any house remedies to get well, e.g. c-vitamins and such | 16:48 | |
timotimo | i got aspirin complex which is painkiller + cough suppressant + something else | 16:49 | |
something that prevents the nose from running as much | 16:50 | ||
dogbert2 | cool, I believe that these combo meds are forbidden where I live, don't know why though | ||
timotimo | huh? weird. | 16:51 | |
dogbert2 | indeed, so you have to get several meds instead of one | ||
timotimo | i wonder if the combo meds are noticably more expensive than getting the individual parts and mixing them by yourself | 16:52 | |
dogbert2 | good question | 16:53 | |
leont | utf-8 isn't really well-defined beyond that point | 17:04 | |
timotimo | oh | 17:15 | |
17:18
pmurias joined
|
|||
pmurias | the MoarVM wants to have all not static function MVM_ prefixed? | 17:19 | |
leont | Start bytes F0-4 are defined as encoding for 4 byte codepoints, and logically F5-F8 would do the same (IMHO), but what should F9 do? 5 bytes? | 17:27 | |
Erm, that's a one off, F5-F7, and F8 | |||
(in the C0-F4 range, the number of initial 1s is the number of bytes for the character) | 17:29 | ||
The reason for the 0x10ffff limit is that UTF-16 can't express any higher codepoints, because it's a idiotic encoding | 17:32 | ||
20:19
robertle joined
20:27
robertle_ joined
20:31
zakharyas joined
21:58
dogbert2 joined
23:02
bisectable6 joined,
benchable6 joined
23:17
MasterDuke joined
|