01:16 geekosaur joined 01:55 ilbot3 joined
Geth MoarVM: samcv++ created pull request #685:
Merge collation-arrays branch
05:02
MoarVM: 088aa0a022 | Skarsnik++ (committed using GitHub Web editor) | src/6model/reprs/CArray.c
Fix a leak in CArray repr.

When expand is called (via at-pos) it make room for more perl6 object in the body, but this does not get free if the array is not managed by MoarVM.
07:30
MoarVM: 1059eed1cc | niner++ (committed using GitHub Web editor) | src/6model/reprs/CArray.c
Merge pull request #684 from Skarsnik/patch-1

Fix a leak in CArray repr.
07:58 domidumont joined 08:03 domidumont joined
nine ugexe: it's taken me years to get this into my head: MVM_ROOT is less about the value than it is about the variable. It's for having the GC update your local pointer to the object after it moved the object to the second generation as the object's address will change. 08:32
yoleaux 8 Sep 2017 15:53Z <tbrowder> nine: I just filed issue #101 with Inline::Perl5; failure using Expect::Simple
8 Sep 2017 19:19Z <tbrowder> nine: my version was too old, closing issue 101
08:52 leont joined
Geth MoarVM: 866623d933 | (Samantha McVey)++ (committed using GitHub Web editor) | 8 files
Merge Full Unicode Collation Algorithm Implementation

This is a full implementation of the Unicode Collation Algorithm. We iterate by codepoint and put this into a ring buffer. The ring buffers hold the exact number of codepoints which comprise the longest sequence of codepoints which map to its own collation keys in the Unicode Collation Algorithm. As of Unicode 9.0 this number was 3. When Generate-Collation-Data.p6 is run, this number will ... (42 more lines)
10:18
MoarVM: 0b81969db2 | (Samantha McVey)++ | 2 files
Remove unneeded file from UCA implementation

This file isn't needed for the generation script.
10:21
timotimo nine: not only will the address change when the object is moved to the second gen, but we also move it from one half of the nursery to the other half at least once 10:45
nine Oh, I didn't know that! 10:57
timotimo yeah, we have "semispace copying nurseries" or what it's called 10:58
jnthn Yeah, the nursery is a semispace; we allocate in half of it, then evacuate the objects still alive, but never spotted by GC before, into the other half, and move the ones we did see once before into gen2 11:00
So each thread switches between the spaces per GC
This is a really nice scheme in that 1) you get cheap bump-the-pointer allocation, 2) for objects that die right away you do close to zero work 11:02
And 1 implies good cache locality also
Though you can't do it for the entire heap, otherwise your memory overhead becomes immediately at least 2 * what the program actually needs 11:03
11:09 Skarsnik joined 11:44 vendethiel- joined
MasterDuke samcv: your latest commit or two broke the moarvm build for me 11:46
dir 11:48
wrong window
src/strings/unicode.c:78279:43: error: unknown type name ā€˜sub_nodeā€™ 11:49
src/strings/unicode.c:78324:24: error: ā€˜codepoint_sequence_no_maxā€™ undeclared here (not in a function); did you mean ā€˜codepoints_by_nameā€™?
anyone else getting those? 11:59
12:13 brrt joined
timotimo let me look 12:14
MasterDuke i did make realclean, but still seeing it 12:15
timotimo yeah i get these errors too
this starter_main_elems is a #define in unicode_uca.c 12:16
MasterDuke huh, it builds fine on my laptop
timotimo easy fix 12:17
rm unicode.c; make
Skarsnik what MVMROOT does? 12:39
timotimo what about it? 12:40
Skarsnik Dunno I am reading CArray.c at_pos function, since I get 2 crash pointing at this 12:41
well pointing at the NC.pm operator [] for array 12:42
timotimo oh you mean "what does it do"
Skarsnik Yeah sorry x)
timotimo it tells the GC "this local variable points at a GC-managed object. i want this object to stay alive and i want you to update the pointer when the object gets moved"
Skarsnik hm I am not sure understand. This mark the whole child_objs array and not the new object created in it ? github.com/MoarVM/MoarVM/blob/mast...ray.c#L296 12:45
timotimo the storage array is likely handled by gc_mark 12:47
the object we're calling at_pos on is "root", that gets pointed out to the GC and the GC will then do the rest
but it doesn't look like the storage array is gc-relevant 12:48
Skarsnik storage is handled by C in this part. What I get is the chlild_objs array is what Moar add to track the perl6 objects it return 12:51
timotimo mhm
oh 12:55
the reason why the crash points at the MVMROOT
is because gdb is stupid
it treats the whole macro as a single line
so where the crash happens exactly is hidden 12:56
Skarsnik I was not clear, it does not crash on MVMROOT but the backtrace pointed me at this function x) 12:57
So I try to understand it x)Ć 
timotimo oh ok 13:00
13:02 dogbert2 joined
dogbert2 c: MVM_SPESH_NODELAY=1 HEAD use Test; sub try_eval($str) { try EVAL $str }; is(try_eval('myĀ @xĀ =Ā <aĀ bĀ c>;Ā subĀ yĀ (@z)Ā {Ā @z[0]Ā };Ā y(@x)'), "aĀ bĀ c", "NO-BREAK SPACE") for ^10; 14:08
committable6 dogbert2, Ā¦HEAD(9b42484): Ā«ok 1 - NO-BREAK SPACEā¤ok 2 - NO-BREAK SPACEā¤ok 3 - NO-BREAK SPACEā¤ok 4 - NO-BREAK SPACEā¤ok 5 - NO-BREAK SPACEā¤ok 6 - NO-BREAK SPACEā¤ok 7 - NO-BREAK SPACEā¤ok 8 - NO-BREAK SPACEā¤ok 9 - NO-BREAK SPACEā¤ok 10 - NO-BREAK SPACEĀ»
dogbert2 hmm
c: MVM_SPESH_NODELAY=1 HEAD use Test; sub try_eval($str) { try EVAL $str }; is(try_eval('myĀ @xĀ =Ā <aĀ bĀ c>;Ā subĀ yĀ (@z)Ā {Ā @z[0]Ā };Ā y(@x)'), "aĀ bĀ c", "NO-BREAK SPACE") for ^10; 14:09
committable6 dogbert2, Ā¦HEAD(9b42484): Ā«ok 1 - NO-BREAK SPACEā¤ok 2 - NO-BREAK SPACEā¤ok 3 - NO-BREAK SPACEā¤ok 4 - NO-BREAK SPACEā¤ok 5 - NO-BREAK SPACEā¤ok 6 - NO-BREAK SPACEā¤ok 7 - NO-BREAK SPACEā¤ok 8 - NO-BREAK SPACEā¤ok 9 - NO-BREAK SPACEā¤ok 10 - NO-BREAK SPACEĀ»
dogbert2 this code is part of t/spec/S02-lexical-conventions/unicode-whitespace.t and on my system it behaves very badly when MVM_SPESH_NODELAY=1 14:10
lots of test failures, although the first few works. Interestingly, if I add MVM_SPESH_BLOCKING=1 the problem vanishes 14:12
for me the output from running the above mentioned file looks like this: gist.github.com/dogbert17/cb708cb3...0ff0a668e5 14:21
14:21 robertle joined 15:06 zakharyas joined 15:56 zakharyas joined
dogbert2 should it be possible to decode any Buf, containing bytes, to a Str with utf8-c8 ? 16:28
timotimo should be, yeah, but currently we'll bail if there's "almost valid" utf8 16:31
dogbert2 ah, I was looking at one of [Tux]'s test files and it generates random buffers and try to decode them to Str with utf8-c8 16:33
so if the buffer contains "almost valid" utf-8 then the utf8-c8 decoder will fail? 16:34
timotimo right 16:36
for you see
there's encoding-wise valid utf8 that's still invalid because the codepoints encoded aren't proper
we'd still want to encode these with synthetics, though
nobody implemented that yet, though
dogbert2 aha, that explains it, many thanks 16:37
16:39 domidumont joined
timotimo sure 16:39
m: say chr(0x10ffffff + 1) 16:43
camelia Error encoding UTF-8 string: could not encode codepoint 285212672 (0x11000000), codepoint out of bounds. Cannot encode higher than 1114111 (0x10FFFF)
in block <unit> at <tmp> line 1
timotimo m: say chr(0x10ffff + 1)
camelia Error encoding UTF-8 string: could not encode codepoint 1114112 (0x110000), codepoint out of bounds. Cannot encode higher than 1114111 (0x10FFFF)
in block <unit> at <tmp> line 1
timotimo if i'm not mistaken, utf8 can no-problem represent codepoints higher than this, but they are "forbidden" 16:47
dogbert2 that's mean :)
timotimo so if you encounter a properly encoded value above 0x10ffff we won't create a synthetic because the encoding is correct, but it's still explosive in a "later stage"
with the cold i'm having i don't have the necessary brain grease to step in and fix it
dogbert2 do you use any house remedies to get well, e.g. c-vitamins and such 16:48
timotimo i got aspirin complex which is painkiller + cough suppressant + something else 16:49
something that prevents the nose from running as much 16:50
dogbert2 cool, I believe that these combo meds are forbidden where I live, don't know why though
timotimo huh? weird. 16:51
dogbert2 indeed, so you have to get several meds instead of one
timotimo i wonder if the combo meds are noticably more expensive than getting the individual parts and mixing them by yourself 16:52
dogbert2 good question 16:53
leont utf-8 isn't really well-defined beyond that point 17:04
timotimo oh 17:15
17:18 pmurias joined
pmurias the MoarVM wants to have all not static function MVM_ prefixed? 17:19
leont Start bytes F0-4 are defined as encoding for 4 byte codepoints, and logically F5-F8 would do the same (IMHO), but what should F9 do? 5 bytes? 17:27
Erm, that's a one off, F5-F7, and F8
(in the C0-F4 range, the number of initial 1s is the number of bytes for the character) 17:29
The reason for the 0x10ffff limit is that UTF-16 can't express any higher codepoints, because it's a idiotic encoding 17:32
20:19 robertle joined 20:27 robertle_ joined 20:31 zakharyas joined 21:58 dogbert2 joined 23:02 bisectable6 joined, benchable6 joined 23:17 MasterDuke joined