samcv jnthn, what would be the best way to store multiple codepoints for emoji sequences? 00:33
for decomposition we just store a string and then parse it. but there has to be a better way to store multiple things?
bye bye dalek 00:37
00:37 dalek joined 00:41 pyrimidine joined 02:04 lizmat_ joined, pyrimidine joined 02:30 pyrimidine joined 02:48 ilbot3 joined 03:29 pyrimidine joined 04:01 geekosaur joined 04:25 geekosaur joined 04:40 pyrimidine joined 05:14 pyrimidine joined 05:36 pyrimidine joined
samcv jnthn, so i got it fully working \o/ 06:59
07:05 domidumont joined 07:10 domidumont joined
nwc10 good *, #moarvm 07:11
samcv hi nwc10 :) 07:12
github.com/MoarVM/MoarVM/pull/492 \o/ 07:18
07:28 Ven joined 07:37 brrt joined
brrt good * #moarvm 07:41
samcv waves 07:42
brrt i'm checking your PR 07:45
14 files changes, oh my
samcv need to fix a few things but
most of those are just adding the op
brrt oh, wait, indeed
brrt would prefer that the oplist were generated at build time....
samcv what's the best way to construct a string from codepoints btw? i don't think i did it the best way
brrt honestly, i don't know 07:46
let's check first before i give you an answer to that
samcv trying to debug a problem when MVM errors for some of them
i must not be constructing the grapheme properly
but seems to happen when there's more than 2 codepoints in the sequence 07:47
brrt just for my info, why place getstrbyname before indexat in src/core/interp.c
samcv dunno
brrt i'd be surprised if that had any ill effect on the compiled code, but i had expected them to be in bytecode order
hmmmm
samcv interp.c doesn't have to be in order
brrt no
but i would still expect it to be 07:48
samcv hmm it seems to return the grapheme fine but.
it errors later on
MoarVM panic: MVM_nfg_get_synthetic_info called with out-of-range synthetic
so it's not erroring in the code i made
well maybe it is. maybe gdb is being weird
may have not break'd at the right spot 07:49
yeah it seems to panic after adding the 2nd codepoint to the buffer
brrt i assume you know why the enums have changed, so i'm not going to comment on that, either
samcv well i think it's erroring when adding the 3rd
what about the enums?
brrt have you compiled with --debug
samcv yep
brrt hmmm 07:50
have you compiled with ā€”optimize=0
samcv i need to go into unicode.c which is generated from a compilation of multiple
plus there's macros
oh it looks like it's not generating the right number of array items 07:51
ah i see. because my original testing code didn't use * things 07:52
brrt hmmpf 07:53
i honestly have no comments on that PR
:-)
well, some things
samcv uhm how do i get the size of this structure properly: static const MVMint32 uni_seq_16[] = {0x1F487,0x1F3FF} 07:56
from const MVMint32 * uni_seq = uni_seq_enum[result->structitem]; 07:57
maybe there is not a way. or maybe there is. i could always store the number of codepoints as the first item in the structure if i cannot
uni_seq_enum[] just stores pointers to the uni_seq_xx 07:58
brrt what do you mean by 'size of this structure'
i see no reason why sizeof() wouldn't give you the right thing 07:59
namely, 8
samcv ah yeah. i see what i was doing wrong 08:00
brrt, well i think i got it 08:23
well not the size part yet. but the other crashy
08:26 zakharyas joined
samcv brrt, sizeof(uni_seq)/sizeof(MVMint32); gives me half the size i want 08:27
that's what i thought would give me the number of items in the array, but it returns a number half that 08:28
why is this?
because uni_seq is a 64 bit pointer?
lizmat that would be my first guess ? 08:29
samcv ok yeah. i don't want to divide by anything just want to dereference it 08:30
getting the right number now 08:31
brrt well, you just defined uni_seq_16, not uni_seq
samcv brrt, i want to get the number of elements in the array. still not working argh 08:35
sizeof(*uni_seq) gives me size of the MVMint32 type because uni_seq is MVMint32 *
arnsholt If uni_seq is declared as a pointer, there's no way to figure out the length of the array 08:37
samcv ok that's what i thought originally
well actually i declare static const MVMint32 uni_seq_449[] = {0x1F3C4,0x1F3FB,0x200D,0x2640,0xFE0F} 08:38
and then i have a struct which contains uni_seq_449 uni_seq_450 etc
so i access the uni_seq_xx from the struct
brrt uhuh
hmm
i see 08:39
no, you can't do that
samcv uni_seq = uni_seq_enum[blah];
ok i will just have the 1st item be the length
brrt was just about to suggest that
samcv yea
brrt alternatively, have a sentinel value at the end
samcv yeah
arnsholt Yeah, those are the standard solutions
Pascal arrays (length first), or NULL-terminated
samcv i'm gonna go with length first 08:40
arnsholt Yeah, I like length first too, TBH
brrt as long as you don't forget to bias your indexes
samcv yea 08:42
arnsholt brrt: There's always "real_array = &data[1]" =) 08:49
brrt true 08:50
although i've started to prefer: "real_array = data+1"
arnsholt Yeah, that'll work too 08:51
brrt register arithmetic is surprisingly elegant if you get the hang of it 08:52
08:52 domidumont joined
brrt imagines a thousand rustaceans fainting reading that 08:52
arnsholt Register arithmetic? 08:53
brrt eh, pointer arithmetic
hehe
arnsholt Oh, right =)
Yeah, it's not too bad once you get used to it 08:54
brrt yeah, my bad, i'm working on a blog post
arnsholt But I do think a language like Rust has the potential to kill of entire classes of problems
brrt well, it goes hand in hand with certain patterns (of memory management / layout), and if you're not into those patterns, then it's going to suck
hmmm. no doubt
on the other hand 08:55
arnsholt And a problem with pointer arithmetic is that if you fuck it up, all kinds of weird shit can happen
brrt i've had to fix many, many errors in the register allocator before it worked
i think just 2 of these were actual honest memory corruption / overflow errors
and they were swiftly caught by ASAN 08:56
moritz dishonest memory corruption errors are the worst :-)
brrt one of these was actually a data-structure-and-algorithm-choice error, at the root of it
the other was a noninitialized value
all other issues were logic issues
so.....
it's undoubtedly true that rust relieves programmers of whole classes of errors 08:57
samcv ok it actually really works now \o/
brrt what is not so self-evident is that those classes of errors are the most frequent or most important errors
samcv++
(although I guess you could point to a number of CVE's which prove me wrong)
on the other, other hand 08:58
renember shell-shock
nothing buffer overrunny about that
was a logic error
arnsholt Yeah, Rust won't save you from those
brrt rust won't save you from phishing, either
arnsholt Nope
brrt so i'm *a bit* annoyed about the hype surrounding 'rust = safety' 08:59
that doesn't mean i don't want to try it out sometime :-)
arnsholt I think it's not too far off the mark
brrt it's a correct statement. it is the hype which is unreasonable
arnsholt Especially when you get things like memory shenanigans in file(1) and friends
Yeah, hype is hype, I guess
brrt (this too shall pass :-)) 09:00
moritz it's really that Rust offers compile-time abstractions without (much) of a runtime cost 09:03
without the crazy subtle semantics that C++ has, too :-)
brrt that's pretty cool, yes
09:09 pyrimidine joined 09:24 brrt joined
samcv now just time to make spectest :) 09:28
brrt make spectest, not segv 09:32
samcv heh
09:37 jnthn joined, Util joined, mst joined, nine_ joined 09:39 nwc10_ joined, moritz_ joined, ggoebel joined 09:40 camelia joined 09:44 japhb joined
jnthn moarning o/ 09:53
samcv morning jnthn
brrt moarning jnthn 10:02
jnthn catches up with backlog here and on #perl6-dev to see what happend during the night :) 10:03
samcv: So, any leftover questions, or is it now at the point of "review my PR"? :)
samcv yeah. review my PR :-D
it works fully and is gud
let me know if there's something i did you don't like though 10:04
spectest just finished and pass 10:05
jnthn Alrighty
samcv oh there's one thing MVM_string_from_grapheme i just copied it into that file
other than that 10:06
jnthn Working example: nqp-m -e "say(nqp::getstrbyname(''person golfing: medium-light skin tone'))"
samcv err maybe it was already there
jnthn I...uh...doubt this works, due to the extra quote at the start? :)
samcv err wait where is it from
lies!
jnthn From the PR description ;) 10:07
samcv heh yeah whatever the double quotes
oh i know where i stole it from 10:08
it's MVM_string_chr except without checking to make sure there are no negative graphemes :)
maybe should have MVM_string_chr call that one? anyway check out the PR and let me know 10:09
(so we don't duplicate code)
and move it to ops.c or something
10:13 pyrimidine joined
jnthn Yeah, currently reviewing 10:26
OK, review done 10:38
samcv This is called string_from_grapheme, but actually is taking a codepoint, which is not always a grapheme. 10:39
but uhm. it takes both?
synthetic and non synthetic's
idk what it should be called then
jnthn That's not what grapheme means.
Grapheme means "in NFG form"
The positive integers of the NFG representation all just happen to align with NFC codepoints. 10:40
samcv so what are the negative ones?
jnthn Also graphemes
samcv those are graphemes yes?
jnthn Yes
We use "synthetics" to talk about the negatives. 10:41
But I think the routine being called string_from_grapheme is fine
samcv ok
jnthn It should just take MVMGrapheme32 and it doesn't need to run it through the normalizer at all
Because it's already NFG
Note that while having such a function in MoarVM is fine, we shouldn't expose that one directly to the outside world 10:42
samcv yeah
jnthn (We never expose synthetics, because we don't want people to rely on their integer values.)
samcv yep
uhm so how do i do it without MVM_unicode_normalizer_process_codepoint
i tried without it but i kept running into issues 10:43
jnthn Which "it"? :)
How to implement string_from_grapheme? 10:44
samcv uh adding to buffer.
also yes that
err. no.
but also i'm using MVM_unicode_normalizer_process_codepoint just because i don't want any issues if we run into the cases where we don't correctly break in emoji sequences 10:45
there are still a few that don't work properly
jnthn I'd just change the signature to take `MVMGrapheme32 g` and then get rid of the use of the normalizer
And then it's already correct
samcv ok how do i do it without the normalizer? 10:46
jnthn Because s->body.storage.blob_32[0] = g; does the right thing
(What I just said is about inside of string_from_grapeheme)
It's totally reasonable to use the normalizer in MVM_unicode_string_from_name 10:47
samcv oh just don't use it in MVM_string_from_grapheme
jnthn Right
Because by the time you call that you already have a grapheme :)
samcv and this will also work if i have multiple graphemes reight?
well er probably not 10:48
jnthn Well, not at the moment, because the signature is MVMGrapheme32
samcv but can cross that road when we come to it
jnthn Yeah
Though fixing it now isn't so hard
Lemme find a good example 10:49
samcv ok
jnthn github.com/MoarVM/MoarVM/blob/mast...lize.c#L88
This function actually already nearly does what you ned
*need
It's just that it takes an MVMObject * as its input and pulls data out of that 10:50
samcv yes i saw that
jnthn But we could split it into two parts
One that works on a C-level array
samcv that would be cool
jnthn And takes a length
And then you can just feed the codepoint array you've got into it
samcv yeah i had seen that function but it didn't do exactcly what i wanted
well my array's 1st item is the number of items in it, but i can always move the pointer by 1 10:51
and already have the length
jnthn Sure, just move the pointer by 1 and pass in that and the length
Though I was a tad confused about the length
samcv hm? 10:52
jnthn Whether it includes the element specifying the length or not
samcv no it does not
it's the number of codepoints
jnthn for (int i = 1; i < array_size; i++) {
So isn't this an off-by-one, or do I need another coffee? :) 10:53
(If we start at 1 to skip the length, then it'd need to be <= ? )
samcv nope
almost certain not 10:54
i thought a similar thing at first but, that is off by one if i do <=
but i will 2x check
jnthn m: my @a = 2, 100, 101; my $array_size = @a[0]; loop (my $i = 1; $i < $array_size; $i++) { say @a[$i] } 10:56
camelia rakudo-moar ed5c86: OUTPUTĀ«100ā¤Ā»
dalek arVM: 8bfbb0e | jnthn++ | src/gc/orchestrate.c:
Tweak full collection criteria in heap profiling.

The recording of heap snapshots will of course use memory, which will throw off the RSS heuristic and make us a *lot* less likely to ever do a full collection, distorting the profiles. This is also a bit of a distortion (to more regular heap profiles being taken), but it's an improvement. (To do better, we could try tracking RSS before/after snapshots and excluding that memory from the calculation. Patches welcome if anyone tries it and finds that a viable appraoch.)
10:59
arVM: 68b5e35 | jnthn++ | src/profiler/heapsnapshot.c:
Null-check the *correct* thread's ->cur_frame.

539346d | jnthn++ | src/io/ (3 files): Take into account actual allocated of I/O buffers.
It seems libuv suggest we allocate 64KB sometimes, even when the input we get is tiny. While I'm not sure second-guessing it is wise, we should at least be honest internally about what's allocated. By storing the actual allocated size, the GC can track it as part of the gen2 promotion statistics, and be smarter about triggering full collections. This reduces memory overhead.
samcv ok it is off by one now jnthn i must have changed something else
MoarVM: 80c8044 | jnthn++ | src/io/ (3 files):
MoarVM: Merge pull request #488 from MoarVM/more-pressure
MoarVM:
MoarVM: Take into account actual allocated of I/O buffers.
jnthn samcv: Phew, I didn't need stronger coffee after all :-)
11:00 zakharyas joined
samcv you still need stronger coffee though 11:00
just because why not
jnthn The stuff I'm drinking now is quite a bit weaker than my regular... 11:01
I was given a box set of coffees at Christmas.
I'm used to drinking a 5. If I found the 3 a bit weak, I dunno what I'll make of the 1s. :P 11:02
Hm, let's switch to using Geth instead of dalek here...seems to be working fine for other projects
Geth arVM: 4d87b1cc70 | (Jonathan Worthington)++ | src/spesh/candidate.c
Free up spesh log slots after specialization.

Spesh logging keeps values alive, preventing the GC from collecting them. It logs values to sample what types show up, which is fine, but we should not hang on to them beyond the point the specializer has used them in its analysis. This reduces memory overhead, perhaps quite notably in some applications that have large objects (for example, RT #130494 leaked many objects in this way). On CORE.setting compilation it saves ~3MB - not much in the scheme of things, but nice to win.
11:04
arVM: c670eadf6b | (Jonathan Worthington)++ | src/spesh/candidate.c
Merge pull request #490 from MoarVM/free-spesh-log-slots

Free up spesh log slots after specialization.
jnthn Nice...now our commits are reported by a bot running on MoarVM :) 11:05
samcv jnthn, That's reasonable, but we should at the very least stick in an assert that we really get 0 back from this.
what do we want to do in case it's not 0?
return empty string?
jnthn Well, if the plan is that we'll re-use the code inside of MVM_unicode_codepoints_to_nfg_string it'll be fine 11:06
Since it handles cases where the sequence produces multiple graphemes.
If it's non-zero it'd mean we were about to silently lose a grapheme. 11:07
samcv yeah 11:08
jnthn But really, I'd break MVM_unicode_codepoints_to_nfg_string into two pieces
Everything below input_codes = ((MVMArray *)codes)->body.elems; can be factored out
And then called as with input and input_codes
samcv can this be done later? 11:09
jnthn I guess, but it'd avoid the need to introduce MVM_string_from_grapheme and resolve all the issues I had in MVM_unicode_string_from_name except the off-by-one :) 11:12
And result in less code overall 11:13
samcv i will look into it tomorrow most likely since it's 3am here now 11:14
we won't need MVM_string_from_grapheme then anymore right? 11:15
also aside from splitting unicode_codepoints_to_nfg_string, i think i've made all the changes you requested now 11:16
jnthn Right 11:19
OK, sounds good.
Rest well :)
timotimo o/ 11:21
samcv not asleep yet :P
but i'm mostlyish done coding for the day
timotimo i do wonder what causes all our bots to consume more and more memory 11:24
i'd need to run them myself to figure that out
jnthn Well, lemme merge work-lifetime first :) 11:25
My first attempt to rebase stuff to clean up resulted in SEGV...
arnsholt Whee! =) 11:26
samcv hehehe 11:27
timotimo wow,oops
jnthn I probably did something silly :) 11:28
Works on second attempt
All I intended to do ws trip out a commit that shouldn't have been in and aprt of another. 11:29
timotimo i think i already asked for it a long time ago ... someone could implement abs_i for our jit and it'd positively impact something inside commitable
i mean, i already mentioned abs_i could be done
but i don't know how to do that properly
jnthn Aww, where went Geth? 11:32
arnsholt Ping timeout, apparently
jnthn Anyway, just pushed the rebase of work-lifetime fixing the thing timotimo++ mentioned ;)
timotimo wait, i mentioned what? ;) 11:33
oh the typo?
i mean ... switcho? switcheroo?
samcv work-lifetime sounds sort of ominous. as if that concludes all work jnthn will do on mvm lol
jnthn Yeah, that.
:D
samcv work-lifetime pushed. nothing more to do!
jnthn Yup. All done. Now I can go to the Alps and spend my days sipping beer and enjoying the view. :) 11:36
Well, NQP and Rakudo builds seem happy post-rebase 11:37
timotimo MoarViem
jnthn At first I was like "huh, got a few seconds slower again??", then realized I've got IntelliJ running outside of the VM which is probably hogging an amount of resources... 11:38
11:38 pyrimidine joined
brrt feels for jnthn's computer 11:38
jnthn It leads a busy life :) 11:40
11:41 Geth joined
notviki aww 11:41
Ping timeout... unsure why 11:42
samcv jnthn, how to name MVM_unicode_codepoints_to_nfg_string that takes in a unicode string 11:44
err that takes a c array
can i just uhm. make a new one and change tho op mapping
timotimo just put a _v at the end, just like OpenGL uses :P
samcv v? 11:45
timotimo "vector"
samcv no i get that but
but why vector
timotimo another word for contiguous array
samcv it's two dimensional i guessā€¦ but
timotimo wow. i was hoping to find an example by searching for "glgetv", but it seems like there's shoes that are called that
jnthn samcv: I'd leave the original one as is and call the factored out bit MVM_unicode_codepoints_c_array_to_nfg_string or so :) 11:47
Seems work-lifetime is good for merge :) 11:48
brrt \o/
lizmat is looking forward :-) 11:49
brrt apparently can't write short blog postsā€¦. 11:51
timotimo it is difficult 11:52
samcv i will have to make a blog post once this is all done on all this unicode things
brrt i'd be interested in that. are you syndicated on pl6anet.org?
timotimo ... "The Syndicate" title theme song plays in the distance ... 11:53
samcv nope brrt how do i get that
timotimo i think stmuk can add your .rss to the list 11:54
brrt you should ask moritz, I think
moritz can't do anything on pl6anet 11:55
yes, stmuk is the one to talk to
brrt (pointer following :-)) 11:56
notviki samcv: just add yours to this file: github.com/stmuk/pl6anet.org/blob/...perlanetrc 11:57
samcv sweet
moritz ooh, nice 11:58
maybe add a link to the github repo to the website, while you're at it? :-)
Geth arVM: samcv++ created pull request #493:
Refactor MVM_unicode_codepoints_to_nfg_string
12:03
samcv woah. fancy
anyway jnthn here you go
spectest almost done completing, so should be ready to merge if you have no problems with it 12:04
ok spectest pass. that one is ready for Merge 12:07
timotimo nice
jnthn Travis is having a go slow... 12:11
jnthn spectests a fix for github.com/MoarVM/MoarVM/issues/482 12:21
lunch, bbi30
samcv jnthn, fixed now. also i've rewritten the new get_string_from_name or whatever it's called to use the new function 12:24
will rebase the string from name one once the newest PR is accepted 12:26
12:42 pyrimidine joined
lizmat needs some help with a codegen issue in Actions 12:50
samcv night all o/
Geth arVM/master: 17 commits pushed by jnthn++
review: github.com/MoarVM/MoarVM/compare/e...f712c6a777
lizmat good night, samcv 12:51
jnthn samcv: Newest PR looks good, I will merge it once Travis chcks OK
Thanks; 'night o/
lizmat basically, I need to get Zop to call infix:<Z>(...,:with(op)) 12:52
instead of somehow working something with METAOP_ZIP 12:53
line 6893 in Actions
oops, 6983
feels like I'm trying to work this at the wrong place 12:54
oddly enough, bare Z does codegen to a direct call to &infix:<Z>
would appreciate any help there :-) 12:55
Geth arVM/utf8-c8-boundary-fix: 9475d8db4c | (Jonathan Worthington)++ | src/strings/utf8_c8.c
Decode (hopefully) all NFC UTF8 to NFG in UTF8-C8

In the last round of tweaks to UTF8-C8, we fixed some sequences that would not round-trip properly due to being mis-represented in UTF8. The fix dealt with those cases, but was a bit too sweeping. UTF8-C8 aims to decode everything that's both valid UTF8 and in NFC as the UTF8 decoder would, and express everything else as synthetics that will ensure round-tripping. This fix deals with the issue raised in MoarVM Issue #482, while not regressing any of the UTF8-C8 roundtrip tests.
13:00
arVM: jnthn++ created pull request #494:
Decode (hopefully) all NFC UTF8 to NFG in UTF8-C8
13:01
jnthn arnsholt: I think github.com/MoarVM/MoarVM/pull/494 does what you were suggesting; please take a glance if you've a moment :)
lizmat: What does your diff to do it look like?
I'd expect it to be mostly setting .named('with') 13:02
lizmat well, yes and no:
I think the thinko I made is that METAOP_ZIP returns a block that takes a lol
jnthn Yes
lizmat whereas &infix:<Z>:with returns a Seq
jnthn Oh
Yes
So it won't work to do that simple rewrite 13:03
lizmat indeed
jnthn We need the extra level of thunk for nested meta-ops
lizmat why? it apparently isn't needed for a bare Z?
or do you mean a ZZ ? 13:04
or a ZX ?
jnthn Yes, any of those 13:05
Or ZZ :)
lizmat ah, so maybe I should codegen a call to Rakudo::Internals.ZipIterator... directly
jnthn Will that return a block? 13:07
lizmat atm that returns a Seq
no, Iterator
arnsholt jnthn: Yeah, that looks right to me! 13:08
jnthn arnsholt: OK, thanks. :)
lizmat: I think whatever we code-gen meta-ops to, we'd need to have it be a block except at the top level 13:09
lizmat: We may be able to do smarter at the top level
(But would also need a mechanism to detect it)
lizmat ok, lemme digest that for a bit :-) 13:10
jnthn OK
Going to switch to $other-job for a bit :)
lizmat thanks so far! 13:11
jnthn But will be about :) 13:12
Will merge stuff when Travis is happy
And bump MOAR/NQP revisions, so hopefully everyone can enjoy the fixes :)
timotimo for cases like my &bleh = &[Z,] and such 13:14
jnthn Oh, that also :)
Geth arVM: dd7d4d086d | (Samantha McVey)++ | 2 files
Refactor MVM_unicode_codepoints_to_nfg_string

Seperate out the section which involves MVMObject so we can re-use this function in other places with native c data structures.
13:47
arVM: 37bb9737bd | (Jonathan Worthington)++ | 2 files
Merge pull request #493 from samcv/MVM_unicode_codepoints_to_nfg_string

Refactor MVM_unicode_codepoints_to_nfg_string
brrt something extra to ponder 13:49
how am i going to extend the linear scan allocator (and the expr jit in general) to work with SSE registers 13:50
im not at all sure that the rex byte will work for those
matter of fact 13:51
i know nothing about SSE registers and their encoding
jnthn I figure this is something we can worry about once we've got stuff working at all 13:52
(as in, post-merge)
afaik we don't have any code that uses those today?
So we won't miss out on anything?
(anything we're already getting, that is)
brrt yes, definitely 13:57
but i make plans long ahead
:-) 13:58
i've more or less figured out how to implement ARGLISt
as I said, that's the last essential bit before we can really consider merging 13:59
by the way, the current JIT *does* work with SSE registers
jnthn Oh? What for ooc?
brrt for floating point calculations :-) 14:00
the alternative would be x87 coprocessor calculations. don't use those
jnthn heh, I didn't realize we weren't using those :P
brrt well, it's only for a few things 14:01
jnthn Ah, just some floating point ops?
So the basic things like + doesn't use them?
Compilation completed successfully with 3,719 warnings in 20m 25s 687ms (moments ago)
oops 14:02
ww
brrt :-) 14:03
no, regular integer addition doesn't 14:04
jnthn But flaoting point addition?
brrt floating point addition does
brrt looks for example
github.com/MoarVM/MoarVM/blob/even....dasc#L987 14:07
jnthn Hm 14:08
OK, so we probably do need to think about that at some point sooner rather than later if we want nice JIT of floating point code :) 14:09
brrt well, yeah
i don't expect a terror, though
i may need to extend dasm a bit again
but the allocator shouldn't have to change (much) 14:10
an extra stack for the additional registers, a few extra definitions, and some more care in accessors... 14:11
14:13 pyrimidine joined
brrt oh, and passing floating point args, of course 14:17
14:22 pyrimidine joined
lizmat so, why is there no METAOP_REDUCE_NON, and why doesn't the lack of that not break Z.. ? 14:25
m: find-reducer-for-op(&[..]) 14:26
camelia rakudo-moar ed5c86: OUTPUTĀ«No such symbol '&METAOP_REDUCE_NON'ā¤ in block <unit> at <tmp> line 1ā¤ā¤Actually thrown at:ā¤ in block <unit> at <tmp> line 1ā¤ā¤Ā»
moritz m: say 1 Z.. 2 14:29
camelia rakudo-moar ed5c86: OUTPUTĀ«(1..2)ā¤Ā»
moritz wow
lizmat looks to me these operators only can take 2 iterators 14:30
ever
m: say 1 Z.. 2 Z.. 3
camelia rakudo-moar ed5c86: OUTPUTĀ«Range objects are not valid endpoints for Rangesā¤ in block <unit> at <tmp> line 1ā¤ā¤Ā»
lizmat m: dd 1 Zcmp 2 Zcmp 3 14:31
camelia rakudo-moar ed5c86: OUTPUTĀ«(Order::Less,).Seqā¤Ā»
lizmat m: dd 1 Zcmp 2 Zcmp 1
camelia rakudo-moar ed5c86: OUTPUTĀ«(Order::Less,).Seqā¤Ā»
lizmat m: dd 1 Zcmp -1 Zcmp 1
camelia rakudo-moar ed5c86: OUTPUTĀ«(Order::Same,).Seqā¤Ā»
lizmat m: dd 1 Zcmp 1 Zcmp -1
camelia rakudo-moar ed5c86: OUTPUTĀ«(Order::More,).Seqā¤Ā»
lizmat m: dd 1 Zcmp 1 Zcmp 0 14:32
camelia rakudo-moar ed5c86: OUTPUTĀ«(Order::Same,).Seqā¤Ā»
lizmat m: dd 1 Zcmp 2 Zcmp 0
camelia rakudo-moar ed5c86: OUTPUTĀ«(Order::Less,).Seqā¤Ā»
lizmat m: dd 1 Zcmp 2 Zcmp -1
camelia rakudo-moar ed5c86: OUTPUTĀ«(Order::Same,).Seqā¤Ā»
lizmat yeah, that feels faulty 14:33
14:51 brrt left, brrt joined 16:02 domidumont joined 16:04 zakharyas joined
TimToady yeah, should disallow more than 2 for non-assocs 16:58
16:59 zakharyas joined 17:25 zakharyas joined 17:59 pyrimidine joined
Geth arVM: 9475d8db4c | (Jonathan Worthington)++ | src/strings/utf8_c8.c
Decode (hopefully) all NFC UTF8 to NFG in UTF8-C8

In the last round of tweaks to UTF8-C8, we fixed some sequences that would not round-trip properly due to being mis-represented in UTF8. The fix dealt with those cases, but was a bit too sweeping. UTF8-C8 aims to decode everything that's both valid UTF8 and in NFC as the UTF8 decoder would, and express everything else as synthetics that will ensure round-tripping. This fix deals with the issue raised in MoarVM Issue #482, while not regressing any of the UTF8-C8 roundtrip tests.
18:27
arVM: f9e14e9ca8 | (Jonathan Worthington)++ | src/strings/utf8_c8.c
Merge pull request #494 from MoarVM/utf8-c8-boundary-fix

Decode (hopefully) all NFC UTF8 to NFG in UTF8-C8
18:29 pyrimidine joined 18:45 camelia joined 18:48 camelia joined 18:54 pyrimidine joined 19:01 Geth joined 19:13 brrt joined
brrt ohai #moarvm 19:13
i'm writing a longish blog post on the new register allocator and i've figured out a bug
it is an extremely annoying bug, which is why i want to tell you about it 19:14
if you read the literature about linear scan, the received wisdom is: expire registers prior to allocating a new one, so that if one of the input registers has it's last use in to create this live range, you can reuse it's register 19:15
especially for two-operand instruction sets like x86-64, that's great, because that matches well with how the architecture works
however, to get that effect, you need to arrange registers in a stack, not a register buffer 19:16
s/register buffer/ring buffer/
so, that's one thing, but things get worse 19:17
suppose you have no registers left and need to spill a value
19:18 pyrimidine joined
brrt suppose you pick to spill a value which is used for the next instruction (i.e. where the new live range stats) 19:18
so then we split the live range into 'atomic' ranges 19:19
since the new 'atomic' range is not in the past, it can't be retired, and must be put on the worklist
however, once that's done, it can be immediately expired 19:20
so suppose i have *two* such 'atomic' live ranges 19:22
then one is allocated, e.g. to register rcx; before I allocate the second, this is expired, rcx is returned to the stack; the second value is also loaded into rcx, and my program is wrong 19:24
... and yes, as usual, i know how to fix this 19:26
but i'm *annoyed*
also because the literature is just wrong about this 19:27
TimToady
.oO("We don't know why your Fortran program crashes, but if you just throw in a few extra 'continue' statements, it should start working again.")
19:28
notviki :o 19:29
This all sounds fancy pants.
19:29 zakharyas joined
notviki brrt: is that stuff hard to learn? :) 19:30
TimToady and yes, I heard that when I was (quite a bit) younger
brrt JIT compilers have many moving parts. that makes them kind of hard to explain
each of the individual things is bog-standard. binary heap, disjoint set, linked lists 19:31
TimToady: how.. even
TimToady probably buffer boundary issues 19:32
brrt how does 'continue' make it work then :-o 19:33
did that acutally help
notviki: i'm kind of hoping my blog has some practical hints on how you can do things. and i try to keep the LoC of the jit low 19:34
that's also because it's just me writing it now
notviki brrt: do you have a CS degree? 19:35
brrt (the proper fix, by the way, is to do two things; expire values *after* rather than *at* their last use; and add a special 'reuse' mechanism that checks if a register can be reused *at* it's last use; alternatively, expire a registers only once at a given code point) 19:39
notviki: not really, i've a degree in environmental science :-) 19:41
i've kind of learned by brute force. perhaps not the most efficient way of doing it 19:42
TimToady brrt: it doesn't have to be continue, it can be anything that shifts the positions of the rest of the program, but continue is a no-op 19:43
but it's also possible they used it to mark basic block boundaries, or some such 19:44
brrt hmm, that makes some sense 19:45
TimToady ancient Fortran optimizers were scary good, except when they weren't
brrt :-) 19:46
these days i think we have a bit more theory behind it
i kind of like the 'expire once per codepoint' solution best
simplest to implement
notviki :)
brrt and sufficient. but brittle 19:47
19:47 mtj_ joined
brrt otoh, everything about compilers is brittle 19:47
TimToady
.oO(that's why compiler writers make peanuts...)
19:49
brrt i like peanut butter, so that's something 19:51
also, i think that actual compiler writers (that know what they are doing) actually have reasonable salaries
samcv ok i'm back 20:17
well uhm. i mean. good morning 20:18
o/
20:23 pyrimidine joined
brrt good * samcv 20:39
evening for me
samcv i'll brb in an hour or so 20:43
brrt in an hour or 10 or so :-) 20:44
sleep &
20:50 pyrimidine joined
jnthn o/ samcv 20:50
20:57 pyrimidine joined 21:28 pyrimidine joined 21:33 pyrimidine joined 22:02 pyrimidine joined 22:47 pyrimidine joined 23:26 tbrowder joined
samcv jnthn, i have rewritten it to use the new reworked string creation function. waiting for travis builds to complete now 23:30
renamed it getstrfromname instead of getstrbyname, because of the existing getcpfromname
jnthn samcv++ 23:36
Will look in the morning :)
samcv aww ok
jnthn Super sleepy :)
samcv k
jnthn Found a moment to look anyway 23:40
Spotted one more thing
But overall looks close
Anything else before I attempt sleep? )
samcv uhm i think that's it 23:42
jnthn OK
Be back in the morning then :) 23:43
samcv sleep well :)
jnthn Thanks...here's hoping :) 23:44
o/
samcv o/ 23:45
timotimo sleep wellthn 23:53