01:49 ilbot3 joined 02:20 colomon joined 03:16 vendethiel joined 03:43 AlexDaniel joined
Geth_ MoarVM: 9531ccc915 | (Samantha McVey)++ | 3 files
Use normalize_should_break to decide if concat needs normalization

This greatly increases the speed of normalization in many cases when using non-ASCII characters.
We rename should_break to normalize_should_break and save a lot of non-needed normalization, only renormalizing when the normalized form would actually be different.
05:48
06:05 brrt joined
Geth_ MoarVM/even-moar-jit: 6854e97ffd | (Bart Wiegmans)++ | 4 files
A real parser for s-expressions

With a tokenizer, lookahead, and recursive descent, alright. Also, supports strings, which I think can be pretty useful, and is considerably more flexible as to what is or is not a 'word'.
06:20
MoarVM/even-moar-jit: 6a3471a348 | (Bart Wiegmans)++ | 5 files
Flatten LABEL nodes

We used to have the labels wrap a const, but that's really redundant, as the label can't be anything but a constant itself. So instead lets use the label as the const.
06:53 domidumont joined 06:55 domidumont joined 07:10 domidumont joined 07:12 brrt joined 07:20 brrt joined 07:47 zakharyas joined
brrt 'optimistic' store insertion is finicky, it seems.. 07:53
samcv brrt, so i'm exploring ways to not have to renormalize an entire string just when concatting it 08:14
so. what i will need to do though is dealloc the last grapheme/first grapheme on string a or string b
brrt ohai samcv 08:15
allow me to help
samcv so i want to know if there's anything special i should know in addition to what you would do with C
brrt okay, ehm, let me think
samcv well i guess i don't need to dealloc it. though i'd like to know how to do that too. but. i need to make a new string which contains the data of first and second strings except not the end or begining of one of them 08:16
and it was my impression when you concat things, you create a new string with two strands. string a and string b
brrt that's usually what happens, yes
samcv ok
so that is simple. but i am going to make a new string that contains part of each string
idk is there anything special i should know about 08:17
brrt i'm actually not super familiar with the layout of MVM strings 08:19
samcv ok well. they're just int arrays
brrt uhuh
samcv as far as we care at this moment
so i just need to make sure that the GC will know that i'm only using a certain section of that string
brrt if allcoated via malloc, you can 'dealloc' with realloc, just with a smalle rnumber
aha
why?
samcv my $a = "blah"; my $b = 'stuff'; my $c = $a ~ $b; so ok we end up having to change the normalization 08:20
hypothetically. so i end up wanting 'bla' + X + 'tuff'
so i make a new MVM string and set the data to point to the right point and set the grapheme count properly etc 08:21
i know how to do that
timotimo when you call realloc on a piece of memory and you only reduce it by a little bit, usually the allocator decides to just do nothing
samcv timotimo, well i just want to not mess with $a and $b
timotimo don't forget that you're not allowed to share buffers between MVMStrings
samcv until it get's GC'd
timotimo yeah
samcv i can't share the buffer?
that doesn't make sense from what i'm reading here in concat 08:23
or is it because the strands literally point to the *same* string
brrt yeah, i'm not sure either. i thought that's what the whole strands thing did 08:24
samcv also is there not some fancy hing i can do to share the buffers?
so that the gc knows that x1-x3 memory address go to string $a and x1-x2 goes to string $c 08:25
or can we not do that
because i really want to do that
timotimo that only works for reference-counted or otherwise gc-managed things
samcv i mean if the GC kept track properly it should be possible
hm
timotimo i don't think strands are gc-managed, though?
samcv i mean strands are made up of strings 08:26
literally the same strings that it started with
in simple cases
brrt i think i'm confused, in geneal 08:27
*general
anyway
you'd never 'dealloc' the memory, you'd just use a smaller portion of it
samcv yes exactly
ignore the part about deallocing
yes i use a smaller portion of it 08:28
timotimo oh, strands do point at other MVMString objects
that'd be fine, then
samcv yes
but. i want to use part of another string
brrt a substring
yes
timotimo i had a very bad night with not actually sleep :<
samcv and i'm just concerned about what happens to make sure that if string $a gets gc'd that i don't get things exploding 08:29
since string $c contains a substring of it
brrt well as long as you point to the full collectable...
samcv though tbh it would be fine to waste one or 4 bytes of space in most cases
in almost all cases the substring will be almost the whole string 08:30
so i suppose it would be ok if it kept around that extra stuff
timotimo we might want a little heuristic so we don't make strands if the full buffer itself would take up less space than building a strands array would
samcv well yes 08:31
09:10 robertle joined 09:38 domidumont joined 09:56 domidumont1 joined
samcv ok so guys github.com/MoarVM/MoarVM/compare/m...130f8cR558 10:32
this is my 1/2 actual code and the ending is pseudocode for what i want to do
for now going to only cover the case where we have two non-synthetics 10:33
maybe jnthn can help out since he knows strands better than me
timotimo newstring = a[0..*-2] ~ r_string ~ b[1..*-1]; ... this is not C %) 10:34
samcv i said it was half pseudocode 10:35
timotimo ah
sorry, i somehow read about half of that text
don't allow me to operate a motorized vehicle today
samcv i mean there i just need to compose a string that is all but the last cp of string a + the new chr + all but the first char of string b
otherwise the code is functional 10:36
timotimo so you'd create a new MVMString for the in-between grapheme, and a three strands containing MVMString that has n-1 from the first string, the new chr, and all but the first ... yeah
brrt yes. it is quite clever 10:37
the only thing i'm not sure about is how you'd actually 'do' the substringing 10:38
samcv yes
well i can make a new string that's the same but has slightly different offsets
though i don't know how the gc knows about strings. when does it learn of them?
i mean if the gc thinks strand a of the new string is the SAME as string a from the concat we will be okay 10:39
brrt i see what you mean
and sadly, i don't know how to resolve that
samcv you guys know better about how gc works than i do 10:40
brrt well, to be fair, the bus-factor on our GC is about a single jtnhtn 10:42
jnthn
samcv what if jnthn gets GC'd :( 10:43
can we call it the GC factor? :P
heh
timotimo sorry, what is the question now? 10:44
a strand holds on to a full MVMString but it also holds an individual start and length field 10:45
that's how you get a substring from another string
brrt so
if you can help samcv do that, then we're done for today
samcv ah so strands already do what i want timotimo ?
timotimo now if you have a 500 megabytes long string and you take a 10 character substring and it ends up as a strand, you'll keep the 500 megabytes alive through your tiny substring
does that give you a better feel for the thing?
samcv yeah that won't happen though
but i see your point
but yeah that's what i want :) 10:46
will start at single character and then once that's confirmed working will work on making it longer based on how much is needed for the specific string
to be renormalized or whatever
timotimo don't forget to have MVMROOT everywhere 10:47
samcv yep 10:54
well i must sleep now :)
thx for the info timotimo 10:55
timotimo i hope you sleep better than i didn't
samcv and if jnthn gets around please message me useful tips so i get it when i wake up :)
i hope i don't as bad as you did
Zzzz night o/ 10:59
Geth_ MoarVM: bfaeb405d4 | (Samantha McVey)++ | 3 files
Rename should_break to MVM_unicode_normalize_should_break

  @zhuomingliang++
11:21
samcv ok now i'm really really going to bed ;)
brrt yes! 11:43
what is it with moarvm developers and sleep 11:44
11:46 AlexDaniel joined
nwc10 seems to be a known failure mode of humans. 11:54
12:05 brrt1 joined 12:13 domidumont joined 12:22 robertle_ joined 12:23 domidumont1 joined 12:27 Ven joined, domidumont joined 12:59 domidumont1 joined
jnthn is back 13:12
Glad to report I didn't get GC'd :)
samcv: You can safely do something like allocate a new strand string, make the first strand a substring of the first string, the third strand a substring of the second string, and then allocate another string for the second strand with the twiddled bits in it. That'd work, you just need to MVMROOT things you'll talk about after allocating the strand and new middle string 13:15
13:18 colomon joined
Geth_ MoarVM: a3a58c82dd | (Jonathan Worthington)++ | src/core/continuation.c
Clear cached dynamics when taking continuation.

They may be from frames that are no longer visible at the location that the continuation will be invoked. Fixes problems with $*THREAD being outdated after an `await` in Perl 6.d.PREVIEW.
14:24
dogbert17 jnthn, github.com/MoarVM/MoarVM/issues/612 might be of interest wrt to your current spesh work 14:31
jnthn Hm, doubt it's on the path my current work would touch# 14:38
But given designing/implementing the type planner isn't really something to do on an afternoon after returning from vacation, will take a look at it now :) 14:39
dogbert17 :) 14:40
hope you had a nice vacation
jnthn Yeah, we made a short trip to Olomouc, which turns out to be a very pleasant city. 14:41
It's most famous for having a very smell kind of cheese. :) 14:42
*smelly
Gee, this bug doesn't golf terribly far 14:47
14:49 AlexDaniel joined
Geth_ MoarVM: 1fe53f6d7d | (Jonathan Worthington)++ | src/core/args.c
Missing MVMROOT during an allocation.
15:05
jnthn I found that, but sadly it ain't The Bug we're hutning
*hunting 15:06
15:06 brrt joined
dogbert17 jnthn++, you probably solved some other reported bugs with that fix :) 15:07
jnthn MVMCallCapture sure has its share of historical cruft 15:11
brrt \o/ 15:14
i need some thinking assistance today 15:15
jnthn o/ brrt
Geth_ MoarVM: 352d42c40b | (Jonathan Worthington)++ | src/6model/reprs/MVMCallCapture.h
Remove a flag option that is no longer mentioned.
15:17
MoarVM: 5dc3389fea | (Jonathan Worthington)++ | 2 files
Remove never-assinged use_mode_frame field.

A leftover from when the call capture has a use mode other than the one where a copy of the args was taken for safety reasons.
brrt \o jnthn 15:20
the thinking assistance that is required
i have decided to move towards 'optimistic' store insertion
roughly, i currently immediately issue a STORE node for every generated value 15:21
that's easy and correct
nine But generates lots of unnecessary STOREs? 15:23
brrt yes 15:28
now, i want to move to a scheme in which we only insert them when required
(currently, at the end of every basic block)
but even that isn't necessary if we start to handle multiple basic blocks, although that is some way off 15:29
nine Well, one step at a time
Geth_ MoarVM: 2f9be082a3 | (Jonathan Worthington)++ | 4 files
Remove now-unused cur_usecapture per thread.
15:30
brrt indeed 15:31
but on the other hand, as long as we still issue a store after every computed value...
well, all the other optimizations we might do are not very sensible then
now, as to 'where am i' 15:32
the basic objective is to have a 'flush' function, which makes the in-memory state compatible with what the interpreter expects
Geth_ MoarVM: 8aa657d60a | (Jonathan Worthington)++ | 3 files
Eliminate mode flag from CallCapture.

They are all save-mode now.
15:33
brrt that's a repurposing of the existing 'flush' function, which makes sure that nodes defined previously are no longer seen as available because control flow might be interupted
jnthn Sorry, bit headachey (probably tired from travel), gonna have to go rest some 15:36
bbiab
dogbert17 jnthn: rest well 15:41
brrt rest well yes 15:44
continuing: so the way to do that, is to keep track of which node last computed a value, and wrap that with a STORE
well, that's where i am, but... 15:48
i'm already wrapping some nodes with a store
timotimo rested up some 16:03
17:24 domidumont joined 18:17 TimToady joined 18:46 zakharyas joined 19:08 colomon joined 19:36 brrt joined 19:49 AlexDaniel joined
samcv good * 20:09
brrt ohai 20:14
okay, 9 hours is reasonable for sleep
samcv good hi 20:34
jnthn o/ samcv
samcv hi jnthn :) 20:35
jnthn, where should i look to see how i make strands a substring of another? do i want to look at .substr ?
jnthn Yeah, see .substr
Or just read the MVMStrand or whatever it is data structure
samcv ok :) 20:36
yay!
jnthn github.com/MoarVM/MoarVM/blob/mast...ring.h#L53
samcv sexy
jnthn start and end you set to indicate the regions
repetitions isn't useful for what you're doing, I susepct, but it's the thing that lets us attempt to support the Perl 6 x operator efficiently 20:37
samcv attempt heh
jnthn Well, in the best case it saves us allocating memory for the repetitions, an in the worst case it just puts off when we have to 20:38
*and
samcv jnthn, ok so i have it almost done. but i'm not sure how to deal with copy_strands 21:25
so i want to copy the strands and then afterward i need to edit the strands i guess right?
so that they refer to a different section of the string
timotimo you should be able to construct the strands array from scratch, too 21:26
jnthn samcv: that's only relevant if your input is a strand string itself 21:30
samcv yep i know
jnthn samcv: We don't do tree-like structures
We only allow a single "level" of strands 21:31
samcv ok so uh
well yeah i know it's not a tree
it copies the strands
jnthn So if you have one string coming in with strands, you'd use copy_strands to stick those strands into the result
It's really just a wrapper around memcpy that does size calcs :)
samcv github.com/MoarVM/MoarVM/compare/m...130f8cR455 21:32
if you want to see what i have so far
jnthn, well yes i know that
so then i just need to go to the last strand of string a and change the length of the cp's
jnthn OK, I'm not sure what more there is to know :)
Ah, right
Yes, you'd do that after the copy :)
samcv check out what i have so far
timotimo i didn't realize we don't tree it up in there 21:36
samcv probably nicer to just do a small amount of copying i'd think
to keep things simple since we're just copying the metadata
jnthn Hm, would renormalized_section that is passed to strand_renormalize_part not need to be &renormalized_section so it can be updated?
samcv MVMString *result, *renormalized_section; 21:37
hm
it might
jnthn I think around line 551 after copy_strands, you just want to take the final strand and tweak its .end though
using consumed_a
And similar around 540
samcv ok tweak them instead of originally setting them different 21:38
that could be cleaner
jnthn Yeah, you can tweak it because after the copy you're tweaking "your" things
samcv maybe make a function that tweaks them?
jnthn And so not violating the immutability
I dunno if a function is worth it 'cus the tweaks are different? 21:39
samcv we will see
jnthn It's last strand last char in one case, and first strand first char in the other
Which doesn't feel like it leaves a lot to factor out :)
timotimo still wondering if i'll ever get in-situ storage for strings and/or mvmarrays working 21:40
21:43 colomon_ joined
jnthn suspects he should got and rest so he stands a chance of getting anything working tomorrow. :) 21:44
lizmat feels likewise
jnthn 'night o/ 21:49
samcv jnthn, nothing bad happens if i go over max strands y/n 21:52
yay lots of segfaults :) 22:18
22:32 d4l3k_ joined 22:35 arnsholt joined, nwc10 joined