00:33
eternaleye joined
|
|||
TimToady | I'd think hash functions would be a good use for polymorphism based on the storage type | 00:47 | |
if you're using conditionals for that, it's kinda smelly | 00:48 | ||
but maybe the rules are different for in-lined stuff... | 00:49 | ||
00:50
jnap joined
|
|||
TimToady | (just spouting off, haven't actually looked at what you're doing :) | 00:50 | |
01:30
lizmat_ joined
04:45
lizmat joined
05:50
woolfy joined
06:07
woolfy joined,
woolfy left,
colomon joined
06:58
zakharyas joined
|
|||
timotimo | it's kind of hard to do since the uthash is implemented with c preprocessor macros, so if the hash function to implement is conditional upon the type, some kind of conditional needs to remain in the code | 07:07 | |
we could of course put a function pointer into the struct that you have to put into the hashable entries anyway, but that is quite a lot of overhead - or at least it seems to me. | 07:08 | ||
07:10
brrt joined
07:11
lizmat joined
|
|||
brrt | \o #moarvm | 07:20 | |
jnthn | o/ brrt | 07:21 | |
nwc10 | \o/ | ||
timotimo | o | 07:22 | |
moritz | \|o|/ | ||
brrt | wow, such happiness :-) | ||
brrt reading backlog | 07:23 | ||
jnthn | so joy | 07:24 | |
brrt | i'm at a loss what the goal of timotimo's last commit is tbh | 07:27 | |
timotimo | the one from uthash_padding? | 07:28 | |
brrt | yes | ||
doesn't seem to be used anyway? | |||
anywhere | |||
timotimo | not yet | ||
we are currently forced to turn all our strings into the 32bit representation so that two equal strings will hash to the same value | |||
otherwise we wouldn't be able to use hashes any more unless we force every string that can be in 8 bits to always be in 8 bits | 07:29 | ||
brrt | oh…. | ||
timotimo | that's also why you see the MVM_string_flatten call everywhere | ||
that gets rid of ropes and forces the 32bit thing | |||
brrt | hmmm | 07:30 | |
jnthn | also 'cus the ropes code is...uh...ropey | ||
brrt | but then you have 32 bit strings … everywhere | ||
jnthn | brrt: Right | 07:31 | |
timotimo | that's true | ||
uthash doesn't make it very easy to swap out the hash function for different things | |||
jnthn | brrt: Which in terms of "get the right answers up to codepoint level" and "constant time indexing" is a fine choice. | ||
brrt | i guess thats somewhere between acceptable and annoying | ||
jnthn | Just not for efficiency. But optimization always comes after wroking. :) | ||
brrt | true enough | 07:32 | |
timotimo | one way to "defeat" this problem is to write a "key extractor" function that turns the string into the 32bit representation for hashing, but that's an immense amount of overhead | ||
jnthn | uthash.h is like, epic macros | ||
timotimo | maybe i can factor out the "calculate the hash bucket" part of all the hash functions and allow the user to calculate the hash bucket directly | ||
jnthn | yeah | ||
timotimo | that would make it much less hacky i think | ||
jnthn | ok, I should teach stuff | ||
timotimo | except then you can do even more terrible things :) | 07:33 | |
jnthn | bbiab | ||
brrt | what today? | ||
you want the bucket rather than the hash to be computed? | |||
jnthn | brrt: MVC web dev stuff | ||
Nothing too thrilling ;) | |||
brrt | well, stay strong | ||
:-p | |||
jnthn | yeah, it's going fine :) | 08:39 | |
Well, only web-y course I gotta do this month | |||
timotimo | jnthn: putting nameds into callsites ... won't that make our callsite interning strategies much less successful? | 09:03 | |
jnthn | timotimo: ONLY IF WE DON'T UPDAET THE INTERN CODE | ||
uh, ooops] | |||
timotimo | was scared there for a little moment | 09:04 | |
the intern code can still only intern callsites if they have the same exact set of nameds, no? | |||
jnthn | hit caps lock instead of tab, then didn't look at what I'd typed :) | ||
not only exact same name, but also exact same order | 09:05 | ||
timotimo | right | ||
jnthn | That's what allows us to do the names => positions opt | ||
timotimo | mhm | ||
but at least the callsite doesn't contain a reference to what is being called, right? | |||
(except a little cache for invocants) | 09:06 | ||
jnthn | Right. :) | ||
timotimo | what file do i look at to find the callsite writing code? | ||
jnthn | src/mast/compiler.c | ||
I'd start by updating docs/bytecode.markdown or so | |||
Just to get the format clear. | |||
timotimo | oh, get_callsite_id also writes the callsite to the bytecode file | 09:07 | |
brrt | that would be awesome jnthn :-) | ||
timotimo | that explains why i overlooked it | ||
what exactly? | |||
brrt | updating the docs ;_) | ||
timotimo | oh, now i get it! | ||
brrt | ;-) | ||
jnthn | bytecode.markdown is actually quite up to date, afaik. | ||
timotimo | we're actually turning nameds into something that looks exactly like a positional (because really it is) | 09:08 | |
jnthn | brrt: Do you want a spesh doc? | ||
brrt: If so, what parts do you most want a document on? | |||
Same question to timotimo I guess :) | |||
I'm happy to write something up, but it's good to know what the goal is :) | |||
(Which should be "make things clear that the code doesn't make clear".) | 09:09 | ||
brrt is thinking about how i'd attack it | |||
timotimo | an index to the string heap is a 16 bit integer, isn't it? | 09:11 | |
jnthn | 32 now | 09:17 | |
dalek | arVM/named_to_positional: aa55bbc | (Timo Paulssen)++ | docs/bytecode.markdown: fix highlighting in vim ~_~ |
||
arVM/named_to_positional: 738a4cd | (Timo Paulssen)++ | docs/bytecode.markdown: initial spec attempt for callsites storing named arg names. |
|||
timotimo | OK | ||
jnthn | after doing r-j where JVM has 16-bit ones I...learned ;) | ||
dalek | arVM/named_to_positional: 5eedf49 | (Timo Paulssen)++ | docs/bytecode.markdown: index to string heap is 32bit big. |
09:18 | |
timotimo | should i store the names so that they line up with argument numbers or should i "compact" them? | ||
(in the actual in-memory callsite, not the bytecode format) | 09:19 | ||
i.E. will the MVMString **named_names; begin with one NULL per positional? | |||
jnthn | Hmmm | ||
timotimo | could potentially set the whole thing to NULL and not allocate at all if there's only positionals as a slight optimization | 09:20 | |
jnthn | well, yeah, you only need it if there are names, for sure | ||
I think it'll be a question of looking at what args.c needs to be fast/easily done, tbh. | |||
timotimo | oh, actually, how about this crazy idea: | 09:21 | |
... yeah, actually a crazy idea, not a good one. | |||
jnthn | Data structures need designing around use cases :) | ||
timotimo | forget about it :) | ||
jnthn | I *think* that args.c may be efficiencly implementable with them compacted. | 09:22 | |
timotimo | i also just realized that it'd always be a number of NULLs, then a number of names | ||
rather than any kind of mixture | |||
so just knowing the index of the first named will be sufficient to calculate every named index | 09:23 | ||
jnthn | and you do know that 'cus we cache num_pos | ||
timotimo | sounds great tehn | ||
then | |||
jnthn | We don't build a hash table for looking things up 'cus there's not enough names to be worth it almost all the time | 09:24 | |
brrt | can we do symbolic lookups? | 09:25 | |
timotimo | huh? where would that hash table live/what would it be used by? | 09:26 | |
jnthn | well, potentially you could have a name => index hash on a callsite | 09:27 | |
but it's not gonna be worth it | |||
lunch time; bbiab | |||
timotimo | ah, aye | ||
a linear scan would probably always be better | |||
except if you have something like "Hash.new(:foo<bar>, [ and 1000 others ])" | |||
that could potentially wreak some havoc | 09:28 | ||
brrt | you know what might be worth it? | 09:29 | |
timotimo | do tell :) | ||
brrt | sorting the symbols by (symbol pointer value or something like that) | ||
so that you could - if callsite name lists get very large - always resort to binary search | 09:30 | ||
timotimo | you'll have to give me a bit more context | ||
symbol means name of named parameter here? | |||
brrt | yes | ||
with the added notion that the named parameter should / could be 'normalized' - i.e. all instances refer to the same in-memory-object, so that comparison is pointer-comparison | 09:31 | ||
(that is to say, equality is pointer-equality, not what i said) | |||
binary search is really cheap and memory-efficient | 09:32 | ||
timotimo | i'm not entirely sure how exactly that plays together with all other pieces of the puzzle | ||
brrt | neither do i | 09:33 | |
timotimo has much code to read | 09:34 | ||
brrt | :-) | ||
timotimo | hey, if you want to do something string-related that'll probably pay off in memory usage: | ||
there's strings that show up incredibly often, over and over again | 09:35 | ||
i've tried to add a string interning step to the minor collection of the nursery before, that didn't work out well | |||
brrt | such as? | ||
hmmm | |||
timotimo | "dotty" | ||
(this is just from settings compilation, though) | |||
brrt | as part of collection that wouldn't be ideal i imagine | 09:36 | |
you'd ideally want the compiler / mast to fix that | |||
timotimo | not possible | ||
"foobar".substr(1, 2) ← how to? :) | |||
brrt | … possible for spesh? | ||
timotimo | the compiler/mast already has a string heap | ||
gist.github.com/timo/e1af6d5c10a4e34d6cb0 ← check it. | 09:37 | ||
that is a random sub-sample of the gen2 | |||
jnthn | I think we need to keep GC simple, fwiw. | 10:28 | |
Otherwise we make pause times worse. | |||
10:45
lizmat joined
10:50
lizmat_ joined
11:11
brrt joined
|
|||
brrt is checking | 11:13 | ||
wow | |||
these are all different strings? | |||
tadzik | so "OPER" is there 876 times? | 11:14 | |
brrt | as in, different sections of memory | 11:22 | |
tadzik | that's my understanding | ||
brrt | wow | ||
(again) | |||
ok, and name-lookups are string comparisons? | 11:25 | ||
12:22
lizmat joined
|
|||
jnthn | I suspect the OPER thing gets fixed with the arnsholt++ work on O | 12:38 | |
mmmm...cheesecake | |||
13:52
vendethiel joined
14:04
jnap joined
14:31
dalek joined,
btyler joined
14:36
jnap1 joined
14:38
synopsebot joined
14:40
masak joined
|
|||
timotimo | and the lowered_param_N thing i've fixed manually by stashing these strings in the Actions (i think) where they were previously generated with string concat | 15:12 | |
jnthn: we don't inline method calls yet; is that something the specializer will do at some point? or will we leave that for the jit? | 15:30 | ||
hmm. maybe the callsite used in a specialized piece of bytecode could have extra information attached to it that the specializer could then use to improve calls that come from the specialized bytecode | 15:32 | ||
oh, i think i understand why it's hard' in many cases we just put a slot for a little cache into the specialized bytecode to be filled later | 15:36 | ||
so at specialize time we may not even know a single likely candidate | |||
15:41
lizmat joined
16:01
cognominal joined
|
|||
jnthn | timotimo: Well, the specializer and the JIT are very related | 16:54 | |
Such that if the specializer learns to inline then that's done the work for the JIT to also, I expect | 16:55 | ||
Or most of the work at least | |||
The way we de-opt in the two cases may be different. | |||
timotimo | mhh, okay | 16:56 | |
jnthn | On "don't know what's likely", one of the later things we'll do is 2-stage spesh | 16:57 | |
timotimo | jnthn: how do you recommend i tackle the kind of daunting task to change argument handling from "two arguments in order" to "names in the callsite"? | 16:58 | |
jnthn | The first will introduce various "recording" instructions; if the spesh remains hot enough then we'll use them to emit an even specialer version with guard clauses. | ||
timotimo | ah, these recording instructions will be using the spesh slot mechanism? | 16:59 | |
jnthn | timotimo: Well, my plan was to get the bytecode writer to emit the right thing first, and then update the bytecode reader to read them in. | ||
yeah, will use spesh slots for it. | |||
timotimo | i am not quite convinced that this will work fine without a new stage0 | 17:00 | |
jnthn | And then rebootstrap so we always have them. | ||
And then switch args.c over | |||
timotimo | oh, that makes sense | ||
jnthn | oh, it'll want a new stage0 | ||
Maybe twice over. | |||
timotimo | even if i have to make two, should i commit both to the repository? | 17:01 | |
jnthn | Anyway, my idea was to ween us off needing to put the names in. | ||
Sure | |||
You'll be doing it in a branch | |||
timotimo | i already am :) | ||
jnthn | They should be the only NQP changes needed for this. | ||
So then we just merge --squash the branch. | 17:02 | ||
timotimo | ah, fair enough | ||
what does "ween us off needing to put the names in" mean? o_O | |||
jnthn | Hack until deleting the argconst_s instructions for names doesn't break anything:) | 17:03 | |
Can leave the holes | |||
And deal with them once everything works. | 17:04 | ||
timotimo | so 1) write the names into the callsites and serialize that, 2) build a new stage0 with the names in the callsites, 3) remove the argconst_s bytecodes that put the names in the even slots | 17:05 | |
and then build an even newer stage0 that doesn't have the argconst_s bytecodes at all any more | |||
jnthn | uh, 1 is really "and deserialize too" | 17:07 | |
timotimo | ah, yeah | ||
jnthn | And 3 is a lot of work to mkake it possible :) | ||
*make | |||
timotimo | i should have put a "just" in there for good measure :D | ||
run-time named arguments and |%foo are handled by creating a new callsite on the fly, right? | 17:10 | ||
jnthn | yeaah | ||
We ignore any flattening callsites in spesh | |||
timotimo | that'll want fixed, too somewhere between 1 and 3 | ||
jnthn | And will do for the future | ||
timotimo | oh, hold on; i thought i was supposed to change named argument passing for everything ever | ||
jnthn | you still need to update it :) | 17:11 | |
Just saying that spesh isn't going to try and deal with | | |||
Not for the time being anyway. | |||
timotimo | update what now? | 17:12 | |
sorry, i seem to be having a brainfart or something | |||
.o( brainfort, much cooler than a blanketfort ) | |||
spesh isn't, but the regular code is | |||
jnthn | update the flattener | 17:14 | |
timotimo | yea. i was going to do that | ||
jnthn | Note you'll need to do GC updates also. | ||
timotimo | oh? how so? | ||
jnthn | 'cus callsites now point to MVMString which is collectable | 17:15 | |
timotimo | ah, that makes sense | 17:16 | |
17:44
lizmat joined
17:45
lizmat joined
17:55
benabik joined
18:47
woolfy joined
20:07
zakharyas joined
20:18
brrt joined
|
|||
timotimo | the interning mechanism doesn't count nameds? | 20:29 | |
the arity it considers seems to be only the number of positionals | |||
jnthn | we don't intern named things, right. | 20:30 | |
timotimo | oh, i see that now | 20:31 | |
arg_count != num_pos | |||
that line, i overlooked | |||
so i won't need to teach the interner about nameds yet | |||
jnthn | no, that can come later | 20:49 | |
timotimo | is the "names_used" mechanism still needed the way it is right now after the refactor? | 20:50 | |
brrt | (reading backlog) yes, its deopt that will differ rather substantially | 20:54 | |
jnthn | timotimo: Well, we need it to behave the same. Doesn't have to work exactly the same. | 20:55 | |
timotimo | mhm | ||
args_proc_init would introspect the args MVMRegister assuming it's an array-like and get the strings from the args list, yes? | 20:56 | ||
or am i looking at the wrong thing? | |||
jnthn | sounds like | 20:58 | |
I mean, it doesn't introspect args today | |||
timotimo | i feel kinda dumb. maybe today isn't a good day | ||
jnthn | It knows what's there from arg_flags | ||
timotimo | i thought that thing is what creates the callsite and the callsite is supposed to have the nameds set | ||
21:00
lizmat joined
21:07
brrt joined
|
|||
timotimo | huh. so MASTNode *args is supposed to be some kind of array of MVMObjects | 21:12 | |
among them should be strings for the nameds, right? | |||
when i use ATPOS_S_C, the get_str gets passed a null pointer | 21:14 | ||
oh, huh. | |||
yeah, i confused flags and args | 21:15 | ||
there's one flag for two args | |||
21:19
woolfy joined
|
|||
timotimo | i'm not doing very well right now >_< | 21:22 | |
jnthn | Well, it's probably a little tricky... | ||
timotimo | what i'm trying to do here should be simple, though :) | ||
jnthn | timotimo: Did you give up on the NQP regex opt thingy? | ||
timotimo | for now, yes | 21:23 | |
jnthn | OK, I may task steal that. | ||
timotimo | right now i want to teach the mast compiler to write out the nameds into the callsite | ||
thus, when looping through the flags, i num_nameds++ if i see MVM_CALLSITE_ARG_NAMED | _FLAT_NAMED | |||
then, i go from 0 to num_nameds and get args[(elems - num_nameds) + i * 2] | |||
i tried that with ATPOS_S_C, but that didn't work, neither did ATPOS_S | 21:24 | ||
and (MVMString *)ATPOS(...) didn't work either | |||
what i get from there is a p6opaque at least, so it *could* be a P6str | 21:25 | ||
21:26
brrt joined
|
|||
timotimo | huh, the object i get there has no unbox_str_slot, though | 21:26 | |
say, could we perhaps put the name of the class we've created into the MVMP6opaqueREPRdata? | 21:27 | ||
jnthn | Oh | 21:29 | |
It's perhaps a MAST::SVal | |||
Which contains the string | |||
timotimo | ah, that would be helpful | ||
jnthn | For "what class is this" I think it's more "what type is this" and belongs on the STable. | 21:30 | |
timotimo | hm. yeah probably | ||
value = <error reading variable: Cannot access memory at address 0x3f> | 21:31 | ||
that doesn't seem right | |||
AFK for a bit | 21:33 | ||
oh btw | 21:52 | ||
i've looked and it *is* a p6opaque | |||
interesting. a few times the function i'm hacking in actually succeeds, even with nameds | 21:55 | ||
22:19
brrt left
|
|||
timotimo | well, i'm certainly at a loss here. | 22:42 | |
ah, ok | 22:43 | ||
it's due to the fact that it's FLAT_NAMED | |||
or rather: the first one that is FLAT_NAMED breaks my code | |||
jnthn | ah | ||
timotimo | how are those treated? | ||
jnthn | A flat named is really a positional... | ||
That will be flattened. | |||
timotimo | so i shouldn't be looking at the object that's put there at all to extract a name from it | 22:44 | |
in what combinations do those flattened nameds appear? | |||
always in the last slot? always at most one? | |||
jnthn | Not sure off hand... | 22:47 | |
See QASTOperationsMAST.nqp | |||
timotimo | thanks | ||
jnthn | Near call and callmethod there's some code that sorts things out | ||
I know it makes sure nameds come later. | |||
timotimo | ah, so maybe all i'll need to do is just ignore the NAMED_FLAT flag | 22:48 | |
wow, that way i actually get one file to compile and the next error i'll have to hunt is All positional args must appear first | 22:50 | ||
jnthn | That's in bytecode.c iirc | 22:51 | |
timotimo | the verifyer, aye | ||
jnthn | Hmmm | 22:53 | |
grr | |||
timotimo | ah, that's probably because it's trying to read the callsites | 22:54 | |
and it's not reading the named args | |||
it just treats them as if they were the next callsite :) | |||
jnthn | Oh! | 22:55 | |
Meanwhile, I got a version of the regex thingy that calls back into the main optimizer | |||
timotimo | also: i said "the verifyer, aye" which was untrue | ||
i had one, too. it was just that it horribly broke compilation :) | |||
jnthn | Well, mine does pass the NQP tests and build Rakudo. | ||
It for some reason doesn't ever lower self. | 22:56 | ||
timotimo | well, that's much better than what i had already! | 22:57 | |
cool :) | |||
jnthn | ohh | 23:00 | |
tssk | |||
timotimo | ah dangit | 23:02 | |
now i'm trying to read in the stage0 and of course i can't | |||
i'll need a bytecode version incraese | |||
jnthn | Yes, you need to bump the bytecode version and maintain backcompat. | 23:03 | |
timotimo | yup | ||
very obvious in retrospect :) | |||
jnthn | .oO( A description of much of computer science... ) |
||
Figured out why it didn't lower self. | 23:04 | ||
However, the cursor match variable is gonna be a stiffer challenge. | |||
timotimo | mhm :| | ||
but it'll probably really be worth it | |||
jnthn | yeah, 'cus it's the only remaining lexical now. | 23:06 | |
In most rules | |||
23:08
FROGGS joined
|
|||
jnthn is pondering an asserttype op for conveying stuff that should always be true. | 23:10 | ||
spesh can leave it in place (unless it can prove it's not needed), and then copy the type info. | 23:12 | ||
Essentially a way for code-gens and optimizers to convey to the VM, "I'm really really sure this is true even if you can't prove it; blow up if it ever ain't, and optimize assuming it is" | 23:13 | ||
timotimo | right | ||
jnthn | Immediate use case is that !cursor_start always returns something of the same type as self. | 23:14 | |
Granted some day inlining might get it. | |||
But !cursor_start is perhaps a little big to be an obvious inline. | 23:15 | ||
Well, patch passes spectest too | 23:17 | ||
timotimo | now i'm getting bogus data in my arg_name array :\ | ||
i'm setting it with get_heap_string, that seems correct, aye? | |||
(gdb) print frame->cur_args_callsite->arg_name[0] | 23:18 | ||
$4 = <error reading variable: Cannot access memory at address 0x34> | |||
just ... huh? | |||
i must be creating a callsite somewhere and forgetting to null that out or something | 23:20 | ||
jnthn | that looks...odd, yeah. | ||
0x34 is clearly not a pointer. | 23:21 | ||
I just pushed my NQP patches | |||
23:22
woolfy left
|
|||
jnthn | No significant effect yet. | 23:23 | |
timotimo | is it aborting for some other reason? | 23:24 | |
jnthn | Well, it can't lower $¢ yet | 23:25 | |
But also it could benefit from the asserttype thing I mentioned | |||
timotimo | i'm segfaulting and i have no idea where this data could possibly come from | 23:27 | |
jnthn | Anyway, need sleep...still exhausted from last day's teaching/bad sleep... | ||
'night | |||
and good luck ;) | |||
timotimo | oh, huh | 23:28 | |
it's actually a straight-up null pointer | |||
good night and rest well! | |||
(gdb) print num_nameds | 23:31 | ||
$8 = 32783 | |||
yeah, that's not so probable | |||
ah, yeah, C doesn't null out variables on the stack for you | 23:32 | ||
what in the ... | 23:34 | ||
i'm getting a segfault in MVM_gc_worklist_add | 23:36 | ||
and, this happens: (gdb) print /r **(MVMCollectable **)(frame->cur_args_callsite->arg_name[0]) | |||
Cannot access memory at address 0x38000800000001 | |||
other places use MVM_ASSIGN_REF for the result of get_heap_string, but I can't do it here, because the root, in this case the callsite, isn't an MVMCollectable | 23:42 | ||
does everything that has MVMCollectables inside them have to be an MVMCollectable? | |||
i guess i'll go to sleep now | 23:55 |