|
00:33
eternaleye joined
|
|||
| TimToady | I'd think hash functions would be a good use for polymorphism based on the storage type | 00:47 | |
| if you're using conditionals for that, it's kinda smelly | 00:48 | ||
| but maybe the rules are different for in-lined stuff... | 00:49 | ||
|
00:50
jnap joined
|
|||
| TimToady | (just spouting off, haven't actually looked at what you're doing :) | 00:50 | |
|
01:30
lizmat_ joined
04:45
lizmat joined
05:50
woolfy joined
06:07
woolfy joined,
woolfy left,
colomon joined
06:58
zakharyas joined
|
|||
| timotimo | it's kind of hard to do since the uthash is implemented with c preprocessor macros, so if the hash function to implement is conditional upon the type, some kind of conditional needs to remain in the code | 07:07 | |
| we could of course put a function pointer into the struct that you have to put into the hashable entries anyway, but that is quite a lot of overhead - or at least it seems to me. | 07:08 | ||
|
07:10
brrt joined
07:11
lizmat joined
|
|||
| brrt | \o #moarvm | 07:20 | |
| jnthn | o/ brrt | 07:21 | |
| nwc10 | \o/ | ||
| timotimo | o | 07:22 | |
| moritz | \|o|/ | ||
| brrt | wow, such happiness :-) | ||
| brrt reading backlog | 07:23 | ||
| jnthn | so joy | 07:24 | |
| brrt | i'm at a loss what the goal of timotimo's last commit is tbh | 07:27 | |
| timotimo | the one from uthash_padding? | 07:28 | |
| brrt | yes | ||
| doesn't seem to be used anyway? | |||
| anywhere | |||
| timotimo | not yet | ||
| we are currently forced to turn all our strings into the 32bit representation so that two equal strings will hash to the same value | |||
| otherwise we wouldn't be able to use hashes any more unless we force every string that can be in 8 bits to always be in 8 bits | 07:29 | ||
| brrt | oh…. | ||
| timotimo | that's also why you see the MVM_string_flatten call everywhere | ||
| that gets rid of ropes and forces the 32bit thing | |||
| brrt | hmmm | 07:30 | |
| jnthn | also 'cus the ropes code is...uh...ropey | ||
| brrt | but then you have 32 bit strings … everywhere | ||
| jnthn | brrt: Right | 07:31 | |
| timotimo | that's true | ||
| uthash doesn't make it very easy to swap out the hash function for different things | |||
| jnthn | brrt: Which in terms of "get the right answers up to codepoint level" and "constant time indexing" is a fine choice. | ||
| brrt | i guess thats somewhere between acceptable and annoying | ||
| jnthn | Just not for efficiency. But optimization always comes after wroking. :) | ||
| brrt | true enough | 07:32 | |
| timotimo | one way to "defeat" this problem is to write a "key extractor" function that turns the string into the 32bit representation for hashing, but that's an immense amount of overhead | ||
| jnthn | uthash.h is like, epic macros | ||
| timotimo | maybe i can factor out the "calculate the hash bucket" part of all the hash functions and allow the user to calculate the hash bucket directly | ||
| jnthn | yeah | ||
| timotimo | that would make it much less hacky i think | ||
| jnthn | ok, I should teach stuff | ||
| timotimo | except then you can do even more terrible things :) | 07:33 | |
| jnthn | bbiab | ||
| brrt | what today? | ||
| you want the bucket rather than the hash to be computed? | |||
| jnthn | brrt: MVC web dev stuff | ||
| Nothing too thrilling ;) | |||
| brrt | well, stay strong | ||
| :-p | |||
| jnthn | yeah, it's going fine :) | 08:39 | |
| Well, only web-y course I gotta do this month | |||
| timotimo | jnthn: putting nameds into callsites ... won't that make our callsite interning strategies much less successful? | 09:03 | |
| jnthn | timotimo: ONLY IF WE DON'T UPDAET THE INTERN CODE | ||
| uh, ooops] | |||
| timotimo | was scared there for a little moment | 09:04 | |
| the intern code can still only intern callsites if they have the same exact set of nameds, no? | |||
| jnthn | hit caps lock instead of tab, then didn't look at what I'd typed :) | ||
| not only exact same name, but also exact same order | 09:05 | ||
| timotimo | right | ||
| jnthn | That's what allows us to do the names => positions opt | ||
| timotimo | mhm | ||
| but at least the callsite doesn't contain a reference to what is being called, right? | |||
| (except a little cache for invocants) | 09:06 | ||
| jnthn | Right. :) | ||
| timotimo | what file do i look at to find the callsite writing code? | ||
| jnthn | src/mast/compiler.c | ||
| I'd start by updating docs/bytecode.markdown or so | |||
| Just to get the format clear. | |||
| timotimo | oh, get_callsite_id also writes the callsite to the bytecode file | 09:07 | |
| brrt | that would be awesome jnthn :-) | ||
| timotimo | that explains why i overlooked it | ||
| what exactly? | |||
| brrt | updating the docs ;_) | ||
| timotimo | oh, now i get it! | ||
| brrt | ;-) | ||
| jnthn | bytecode.markdown is actually quite up to date, afaik. | ||
| timotimo | we're actually turning nameds into something that looks exactly like a positional (because really it is) | 09:08 | |
| jnthn | brrt: Do you want a spesh doc? | ||
| brrt: If so, what parts do you most want a document on? | |||
| Same question to timotimo I guess :) | |||
| I'm happy to write something up, but it's good to know what the goal is :) | |||
| (Which should be "make things clear that the code doesn't make clear".) | 09:09 | ||
| brrt is thinking about how i'd attack it | |||
| timotimo | an index to the string heap is a 16 bit integer, isn't it? | 09:11 | |
| jnthn | 32 now | 09:17 | |
| dalek | arVM/named_to_positional: aa55bbc | (Timo Paulssen)++ | docs/bytecode.markdown: fix highlighting in vim ~_~ |
||
| arVM/named_to_positional: 738a4cd | (Timo Paulssen)++ | docs/bytecode.markdown: initial spec attempt for callsites storing named arg names. |
|||
| timotimo | OK | ||
| jnthn | after doing r-j where JVM has 16-bit ones I...learned ;) | ||
| dalek | arVM/named_to_positional: 5eedf49 | (Timo Paulssen)++ | docs/bytecode.markdown: index to string heap is 32bit big. |
09:18 | |
| timotimo | should i store the names so that they line up with argument numbers or should i "compact" them? | ||
| (in the actual in-memory callsite, not the bytecode format) | 09:19 | ||
| i.E. will the MVMString **named_names; begin with one NULL per positional? | |||
| jnthn | Hmmm | ||
| timotimo | could potentially set the whole thing to NULL and not allocate at all if there's only positionals as a slight optimization | 09:20 | |
| jnthn | well, yeah, you only need it if there are names, for sure | ||
| I think it'll be a question of looking at what args.c needs to be fast/easily done, tbh. | |||
| timotimo | oh, actually, how about this crazy idea: | 09:21 | |
| ... yeah, actually a crazy idea, not a good one. | |||
| jnthn | Data structures need designing around use cases :) | ||
| timotimo | forget about it :) | ||
| jnthn | I *think* that args.c may be efficiencly implementable with them compacted. | 09:22 | |
| timotimo | i also just realized that it'd always be a number of NULLs, then a number of names | ||
| rather than any kind of mixture | |||
| so just knowing the index of the first named will be sufficient to calculate every named index | 09:23 | ||
| jnthn | and you do know that 'cus we cache num_pos | ||
| timotimo | sounds great tehn | ||
| then | |||
| jnthn | We don't build a hash table for looking things up 'cus there's not enough names to be worth it almost all the time | 09:24 | |
| brrt | can we do symbolic lookups? | 09:25 | |
| timotimo | huh? where would that hash table live/what would it be used by? | 09:26 | |
| jnthn | well, potentially you could have a name => index hash on a callsite | 09:27 | |
| but it's not gonna be worth it | |||
| lunch time; bbiab | |||
| timotimo | ah, aye | ||
| a linear scan would probably always be better | |||
| except if you have something like "Hash.new(:foo<bar>, [ and 1000 others ])" | |||
| that could potentially wreak some havoc | 09:28 | ||
| brrt | you know what might be worth it? | 09:29 | |
| timotimo | do tell :) | ||
| brrt | sorting the symbols by (symbol pointer value or something like that) | ||
| so that you could - if callsite name lists get very large - always resort to binary search | 09:30 | ||
| timotimo | you'll have to give me a bit more context | ||
| symbol means name of named parameter here? | |||
| brrt | yes | ||
| with the added notion that the named parameter should / could be 'normalized' - i.e. all instances refer to the same in-memory-object, so that comparison is pointer-comparison | 09:31 | ||
| (that is to say, equality is pointer-equality, not what i said) | |||
| binary search is really cheap and memory-efficient | 09:32 | ||
| timotimo | i'm not entirely sure how exactly that plays together with all other pieces of the puzzle | ||
| brrt | neither do i | 09:33 | |
| timotimo has much code to read | 09:34 | ||
| brrt | :-) | ||
| timotimo | hey, if you want to do something string-related that'll probably pay off in memory usage: | ||
| there's strings that show up incredibly often, over and over again | 09:35 | ||
| i've tried to add a string interning step to the minor collection of the nursery before, that didn't work out well | |||
| brrt | such as? | ||
| hmmm | |||
| timotimo | "dotty" | ||
| (this is just from settings compilation, though) | |||
| brrt | as part of collection that wouldn't be ideal i imagine | 09:36 | |
| you'd ideally want the compiler / mast to fix that | |||
| timotimo | not possible | ||
| "foobar".substr(1, 2) ← how to? :) | |||
| brrt | … possible for spesh? | ||
| timotimo | the compiler/mast already has a string heap | ||
| gist.github.com/timo/e1af6d5c10a4e34d6cb0 ← check it. | 09:37 | ||
| that is a random sub-sample of the gen2 | |||
| jnthn | I think we need to keep GC simple, fwiw. | 10:28 | |
| Otherwise we make pause times worse. | |||
|
10:45
lizmat joined
10:50
lizmat_ joined
11:11
brrt joined
|
|||
| brrt is checking | 11:13 | ||
| wow | |||
| these are all different strings? | |||
| tadzik | so "OPER" is there 876 times? | 11:14 | |
| brrt | as in, different sections of memory | 11:22 | |
| tadzik | that's my understanding | ||
| brrt | wow | ||
| (again) | |||
| ok, and name-lookups are string comparisons? | 11:25 | ||
|
12:22
lizmat joined
|
|||
| jnthn | I suspect the OPER thing gets fixed with the arnsholt++ work on O | 12:38 | |
| mmmm...cheesecake | |||
|
13:52
vendethiel joined
14:04
jnap joined
14:31
dalek joined,
btyler joined
14:36
jnap1 joined
14:38
synopsebot joined
14:40
masak joined
|
|||
| timotimo | and the lowered_param_N thing i've fixed manually by stashing these strings in the Actions (i think) where they were previously generated with string concat | 15:12 | |
| jnthn: we don't inline method calls yet; is that something the specializer will do at some point? or will we leave that for the jit? | 15:30 | ||
| hmm. maybe the callsite used in a specialized piece of bytecode could have extra information attached to it that the specializer could then use to improve calls that come from the specialized bytecode | 15:32 | ||
| oh, i think i understand why it's hard' in many cases we just put a slot for a little cache into the specialized bytecode to be filled later | 15:36 | ||
| so at specialize time we may not even know a single likely candidate | |||
|
15:41
lizmat joined
16:01
cognominal joined
|
|||
| jnthn | timotimo: Well, the specializer and the JIT are very related | 16:54 | |
| Such that if the specializer learns to inline then that's done the work for the JIT to also, I expect | 16:55 | ||
| Or most of the work at least | |||
| The way we de-opt in the two cases may be different. | |||
| timotimo | mhh, okay | 16:56 | |
| jnthn | On "don't know what's likely", one of the later things we'll do is 2-stage spesh | 16:57 | |
| timotimo | jnthn: how do you recommend i tackle the kind of daunting task to change argument handling from "two arguments in order" to "names in the callsite"? | 16:58 | |
| jnthn | The first will introduce various "recording" instructions; if the spesh remains hot enough then we'll use them to emit an even specialer version with guard clauses. | ||
| timotimo | ah, these recording instructions will be using the spesh slot mechanism? | 16:59 | |
| jnthn | timotimo: Well, my plan was to get the bytecode writer to emit the right thing first, and then update the bytecode reader to read them in. | ||
| yeah, will use spesh slots for it. | |||
| timotimo | i am not quite convinced that this will work fine without a new stage0 | 17:00 | |
| jnthn | And then rebootstrap so we always have them. | ||
| And then switch args.c over | |||
| timotimo | oh, that makes sense | ||
| jnthn | oh, it'll want a new stage0 | ||
| Maybe twice over. | |||
| timotimo | even if i have to make two, should i commit both to the repository? | 17:01 | |
| jnthn | Anyway, my idea was to ween us off needing to put the names in. | ||
| Sure | |||
| You'll be doing it in a branch | |||
| timotimo | i already am :) | ||
| jnthn | They should be the only NQP changes needed for this. | ||
| So then we just merge --squash the branch. | 17:02 | ||
| timotimo | ah, fair enough | ||
| what does "ween us off needing to put the names in" mean? o_O | |||
| jnthn | Hack until deleting the argconst_s instructions for names doesn't break anything:) | 17:03 | |
| Can leave the holes | |||
| And deal with them once everything works. | 17:04 | ||
| timotimo | so 1) write the names into the callsites and serialize that, 2) build a new stage0 with the names in the callsites, 3) remove the argconst_s bytecodes that put the names in the even slots | 17:05 | |
| and then build an even newer stage0 that doesn't have the argconst_s bytecodes at all any more | |||
| jnthn | uh, 1 is really "and deserialize too" | 17:07 | |
| timotimo | ah, yeah | ||
| jnthn | And 3 is a lot of work to mkake it possible :) | ||
| *make | |||
| timotimo | i should have put a "just" in there for good measure :D | ||
| run-time named arguments and |%foo are handled by creating a new callsite on the fly, right? | 17:10 | ||
| jnthn | yeaah | ||
| We ignore any flattening callsites in spesh | |||
| timotimo | that'll want fixed, too somewhere between 1 and 3 | ||
| jnthn | And will do for the future | ||
| timotimo | oh, hold on; i thought i was supposed to change named argument passing for everything ever | ||
| jnthn | you still need to update it :) | 17:11 | |
| Just saying that spesh isn't going to try and deal with | | |||
| Not for the time being anyway. | |||
| timotimo | update what now? | 17:12 | |
| sorry, i seem to be having a brainfart or something | |||
| .o( brainfort, much cooler than a blanketfort ) | |||
| spesh isn't, but the regular code is | |||
| jnthn | update the flattener | 17:14 | |
| timotimo | yea. i was going to do that | ||
| jnthn | Note you'll need to do GC updates also. | ||
| timotimo | oh? how so? | ||
| jnthn | 'cus callsites now point to MVMString which is collectable | 17:15 | |
| timotimo | ah, that makes sense | 17:16 | |
|
17:44
lizmat joined
17:45
lizmat joined
17:55
benabik joined
18:47
woolfy joined
20:07
zakharyas joined
20:18
brrt joined
|
|||
| timotimo | the interning mechanism doesn't count nameds? | 20:29 | |
| the arity it considers seems to be only the number of positionals | |||
| jnthn | we don't intern named things, right. | 20:30 | |
| timotimo | oh, i see that now | 20:31 | |
| arg_count != num_pos | |||
| that line, i overlooked | |||
| so i won't need to teach the interner about nameds yet | |||
| jnthn | no, that can come later | 20:49 | |
| timotimo | is the "names_used" mechanism still needed the way it is right now after the refactor? | 20:50 | |
| brrt | (reading backlog) yes, its deopt that will differ rather substantially | 20:54 | |
| jnthn | timotimo: Well, we need it to behave the same. Doesn't have to work exactly the same. | 20:55 | |
| timotimo | mhm | ||
| args_proc_init would introspect the args MVMRegister assuming it's an array-like and get the strings from the args list, yes? | 20:56 | ||
| or am i looking at the wrong thing? | |||
| jnthn | sounds like | 20:58 | |
| I mean, it doesn't introspect args today | |||
| timotimo | i feel kinda dumb. maybe today isn't a good day | ||
| jnthn | It knows what's there from arg_flags | ||
| timotimo | i thought that thing is what creates the callsite and the callsite is supposed to have the nameds set | ||
|
21:00
lizmat joined
21:07
brrt joined
|
|||
| timotimo | huh. so MASTNode *args is supposed to be some kind of array of MVMObjects | 21:12 | |
| among them should be strings for the nameds, right? | |||
| when i use ATPOS_S_C, the get_str gets passed a null pointer | 21:14 | ||
| oh, huh. | |||
| yeah, i confused flags and args | 21:15 | ||
| there's one flag for two args | |||
|
21:19
woolfy joined
|
|||
| timotimo | i'm not doing very well right now >_< | 21:22 | |
| jnthn | Well, it's probably a little tricky... | ||
| timotimo | what i'm trying to do here should be simple, though :) | ||
| jnthn | timotimo: Did you give up on the NQP regex opt thingy? | ||
| timotimo | for now, yes | 21:23 | |
| jnthn | OK, I may task steal that. | ||
| timotimo | right now i want to teach the mast compiler to write out the nameds into the callsite | ||
| thus, when looping through the flags, i num_nameds++ if i see MVM_CALLSITE_ARG_NAMED | _FLAT_NAMED | |||
| then, i go from 0 to num_nameds and get args[(elems - num_nameds) + i * 2] | |||
| i tried that with ATPOS_S_C, but that didn't work, neither did ATPOS_S | 21:24 | ||
| and (MVMString *)ATPOS(...) didn't work either | |||
| what i get from there is a p6opaque at least, so it *could* be a P6str | 21:25 | ||
|
21:26
brrt joined
|
|||
| timotimo | huh, the object i get there has no unbox_str_slot, though | 21:26 | |
| say, could we perhaps put the name of the class we've created into the MVMP6opaqueREPRdata? | 21:27 | ||
| jnthn | Oh | 21:29 | |
| It's perhaps a MAST::SVal | |||
| Which contains the string | |||
| timotimo | ah, that would be helpful | ||
| jnthn | For "what class is this" I think it's more "what type is this" and belongs on the STable. | 21:30 | |
| timotimo | hm. yeah probably | ||
| value = <error reading variable: Cannot access memory at address 0x3f> | 21:31 | ||
| that doesn't seem right | |||
| AFK for a bit | 21:33 | ||
| oh btw | 21:52 | ||
| i've looked and it *is* a p6opaque | |||
| interesting. a few times the function i'm hacking in actually succeeds, even with nameds | 21:55 | ||
|
22:19
brrt left
|
|||
| timotimo | well, i'm certainly at a loss here. | 22:42 | |
| ah, ok | 22:43 | ||
| it's due to the fact that it's FLAT_NAMED | |||
| or rather: the first one that is FLAT_NAMED breaks my code | |||
| jnthn | ah | ||
| timotimo | how are those treated? | ||
| jnthn | A flat named is really a positional... | ||
| That will be flattened. | |||
| timotimo | so i shouldn't be looking at the object that's put there at all to extract a name from it | 22:44 | |
| in what combinations do those flattened nameds appear? | |||
| always in the last slot? always at most one? | |||
| jnthn | Not sure off hand... | 22:47 | |
| See QASTOperationsMAST.nqp | |||
| timotimo | thanks | ||
| jnthn | Near call and callmethod there's some code that sorts things out | ||
| I know it makes sure nameds come later. | |||
| timotimo | ah, so maybe all i'll need to do is just ignore the NAMED_FLAT flag | 22:48 | |
| wow, that way i actually get one file to compile and the next error i'll have to hunt is All positional args must appear first | 22:50 | ||
| jnthn | That's in bytecode.c iirc | 22:51 | |
| timotimo | the verifyer, aye | ||
| jnthn | Hmmm | 22:53 | |
| grr | |||
| timotimo | ah, that's probably because it's trying to read the callsites | 22:54 | |
| and it's not reading the named args | |||
| it just treats them as if they were the next callsite :) | |||
| jnthn | Oh! | 22:55 | |
| Meanwhile, I got a version of the regex thingy that calls back into the main optimizer | |||
| timotimo | also: i said "the verifyer, aye" which was untrue | ||
| i had one, too. it was just that it horribly broke compilation :) | |||
| jnthn | Well, mine does pass the NQP tests and build Rakudo. | ||
| It for some reason doesn't ever lower self. | 22:56 | ||
| timotimo | well, that's much better than what i had already! | 22:57 | |
| cool :) | |||
| jnthn | ohh | 23:00 | |
| tssk | |||
| timotimo | ah dangit | 23:02 | |
| now i'm trying to read in the stage0 and of course i can't | |||
| i'll need a bytecode version incraese | |||
| jnthn | Yes, you need to bump the bytecode version and maintain backcompat. | 23:03 | |
| timotimo | yup | ||
| very obvious in retrospect :) | |||
| jnthn | .oO( A description of much of computer science... ) |
||
| Figured out why it didn't lower self. | 23:04 | ||
| However, the cursor match variable is gonna be a stiffer challenge. | |||
| timotimo | mhm :| | ||
| but it'll probably really be worth it | |||
| jnthn | yeah, 'cus it's the only remaining lexical now. | 23:06 | |
| In most rules | |||
|
23:08
FROGGS joined
|
|||
| jnthn is pondering an asserttype op for conveying stuff that should always be true. | 23:10 | ||
| spesh can leave it in place (unless it can prove it's not needed), and then copy the type info. | 23:12 | ||
| Essentially a way for code-gens and optimizers to convey to the VM, "I'm really really sure this is true even if you can't prove it; blow up if it ever ain't, and optimize assuming it is" | 23:13 | ||
| timotimo | right | ||
| jnthn | Immediate use case is that !cursor_start always returns something of the same type as self. | 23:14 | |
| Granted some day inlining might get it. | |||
| But !cursor_start is perhaps a little big to be an obvious inline. | 23:15 | ||
| Well, patch passes spectest too | 23:17 | ||
| timotimo | now i'm getting bogus data in my arg_name array :\ | ||
| i'm setting it with get_heap_string, that seems correct, aye? | |||
| (gdb) print frame->cur_args_callsite->arg_name[0] | 23:18 | ||
| $4 = <error reading variable: Cannot access memory at address 0x34> | |||
| just ... huh? | |||
| i must be creating a callsite somewhere and forgetting to null that out or something | 23:20 | ||
| jnthn | that looks...odd, yeah. | ||
| 0x34 is clearly not a pointer. | 23:21 | ||
| I just pushed my NQP patches | |||
|
23:22
woolfy left
|
|||
| jnthn | No significant effect yet. | 23:23 | |
| timotimo | is it aborting for some other reason? | 23:24 | |
| jnthn | Well, it can't lower $¢ yet | 23:25 | |
| But also it could benefit from the asserttype thing I mentioned | |||
| timotimo | i'm segfaulting and i have no idea where this data could possibly come from | 23:27 | |
| jnthn | Anyway, need sleep...still exhausted from last day's teaching/bad sleep... | ||
| 'night | |||
| and good luck ;) | |||
| timotimo | oh, huh | 23:28 | |
| it's actually a straight-up null pointer | |||
| good night and rest well! | |||
| (gdb) print num_nameds | 23:31 | ||
| $8 = 32783 | |||
| yeah, that's not so probable | |||
| ah, yeah, C doesn't null out variables on the stack for you | 23:32 | ||
| what in the ... | 23:34 | ||
| i'm getting a segfault in MVM_gc_worklist_add | 23:36 | ||
| and, this happens: (gdb) print /r **(MVMCollectable **)(frame->cur_args_callsite->arg_name[0]) | |||
| Cannot access memory at address 0x38000800000001 | |||
| other places use MVM_ASSIGN_REF for the result of get_heap_string, but I can't do it here, because the root, in this case the callsite, isn't an MVMCollectable | 23:42 | ||
| does everything that has MVMCollectables inside them have to be an MVMCollectable? | |||
| i guess i'll go to sleep now | 23:55 | ||