00:27 vendethiel joined
dalek arVM: ee2e258 | timotimo++ | src/6model/reprs/MVMMultiCache.c:
MVMMultiCache gets an unmanaged_size
02:02
arVM: e2b01a9 | timotimo++ | src/6model/reprs/MVMStaticFrame.c:
the first ~half of MVMStaticFrame's unmanaged_size
arVM: 6b90ccb | timotimo++ | src/6model/reprs/MVMCompUnit.c:
measure unmanaged size for MVMCompUnit
arVM: 9136a3e | timotimo++ | src/6model/reprs/SCRef.c:
measure SCRef's unmanaged size (potentially incomplete)
timotimo still missing: MVMHash and MultiDimArray
and there's a few XXX's here and there
but at least it's only ever underestimating, and it definitely gets us closer to the truth 02:05
oh, whoops, what a silly mistake i made in there 02:06
dalek arVM: ad01448 | timotimo++ | src/6model/reprs/MVMCompUnit.c:
fix drastic copy-pasto in MVMCompUnit.c
02:11 colomon joined
dalek arVM: 9872121 | timotimo++ | src/6model/reprs/MVMCompUnit.c:
some of these callsites are null, so skip those.
02:15
timotimo "considering the snapshot ..." %) 02:21
i wsan't quick enough to close the window when it was taking the snapshots in the terminal
5941561 142M -rw-r--r--. 1 timo timo 142M Mar 24 03:12 heap-snapshot-1458785563.50406 02:22
Total heap size: 4,179,359,695,703,278,305 bytes 02:26
m: say "4,179,359,695,703,278,305".split(",").join("").Int / 1024
camelia rakudo-moar 6732d2: OUTPUTĀ«4081405952835232.719727ā¤Ā»
timotimo m: say "4,179,359,695,703,278,305".split(",").join("").Int / (1024 * 1024)
camelia rakudo-moar 6732d2: OUTPUTĀ«3985748000815.65695286ā¤Ā»
timotimo ... ?!?
SCRef 4,179,359,695,653,307,760 bytes 02:27
that's probably where that error comes from %)
m: say "4,179,359,695,703,278,305".split(",").join("").Int / (1024 * 1024 * 1024) 02:29
camelia rakudo-moar 6732d2: OUTPUTĀ«3892332032.04653999303ā¤Ā»
timotimo m: say "4,179,359,695,703,278,305".split(",").join("").Int / (1024 * 1024 * 1024 * 1024)
camelia rakudo-moar 6732d2: OUTPUTĀ«3801105.50004544921194ā¤Ā»
timotimo so, like, a couple thousand terabytes? 02:30
m: say "4,179,359,695,703,278,305".split(",").join("").Int / (1024 * 1024 * 1024 * 1024 * 1024)
camelia rakudo-moar 6732d2: OUTPUTĀ«3712.01708988813399603ā¤Ā»
timotimo is that 3 exabytes?
geekosaur sounds more like total allocations, not active heap 02:33
02:33 synopsebot6 joined
timotimo github.com/MoarVM/MoarVM/commit/91...f504f879da - i wonder how i overestimate the size so drastically with this simple code? 02:34
SCRef size: 1 objects, 1 stables, size is -1358922752 02:38
... errr .... ?!?!
am i casting the object wrongly?
that still doesn't make any sense 02:39
oops, wrong number of arguments to fprint :)
02:47 ilbot3 joined
timotimo how does this go wrong? :o 02:49
somehow 144115737831669776 and 144116837343297552 seem to have ended up in the dump 03:01
those values actually are in the data 03:07
diakopter interesting 03:11
timotimo i don't quite get it
SCRef size: 1 objects, 1 stables, size is 16
diakopter 4,179,359,695,653,307,760 bytes 4,179,359,695,653,307,760 bytes
timotimo this object gave a humongous result: 144115737831669776; SCRef
diakopter that's a lot of bytes 03:12
timotimo that's inside the "unmanaged_size" function and then right outside it
so on the way out it gets misinterpreted
diakopter well there must be some uninitialized memory
03:12 mojca joined
diakopter so one of the unmanaged_size routines is returning garbage sometimes 03:13
timotimo let me show you the code i'm looking at
github.com/MoarVM/MoarVM/blob/mast...hot.c#L304
here i dump the REPR of the thing that we call unmanaged_size on, and the result if it's above a certain threshold 03:14
github.com/MoarVM/MoarVM/blob/mast...Ref.c#L137 - here i fprintf the size before it gets returned 03:15
oh, well, i print it with %d 03:17
that's not the right one to use here
oh 03:18
PRId64 is actually for signed int, it seems like?
diakopter (don't know) 03:19
timotimo and PRIu64 seems to be for unsigned
which is what we have here
diakopter well in SCRef's initialize it needs to zero out the num_objects and num_stables 03:21
timotimo no, those values are fine 03:22
something's still off 03:23
anyway, PRIu64 is for unsigned values. we have those here, so putting that into the format strings is more correct. 03:24
SCRef size: 1 objects, 1 stables, size is 144115737831669776
this object gave a humongous result: 144115737831669776; SCRef
this is with the right format specifier, and it reads the exact same values that get multiplied with the sizeof those pointers 03:25
timotimo needs to print those numbers with PRIu64, too 03:26
OK, you were right, those numbers are actually bogus
even though they looked harmless (but only because i printed them with %d)
so ... good catch! :)
i'm a bit surprised we don't allocate the SCRef object in a zeroed-out area 03:27
i put = 0 into the initialize function, but i still get those "interesting" results 03:28
heap me diakopter senpai; what's going on here? :\ 03:29
geekosaur "a heap of trouble" 03:30
timotimo we have many places where SC indexes are MVMint64 03:31
like MVM_sc_get_object, which is, AFAIK, an API function 03:32
so ... ugh
i guess i'll open an issue about these numbers being not so right all over the place? 03:34
i mean, mixed signed and unsigned
github.com/MoarVM/MoarVM/issues/348 03:36
dalek arVM: ef856ae | timotimo++ | src/6model/reprs/SCRef.c:
disable unmanaged size for SCRef untix #348 gets some attention
03:42
arVM: ae196c6 | timotimo++ | src/profiler/heapsnapshot.c:
use proper unsigned formatting codes in collectables_str
arVM: 543c727 | timotimo++ | src/profiler/heapsnapshot.c:
use unsigned formatting codes in types_str, too

though we are very unlikely to hit the point where that makes a difference
03:57 vendethiel joined
timotimo diakopter: do you think you can make moarop_mapper and add_core_moarop_mapping keep around less memory? 04:01
like, do you have time to do that this week or something?
dalek arVM/annotate_refs_reprapi: 43f5701 | timotimo++ | src/6model/ (44 files):
give all REPRs a "describe_refs" function
04:21
05:14 vendethiel joined
diakopter timotimo: dunno 05:39
also dunno if it's worth it
timotimo: ohh (looking at #348) 05:43
05:58 vendethiel joined 06:21 vendethiel joined 07:03 vendethiel joined 07:09 mojca joined 07:16 domidumont joined 07:21 domidumont joined
timotimo i'm having difficulty falling asleep :| 08:01
so i'm banging my head against the describe_ref functions
MVMArray seems okay, but i couldn't actually see it put any indices anywhere so far 08:02
P6opaque gives me segfaults
and SCRef is just totally b0rked
08:04 vendethiel joined
dalek arVM/annotate_refs_reprapi: 9d355e7 | timotimo++ | src/6model/6model.h:
describe_refs ought to be void.
08:07
arVM/annotate_refs_reprapi: 842eb4e | timotimo++ | src/6model/reprs/MVMCompUnit.c:
describe refs of MVMCompUnit
arVM/annotate_refs_reprapi: 138f13e | timotimo++ | src/profiler/heapsnapshot.c:
actually call describe_refs for things
arVM/annotate_refs_reprapi: e91e879 | timotimo++ | src/6model/reprs/MVMArray.c:
describe_refs in MVMArray

0953d68 | timotimo++ | src/6model/reprs/SCRef.c: improvement to SCRef's describe_refs
timotimo the b0rken parts are commented out, so compiling with this branch doesn't give you crashes.
well, one thing that's wrong is having alloc_objects and alloc_stables in SCRef 08:10
it should be num_*
$5 = {handle = pointer to '', description = <error reading variable: Cannot access memory at address 0x20>, 08:12
num_objects = 15762615875665921, alloc_objects = 6316048, root_objects = 0x38, root_stables = 0x0,
num_stables = 56, alloc_stables = 7601200, root_codes = 0x0, rep_indexes = 0x20000400000001, rep_scs = 0x606550,
owned_objects = 0xa6d540, sc = 0x0, hash_handle = {tbl = 0x38000400000001, hh_prev = 0x606010, hh_next = 0xa9, 08:13
key = 0x0, keylen = 169, hashv = 0}, sc_idx = 13310112, sr = 0x0, mutex = 0x20000400000001}
which of these fields is supposed to tell me "do not try to look into this object's things"?
08:14 FROGGS joined
timotimo oooooh 08:15
there's a level of indirection there that i hadn't noticed!
08:15 domidumont joined 08:16 zakharyas joined
dalek arVM/annotate_refs_reprapi: 2a873b0 | timotimo++ | src/6model/reprs/SCRef.c:
finally noticed SCRef's body is one more pointer away!

fixes SCRef's unmanaged_size as well as describe_refs
08:17
timotimo > find 100 objects type="SCRef" 08:23
fish: Job 2, ā€œperl6 -Ilib bin/moar-ha ~/perl6ā€¦ā€ terminated by signal SIGSEGV (Address boundary error)
*cough cough* :)
but probably because moar re-compiled in the background?
08:45 mojca joined
timotimo gist.github.com/timo/1d9d45ae647a870799eb - not quite sure how exactly this is going wrong, but it's kinda neat to see at least 08:48
i'd like it to also output the index of the other objects if that's not too much 08:49
github.com/timo/p6-app-moarvm-heapanalyzer - fork'd 08:51
jnthn: even though having a collectable already in the seen hash isn't a problem for describe_refs, the refs that get described end up getting added another time on top of the already existing one, because add_reference doesn't check that yet 08:57
timotimo tries to sleep again
why did i even try *sigh* 09:51
09:55 harrow joined
timotimo is implementing deduplication of references 10:03
jnthn timotimo: Why do we get duplicate references?
timotimo because first it gets added in gc_mark, then it gets added again in describe_refs 10:04
jnthn ...
Yes, you're mean tto call gc_mark *or* describe_refs!
Not both!
describe_refs if available, gc_mark as a fallback
timotimo oh, then describe_refs has to implement all of what gc_mark does
jnthn Yes.
timotimo then every one of them has to be perfect :| 10:05
jnthn Right :)
So add them where they're useful
And not where they ain't :)
timotimo the one for p6opaque is useful, but also b0rked
if you could look into my p6opaque describe_refs, that'd be nice :) 10:10
jnthn: is there anything bad about having "index" for multiple different things in the same object? 10:16
like, objects and stables in the SCRef
jnthn timotimo: Well, it won't complain if you do, it'll just be a bit confusing in the output, but tbh SCs reference so much it's going to be a lot of output if you try and ls one :) 10:18
timotimo fair enough 10:19
i put a "show" (basically ls) into the analyzer in my fork on github
jnthn Yeah, I saw
show is a nice name
timotimo it ought to show some more stuff, too 10:20
size, unmanaged size, stuff like that
jnthn yeah 10:21
And also only show some refs if there are a gazzilion 10:22
(And a number or refs up top)
We can show the refs on a single line in show too, I think
as in --[ ... ]--> foo
And yeah, include the collectable ID 10:23
as in --[ ... ]--> 1234 moarop_mapper (QAST.nqp:666) (Frame)
So then you can navigate :)
timotimo right 10:24
also, we may want a "next page" command for "find" and friends
jnthn Maybe show without an arg should default to show 0 - that is, the root :)
dalek arVM/annotate_refs_reprapi: 5765e61 | timotimo++ | src/ (4 files):
describe_refs is supposed to be an alternative to gc_mark

not an addition to it.
timotimo what's the thing that makes startup so slow, btw? 10:26
we just don't read lines fast?
jnthn Yeah, and I think it's in MoarVM 10:27
Maybe something pathological about really long lines
timotimo OK, sounds good
maybe we only grow our line buffer linearly
instead of 2* it every time?
jnthn It's actually managed as a linked list, iirc
Until we assemble the string and then we know how many things we have
I suspect the code that looks for the separator 10:28
Need to do a C profile though
But got other job to work on today :)
timotimo it's going to be interesting to support describe_refs for p6opaque 10:30
as it has flattened STables in it
well, *can have*
jnthn :) 10:31
timotimo i can think of a few different ways to hack around that potential problem
we could remember what reference ID we were at when we started, then we can pass a "success" flag upwards from describe_refs
we could have an extra repr api function to do a check up-front, recursively, if we'd succeed or not 10:32
or the describe_refs function could also take a worklist and do gc_mark when it can't do describe_refs for children
that last one seems the least painful 10:33
jnthn OK, leave it for now, I'll have to think about it a bit :)
But I'm doing other things now.
timotimo on top of that, my implementation of p6opaque caused segfaults and i have no clue why :)
jnthn Note that unmanaged_size will also need to factor in flattened in stuff
(for P6opaque) 10:34
timotimo aye
10:35 harrow joined
dalek arVM/annotate_refs_reprapi: d5de54b | timotimo++ | src/6model/reprs/P6opaque.c:
restore a NULL that went missing
10:36
timotimo yes, 66% in find_separator 10:37
jnthn aha :)
I have a vague memory I cheated in there somewhere 10:38
timotimo the hottest instruction in there is the jne from the if (sep_spec->sep_lengths[j] == 1) { ... } 10:39
jnthn :) 10:40
OK. I'll look into it
timotimo how do we feel about finding all incoming references to a given object? 10:54
jnthn Do we have a use case? 10:56
timotimo just navigation in general, nothing very important
jnthn I don't think it's worth it unless we need it 10:57
I think if we did it then a paths 1234 would do it
That is, list all paths
But usually shortest path is the easiest way to understand a leak 10:58
timotimo hm, all paths could be *really* much :) 10:59
i think i want VMString to show up with its contents, does that make sense? 11:00
jnthn No
timotimo right, if it's huge, that'd be bad
jnthn I didn't do that on purpose
The point of this isn't to see the exact data on the heap 11:02
It's to understand the objects and their relationshps
And magnitudes
Once you start including data too it'll get incredibly large
Plus imagine somebody taking heap snapshots of code that has, for example, passwords or keys in memory 11:03
timotimo that's very true
oh
well, that's true
jnthn Today you don't have to worry you'll leak them if you share the snapshot with somebody for debugging.
timotimo that also means i don't want to give MVMHash's keys into the string heap 11:05
jnthn Yeah, maybe we should add key, value, key, value, etc. 11:06
timotimo i can definitely do that
or give both the key and the value the same index 11:07
though won't the key always be a string anyway? even with object hashes? 11:10
jnthn Yeah, for now 11:14
dalek arVM/annotate_refs_reprapi: 7c369e5 | timotimo++ | src/6model/reprs/MVMHash.c:
give MVMHash an unmanaged_size
11:15
timotimo pushed some more commits up to my fork of your heap analyzer. feel free to pull those in :) 11:28
i'll try to get a bit of shut-eye
jnthn Good luck! :)
.oO( The timotimo is a nocturnal species... )
11:29
12:31 vendethiel joined
nine /win 37 12:36
jnthn /fail 43 12:37
ilmari /lose 42 12:40
nwc10 /lunch 12:46
not sure what number goes with that
moritz at least 30 (minutes) :-)
17:10 geekosaur joined 17:18 domidumont joined 17:45 dalek joined 18:15 dalek joined 18:55 Ven joined 19:13 geekosaur joined 19:23 vendethiel joined 19:29 FROGGS joined 19:37 Ven joined 20:12 dalek joined 20:18 dalek joined
timotimo .o( and now i slept far too much ) 20:42
20:50 mojca joined
jnthn Turns out there's more than one reason that EVAL leaks... 21:12
But at least I know I fixed the first one ('cus the heap snapshot analysis tells me so) and where to start looking for the second :) 21:13
timotimo did the amount of leakage get reduced at all, or is something else holding on to the same sub-tree? 21:14
jnthn The latter 21:15
timotimo feared as much :(
would the imagined "paths" command start from every single root (i.e. those things that are referenced by the Root collectable) and see if the object can be reached from there exclusively? or what? 21:16
jnthn You can probably do it with a modified bfs... 21:17
That keeps arrays of preds instead of just the best pred
timotimo mhm 21:19
then we'll have to be careful about circularity, though 21:20
perhaps it's enough to just not accept links that have a difference in "distance" that goes the wrong way? 21:21
jnthn It's a bfs, the color prevents you going in circles
timotimo oh
that makes sense 21:22
jnthn wtf, these paths are bizzare 21:26
timotimo show one? :) 21:27
jnthn gist.github.com/jnthn/5e6c8f84a9e44fd9966b 21:28
How on earth does ABC end up in the same SC has Int?
timotimo argh!
jnthn It's not reposeesion.
timotimo well, the boxed integer cache doesn't know about SCs at all 21:29
so whoever calls box_i at the right time gets it
jnthn oh...
timotimo that's something i didn't even think about when i wrote that code 21:30
jnthn Yes. Damn.
timotimo we could just empty out the BIC when we pop an SC, fwiw
or unset the sc in the cached boxed integers, if that makes any sense at all?
jnthn Well, an SC needs to "own" the thing really 21:31
timotimo a fake SC could do that, if need be
how does scdisclaim do it? 21:32
jnthn We really do need separate constants per SC, I think 21:33
timotimo OK, so the cache gets an extra "layer"/key and we'll have to throw out SC entries when we pop SCs?
21:36 ilmari joined, dalek joined
jnthn Maybe...or just clear it on push/pop of SCs 21:37
timotimo it could also get a hit/miss counter for SCs and an entry for what SC is currently being accepted and when the misses go much above the hits, it just gets dumped out completely
multiple threads would increment those counters, though
but i think it already has a lock that prevents competitive additions?
jnthn yeah 21:38
timotimo i'd go with the "dump when SC changes", fwiw
i mean, when push or pop get executed
jnthn Think I'll sleep on it. Currently checking if disabling it totally makes a difference.
timotimo OK. if not, we'll just have another place to look at ;)
jnthn It makes the heap snapshot a lot bigger for one.
timotimo yeah, many more Int objects 21:39
maybe an #ifdef for the int cache might be a good idea in general? i could put that in if you'd like
jnthn Seems I didn't disable it well enough... 21:40
Anyway, the short answer is that lexotic_cache is also guilty 21:41
timotimo if you'd like you can just gist the whole test setup including the ABC script
also, i think the int cache can only contribute to a limited amount of old SCs being kept around, no? 21:42
jnthn It's the same test as yesterday :) 21:43
I'd think so
timotimo isn't in my irc backlog any more :S
or was it not a gist? 21:44
jnthn gist.github.com/jnthn/f67379ff234861729f34
timotimo lovely, thanks
jnthn Make sure you have the NQP patch I did too
To fix the immediate leak
timotimo oh, it *was*!
aye, i built that on at least one of my two machines
oh, that one 21:45
yeah, will surely get that. i thought you meant the stage0 update and such
jnthn Oh, those are in a branch :) 21:46
'cus I didn't do the JVM updates for them
And don't want to bust JVM
timotimo understandably
jnthn If you fancy doing that we can merge it
timotimo otoh, #?if moar :)
jnthn Would prefer to at least stub the op on JVM, even if it does nothing :) 21:51
timotimo understood 21:52
i'll have to run emergency-grocery-shopping
feel free to outline some more tasks
jnthn Well, removign the int cache helps but...then we just get other odd traces :) 21:53
timotimo i know nothing about the lexotic cache fwiw :)
jnthn That don't make sense, but I'm tired, so I'll leave it there for now. But feel free to see if you can figure them out
Yeah, that one I need to ponder
But there are *others* :) 21:54
timotimo but it can probably just be dumped at the same time the int cache gets dumped?
OK
i'll at least take a few looks
rest well!
jnthn I think I may come up with a better design for the lexotic cache :)
timotimo is it easy-ish to just chop the cache out of moar and see what happens?
or will that slow execution down 1000x? 21:55
jnthn No, I did exactly that
It makes bigger heap snapshots is all
timotimo: gist.github.com/jnthn/0e9274240cb928c31cfd was my patch 21:56
timotimo oh, i meant the lexotic cache
jnthn 'night
22:16 mojca joined 22:20 geekosaur joined