#moarvm on 29 July 2017 - Raku Programming Language Log

samcv	now let me see how many spectests this breaks :)	00:00	Copy link Message link Add to gist Remove
jnthn	samcv: (most codepoints) sounds fine	00:06	Copy link Message link Add to gist Remove
	timotimo: No, I'm a bit confused by how that'd happen		Copy link Message link Add to gist Remove
samcv	ok cool no spectests failed		Copy link Message link Add to gist Remove
jnthn	timotimo: Will have to look, but it's decidedly bed time for me :)	00:07	Copy link Message link Add to gist Remove
timotimo	well, it doesn't look like we ever root (or even mark) a ThreadContext	00:13	Copy link Message link Add to gist Remove
samcv	cool. i'm adding back the power to configure primary/secondary/tertiary level sort :)	00:18	Copy link Message link Add to gist Remove
	though it's going to have to work slightly differently. though it may give the same result. because all primary levels are higher than all secondary levels and all secondary are higher than all tertiary levels	00:19	Copy link Message link Add to gist Remove
	it will only apply your settings if comparing between the same level. otherwise it will compare normally		Copy link Message link Add to gist Remove
	so if you reverse primary level, only comparing primary level vs primary level will your change be effected		Copy link Message link Add to gist Remove
	and... it works! wow	00:23	Copy link Message link Add to gist Remove
	wait why does MVM_string_codes return s->body.num_graphs and MVM_string_graphs also returns body.num_graphs?	00:49	Copy link Message link Add to gist Remove
timotimo	that seems wrong	01:19	Copy link Message link Add to gist Remove
	but we might not have that function used anywhere?		Copy link Message link Add to gist Remove
01:52 ilbot3 joined
samcv	hmm	01:52	Copy link Message link Add to gist Remove
	well theres nqp::codes_s but its' not used in nqp	01:53	Copy link Message link Add to gist Remove
	what's the nqp op to get the number of codepoints?		Copy link Message link Add to gist Remove
	or the MVM function?		Copy link Message link Add to gist Remove
	o-oh it does self.NFC.codes	01:54	Copy link Message link Add to gist Remove
	in rakudo		Copy link Message link Add to gist Remove
	could that be slow though?		Copy link Message link Add to gist Remove
	so MVM_string_codes is totally bogus	01:56	Copy link Message link Add to gist Remove
timotimo	it could be slow, yes		Copy link Message link Add to gist Remove
	it allocates, for one	01:57	Copy link Message link Add to gist Remove
samcv	we don't just keep track of the number of codepoints?		Copy link Message link Add to gist Remove
	that shouldn't be that hard to do i would think	01:58	Copy link Message link Add to gist Remove
timotimo	how often do we need that info?	02:02	Copy link Message link Add to gist Remove
samcv	idk		Copy link Message link Add to gist Remove
	but it is an op. though it's not added to nqp	02:03	Copy link Message link Add to gist Remove
	seems bad to have it be totally wrong		Copy link Message link Add to gist Remove
timotimo	right, it is	02:12	Copy link Message link Add to gist Remove
	can probably replace it with a NYI exception	02:13	Copy link Message link Add to gist Remove
	bedtime for me	02:14	Copy link Message link Add to gist Remove
	long day ahead of me		Copy link Message link Add to gist Remove
	o/		Copy link Message link Add to gist Remove
04:15 deep-book-gk_ joined 04:18 deep-book-gk_ left 06:42 robertle joined 06:54 statisfiable6 joined 09:07 praisethemoon joined 11:12 colomon joined
jnthn	codes_s should probably just grab a codepoint iter and loop	11:31	Copy link Message link Add to gist Remove
	And increment for each		Copy link Message link Add to gist Remove
	We could do an optimized path for some of the cases		Copy link Message link Add to gist Remove
nine	jnthn: you talked about aliasing and scalars yesterday. Did you mean scalars in general, like ones coming from outside the block spesh is looking at, or scalars created in that block?	11:33	Copy link Message link Add to gist Remove
11:34 vendethiel joined
jnthn	nine: Ones coming outside of the block we pretty much have to assume are aliased	11:36	Copy link Message link Add to gist Remove
	But in a huge number of cases the first thing we do upon receiving them is decont		Copy link Message link Add to gist Remove
11:38 lizmat joined
nine	jnthn: but if you meant the ones created in the block I don't understand how the "block it if we see a call" rule can be too weak, since I think there's no way to pass the scalar to a different thread without making a call?	11:40	Copy link Message link Add to gist Remove
	It's easy to see how it's too strong though.	11:41	Copy link Message link Add to gist Remove
	(for the "it's only deconted" reason)		Copy link Message link Add to gist Remove
jnthn	nine: Ah, I was talking then about where we're heading, not what we have today. Today's one doesn't try to track if the thing might be aliased.	11:47	Copy link Message link Add to gist Remove
nine	Is there any documentation about how spesh works? I do understand its job but would like to learn more about how it's implemented.	11:51	Copy link Message link Add to gist Remove
jnthn	Not a great deal; I'm sure I had some slides with the basics, beyond that the key data structures involved are decently described in the header files. src/spesh/graph.h is the best place to start reading.	11:58	Copy link Message link Add to gist Remove
12:03 colomon joined
nine	Ok, thanks!	12:06	Copy link Message link Add to gist Remove
12:10 colomon joined 12:46 colomon joined 12:52 dogbert2 joined
Geth	MoarVM: abc38137b3 \| (Samantha McVey)++ \| src/strings/ops.c Fix MVM_string_compare to support deterministic comparing of synthetics Previously we compared naively by grapheme, and ended up comparing synthetic codepoints with non-synthetics. This would cause synthetics to be sorted incorrectly, in addition to it making comparing things non-deterministic; if the synthetics were added in a different order, you would get a different result with MVM_string_compare. ... (6 more lines)	13:40	Copy link Message link Add to gist Remove
13:46 brrt joined
nine	Comparing the precomp file of NativeCall::Types with just a recompile, the files differ by 8 32 bit values and 2 64 bit values spread out between 0x0000ff70 and 0x00010150 which is about 73 % into the file. What could those be?	13:47	Copy link Message link Add to gist Remove
	Don't look like strings or time stamps and it's not just a different order either.	13:48	Copy link Message link Add to gist Remove
jnthn	If you valgrind it, does it warn about write getting uninitialized bytes?	13:49	Copy link Message link Add to gist Remove
	I think there's still some case that wasn't yet tracked down where things are aligned in the output, but the padding bytes aren't zeroed, and the memory was malloc'd	13:50	Copy link Message link Add to gist Remove
	It's fine in that we ignore them when reading		Copy link Message link Add to gist Remove
	But maybe not so fine for what you're doing?		Copy link Message link Add to gist Remove
nine	This is just a plain unmodified MoarVM. Wouldn't the uninitialized read have popped up time and again?		Copy link Message link Add to gist Remove
jnthn	I see them in the occasional valgrind output alongside the actual things I've been hutning	13:51	Copy link Message link Add to gist Remove
	*hunting		Copy link Message link Add to gist Remove
nine	I'm investigating reproducible builds as distros like Debian are pushing strongly into that direction.		Copy link Message link Add to gist Remove
jnthn	But since I knew they they were harmless I learned to disregard them.		Copy link Message link Add to gist Remove
	Yeah, then they're not so harmless for that		Copy link Message link Add to gist Remove
	It'll be somewhere in src/mast/compiler.c that'll want fixing, I'd expect		Copy link Message link Add to gist Remove
13:53 colomon joined
nine	Ok, that explanation does fit with the data I'm seeing. Though that doesn't explain those 2 64 bit differences	13:53	Copy link Message link Add to gist Remove
jnthn	No, those are a bit more odd	13:54	Copy link Message link Add to gist Remove
nine	I do get a "Syscall param write(buf) points to uninitialised byte(s)"	14:12	Copy link Message link Add to gist Remove
	Looks like it's in the SC data (surprise, surprise)	14:17	Copy link Message link Add to gist Remove
jnthn	oh	14:18	Copy link Message link Add to gist Remove
	Wasn't quite expecting it in SC data		Copy link Message link Add to gist Remove
Geth	MoarVM/even-moar-jit: 23 commits pushed by (Jonathan Worthington)++, (Jimmy Zhuo)++, (Timo Paulssen)++, (Samantha McVey)++, (Bart Wiegmans)++ review: github.com/MoarVM/MoarVM/compare/5...328d2e1c74	14:19	Copy link Message link Add to gist Remove
nine	Well vm->serialized starts at 0x0b7d8 and is size 0x0a6b8. The values in question are between 0x0ff70 and 0x10150	14:21	Copy link Message link Add to gist Remove
jnthn	Seems guilty then		Copy link Message link Add to gist Remove
nine	also 0x0b7d8+0x0a6b8 comes just a couple bytes short of the mbc file's size, so I guess the numbers make sense	14:23	Copy link Message link Add to gist Remove
	And when I compile with all correct arguments, the size matches exactly even.	14:31	Copy link Message link Add to gist Remove
	So....what's the story behind this? #define vm tc	14:34	Copy link Message link Add to gist Remove
jnthn	Once upon a time, the MAST assembler was compiled both into MoarVM and into a Parrot dynops library	14:38	Copy link Message link Add to gist Remove
	That's how we bootstrapped off NQP on Parrot.		Copy link Message link Add to gist Remove
nine	Oh...that does sound kinda horrible	14:39	Copy link Message link Add to gist Remove
jnthn	Yeah. Well, that code is a victim of its own correctness I guess.	14:40	Copy link Message link Add to gist Remove
	It's required very few changes/fixes, those it did need were very localized, so there was never really an incentive to sink time into erasing this history :)	14:41	Copy link Message link Add to gist Remove
	Not to mention that in the long term we should drop probably MAST altogether and just produce the bytes :)		Copy link Message link Add to gist Remove
nine	Ok, narrowed it down to writer->root.stables_data	14:43	Copy link Message link Add to gist Remove
jnthn	Hmmm		Copy link Message link Add to gist Remove
	jnthn still has no good guesses	14:44	Copy link Message link Add to gist Remove
	Though it's starting to sound a bit less like padding		Copy link Message link Add to gist Remove
nine	Turning the MVM_mallocs in MVM_serialization_serialize into MVM_callocs seems to have improved things. Though I do see additional differences in these tests.	14:51	Copy link Message link Add to gist Remove
	As opposed to my very first one	14:52	Copy link Message link Add to gist Remove
	As the sizes are "Some guesses." it's not that surprising. Not all compilation units will fill those default buffers completely.	14:58	Copy link Message link Add to gist Remove
15:00 colomon joined 15:08 dogbert2 joined 15:17 praisethemoon joined
dogbert2	Created an issue for the problem uncovered yesterday. github.com/MoarVM/MoarVM/issues/620	15:23	Copy link Message link Add to gist Remove
15:48 colomon joined
nine	One of the remaining bits seems to be in a method cache	15:59	Copy link Message link Add to gist Remove
	Removing the 2 ^parameterize methods in the file seems to improve things.	16:36	Copy link Message link Add to gist Remove
	The same is not true for the other methods		Copy link Message link Add to gist Remove
16:39 colomon joined 17:52 dogbert2 joined
nine	No wonder this makes no sense! The sizes for the sections are calculated in a different order than the sections are written, so all positions were far off.	17:53	Copy link Message link Add to gist Remove
	Now this makes more sense: it's the very last 4 bytes of writer->root.objects_data	18:00	Copy link Message link Add to gist Remove
18:03 statisfiable6 joined
nine	And it iiiiis..... the padding between sections. Who'd have thought? We write writer->root.objects_data bytes but advance the offset by MVM_ALIGN_SECTION(writer->objects_data_offset)	18:15	Copy link Message link Add to gist Remove
	Only one difference left, apparently in the closures table	18:19	Copy link Message link Add to gist Remove
	And that can be fixed by initializing the full memory after reallocing the closures table. Now why that's necessary is an open question.	18:31	Copy link Message link Add to gist Remove
18:55 zakharyas joined 18:58 zakharyas joined
timotimo	cool of you to investigate	19:02	Copy link Message link Add to gist Remove
	i was wondering if we should have a stage mibus one		Copy link Message link Add to gist Remove
	minus	19:03	Copy link Message link Add to gist Remove
	a slimmed down nqp without optimizer and repl and maybe some other things you could leave out		Copy link Message link Add to gist Remove
	then verify it		Copy link Message link Add to gist Remove
	with --dump		Copy link Message link Add to gist Remove
	plus somethingbthat dumps the sc		Copy link Message link Add to gist Remove
19:05 zakharyas joined 19:08 greppable6 joined, committable6 joined
nine	Now there's still the time_n() in alt_nfas getting baked into $ast.name(QAST::Node.unique('alt_nfa_') ~ '_' ~ ~nqp::time_n());	19:17	Copy link Message link Add to gist Remove
	Why would that be necessary when there's already a call to unique()?	19:18	Copy link Message link Add to gist Remove
jnthn	I can't remember how unique those have to be but I think they may have to be unique across compilation units, not just per compilation unit	19:19	Copy link Message link Add to gist Remove
nine	As we're talking about regexes here, would they have to be unique even between identical regexes?	19:23	Copy link Message link Add to gist Remove
jnthn	A regex can have many alternations... iirc, but I may not, the issue is that they're cached in the meta-object by the name, and so if you subclass then re-used names betwene the super and child grammars if they're in different compilation units would be problematic	19:26	Copy link Message link Add to gist Remove
	Gotta be afk for a bit now, though, so can't look in detail... Surely we can eliminate the timestamp though :)		Copy link Message link Add to gist Remove
	bbiab		Copy link Message link Add to gist Remove
20:05 colomon joined 20:08 dogbert2 joined 21:16 Geth joined 22:18 dogbert2 joined
timotimo	If the program really needs this behavior there is no really easy way out. One possibility is to create an anonymous file (just unlink it after creation), size the file using ftrunctate, and then map the file in two places. In one place map it with MAP_SHARED and write permission but without execution. For the second mapping use execution permissions but no write permissions. This might be a bit confusing at	23:31	Copy link Message link Add to gist Remove
	first but can be handled. The program must be adjusted to write to one location and expect to execute code in another one. This is reasonably safe in case the two mappings are allowed to be randomied. The example code in the next section illustrates how this should work.		Copy link Message link Add to gist Remove
	"Using this approach instead of one mapping which is writable and executable at the same time is safer because the attacker has to know two independently randomized addresses (this assumes mmap is allowed to perform the randomizations)."	23:32	Copy link Message link Add to gist Remove
	though i thought we had a write-only mapping first and then turn it exec-only?	23:33	Copy link Message link Add to gist Remove
	well, read-write		Copy link Message link Add to gist Remove
Geth	MoarVM: b07acdfd92 \| (Timo Paulssen)++ \| src/jit/compile.c disable jit when we're not allowed to make memory executable	23:52	Copy link Message link Add to gist Remove
timotimo	turn on "deny_execmem" and watch one program after the other crash	23:53	Copy link Message link Add to gist Remove

Please report any issues / comments / feature requests as an issue on App::Raku::Log.

Thank you!