TimToady | m: my $a = 0; my @a = eager ($a++,) Zxx 42; say $a; say @a; say $a | 00:01 | |
camelia | 0 [(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41)] 42 |
||
TimToady | it does seem a bit antisocial to ignore eagerness | 00:03 | |
m: my $a = 0; my @a = ($a++,) Zxx 42; say @a.is-lazy | |||
camelia | False | ||
00:40
agentzh joined
02:49
ilbot3 joined
02:51
dogbert2 joined
02:53
mtj_ joined
03:34
geekosaur joined
|
|||
Geth | MoarVM: c9b3a35a51 | (Samantha McVey)++ | docs/ChangeLog Add more details on speed improvement to changelog |
04:47 | |
05:20
agentzh joined
06:06
agentzh joined
|
|||
samcv | jnthn, putty probably doesn't use fallback fonts or something. i never liked putty because it wasn't very full features. noto fonts is not just one font because it covers so much of unicode it can't fit in one font | 07:10 | |
07:40
brrt joined
08:09
agentzh joined
08:50
domidumont joined
|
|||
Geth | MoarVM: 7d7a6b1881 | (Samantha McVey)++ | src/strings/ops.c Fix bug in indexic_s if expanding cp is the last cp There was a bug where if a codepoint such as 'st' which expands under foldcase was the last codepoint, it would return -1 instead of returning the correct index. Fixes the nqp tests that used to be todo'd. |
10:23 | |
jnthn | m: my $a = 0; say (($a++,) Zxx 42).is-lazy | 11:18 | |
camelia | False | ||
jnthn | Hmm | 11:19 | |
Ah, there was a revert | |||
MasterDuke | jnthn: i think i'm running into the bug you did this workaround for github.com/perl6/nqp/commit/79e10b...8c774ed165 | 12:14 | |
which i think is related to this: | 12:16 | ||
m: class :: { has uint64 $.x; }.new( x => 2**64-1 ).x.say | |||
camelia | -1 | ||
12:32
ilmari[m] joined
|
|||
jnthn | MasterDuke: Yeah, I think that's 'cus we had no getattr_u yet | 13:00 | |
13:00
geekosaur joined
|
|||
MasterDuke | yeah, pretty sure that's one i added in my `add a whole bunch of _u ops` branch | 13:09 | |
also trying to figure out this one now: | 13:10 | ||
m: my uint64 $x = 2**64-1; say $x | |||
camelia | -1 | ||
MasterDuke | i'm returning the right value out of mp_get_int64, but haven't yet figured out where it gets lost | 13:11 | |
14:11
agentzh joined
14:37
geekosaur joined
|
|||
MasterDuke | if i don't care about fixing attributes and the more general _u problem, i have a fix for github.com/MoarVM/MoarVM/issues/513, that make two todos now pass | 14:37 | |
however, this test github.com/perl6/roast/blob/master...ive.t#L167 `is class :: { has uint64 $.x; }.new( x => 2**64-1 ).x, 2**64-1, 'uint64 attributes don\'t get sign-extended';`, which is currently todoed, now dies | 14:38 | ||
because attributes are all treated as signed, and since that value is too large for a signed native, my fix for #513 causes it to throw | 14:39 | ||
my fix for #513 wouldn't cause it to throw if unsigned attributes worked correctly | 14:42 | ||
so would modifying that test as a workaround be ok? | |||
timotimo | src\jit\emit_win32_x64.c(8) : warning C4129: 'j' : unrecognized character escape sequence | 16:00 | |
src\jit\emit_win32_x64.c(8) : warning C4129: 'e' : unrecognized character escape sequence | |||
what could this be? | |||
16:44
agentzh joined
17:30
zakharyas joined
17:32
agentzh joined
18:46
agentzh joined
|
|||
MasterDuke | timotimo: any idea why copying this github.com/MoarVM/MoarVM/blob/mast...1244-L1252 and changing 32s to 8s causes errors like `Unhandled exception: Error encoding UTF-8 string: could not encode codepoint 1836085614`? | 19:26 | |
19:34
Ven joined
|
|||
MasterDuke | .ask samcv parsing the json from a travis log with JSON::Fast spends 46% of its time here github.com/MoarVM/MoarVM/blob/mast...#L86-L139, which mentions it could be smarter. see anything you could do to optimize it? | 19:46 | |
yoleaux2 | MasterDuke: I'll pass your message to samcv. | ||
timotimo | can you give me the actual code you used, MasterDuke? | ||
MasterDuke | gist.github.com/MasterDuke17/80bcd...915c563c8d | 19:49 | |
timotimo | result->body.storage_type = MVM_STRING_GRAPHEME_32; | 19:51 | |
that's why it's erroring out | |||
MasterDuke | ah | 19:53 | |
timotimo | i'd expect that re_nfg could be implemented to detect that no change is necessary at all | 19:59 | |
MasterDuke | so should the MVM_STRING_GRAPHEME_8 case have the exact same body as the MVM_STRING_GRAPHEME_32 case? | ||
timotimo | alternatively it can use almost-all-of the previous string by refering to it from a rope and then a little piece in the middle that combines the end and beginning into a grapheme, and then most-of the second string | 20:00 | |
almost the same body. the pgraphs has a different meaning based on what the storage type is | |||
so you have to copy pgraphs * sizeof(MVMGrapheme32) in one case and pgraehs * sizeof(MVMGrapheme8) | 20:01 | ||
the heap analyzer ought to help here :) | |||
MasterDuke | same error | 20:02 | |
timotimo | are you also setting the correct storage type for result? | ||
MasterDuke | nope | ||
timotimo | well, that's the problem, then | 20:03 | |
though of course both parts must have the same storage type | 20:04 | ||
MasterDuke | but i could only do that if i know that all the pieces and the separator are MVM_STRING_GRAPHEME_8 right? | ||
timotimo | right | 20:10 | |
otherwise you'll have to do a loop and assign | |||
memcpy can't do upscaling like that | |||
MasterDuke | think the loop is faster than the current default case? | ||
timotimo | maybe there's some fast way to do it, and if you have a tight loop it could very much be that gcc gives you that for free | 20:11 | |
oh i didn't know about nfg_is_concat_stable | |||
do we check that everywhere before we re_nfg? | 20:12 | ||
yeah, we do | |||
so it could be interesting to figure out why our json stuff thinks the concat isn't stable | |||
is there any reason to assume that? | |||
MasterDuke | i hope that was a rhetorical question, i don't know this nfg stuff at all | 20:15 | |
timotimo | i didn't look at the function yet, but i assume it checks the last grapheme of a and the first of b | 20:17 | |
and if the first of b is a combiner without its base, it'll complain | 20:19 | ||
or if a ends in a pre-pendable combiner | |||
MasterDuke | i'm looking at it, and without knowing these things at all, the comments seem to suggest what you just said | 20:20 | |
timotimo | OK, how about this: | 20:22 | |
we should have code to encode a substr of a string to utf8, right? | |||
when we hit the case that concat isn't stable, we output the string in question | |||
or maybe even the graphemes as integers | |||
MasterDuke | there is a MVM_string_utf8_encode_substr | 20:23 | |
timotimo | fantastic | ||
though maybe using the ints directly is more helpful | |||
because the strings are supposed to have "magical" unicode chars in them | |||
and they might turn invisible or hard-to-see when we just encode&print them | |||
MasterDuke | in moar or json::fast? | 20:25 | |
timotimo | in moar | ||
MasterDuke | for debugging purposes? | 20:29 | |
timotimo | yes, only for that | 20:30 | |
MasterDuke | ha, i was a bit confused there at first | 20:35 | |
timotimo | will you try doing that? | 20:38 | |
oh, interesting | 20:43 | ||
look, it thinks as soon as there's a synthetic it wouldn't be stable | |||
MasterDuke | so if not nfg_is_concat_stable, print out the MVM_string_utf8_encode_substr of the piece and separator? | ||
timotimo | a newline is rather common, but it's a synthetic | 20:44 | |
maybe we'll want to special-case something there | |||
(but first see what it's actually all about) | |||
oh, also ... we'll want both the encoded and the pure graphemes, i think | 20:45 | ||
MasterDuke | why do we need the *_encode_substr function? | 20:49 | |
timotimo | it's not strictly necessary, but if you see something like "-324" in the string, you don't necessarily know what's up | ||
MasterDuke | segv, did something wrong | 20:59 | |
timotimo | perhaps, let me see your code? | ||
MasterDuke | fprintf(stderr, "pieces[i - 1]: %s, p: %s\n", MVM_string_utf8_encode(tc, pieces[i - 1], MVM_string_graphs(tc, pieces[i - 1]), 0), MVM_string_utf8_encode(tc, piece, MVM_string_graphs(tc, piece), 0)); | 21:00 | |
oh, i need to encode to C str? | 21:01 | ||
geekosaur | fprintf isn't going to understand lists of codepoints, much less lists of graphemes; you need to feed it UTF8 in C's (char []) | 21:06 | |
timotimo | maybe we'll use only the graphemes for now | 21:09 | |
MasterDuke | so that printed a whole lot | 21:11 | |
timotimo | mhm | ||
that's only if it's not stable, right? | |||
can has a bit of example? | |||
MasterDuke | with pieces[i - 1] == `^M` a lot | 21:12 | |
timotimo | which one is that again ... | ||
MasterDuke | `else if (!MVM_nfg_is_concat_stable(tc, pieces[i - 1], piece)) {` | ||
timotimo | that's a control character? | ||
MasterDuke | isn't that a \r that get left from bad dos <-> unix line endings? | 21:14 | |
here's the first couple lines of output gist.github.com/MasterDuke17/d75d9...747f1dddf0 | 21:15 | ||
timotimo | maybe you're accessing one past the string? | 21:16 | |
MasterDuke | i changed the format string to "pieces[i - 1]: |%s|, p: |%s|\n" | ||
timotimo | though it should have a bounds check | ||
but yeah | |||
those are all newlines | |||
newline is a synthetic grapheme, because \r\n and \n are supposed to be the same thing | |||
too many things at once, where do i get the json we have again? | 21:17 | ||
MasterDuke | api.travis-ci.org/jobs/212455149 | 21:18 | |
timotimo | oh, i know | 21:19 | |
that's because of how json_fast parses strings | |||
it goes until it sees the first \* | |||
and then it parses chunks and joins them | |||
so you'll pretty much always have something like a newline where join is used | 21:20 | ||
i don't see a "Done" in there at all | |||
MasterDuke | there's at least one line with "Done" according to grep | 21:21 | |
timotimo | i must be doing something wrong | 21:22 | |
{"id":212455149,"number":"7436.3","config":{"language":"perl","os":"linux","sudo":false,"fast_finish":true,"install":"echo","dist":"trusty","addons":{"apt":{"sources":["ubuntu-toolchain-r-test"],"packages":["gcc-6","openjdk-7-jdk"]}},"script":["perl Configure.pl $RAKUDO_OPTIONS --moar-option=--cc=$(command -v gcc-6 >/dev/null 2>&1; if [ $? -eq 1 ]; then printf \"gcc\"; else printf \"gcc-6\"; fi)","make | |||
test","make install"],"branches":{"only":["nom","/smoke-me/"]},"notifications":{"irc":{"channels":["irc.freenode.net#perl6-dev"],"on_success":"change","on_failure":"always","template":["Rakudo build %{result}. %{author} '%{commit_message}'","%{build_url} %{compare_url}"]}},"env":"RAKUDO_OPTIONS=\"--backends=moar --gen-nqp=master | |||
--gen-moar=master\"",".result":"configured","group":"stable"},"repository_id":1242635,"build_id":212455146,"state":"finished","result":1,"status":1,"started_at":"2017-03-18T14:22:12Z","finished_at":"2017-03-18T14:29:07Z","log":"","commit":"dd3f22875c8f803ec97e0d4e39eef2c29ff49695","branch":"nom","message":"Log all commits to date\n\nDocuments commits:\ne1ebb50 f94a2c7 3de5fb2 79bb179 65b0040 10f5f74 | |||
9644fc3","committed_at":"2017-03-18T14:21:35Z","author_name":"Zoffix Znet","author_email":"cpan@zoffix.com","committer_name":"Zoffix Znet","committer_email":"cpan@zoffix.com","compare_url":"github.com/rakudo/rakudo/compare/3...ker":null} | |||
this be all i get | |||
MasterDuke | added the file i have to my gist | 21:24 | |
timotimo | thanks | ||
MasterDuke | and there's definitely one and only one "Done" in it | ||
timotimo | yeah that's like 100x as long as what i get | 21:25 | |
MasterDuke | wait, nm, could be more | ||
timotimo | hm? | ||
IOninja | travis trimmed an old job log? | ||
I see no log in browser in that url... nor Done | |||
timotimo | *shrug*, maybe i'm not logged in or whatever? | 21:26 | |
IOninja | gonna try on my box in ~10m (on phone right now) | ||
timotimo | i'll see what i can do about nfg_concat_is_stable | 21:27 | |
because we definitely know that the \n grapheme is okay (as it's a control characterr | |||
) | |||
MasterDuke | would it make sense for json::fast to also grab the next char if it sees a '\r'? | ||
timotimo | well, to do that, it'll have to concat, too | 21:28 | |
we wouldn't escape that problem | |||
"escape" | 21:29 | ||
badum-tss | |||
MasterDuke | but wouldn't "foo\r\nbar" get split into "foo", "\r", "\nbar" right now? | ||
or "foo", "\r", "\n", "bar"? | |||
timotimo | yup | 21:30 | |
the latter | |||
MasterDuke | wouldn't you want "foo", "\r\n, "bar"? | 21:31 | |
"foo", "\r\n", "bar" | |||
timotimo | if you want to write the code that handles consecutive escape sequences like that, sure. | 21:32 | |
MasterDuke | heh, i do not | 21:35 | |
timotimo | so there :) | ||
but we don't actually have \r\n, right? | |||
in our file i mean | 21:36 | ||
oh crap, we totally do | 21:37 | ||
a whole fricking lot of 'em | |||
in that case we always have to renormalize | |||
okay, it might be worthwhile to special-case \r\n | |||
i'll try implementing it | |||
IOninja | yeah, that link expired. tis empty on my box too | 21:38 | |
nearly empty | |||
MasterDuke | glad i saved it off then | 21:39 | |
timotimo | could have just used a newer link | 21:40 | |
may want to put a fresher link into tickets that are about this | |||
IOninja | I just grabbed one from a new job; noms 2.3GB there too: gist.githubusercontent.com/zoffixz...tfile1.txt | ||
I've updated my snippet to use ^ that permalink now: github.com/timo/json_fast/issues/2...-287570921 | 21:41 | ||
timotimo | i forgot to use -Ilib | ||
so i was using system-wide installed JSON::Fast %) | |||
IOninja | heh | ||
timotimo | stripped off half a gig | 21:42 | |
time perl6 -Ilib -e 'use JSON::Fast; from-json(slurp("/tmp/mem_leak.json"))' | |||
2.83user 0.63system 0:03.49elapsed 99%CPU (0avgtext+0avgdata 2166808maxresident)k | |||
0inputs+0outputs (0major+537101minor)pagefaults 0swaps | |||
timo@schmetterling ~/p/e/json_fast (master)> time perl6 -e 'use JSON::Fast; from-json(slurp("/tmp/mem_leak.json"))' | |||
2.56user 0.79system 0:03.38elapsed 99%CPU (0avgtext+0avgdata 2620180maxresident)k | |||
0inputs+0outputs (0major+650303minor)pagefaults 0swaps | |||
MasterDuke | nice | 21:43 | |
IOninja | \o/ | 21:44 | |
timotimo | not good enough, of course | 21:48 | |
but a bit better | |||
# at t/06-control-characters.t line 42 | 21:49 | ||
# expected: '"\r\n"' | |||
# got: '" | |||
"' | 21:50 | ||
welp :| | |||
that isn't right | |||
okay, fixed it | |||
pushed it, you can try if it makes things better | 21:51 | ||
it could be that it has been handling \r\n wrong for a while | 21:52 | ||
yup, totally | 21:54 | ||
sounds like a version bump is in order | |||
MasterDuke | seeing less mem usage also | ||
re_nfg is still 49% according to perf though | |||
timotimo | right | 21:55 | |
not unexpected | |||
we only reduced the number of concat operations | |||
not gotten around the problem in general | 21:56 | ||
MasterDuke | yup | ||
fwiw, here's heaptrack's summary now i.imgur.com/4bw1Bup.png | 21:58 | ||
for just parsing it once | |||
samcv | MasterDuke, so it may be from \r \n it having to renormalize so much? | 22:05 | |
and we call re_nfg so much when we concatenate yes? | 22:06 | ||
MasterDuke | believe so | 22:08 | |
but timotimo's really the one who knows what's going on | 22:09 | ||
samcv | yeah that's pretty wasteful | ||
i'm pretty sure i understand what's happening too | |||
from the scrollback | |||
MasterDuke | heh, i was having the conversation and don't understand | 22:10 | |
samcv | i'll look at it later today. doing stuff for the next few hours | ||
:) | |||
MasterDuke | cool | ||
timotimo | i found it a little bit surprising that re_nfg allocated such a gigantic amount of data | 22:14 | |
samcv | how much is gigantic? | ||
MasterDuke | 2g to parse a 200k json file | 22:15 | |
samcv | lol. | ||
ok that's gigantic | |||
timotimo | a little bit :3 | ||
MasterDuke | i'm kind of surprised re_nfg hasn't cropped up in any other profiles | 22:16 | |
samcv | maybe because we've had \n line endings? | 22:17 | |
timotimo | it had lots of \r\n in it | 22:20 | |
i improved the memory usage by half a gig (a quarter in total) by special-casing \r\n to be done in one step | |||
jnthn | Huh, how on earth is re_nfg managing that... | 22:25 | |
yoleaux2 | 16:58Z <japhb> jnthn: 6guts.wordpress.com/2017/03/16/con...semantics/ keeps asserting incorrect arithmetic (6+5=12,9+5=15,9+5=16). | ||
19:53Z <IOninja> jnthn: recall I kept mentioning buggable leaking? Seems it's just something with JSON::Fast (and it doesn't seem to be leaking but just using lots of RAM). Swapping to JSON::Tiny avoids the issue. | |||
MasterDuke | i think heaptrack said most of the allocations happened here github.com/MoarVM/MoarVM/blob/mast...ops.c#L100 `MVMGrapheme32 *out_buffer = MVM_malloc(bufsize * sizeof(MVMGrapheme32));` | ||
and bufsize == in->body.num_graphs | 22:26 | ||
jnthn | Oh heh | 22:27 | |
Yeah, that's a pretty lazy implementation :) | |||
Though I'm now curious how JSON::Fast triggers that case :) | 22:28 | ||
MasterDuke | i think it's lots of concats where !nfg_is_concat_stable | 22:29 | |
in MVM_string_join | |||
timotimo | it sees \r\n which is < 0 and so it never says is_stable | ||
jnthn | Ah | ||
But that can never combine with anything else, so we could potentially special-case it. | 22:30 | ||
But still, re_nfg could be done in a smarter way | |||
MasterDuke | most were here github.com/MoarVM/MoarVM/blob/mast...ps.c#L1237 `else if (!MVM_nfg_is_concat_stable(tc, pieces[i - 1], piece))` | ||
jnthn has had a headache all evening, so will be pretty useless for thinking much about this :/ | 22:32 | ||
timotimo | yup | 22:33 | |
i had headache yesterday and was also rather useless | 22:36 | ||
jnthn | Had sore throat/ears the last couple of days, and somewhat today, and today gained a stuffy nose, so I guess I'm doing some kind of cold. | 22:37 | |
timotimo | ugh, i hate that kind of thing | ||
oh, it seems like i recurse in parse-string instead of looping somehow | 22:38 | ||
that could explain why it joins often | 22:39 | ||
yeah, if it doesn't see a " after the escape sequence, it'll recurse | 22:41 | ||
that's Bad | |||
but i have a fix | 22:45 | ||
time perl6 -Ilib -e 'use JSON::Fast; from-json(slurp("/tmp/mem_leak.json"))' | 22:48 | ||
1.39user 0.05system 0:01.46elapsed 98%CPU (0avgtext+0avgdata 92652maxresident)k | |||
how do you like this? | |||
MasterDuke | hot damn! | 22:49 | |
timotimo | now only has to be correct | ||
jnthn | Did it get any faster with all the reduction in allocations? :) | 22:50 | |
timotimo | yeah, a bit | 22:57 | |
let me do proper measurements | |||
samcv | timotimo, can it be any json file with \r\n lineendings? | 22:58 | |
timotimo | um ... huh. | ||
no, it got significantly slower | 22:59 | ||
somehow it also reached almost a gig again?!? | 23:05 | ||
grr | 23:08 | ||
i haven't a good clue what's going wrong | 23:10 | ||
can't get it down any further :( | 23:11 | ||
not sure why it was 1/10 in between | |||
MasterDuke | 1/20 even, right? | 23:17 | |
edit history you can undo through? | 23:18 | ||
timotimo | yeah, well | ||
MasterDuke | samcv: if you want the file we've been testing with you can find it here gist.github.com/MasterDuke17/d75d9...747f1dddf0 | 23:24 | |
samcv | mem_leak.json? | ||
there's two there | 23:25 | ||
MasterDuke | yeah | ||
samcv | what's the profiling program yoeu're using MasterDuke | 23:27 | |
heh 2.4GB that's insane | |||
MasterDuke | the memory one? heaptrack | 23:28 | |
you're using arch right? it's in the aur | |||
samcv | yeah | ||
MasterDuke | you have to run moar directly though | ||
i.e., the command that actually is exec'ed in perl6-m | 23:29 | ||
samcv | yeah i get what you mean | ||
;) | |||
why do we even have to re-nfg it. unless it's trying to concat two codepoints that may combine | 23:36 | ||
anyway i've gotto go to dinner | |||
renfging the whole string seems wasteful, but i don't think it should allocate that much space.. | 23:37 | ||
so that needs to be fixed regardless i guess | |||
MasterDuke | well, the comment right above the allocate says `/* Create the output buffer. We used to believe it can't ever be bigger than the initial estimate, but utf8-c8 showed us otherwise. */` | 23:38 | |
but maybe that allocation can be delayed until we know more info and can be more precise about it? | 23:39 | ||
samcv | but that shouldn't account for all the 1.2GB though? | 23:41 | |
it's not that big of a file | 23:42 | ||
MasterDuke | but it's joining lots of pieces, potentially having to re_nfg each piece and the separator each time | 23:43 | |
samcv | yeah | 23:46 | |
i mean we really should only renormalize the end of it | 23:47 | ||
not the whole thing | |||
MasterDuke | what "it"? | 23:49 | |
samcv | uh. string `in` | 23:51 | |
i'm not sure how strands work tbh. but we just need to go back along the string until we're sure it can't combine differently with the thing on the right | 23:52 | ||
and then normalize that section | |||
though i guess it's fed the string after it has already been concatted? | |||
MasterDuke | that certainly sounds faster | ||
well, i think there are two fixes to be made. re_nfg can be made better/faster, but MVM_string_join also shouldn't be calling it in all the cases it does | 23:53 | ||
samcv | yeah | ||
will be back in a few hours or something. bbl dinner | 23:54 | ||
MasterDuke | later... |