02:08
Kaiepi joined
|
|||
timotimo | we used to turn spesh off after removing profiler instrumentation "due to bugs", but since the "spesh worker" rework, i don't think we respect a change in spesh_enabled | 02:12 | |
MasterDuke: i think that's just the browser not having run the angular code yet to hide all the "only show one tab at a time" code, or maybe some function somewhere throws an exception before it can do that or something | 02:15 | ||
do you remember when i said graphviz might be done when i come back? | 02:20 | ||
killed it now; it also used up about 4 gigs of ram per process - including the one gig each graphviz process was using | 02:21 | ||
dogbert11, MasterDuke: well, i figured out how to make it about 3x faster and the profile also shrinks significantly; just turn $num into a my uint8 @digits and using .skip and .head to get the subrange | 02:32 | ||
i.e. the profile is tiny even if you use the full number | 02:35 | ||
at this point it's impossible to get an accurate time measurement for this script, it's just too fast :) | 02:36 | ||
02:56
ilbot3 joined
|
|||
MasterDuke | timotimo: huh. any idea what causes the profile to blow up with the original script? | 03:09 | |
timotimo | not really, no :( | 03:33 | |
MasterDuke | timotimo: well, want a completely unrelated question? | 03:35 | |
timotimo | maybe :) | ||
MasterDuke | RefiedList's push-all, github.com/rakudo/rakudo/blob/mast...2560-L2570 | 03:36 | |
to make that faster, would we need the nqp::subbuf (or something like that) that lizmat was mentioning recently? | |||
so we could just append the entire $!reified to $target at once, instead of pushing an nqp::atpos in a loop | 03:38 | ||
timotimo | yeah. though perhaps looking at spesh log and jit output points out something interesting | ||
MasterDuke | i don't really know enough about those to get much out of them | 03:39 | |
fwiw, this is what i'm testing with: `my @a = ^100000; my @b; @b = @a[$_, $_+1, $_+2] for ^99997; say @b` | |||
timotimo | is the push-all part where it assigns ^100_000 to @a? wouldn't that use a custom iterator for Range? | 03:42 | |
MasterDuke | no, it's getting called 99997 times | 03:43 | |
wait, maybe it's 199994 times | 03:45 | ||
well, i'm off to sleep, later... | 03:51 | ||
03:54
mtj_ joined
|
|||
timotimo | later | 04:00 | |
06:17
dalek joined
06:56
domidumont joined
07:04
domidumont joined
07:26
hoelzro_ joined,
SmokeMachine joined
07:28
dalek joined,
synopsebot joined,
SourceBaby joined
07:30
shareable6 joined,
bloatable6 joined,
releasable6 joined,
evalable6 joined,
reportable6 joined,
unicodable6 joined,
greppable6 joined,
benchable6 joined,
statisfiable6 joined
08:01
robertle joined
|
|||
samcv | jnthn: if we are in decodestream and it steps decoding since it pulled the specified number of bytes requested, but it expected a following byte since the last byte was a starter for a two byte sequence | 08:09 | |
should we throw? | |||
timotimo | i think we'd hold on to the last grapheme until more bytes are fed | 08:10 | |
samcv | timotimo: you mean there's another invocation of the function? | ||
i mean at the end of the function | |||
timotimo | which function are we talking about exactly? | ||
samcv | github.com/samcv/MoarVM/blob/shift...#L416-L420 | 08:12 | |
shiftjis_decodestream | |||
timotimo | yeah, i do believe decodestream will be called more than once | ||
samcv | really? | ||
hm | |||
timotimo | that's why it's "discard_to" and not "discard_everything" at the end | 08:13 | |
samcv | so what do i want to do? | ||
it should still throw if it reaches the end of the stream and Shift_JIS_lead is set. and i'd need to retain the value across calls | |||
timotimo | save the Shift_JIS_lead in the decodestream for the next time bytes are fed into the stream | ||
you don't have to retain the value specifically, just leave the last byte in the buffer for the next time | 08:14 | ||
that's what discard_to already does | |||
so the next time bytes have been fed they should already be in the right place | |||
samcv | ah ok. i believe it would be DECODE_CONTINUE returned from the decoder_handler function | 08:15 | |
so then it should exit the loop i guess | |||
and won't have increased the last_accept_pos | |||
timotimo | i didn't look closely, but that sounds right | 08:16 | |
samcv | so i think it should be already good then. timotimo how do i make sure it throws when we reach the end of the stream? | ||
timotimo | i could imagine when something like decodestream_eof is called and there's still bytes left in the buffer it'll throw by itself? i'd have to look through the code | ||
samcv | hmm | 08:17 | |
timotimo | or maybe rakudo is responsible for throwing when there's bytes left after an eof + decode | 08:18 | |
samcv | hmm. timotimo how would i test this. and make sure it doesn't throw when we try pulling it by byte | 08:20 | |
and throws when it should | |||
i'm not sure how to control it from nqp/perl6 like that | 08:21 | ||
08:22
committable6 joined
|
|||
timotimo | probably using decoderaddbytes and then decodertakeallchars | 08:23 | |
samcv | gonna try removing that throw at the end and see if it ends up throwing by itself | 08:25 | |
08:26
bisectable6 joined,
squashable6 joined
|
|||
samcv | yeah nope it doesn't throw | 08:27 | |
timotimo | hm, OK | 08:28 | |
samcv | and last_accept_pos is set to 0 on return, so that should indicate it didn't process any bytes | 08:29 | |
maybe hmm | 08:30 | ||
maybe i need to pos-- if there's a byte left over | |||
timotimo | hm, how does it behave with utf8 with a two-byte-starter-beginning-byte at the end and then EOF? | 08:31 | |
samcv | ah so it was fine already. last_accept_pos == 0 is where it starts out. since it does a continue it doesn't set the last accept pos variable | 08:33 | |
timotimo: i don't see any throwing inside the utf8 decodestream function at the end | 08:36 | ||
timotimo | and nothing in the rest of moar or rakudo either? | 08:37 | |
hmm. | |||
could just be an oversight, or could be intentional | |||
samcv | and it doesn't throw | ||
if i write invalid bytes and then read it | |||
Buf.new(0xe2, 0x99).decode #> throws | |||
my $fh = open '/tmp/temppp', :w; $fh.write: Buf.new(0xe2, 0x99); $fh.close; my $fh2 = open '/tmp/temppp', :r; $fh2.slurp | 08:38 | ||
no throwing | |||
timotimo | hm, so decodestream and decoder-without-stream behave differently? | ||
samcv | yep | 08:39 | |
only at the end of the stream though | |||
both will throw if it happens in the middle. but if it's at the end and expecting more bytes it seems to not throw | |||
timotimo | perhaps something somewhere needs to have a nqp::decoderempty call and throw if it returns 0 | 08:44 | |
08:53
coverable6 joined,
quotable6 joined
|
|||
Geth | MoarVM: a3bd17b4f0 | (Samantha McVey)++ | src/strings/windows1252.c Fix windows1252/1 decodestream replacement with last pos repl. If the last position byte is bad and there is a replacement, we would exit the while loop since we had moved past the last position. Ensure that the loop continues until we write the full replacement. This bug only occured with replacements longer than one character. Example: "abcX" where X is the bad character and replacement = "123" would result in "abc1" instead of the correct result of "abc123" |
09:00 | |
samcv | after shiftjis gets merged moarvm will support 98.1% of the websites on the internet | 09:05 | |
timotimo | what encodings are missing? | 09:06 | |
samcv | w3techs.com/technologies/overview/...coding/all | ||
timotimo | "all" :D | ||
samcv | GB2312 is the chinese non-unicode encoding | ||
i think the main difference between how shiftjis is implemented original standard JIS X 0208:1997 is that \ is replaced with ¥ and ~ is an overscore | 09:10 | ||
i guess i could have a variation, but i think the variants that leave ascii alone are more common | |||
may have to look into that a bit more. but i think that is how it is | 09:11 | ||
ah no. there's just an issue with my code | 09:12 | ||
encoding.spec.whatwg.org/#shift_jis-encoder i think step 2 should be moved after step 4 | 09:14 | ||
or i guess it just needs to move after step 3 | |||
though it's also that way for the decoder.. | 09:15 | ||
09:27
yoleaux joined
|
|||
samcv | hmm ok it seems like my encoder is doing the same thing that iconv does. it just turns a \ into a yen sign | 10:07 | |
since maybe shiftjis has no backslash at all | |||
ok so this i implemented is windows-31J which keeps the ascii mappings | 10:15 | ||
10:32
dogbert2 joined
10:37
dogbert2 joined
|
|||
dogbert2 | timotimo: so perhaps it's best to run the profiler with spesh disabled for the time being ? | 10:56 | |
lizmat | samcv: got a consistent error on S32-str/encode.t now: # Looks like you planned 38 tests, but ran 36 | 11:53 | |
12:45
AlexDaniel joined
13:28
SourceBaby joined
13:32
SourceBaby joined
13:35
SourceBaby joined
14:57
unicodable6 joined
16:01
Voldenet joined
16:46
robertle joined
17:58
Kaiepi joined
18:05
committable6 joined
18:11
bisectable6 joined
18:16
domidumont joined
18:25
domidumont joined
18:26
Kaiepi joined
18:31
domidumont joined
18:33
domidumont joined
19:04
domidumont joined
|
|||
dogbert11 profiling a script generated a 155 meg html file | 19:05 | ||
dogbert11 and a SEGV | 19:23 | ||
19:45
Kaiepi joined
|
|||
samcv | lizmat: hmm i'm not seeing that | 20:15 | |
MasterDuke | samcv: think it was fixed with github.com/perl6/roast/commit/64a0eae869 | 20:18 | |
20:23
Kaiepi joined
21:07
Kaiepi joined
|