| Geth | MoarVM: 32d66d5683 | (Samantha McVey)++ | src/strings/ops.c Speed up index 50% for flat haystack and diff type needle Convert the needle to match the haystack's data type when encountering a flat haystack (very common since Perl 6 flattens the haystack during regex). Also fix a problem in the 32 bit memmem loop. The loop runs memmem again ... (9 more lines) |
00:15 | |
|
00:16
AlexDaniel joined
|
|||
| MasterDuke | samcv: src/strings/ops.c: In function ‘MVM_string_index’: src/strings/ops.c:498:51: error: pointer of type ‘void *’ used in arithmetic [-Werror=pointer-arith] && ( start_ptr = mm_return_32 + 1) /* Set the new start pointer right after where we left off */ | 00:21 | |
| samcv | yep working on it | 00:22 | |
| Geth | MoarVM: afdcad424e | (Samantha McVey)++ | src/strings/ops.c Use char* for pointer addition to please MSVC |
00:23 | |
| MasterDuke | not just msvc, gcc complained for me | ||
| samcv | ah ok | ||
| MasterDuke | hm, didn't really seem to make a difference for me on that code from earlier | 00:24 | |
| samcv | MasterDuke: yeah it doesn't change that code because it is already doing memmem on 8 bit haystack and 8bit needle i believe | ||
| but should improve a ton of other code | 00:25 | ||
| MasterDuke | ah, the 32bit part gets converted to 8bit when flattened? | ||
| samcv | yeah | ||
| MasterDuke | oh well, a good optimization anyway | 00:26 | |
| samcv | it makes indexing a single codeponit needle 2x as fast cool | 00:33 | |
| index with word needle from 2.0685436 to 0.3240215 | |||
| that's pretty big change | |||
| this is my test file gist.github.com/adc8d50df303e457ed...b85b66e94e | 00:35 | ||
| japhb | samcv: Ready to bump nqp/rakudo? | ||
| samcv | that is fine with me. i gotta go to dinner but feel free | 00:36 | |
| 0.5x-3x faster it seems heh for the newly handled conditions | 00:37 | ||
| not bad | |||
|
01:25
travis-ci joined
|
|||
| travis-ci | MoarVM build failed. Samantha McVey 'Speed up index 50% for flat haystack and diff type needle | 01:25 | |
| travis-ci.org/MoarVM/MoarVM/builds/359632075 github.com/MoarVM/MoarVM/compare/1...d66d568376 | |||
|
01:25
travis-ci left
01:54
FROGGS joined
01:56
ilbot3 joined
|
|||
| AlexDaniel | well, there we go: R#1667 | 02:07 | |
| synopsebot | R#1667 [open]: github.com/rakudo/rakudo/issues/1667 [perf] Some string benchmark | ||
| MasterDuke | AlexDaniel: are those numbers for the other langs from the article? or did you run them yourself? | 02:10 | |
| AlexDaniel | I did run myself of course | ||
| MasterDuke | cool | ||
| AlexDaniel | edited the issue a bit to clarify that… | 02:11 | |
| MasterDuke | oh, samcv said 32d66d5683 wouldn't really help there | ||
| i think looking at the code inspired the change though | 02:12 | ||
| AlexDaniel | oh | ||
| MasterDuke: where? | 02:13 | ||
| MasterDuke | AlexDaniel: irclog.perlgeek.de/moarvm/2018-03-29#i_15978231 | 02:14 | |
| AlexDaniel | ok | 02:16 | |
|
03:52
nativecallable6 joined,
reportable6 joined,
quotable6 joined
03:54
camelia joined
04:25
bartolin joined
04:49
notable6 joined
|
|||
| samcv | MasterDuke, AlexDaniel` working on the collapse_strands issue now | 05:12 | |
| going to extend the grapheme iterator functions to allow me moving to the next strand. what we already do is start with a MVMGrapheme8 string, and iterate the string into it. if we get something that won't fit in 8 bits we abort and put copy that 8 bit buffer into a 32 bit buffer then continue with the iterator | 05:14 | ||
| copying from an 8bit buffer into a 32 bit buffer is pretty fast. much much faster than using an iterator since it's a very tight loop. instead what we will do is use memcpy to copy any 8bit strands into the new buffer instead of using the iterator | |||
| then using a new graphemeiterator_next_strand function to move to the next strand/repetition after we've done the memcpy | 05:15 | ||
| MasterDuke: and i think i figured out how to get converting 8bit to 32bit strings to use vector SIMD operations | 05:22 | ||
| not going to alter that until i finish this. but would be cool if that can speed things up a lot | 05:23 | ||
| 6 | 05:58 | ||
|
06:27
domidumont joined
06:32
robertle joined
06:33
domidumont joined
06:53
dogbert17 joined
07:48
zakharyas joined
08:00
zakharyas joined
08:07
zakharyas joined
08:30
dogbert17 joined
09:00
zakharyas joined
09:07
zakharyas joined
09:59
brrt joined
|
|||
| brrt | good * | 09:59 | |
|
10:07
zakharyas joined
|
|||
| timotimo | yo brrt :) | 10:19 | |
| brrt: did you see my question about interp_cur_op and the jit? | 10:21 | ||
| brrt | i did not | ||
| timotimo | basically, i'm writing code that'll be running when throwpayloadlexcaller runs and i'm wondering if using interp_cur_op gives me a sensible idea of what handlers (only inlines are relevant for this) we're currently in | 10:22 | |
| brrt | it does not | 10:23 | |
| we should have a function spesh_get_inline_by_position | |||
| and a delegate, jit_get_inline_by_position | |||
| timotimo | i need the amount of inlines we're in at that point, though :) | ||
| brrt | if the current frame is JITTed | ||
| then you need a spesh_get_inline_depth(inline_nr) | 10:24 | ||
| in general, though | |||
| screw interp_cur_op :-P | |||
| especially from the PoV from the JIT | |||
| timotimo | i just need something, anything ;) | 10:25 | |
| doesn't have to be interp_cur_op | |||
| brrt | why do you need to know about the inline structure? | 10:31 | |
| what do you need to know about it | |||
| timotimo | well, there's this thing in the profiler where we call prof_exit whenever we leave a frame | 10:32 | |
| but if a frame is left via throwpayloadlexcaller, we skip over a prof_exit command | |||
| (they are inserted into the bytecode) | |||
| so when we unwind, we have to realize that and properly remove the exact right amount of inlined frames | |||
| lizmat | hmmmm "perl6 --profile -Msnapper -e 'start {}; sleep 1" reliably segfaults for me | 10:38 | |
| should I make a ticket, timotimo? | 10:39 | ||
| timotimo | huh, that's funny | 10:44 | |
| it calls an extop that's out of bounds or something? | |||
| lizmat | perhaps starting a thread at compile time ? | 10:47 | |
| hmmm... | |||
| timotimo | no, threads just call into existing bytecode, and even if we do bytecode generation at run time it goes through the validator | 10:48 | |
| we're possibly jumping into bytecode at an improper alignment and reading one byte off to the side or something | |||
|
10:50
dalek joined,
Geth joined,
p6lert joined,
synopsebot joined
10:51
SourceBaby_ joined
10:52
SourceBaby joined
11:47
domidumont joined
12:00
Voldenet joined
13:23
zakharyas joined
13:40
zakharyas joined
14:01
Util joined
14:07
AlexDaniel joined
14:27
zakharyas joined
14:29
FROGGS joined
|
|||
| dogbert17 | .seen timotimo | 15:31 | |
| yoleaux | I saw timotimo 11:14Z in #perl6-dev: <timotimo> eating memory really, really, really fast is often an infinite recursion | ||
|
16:07
zakharyas joined
16:12
domidumont joined
16:13
zakharyas joined
16:54
zakharyas joined
17:04
zakharyas joined
17:13
zakharyas joined
18:28
Kaiepi joined
|
|||
| Kaiepi | i still don't quite understand what was meant by github.com/MoarVM/MoarVM/pull/824#...-375955795 | 18:39 | |
| can someone explain in more detail? | 18:40 | ||
|
19:16
Kaiepi joined
19:27
zakharyas joined
19:30
robertle_ joined
19:47
Kaiepi joined
19:48
zakharyas joined
|
|||
| timotimo | i'm not sure if niner is correct in his assumption here | 19:51 | |
| but he has done a whole lot of nativecall hacking, whereas i did not | |||
| Kaiepi | i need to test more cases, but i was able to get this to work with the additional changes to nqp and rakudo needed hastebin.com/niqucumaju.cpp | 19:56 | |
| yeah, it complains about malformed utf8 when i test for Str with characters outside ascii's range | 20:07 | ||
| timotimo | well, you'll still need to set its encoding to something other than utf8 | 20:12 | |
| Kaiepi | like this? hastebin.com/kimeginata.pl | 20:30 | |
| oh i got it | 20:33 | ||
| or not | 20:34 | ||
| it only works the first time running test...? hastebin.com/ijobayuluz.pl | 20:36 | ||
| lizmat | if the JIT log says something like: "Cannot get template for: gt_n", it means what it says, right? | 20:43 | |
| that nobody has implemented a JIT template for nqp::gt_n ? | |||
|
20:50
FROGGS joined
|
|||
| timotimo | do we have floating point stuff in the expr jit yet? | 20:54 | |
|
21:03
Kaiepi joined
|
|||
| lizmat | other terms I got: clone, prepargs, sp_findmeth, checkarity,, coerce_ni, bindattrs_o etc | 21:04 | |
| lizmat takes an early night | 21:15 | ||
|
21:16
Kaiepi joined
|
|||
| Kaiepi | for the wchar_t stuff, how would i go about debugging what's going on in moarvm? | 21:30 | |
| should moar be built with -j<core count> by default? | 22:13 | ||
| i tested make -j8 and moar built much more quickly without any of the files getting compiled out of order | 22:16 | ||
| timotimo | i usually make -j which basically starts all jobs immediately | 22:39 | |
| it's nice and fast and doesn't go wrong at all ever | |||
| (in moarvm) | |||
|
22:46
lizmat joined
22:47
MasterDuke joined
|
|||
| Kaiepi | i think i might leave it to the --make flag, or add a --makejobs flag | 23:13 | |
| detecting how many cores to use isn't feasible for certain oses without Sys::CPU or Sys::Info being available in their package manager | 23:14 | ||
| MasterDuke | timotimo: i just realized your branch to fix large profiles might mean i can profile the rakudo build again | ||
| timotimo | oh, you can try | ||
| it's not entirely correct, though | 23:15 | ||
| MasterDuke | gonna spin up the vm and give it a shot | ||
| timotimo: `Stage optimize : Profiling is already started at <unknown>:1 (<ephemeral file>:) ...` | 23:30 | ||
| timotimo | oh, look, that's fascinating | 23:31 | |
| MasterDuke | oh. i had --profile-stage=optimize | ||
| timotimo | so maybe that feature is currently also busted. one more for the road | 23:32 | |
| MasterDuke | `MoarVM panic: Profiler lost sequence` with just --profile-compile | ||
| timotimo | can you turn off inlining? | ||
| MasterDuke | hasn't died yet... | 23:35 | |
| heh, `Stage parse : 604.735` | 23:44 | ||
| timotimo: huh. just --profile-compile didn't die, and took much longer like it usually does, but no profile was created | 23:52 | ||
| this is with your branch and MVM_SPESH_INLINE_DISABLE=1 | |||