01:19
zakharyas joined
02:48
ilbot3 joined
03:39
agentzh joined
06:28
mtj_ joined
06:30
mtj_ joined
07:06
domidumont joined
07:11
domidumont joined
08:05
brrt joined
|
|||
brrt | good * #moarvm | 08:06 | |
i almost had a fix for the round to zero error.. | 08:17 | ||
samcv | jnthn, any reason failed UTF-8 kills the VM, is there a way to make it not totally die? | 08:34 | |
hello br | |||
err i guess you're gone | |||
*brrt | |||
nwc10 | good *, #moarvm | 08:35 | |
samcv | :) | ||
08:58
brrt joined
|
|||
brrt | but, it's not correct | 08:59 | |
the tl;dr is, the interpreter uses one algorithm for round-towards-minus-infinity, the JIT uses another, this casues the JIT and the interpreter to disagree in the range [-1,0] | 09:00 | ||
i replaced that with a check of the operand and modulus; but that doesn't work either, since, if you divide by a negative number, the modulus may be positive | 09:03 | ||
so it needs to be (result < 0 && modulus != 0 || result == 0 && modulus < 0) | 09:04 | ||
but there's not really a cute and quick expression for that... | 09:05 | ||
jnthn | samcv: Uh...I thought we already fixed all places that happened? | 09:45 | |
Unless somebody introduced a new one :P | 09:46 | ||
There is still some async string I/O code that can do so, but all of the ops leading to it are deprecated and can be tossed already. | 09:47 | ||
10:18
brrt joined
10:25
dogbert17_ joined
|
|||
samcv | oh | 10:39 | |
dunno. but i remember it crashing | |||
jnthn | Now I've had coffee, the obvious question I shoulda asked is "what code did you run to make this happen" :) | 10:40 | |
samcv | m: Buf.new((0..255).pick(100)).decode('utf8'); say 'hi' | 10:45 | |
camelia | Malformed UTF-8 at line 1 col 1 in block <unit> at <tmp> line 1 |
||
samcv | it never gets to the `say 'hi'` part | ||
dogbert17_ | m: say 'Hi' | 10:50 | |
camelia | Hi | ||
jnthn | You didn't catch the exception | ||
m: try Buf.new((0..255).pick(100)).decode('utf8'); say 'hi' | 10:51 | ||
camelia | hi | ||
dogbert17_ | jnthn: here's idiomatic :) code you can run in order to repro the decoderstream bug gist.github.com/dogbert17/037f464c...95fae63ed9 | ||
jnthn | samcv: Maybe you're wanting a way to do a replacement char or something, though? | 10:55 | |
In which case that's NYI | 10:56 | ||
samcv | not really. just wanting it to continue the rest of the script | ||
so maybe i am wrong, but i thought i put it in a try block and it still messed it all up. but i'll let you know if i have any code | 10:57 | ||
jnthn | OK :) | 10:59 | |
Yeah, it's meant to throw a catchable exception | |||
brrt | so, here's an idea | 11:09 | |
what if we add an opcode that reliably prevents JITting | |||
so you could add nqp::dont_jit_me() or nqp::no_jit(); | |||
and just always bail when we see that | 11:10 | ||
dogbert17_ | jnthn, timotimo: could this be the break we've been looking for ? gist.github.com/dogbert17/e0a63ea2...f657313c1e | 11:18 | |
jnthn | Yes, that looks very wrong o.O | 11:22 | |
Can you MVM_dump_backtrace(tc) on each of those threads? | |||
Geth | MoarVM: be37105fea | (Bart Wiegmans)++ | src/jit/emit_x64.dasc Fix div_i JIT round to negative infinity Compute a bias as (modulo < 0) & ((num < 0) ^ (denom < 0)); this gets rid of a conditional move and branches that would achieve the same. |
11:29 | |
brrt | i should probably update MoarVM and rakudo to point to this new version | 11:30 | |
IOninja | This fixes the div_i JIT bug? | 11:31 | |
brrt | ā¦ how is that done, anyway | ||
yes | |||
IOninja | brrt++ sweet | ||
brrt | IOninja; if you could in my stead, i'd be grateful | ||
(i would go to lunch, in that case) | |||
IOninja | Version bumps? I need to get ready to work | 11:32 | |
I can do it in 1.5 hours. | |||
brrt | hmm | ||
i can probably do it as well | |||
dogbert17_ sry, had to run away for a while | 12:11 | ||
jnthn, I could get only get one trace: gist.github.com/dogbert17/e0a63ea2...f657313c1e | 12:37 | ||
have yet another gist which I believe clarifies things a bit don't you think. gist.github.com/dogbert17/4d360c57...dd30703b93 | 13:25 | ||
jnthn | Hmm...yes, that's interesting | 13:28 | |
dogbert17_ | how should that sequence of events be avoided? | 13:36 | |
dogbert17_ is happy that progress has been made wrt the decoderstream bug :) | 13:51 | ||
jnthn | dogbert17_: I'll have to take a closer look to figure out exactly which of the troublesome scenarios is going on | 13:55 | |
timotimo | i've opened this gist three times in a row now and i'm not quite sure i understand it ... but it's gone into GC in the middle? | ||
i mean ... it went GC in one thread and it was (still? already?) doing something with the same decoder in another? | 13:56 | ||
13:59
Guest45345 joined
|
|||
dogbert17_ | one thread sets the 'in-use' flag (via MVM_decoder_take_available_chars) then goes in to GC and when the other thread tries to get into single user mode things asplode :) | 14:03 | |
dogbert17_ I think :) | 14:04 | ||
jnthn | It's possible that the decoder object moves and we end up zeroing the wrong thing at exit thanks to said GC | 14:12 | |
But that may not entirley explain it | 14:13 | ||
Because we're *in* GC | |||
And in the decoder | |||
And nothing else should be | |||
So even if we're got a single use bug, I still thing we've got a real bug too | |||
dogbert17_ | could it be bad enough to mess up bigints? | 14:18 | |
timotimo | .o( when do bigints come out of the decoder? ) | 14:19 | |
jnthn | If it's really a "thread is running when it should be GCing or blocked" then yes | ||
But I'd expect it to fail much more explosively and variedly than we're seeing | |||
dogbert17_ | we've seen lots of nasty stuff when running harness6, SEGV's, corrupted linked lists etc | 14:20 | |
I still have gdb running (the latest gist), which thread should I run a MVM_dump_backtrace on? | 14:21 | ||
notes that another thread seems to be doing gc as well, thread 2 | 14:22 | ||
14:23
lizmat joined
|
|||
dogbert17_ meant thread 3 | 14:23 | ||
#0 AO_load_acquire (addr=0x804c248) at 3rdparty/libatomic_ops/src/atomic_ops/sysdeps/gcc/generic-small.h:490 # thread 3 | 14:24 | ||
#1 finish_gc (is_coordinator=<optimized out>, gen=<optimized out>, tc=<optimized out>) at src/gc/orchestrate.c:195 | |||
#2 run_gc (tc=tc@entry=0x9fa41c0, what_to_do=what_to_do@entry=1 '\001') at src/gc/orchestrate.c:333 | |||
#3 0xb7c5a7e6 in MVM_gc_enter_from_interrupt (tc=tc@entry=0x9fa41c0) at src/gc/orchestrate.c:511 | |||
i.e. for the untrained eye it seems as if two threads are doing gc at the same time (must be wrong) | 14:25 | ||
timotimo | no, gc is multithreaded | ||
dogbert17_ | aha thx, I did update the gist with data from two other threads, there's lots of gc happening at the same time | 14:28 | |
timotimo | that's interesting. two threads are waiting for gc to finish, but two other threads are still working towards their goals | 14:30 | |
finish_gc is basically "let's wait for all other threads to signal that they're finished" | |||
and collect_free_nursery_uncopied is for cleaning up stuff that has died but can't just be ignored (i.e. they have pointers to buffers or own file handles) | 14:31 | ||
dogbert17_ | timotimo, here's the golfed code if you wish to try it gist.github.com/dogbert17/037f464c...95fae63ed9 | 14:33 | |
timotimo | how many runs did it take you to crash that? | 14:34 | |
oh that was quick | 14:35 | ||
dogbert17_ | it crashes after a while, just let it run (check that you have jnthn's fixes for src/6model/reprs/Decoder.c) | 14:36 | |
timotimo | oh, i might not have it | ||
no, i do have it | 14:37 | ||
dogbert17_ | cool | ||
it should throw an exception after a while | 14:39 | ||
unless you have a breakpoint around line 107 | 14:40 | ||
timotimo | yeah, it gets to the decoder-multi-use exception quickly | 14:41 | |
am i supposed to get that from multiple threads at the same time? | 14:42 | ||
dogbert17_ | good question :) | ||
timotimo | but you get it, too? | ||
dogbert17_ | yes | ||
timotimo | i.e. multiple lines of "unhandled exception on thread n" | ||
dogbert17_ | yup | ||
sometimes it shows up immediately | 14:43 | ||
timotimo | and only one time it has "decoder may not be used concurrently" | ||
dogbert17_ | yes | ||
sometimes it crashes on the first run like this: | 14:45 | ||
Starting... | |||
Unhandled exception in code scheduled on thread 3 | |||
Unhandled exception in code scheduled on thread 8 | |||
Deocder may not be used concurrently | |||
timotimo | have you ever gotten it to crash under valgrind? | 14:55 | |
dogbert17_ | haven't tried | 14:56 | |
dogbert17_ tries | |||
valgrind and mt-programs are a bit meh | 14:57 | ||
timotimo | yeah | ||
dogbert17_ | ASAN might be better | 14:58 | |
timotimo | It is also possible to mark up the effects of thread-safe reference counting using the ANNOTATE_HAPPENS_BEFORE, ANNOTATE_HAPPENS_AFTER and ANNOTATE_HAPPENS_BEFORE_FORGET_ALL, macros. Thread-safe reference counting using an atomically incremented/decremented refcount variable causes Helgrind problems because a one-to-zero transition of the reference count means the accessing thread has exclusive ownership of | 15:02 | |
the associated resource (normally, a C++ object) and can therefore access it (normally, to run its destructor) without locking. Helgrind doesn't understand this, and markup is essential to avoid false positives. | |||
do we have anything that'd have the same problem as this here? | |||
However, it is common practice to implement memory recycling schemes. In these, memory to be freed is not handed to free/delete, but instead put into a pool of free buffers to be handed out again as required. The problem is that Helgrind has no way to know that such memory is logically no longer in use, and its history is irrelevant. Hence you must make that explicit, using the VALGRIND_HG_CLEAN_MEMORY client | 15:03 | ||
request to specify the relevant address ranges. It's easiest to put these requests into the pool manager code, and use them either when memory is returned to the pool, or is allocated from it. | |||
^- it feels like this definitely applies | |||
okay, the program dies under valgrind, but it doesn't output anything interesting | 15:07 | ||
oh, neat | 15:11 | ||
my valgrind support branch has made it into master | |||
i wasn't sure if it did | |||
dogbert17_ | what does that support branch do? | 15:12 | |
timotimo | you can call Configure.pl with --valgrind | 15:13 | |
and it'll put redzones into the fixedsize allocator and also tells valgrind more clearly how that allocator operates | |||
i.e. instead of getting "address is 10000000 bytes into block of size 99999999" it'll give a fine-grained resolution of those chunks of data | 15:14 | ||
and when something writes past its allowed size in the FSA, it'll asplode | |||
dogbert17_ | very cool | 15:15 | |
timotimo | ayup | 15:16 | |
it's not quite as simple to make the same thing work for the nursery, though :( | |||
so i haven't tackled that last time i worked with this code | |||
but telling helgrind that the fromspace is "now completely freed" should be easy | 15:17 | ||
i was hoping i could give locks names with annotations, but it doesn't seem to allow for that | |||
oh, would you look at that | 15:18 | ||
helgrind's docs point out that posix condition variables aren't properly supported | |||
but libuv uses them for stuff | |||
the docs say it'll generate lots of false positives if you do use them. and i do see reports from inside pthread_cond_wait and such | 15:19 | ||
jnthn | We use them for "stuff" doo | ||
*too | |||
Like, every Promise :P | 15:20 | ||
timotimo | oh, ok | 15:21 | |
helgrind docs say to plesae use semaphores instead | |||
here i see it's detecting a data race for fsa->freelist_spin %) | |||
jnthn | Does it understand our CAS functions? | 15:24 | |
timotimo | Do not roll your own threading primitives (mutexes, etc) from combinations of the Linux futex syscall, atomic counters, etc. These throw Helgrind's internal what's-going-on models way off course and will give bogus results. | 15:25 | |
we can call ANNOTATE_RWLOCK_CREATE/_DESTROY and ANNOTATE_RWLOCK_ACQUIRED/_RELEASED around our spinlock | 15:27 | ||
that should make it understand | |||
oh, rwlock may not be the right one | 15:28 | ||
then it'd be VALGRIND_HG_MUTEX_* that i'd have to use | 15:29 | ||
INIT_POST, LOCK_PRE, LOCK_POST, UNLOCK_PRE, UNLOCK_POST, and DESTROY_PRE | |||
i'm not sure how to annotate our race to add stuff to the freelist | 15:33 | ||
wow. | 15:35 | ||
googling "helgrind trycas" finds just a single result | |||
and it's a tumblr tag "really-bad-hair-day." ?!?! | |||
jnthn | lol | 15:38 | |
timotimo also added a MEMORY_CLEAN annotation to where we allocate a frame from our framestack | 15:45 | ||
hum. | 15:49 | ||
could our decodestream thing have to do with passing work to other threads? :\ | |||
because finish_gc is also the function that calls into "process_in_tray" | |||
it's complaining that the memset that zeroes a block allocated in the FSA can potentially race with a read by process_worklist, and i'm not sure how to interpret tihs | 15:53 | ||
huh. that fixed_size_alloc_zeroed was called by allocate_frame where it sets up the ->work block | |||
maybe i should set the memory to cleaned again after the memset finishes | 15:54 | ||
oh, huh. i clean when i free, not when i alloc. maybe i should do that differently actually | |||
"It's easiest to put these requests into the pool manager code, and use them either when memory is returned to the pool, or is allocated from it." | 15:55 | ||
so i ought to decide whether i want to clean the memory when i alloc, or when i free, but not both | 15:56 | ||
IIUC | |||
(insert obligatory "i don't know what i'm doing" picture-of-dog here) | 15:57 | ||
IOninja | candlesandherbs.files.wordpress.co...5efd84.jpg | 15:58 | |
timotimo | TYVM | ||
i1.kym-cdn.com/photos/images/facebo...39/fa5.jpg - i think this is my fav | |||
16:08
brrt joined
|
|||
timotimo | with the incredible amount of output helgrind spits at me i find it very hard to verify if my additions of annotations help at all | 16:17 | |
dogbert17_ relocates | 16:30 | ||
16:41
zakharyas joined
17:54
Geth joined
18:15
lizmat joined
18:18
vendethiel joined
18:25
domidumont joined
18:28
domidumont1 joined
18:29
Ven joined
18:30
domidumont joined
18:31
domidumont joined
18:36
domidumont1 joined
18:49
Ven joined
18:53
domidumont joined
19:09
Ven joined
19:14
TimToady joined
19:29
Ven joined
19:40
lizmat_ joined
19:49
Ven joined
20:09
Ven joined
20:23
lizmat joined
20:29
Ven joined
20:49
Ven joined
21:02
Ven joined
23:26
mst joined
|