01:19 zakharyas joined 02:48 ilbot3 joined 03:39 agentzh joined 06:28 mtj_ joined 06:30 mtj_ joined 07:06 domidumont joined 07:11 domidumont joined 08:05 brrt joined
brrt good * #moarvm 08:06
i almost had a fix for the round to zero error.. 08:17
samcv jnthn, any reason failed UTF-8 kills the VM, is there a way to make it not totally die? 08:34
hello br
err i guess you're gone
*brrt
nwc10 good *, #moarvm 08:35
samcv :)
08:58 brrt joined
brrt but, it's not correct 08:59
the tl;dr is, the interpreter uses one algorithm for round-towards-minus-infinity, the JIT uses another, this casues the JIT and the interpreter to disagree in the range [-1,0] 09:00
i replaced that with a check of the operand and modulus; but that doesn't work either, since, if you divide by a negative number, the modulus may be positive 09:03
so it needs to be (result < 0 && modulus != 0 || result == 0 && modulus < 0) 09:04
but there's not really a cute and quick expression for that... 09:05
jnthn samcv: Uh...I thought we already fixed all places that happened? 09:45
Unless somebody introduced a new one :P 09:46
There is still some async string I/O code that can do so, but all of the ops leading to it are deprecated and can be tossed already. 09:47
10:18 brrt joined 10:25 dogbert17_ joined
samcv oh 10:39
dunno. but i remember it crashing
jnthn Now I've had coffee, the obvious question I shoulda asked is "what code did you run to make this happen" :) 10:40
samcv m: Buf.new((0..255).pick(100)).decode('utf8'); say 'hi' 10:45
camelia Malformed UTF-8 at line 1 col 1
in block <unit> at <tmp> line 1
samcv it never gets to the `say 'hi'` part
dogbert17_ m: say 'Hi' 10:50
camelia Hi
jnthn You didn't catch the exception
m: try Buf.new((0..255).pick(100)).decode('utf8'); say 'hi' 10:51
camelia hi
dogbert17_ jnthn: here's idiomatic :) code you can run in order to repro the decoderstream bug gist.github.com/dogbert17/037f464c...95fae63ed9
jnthn samcv: Maybe you're wanting a way to do a replacement char or something, though? 10:55
In which case that's NYI 10:56
samcv not really. just wanting it to continue the rest of the script
so maybe i am wrong, but i thought i put it in a try block and it still messed it all up. but i'll let you know if i have any code 10:57
jnthn OK :) 10:59
Yeah, it's meant to throw a catchable exception
brrt so, here's an idea 11:09
what if we add an opcode that reliably prevents JITting
so you could add nqp::dont_jit_me() or nqp::no_jit();
and just always bail when we see that 11:10
dogbert17_ jnthn, timotimo: could this be the break we've been looking for ? gist.github.com/dogbert17/e0a63ea2...f657313c1e 11:18
jnthn Yes, that looks very wrong o.O 11:22
Can you MVM_dump_backtrace(tc) on each of those threads?
Geth MoarVM: be37105fea | (Bart Wiegmans)++ | src/jit/emit_x64.dasc
Fix div_i JIT round to negative infinity

Compute a bias as (modulo < 0) & ((num < 0) ^ (denom < 0)); this gets rid of a conditional move and branches that would achieve the same.
11:29
brrt i should probably update MoarVM and rakudo to point to this new version 11:30
IOninja This fixes the div_i JIT bug? 11:31
brrt ā€¦ how is that done, anyway
yes
IOninja brrt++ sweet
brrt IOninja; if you could in my stead, i'd be grateful
(i would go to lunch, in that case)
IOninja Version bumps? I need to get ready to work 11:32
I can do it in 1.5 hours.
brrt hmm
i can probably do it as well
dogbert17_ sry, had to run away for a while 12:11
jnthn, I could get only get one trace: gist.github.com/dogbert17/e0a63ea2...f657313c1e 12:37
have yet another gist which I believe clarifies things a bit don't you think. gist.github.com/dogbert17/4d360c57...dd30703b93 13:25
jnthn Hmm...yes, that's interesting 13:28
dogbert17_ how should that sequence of events be avoided? 13:36
dogbert17_ is happy that progress has been made wrt the decoderstream bug :) 13:51
jnthn dogbert17_: I'll have to take a closer look to figure out exactly which of the troublesome scenarios is going on 13:55
timotimo i've opened this gist three times in a row now and i'm not quite sure i understand it ... but it's gone into GC in the middle?
i mean ... it went GC in one thread and it was (still? already?) doing something with the same decoder in another? 13:56
13:59 Guest45345 joined
dogbert17_ one thread sets the 'in-use' flag (via MVM_decoder_take_available_chars) then goes in to GC and when the other thread tries to get into single user mode things asplode :) 14:03
dogbert17_ I think :) 14:04
jnthn It's possible that the decoder object moves and we end up zeroing the wrong thing at exit thanks to said GC 14:12
But that may not entirley explain it 14:13
Because we're *in* GC
And in the decoder
And nothing else should be
So even if we're got a single use bug, I still thing we've got a real bug too
dogbert17_ could it be bad enough to mess up bigints? 14:18
timotimo .o( when do bigints come out of the decoder? ) 14:19
jnthn If it's really a "thread is running when it should be GCing or blocked" then yes
But I'd expect it to fail much more explosively and variedly than we're seeing
dogbert17_ we've seen lots of nasty stuff when running harness6, SEGV's, corrupted linked lists etc 14:20
I still have gdb running (the latest gist), which thread should I run a MVM_dump_backtrace on? 14:21
notes that another thread seems to be doing gc as well, thread 2 14:22
14:23 lizmat joined
dogbert17_ meant thread 3 14:23
#0 AO_load_acquire (addr=0x804c248) at 3rdparty/libatomic_ops/src/atomic_ops/sysdeps/gcc/generic-small.h:490 # thread 3 14:24
#1 finish_gc (is_coordinator=<optimized out>, gen=<optimized out>, tc=<optimized out>) at src/gc/orchestrate.c:195
#2 run_gc (tc=tc@entry=0x9fa41c0, what_to_do=what_to_do@entry=1 '\001') at src/gc/orchestrate.c:333
#3 0xb7c5a7e6 in MVM_gc_enter_from_interrupt (tc=tc@entry=0x9fa41c0) at src/gc/orchestrate.c:511
i.e. for the untrained eye it seems as if two threads are doing gc at the same time (must be wrong) 14:25
timotimo no, gc is multithreaded
dogbert17_ aha thx, I did update the gist with data from two other threads, there's lots of gc happening at the same time 14:28
timotimo that's interesting. two threads are waiting for gc to finish, but two other threads are still working towards their goals 14:30
finish_gc is basically "let's wait for all other threads to signal that they're finished"
and collect_free_nursery_uncopied is for cleaning up stuff that has died but can't just be ignored (i.e. they have pointers to buffers or own file handles) 14:31
dogbert17_ timotimo, here's the golfed code if you wish to try it gist.github.com/dogbert17/037f464c...95fae63ed9 14:33
timotimo how many runs did it take you to crash that? 14:34
oh that was quick 14:35
dogbert17_ it crashes after a while, just let it run (check that you have jnthn's fixes for src/6model/reprs/Decoder.c) 14:36
timotimo oh, i might not have it
no, i do have it 14:37
dogbert17_ cool
it should throw an exception after a while 14:39
unless you have a breakpoint around line 107 14:40
timotimo yeah, it gets to the decoder-multi-use exception quickly 14:41
am i supposed to get that from multiple threads at the same time? 14:42
dogbert17_ good question :)
timotimo but you get it, too?
dogbert17_ yes
timotimo i.e. multiple lines of "unhandled exception on thread n"
dogbert17_ yup
sometimes it shows up immediately 14:43
timotimo and only one time it has "decoder may not be used concurrently"
dogbert17_ yes
sometimes it crashes on the first run like this: 14:45
Starting...
Unhandled exception in code scheduled on thread 3
Unhandled exception in code scheduled on thread 8
Deocder may not be used concurrently
timotimo have you ever gotten it to crash under valgrind? 14:55
dogbert17_ haven't tried 14:56
dogbert17_ tries
valgrind and mt-programs are a bit meh 14:57
timotimo yeah
dogbert17_ ASAN might be better 14:58
timotimo It is also possible to mark up the effects of thread-safe reference counting using the ANNOTATE_HAPPENS_BEFORE, ANNOTATE_HAPPENS_AFTER and ANNOTATE_HAPPENS_BEFORE_FORGET_ALL, macros. Thread-safe reference counting using an atomically incremented/decremented refcount variable causes Helgrind problems because a one-to-zero transition of the reference count means the accessing thread has exclusive ownership of 15:02
the associated resource (normally, a C++ object) and can therefore access it (normally, to run its destructor) without locking. Helgrind doesn't understand this, and markup is essential to avoid false positives.
do we have anything that'd have the same problem as this here?
However, it is common practice to implement memory recycling schemes. In these, memory to be freed is not handed to free/delete, but instead put into a pool of free buffers to be handed out again as required. The problem is that Helgrind has no way to know that such memory is logically no longer in use, and its history is irrelevant. Hence you must make that explicit, using the VALGRIND_HG_CLEAN_MEMORY client 15:03
request to specify the relevant address ranges. It's easiest to put these requests into the pool manager code, and use them either when memory is returned to the pool, or is allocated from it.
^- it feels like this definitely applies
okay, the program dies under valgrind, but it doesn't output anything interesting 15:07
oh, neat 15:11
my valgrind support branch has made it into master
i wasn't sure if it did
dogbert17_ what does that support branch do? 15:12
timotimo you can call Configure.pl with --valgrind 15:13
and it'll put redzones into the fixedsize allocator and also tells valgrind more clearly how that allocator operates
i.e. instead of getting "address is 10000000 bytes into block of size 99999999" it'll give a fine-grained resolution of those chunks of data 15:14
and when something writes past its allowed size in the FSA, it'll asplode
dogbert17_ very cool 15:15
timotimo ayup 15:16
it's not quite as simple to make the same thing work for the nursery, though :(
so i haven't tackled that last time i worked with this code
but telling helgrind that the fromspace is "now completely freed" should be easy 15:17
i was hoping i could give locks names with annotations, but it doesn't seem to allow for that
oh, would you look at that 15:18
helgrind's docs point out that posix condition variables aren't properly supported
but libuv uses them for stuff
the docs say it'll generate lots of false positives if you do use them. and i do see reports from inside pthread_cond_wait and such 15:19
jnthn We use them for "stuff" doo
*too
Like, every Promise :P 15:20
timotimo oh, ok 15:21
helgrind docs say to plesae use semaphores instead
here i see it's detecting a data race for fsa->freelist_spin %)
jnthn Does it understand our CAS functions? 15:24
timotimo Do not roll your own threading primitives (mutexes, etc) from combinations of the Linux futex syscall, atomic counters, etc. These throw Helgrind's internal what's-going-on models way off course and will give bogus results. 15:25
we can call ANNOTATE_RWLOCK_CREATE/_DESTROY and ANNOTATE_RWLOCK_ACQUIRED/_RELEASED around our spinlock 15:27
that should make it understand
oh, rwlock may not be the right one 15:28
then it'd be VALGRIND_HG_MUTEX_* that i'd have to use 15:29
INIT_POST, LOCK_PRE, LOCK_POST, UNLOCK_PRE, UNLOCK_POST, and DESTROY_PRE
i'm not sure how to annotate our race to add stuff to the freelist 15:33
wow. 15:35
googling "helgrind trycas" finds just a single result
and it's a tumblr tag "really-bad-hair-day." ?!?!
jnthn lol 15:38
timotimo also added a MEMORY_CLEAN annotation to where we allocate a frame from our framestack 15:45
hum. 15:49
could our decodestream thing have to do with passing work to other threads? :\
because finish_gc is also the function that calls into "process_in_tray"
it's complaining that the memset that zeroes a block allocated in the FSA can potentially race with a read by process_worklist, and i'm not sure how to interpret tihs 15:53
huh. that fixed_size_alloc_zeroed was called by allocate_frame where it sets up the ->work block
maybe i should set the memory to cleaned again after the memset finishes 15:54
oh, huh. i clean when i free, not when i alloc. maybe i should do that differently actually
"It's easiest to put these requests into the pool manager code, and use them either when memory is returned to the pool, or is allocated from it." 15:55
so i ought to decide whether i want to clean the memory when i alloc, or when i free, but not both 15:56
IIUC
(insert obligatory "i don't know what i'm doing" picture-of-dog here) 15:57
IOninja candlesandherbs.files.wordpress.co...5efd84.jpg 15:58
timotimo TYVM
i1.kym-cdn.com/photos/images/facebo...39/fa5.jpg - i think this is my fav
16:08 brrt joined
timotimo with the incredible amount of output helgrind spits at me i find it very hard to verify if my additions of annotations help at all 16:17
dogbert17_ relocates 16:30
16:41 zakharyas joined 17:54 Geth joined 18:15 lizmat joined 18:18 vendethiel joined 18:25 domidumont joined 18:28 domidumont1 joined 18:29 Ven joined 18:30 domidumont joined 18:31 domidumont joined 18:36 domidumont1 joined 18:49 Ven joined 18:53 domidumont joined 19:09 Ven joined 19:14 TimToady joined 19:29 Ven joined 19:40 lizmat_ joined 19:49 Ven joined 20:09 Ven joined 20:23 lizmat joined 20:29 Ven joined 20:49 Ven joined 21:02 Ven joined 23:26 mst joined