dalek arVM: 8379b7c | (Matt Oates)++ | src/core/nativecall.c:
Fix parens warning on OSX

Get a warning on OSX the == will be evaluated before the &
src/core/nativecall.c:112:17: warning: & has lower precedence than ==; == will be evaluated first [-Wparentheses]
src/core/nativecall.c:112:17: note: place parentheses around the '==' expression to silence this warning
   ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
src/core/nativecall.c:112:17: note: place parentheses around the & expression to evaluate it first dd87968 | FROGGS++ | src/core/nativecall.c: guard for the case that a foreign call NULLs an rw argument
FROGGS .tell abraxxa 'is rw' native int/num params are now supported, though unsinged ints are cames back as signed and therefore mangle their value 07:37
nwc10 jnthn: I am made of some fail - in those commit messages the numbers stated for "Called N times" are at the Tiobe end of the scale. However, the important bit, the total CPU count drops are accurate. 07:42
jnthn ah :) 07:43
the numbers are CPU counts, no?
nwc10 I think I found another one, that's about 0.25%
yes, I screwed up, and misread CPU counts as call counts
nwc10 but what's strange is that even under callgrind / cachegrind, the numbers fluctuate. 07:43
jnthn If you'd found/submitted these in the opposite order, they'd have each sounded more impressive... :P 07:44
nwc10 What varies? Is there an event loop that can decide to trigger more times in some cases?
brrt libuv probably has something like that? 07:51
jnthn Yeah, it does...but we only do so much I/O
brrt hmm 07:52
i'm wondering at other sources of variability we have
nwc10 I was confused because I thought it crazy that -e 1 would ever be anything other than CPU bound (at least at the event-loop level) hence would never yield and give the event loop different amounts of time 07:54
brrt well, i'm not saying you're wrong. but clearly there are more sources of runtime variability than we'd gues 07:59
nwc10 um, me wrong - I said I'm confused. :-)
brrt confusion reigns 08:00
nwc10 good *, Ven 08:13
Ven \o, nwc10
brrt good * too :-) 08:25
FROGGS I hope this is not a brain-o 09:08
dalek arVM: ef8293e | jnthn++ | / (6 files):
Stub new codepoint/normalization related ops.
arVM: 36d56f7 | FROGGS++ | src/core/nativecall.c:
access correct read/writable argument
brrt what, NFG is getting progress? many wow :-) 09:34
tadzik what a time to be alive :)
jnthn :) 09:35
FROGGS jnthn: I can bump revisions now safely to get my latest fix in? 09:47
jnthn FROGGS: Yes 09:48
FROGGS: I'm going to be doing Rakudo stuff in a branch.
arnsholt No hilarity from afl-fuzz yet 09:49
It reports a couple of hangs, but I guess those are most likely infinite loops in the bytecode
And it's kind of slow, as I started with simple ("1;" and "nqp::say(1)") NQP programs compiled to MBC 09:50
jnthn hah...so I was like "OK, let's generate NFC, NFD etc. spectests from the Unicode databases NormalizationTests.txt test suite" 09:58
And wrote a script to do it 10:00
Which is awesome, except
plan 18581;
And then multiply by 4 :) 10:01
jnthn: put it in S05-mass :o)
jnthn Well, S15-normalization more like :) 10:02
But I may mark them #stress :)
nwc10 jnthn: is it possible to put the test suite in one file, and then have 4 or 8 (or whatever) wrapper scripts that do a relevant quarter (or eighth etC) 10:07
so that parallel tests are for the win?
jnthn nwc10: Well, I'm already putting NFC/NFD/NFKC/NFKD in a file each
nwc10: Which will get us 4-way :)
nwc10 that's enough to be useful, I hope 10:08
jnthn Well, since I'm generating them from the Unicode data set, it's not going to be too hard to refactor later if we find it's not good enough.
nwc10 is your generator script suitably dogfood? 10:09
jnthn nwc10: As in "written in Perl 6"? :) 10:10
nwc10 yes, that question
"acceptable" answers are "yes" or "patches welcome"
jnthn yes, it's written in Perl 6
nwc10 \o/
jnthn Just 40 lines of it even :) 10:11
jnthn Hm, we end up with 1.5MB .t files. I think I'll just get the generator to break 'em up into 2000s or so 10:45
nwc10 is there an easy way to split them logically? 10:46
or is "2000, 4000, 6000, ..." the most sensible? 10:47
and, even if you split them logically, will Unicode updates touch all the files at the same time?
jnthn Logically there are sections but two are tiny and the other two are huge so it's not that much use. 10:49
lizmat jnthn: would you be able to generate the .t files at install time ?
jnthn lizmat: Oh, I'll commit them.
lizmat: We only need to update them if we get new Unicode versions.
Will include the script to help with that scenario. 10:50
lizmat I don't have a problem with 1.5MB .t files: it is dogfooding of the best order
jnthn We can re-combine them later, I just think it's going to make my life a bit easier to work with them in chunks while I implement stuff.
lizmat otoh, the core setting is now just under 900K, and that takes about half a minute to parse 10:51
jnthn Yeah, I'm considering making these # stress 10:53
So we only run them all in a stresstest
And finding some way to "sample" 1000 or so into a quick test to run normally
nwc10 if you're going to sample, and it's "Random", I'd strongly suggest generating an "actual" "random" seed, then seeding the PRNG with it, and making the seed visible in the test results somehow, so that it can be re-run on failure 10:55
on the other hand
if it's just a sample
that's not needed, because the debugging step is "run all of them"
jnthn *nod*
lizmat yes, you need re-runnability
nwc10 fingers faster than brain
lizmat nwc10: even "run all of them" *could* give a different result
nwc10 or still insufficient cofree
jnthn Well, you'll have stability in so far as we will only re-generate the file containing the sampled ones when we update to a new unicode version :)
nwc10 aha. 10:56
that's the sample.
jnthn But I may well take the easy way and take every 10th test or something
lizmat would prefer repeatable randomness to flush any problems out 10:57
please note, the seed should be visible in the "make spectest" or "make stresstest" output as well
I'm not sure how to do that at present 10:58
jnthn lizmat: I won't do it in a away that you can't reproduce trivially.
lizmat ++jnthn
jnthn lizmat: The randomness - if we even have any - will be in the thing that generates the .t file, and the generated things will be checked in.
lizmat ah, ok, gotcha :-) 10:59
jnthn lizmat: Otherwise we have to ship a big file from the Unicode database. :)
lizmat fwiw, I think that compresses very well, so I don't really see that as a problem
arnsholt jnthn: What do you think would be a good way to generate very simple bytecode examples to use with afl-fuzz? 12:19
I've got two simple NQP snippets running ATM, but that depends on the whole compiled NQP stuff on top, which makes it iterate quite a bit slower than it could
jnthn arnsholt: It's possible to generate MAST directly 12:49
arnsholt: Giving an .moarvm file 12:51
nwc10 maybe that we need is an assembly language to make it easier to write bytecode. We could call, say, PIR, for Pwns Infinite Resources, as a rough estimate of the future cost/benefit ratio :-) 12:56
MIR would also be a nice name :o)
tadzik we need Moar Intermediate Representations 12:57
nwc10 what happens when progress crashes into MIR?
is it a good thing? :-)
nwc10 do I win the space cadet award for a non-obvious and therefore non-funny joke? 13:06
or is the problem simply that everyone is asleep? 13:07
or gainfully employed
FROGGS $work :o( 13:09
jnthn Or trying to implenet Unicode normalization :P 13:10
.oO( MIR would at lesat bring world peace... )
JimmyZ MIT 13:11
nwc10 visualises that as some sort of device combining a piano roll to drive a row of big hammers
to flatten things into the right patterns
brrt we'll have Moar Intermedia Nearly Machine Level Representation :-P 13:15
Moar Very Low Level Representation
Moar Machine Like Representation 13:16
[Coke] (update if new unicode verisons) do we have a story yet on what versions of unicode a particular version of p6 supports? 13:21
jnthn [Coke]: No, though I suspect "r-m for The Release will do Unicode 7" may be enough to kick that can down the road a little way at least. 13:22
nwc10 unicode.org/versions/beta-8.0.0.html -- The next version of the Unicode Standard will be Version 8.0.0, planned for release in June, 2015. 13:23
the relevant discussion in Perl 5 land is that we're not even there yet on getting all the fiddly bits of *one* version done 13:24
[Coke] *curses*
nwc10 So it's very wishful thinking to be considered offering more than one
but, that's potentally a side effect/pain point of retrofitting Unicode 13:25
IIRC the Unicode consortium has some sort of backcompat guarantee that things won't change. Only well defined spaces will get things added to them
jnthn Bah, fine, then given it's curently "easy" maybe we'll bump to 8. :)
nwc10 but they *are* human
jnthn My main point was we can pick one thing for r-m for the point we need to start caring more deeply about backcompat. 13:26
[Coke] nwc10: oh, *there's* your problem
nwc10 jnthn: the best wording is probably something more like "we intend to support whatever is the current Unicode release at the time of the beta. We think it likely that this will be 8.0.0" 13:27
jnthn nwc10: That works :) 13:28
nwc10 I don't (offhand) know if the Unicode folks do minor bumps, hence whether it will be 8.0.small 13:29
arnsholt jnthn: Yes, of course. Creating MAST and compiling that is pretty obviously the best approach, I think 14:04
dalek arVM: b75ad46 | jnthn++ | / (6 files):
Add basic normalization infastructure.

We don't actually perform any normalization yet, this just gets the various bits of framework in place for us to do so, plus provides the normalizecodes op so we can run the normalization tests.
arVM: 6d6150a | jnthn++ | tools/ucd2c.p6:
Remove unused script.
arVM: 2bcdf64 | jnthn++ | src/strings/normalize. (2 files):
Normalization buffer setup.
dalek arVM: 79b6087 | jnthn++ | src/strings/normalize. (2 files):
Implement fast "nothing to do" normalization.

When two codepoints in a row are below the threshold of significance for the target normalization form, we can immediately the first one back. For now, the slow path doing the full check does exactly the same; that's where we'll add the interesting bits soon.
FROGGS src/strings/normalize.c: In function ‘assert_codepoint_array’: 17:52
src/strings/normalize.c:23:5: warning: format not a string literal and no format arguments [-Wformat-security]
MVM_exception_throw_adhoc(tc, error);
brrt \o 20:46
timotimo o/ 20:47
brrt many wow, by the way: developerblog.redhat.com/2015/04/07...g-gcc-5-2/
although i'd really like it if people didn't use brainf*ck as an example all the time 21:00
why not perl6-as-an-example 21:01
would be awesome
timotimo hah 21:06
brainfuck is very small 21:07
quite unlike perl6; even if you only take MoarVM's set of ops
as you probably are well aware :)
but yeah, brainfuck has pretty much zero semantic requirements that the jit has to uphold
brrt for one thing :-) 21:10
oh darn, it's GPL then 21:46
that will never work out
licenses! fuuuu
timotimo :\
brrt i'm tired and going to sleep 21:48
see you :-) 21:49