saint- Also, so does raku automatically add a new line at the end of a file when you slurp it? 00:01
stevied The .* will suck up everything. 00:21
You need \N* 00:22
More concise way: `token TOP { 00:26
<line>+ %% \v
}`
saint- Gotcha, yeah I got it to work with \N 00:33
Is \N just a perl regex thing? 00:34
I'm guessing \V or \v is just as good just a bit safer? 00:35
Looks like it, I guess I never really knew pcre, just the standard one-ish 00:36
Is there a purpose to the ^ and $ anchors in TOP? 00:39
00:44 zacts joined
saint- Is there a way to say not a token in the grammar? 00:50
For example is there any way to get this to work? I want to be able to have a character to be any character except for a double line-break 01:00
www.toptal.com/developers/hastebin...vimofib.pl
The idea is to like make a paragraph as a single token
I'm not sure how to get that to work
01:01 zacts left
klebs in this error: `===SORRY!===Object of type X in QAST::WVal, but not in SC` what does SC stand for? 01:28
I have worked around this issue by splitting the whole text on \n\n and processing each chunk separately 01:34
saint- klebs, yeah I figured I could do that, but was just curious as well how to do it in pure raku 01:36
Or rather pure raku grammar
Think I figured it out if anyone is interested www.toptal.com/developers/hastebin...emaguh.xml 01:48
klebs i have run into problems doing this sort of thing when you want to complexify what you parse as a "block" 01:59
saint- I've made it a bit more complex and so far so good
klebs 👍 02:00
saint- Will let you know if it starts to break haha 02:02
klebs what made it break for me was when i started introducing grammar `rules `which dont technically care about whitespace. i know you can set the ws token to \h*, but sometimes that isn't what the problem needs either. 02:07
there might have been some other details as well to consider... i forget
i was basically attempting to mix preprocessing logic and grammar rules
saint- Interesting
klebs while that works to a certain extent, sometimes the bugs when they show up are mega subtle
like, in which cases does something in a block parse as a rule and which cases does it trip your section snipping logic 02:08
as things get more complex, sometimes your snipping logic might trip in odd circumstances, then you try to change your snipping logic, and something you had working before might break 02:09
doesnt happen on small complexity grammars as much, but if you start trying to add more detail to your block parser, you might hit situations like that 02:10
hopefully not -- most of the time it happened to me was when i was trying to parse python doc comments, and extract the relevant items from them (ie when the author describes parameters, function outputs, etc) 02:11
what worked for me was to just preprocess everything up front and break it into smaller chunks, which could each be parsed against one of several smaller less complex grammars
02:13 razetime joined
saint- Based on this code www.toptal.com/developers/hastebin...igojiq.xml do you know how I would print out all refBlock matches only? 02:28
klebs you can write an actions class for this purpose 02:43
do you want the match or the text that matches? 02:46
something like: ```raku 02:48
grammar Lit {
token TOP { <block>+ }
proto token block { * }
token block:sym<section> { <sectionBreak> }
token block:sym<ref> { <refBlock>
token block:sym<plain> { <plainBlock> }
token sectionBreak { \n\n }
token notSectionBreak { [<!sectionBreak> .] }
token char { <notSectionBreak> | <ref> }
token ref { \(\d+\) }
token refBlock { <ref> <char>+ }
token plainBlock { <char>+ }
}
saint- klebs either the match or the text that matches, the text ideally 02:57
Ah hmm 02:58
Thanks I'll save that and look at it more 02:59
klebs np 03:24
03:26 Guest35 left 04:35 saint- left 04:41 saint- joined 05:15 zacts joined 05:26 razetime_ joined, razetime_ left 05:52 zacts left 07:17 frost joined, frost left 07:32 frost joined 10:10 zacts joined 12:10 zacts left 12:44 frost left
deadmarshal I found something weird :|. I used a range inside lines method (i know i shouldn't), but it gave back some lines which didn't correspond to the numbers in the range. and it didn't throw any errors or something. is this normal? 12:56
like this for example: 'input.txt'.IO.lines(4..12); 12:57
MasterDuke lines takes a limit, so it probably just numified the range (which would be the number of elements) and used that 13:09
m: say +(4..12)
camelia 9
MasterDuke you could argue that maybe it should use a range as indices (which would likely involve adding a new multi candidate for lines), feel free to open an issue in the rakudo repo about it 13:11
deadmarshal ok thank you 13:37
13:50 Guest35 joined 14:10 razetime left
gfldex .lines retuns a Positional, so a subscript should work 14:36
lizmat yeah, 'input.txt'.IO.lines[4..12] 16:30
deadmarshal awesome ;) 16:40
22:39 zacts joined 23:12 zacts left