17:18 discord-raku-bot left, discord-raku-bot joined
tope hi, totally new to raku and a question about grammars: is there some guide translating my "context-free grammar" thinking into thinking about grammars? I'm struggling with non-greedy matching in particular. I want to match something on the form `<text> <stuff>?` where `<stuff>` have a strict "structure" that is easy to match, but `<text>` is essentially "anything, but non-greedily" like `<-[\n]>*?`, i.e. 17:42
Anton Antonov "[...] translating my "context-free grammar" thinking into thinking about grammars" -- Hmmm... should be straightforward. Do think in BNF of EBNF? 17:47
Here is a more useful answer: `rule org-section { <stars-spec> <todo-marker>? <text> <tag-list>?}` 17:49
thowe I haven't read the Lenz Grammars book yet... 17:50
I have it, but haven't gotten that far. 17:51
Anton Antonov Here is a more useful answer: `rule org-section { <stars-spec> <todo-marker>? <text> <org-tag-list>?}`
Here is a more useful answer: `rule org-section { <stars-spec> <org-todo-marker>? <text> <org-tag-list>?}`
``` 17:53
rule text { \w+ }
rule marker { 'TODO' | 'DONE' | 'CANCEL' }
rule org-tag { \w+ }
rule org-tag-list { <org-tag> % ':' }
```
Well I have to say the above Raku code is just an example -- some small important tweaks might have to be done in order to work on actual org-mode sections.
(Meaning, I just wrote that code here in the chat, did not try to run it.)
tope @Anton Antonov#7232 But I'd like for `<text>` to maybe contain `:` and the like. E.g. `** Tags on the form :ARCHIVE: are special :tag1:tag2:` 17:55
Anton Antonov @thowe Moritz Lenz' book "Parsing with Perl 6 Regexes and Grammars" was first Perl6 book I read.
thowe I got both of his books and "Learning Perl 6". 17:56
I started with his basics book, but then didn't touch it for a while and started again with Learning. Going through it now. Also watching a lot of Raku YouTube videos. 17:57
Anton Antonov Then include ':' in the text rule : ` rule text { [ ':' | \w]+ }`.
tope but won't that just swallow the tag?
my starting point was something like `^^ <stars> " " <todo>? <priority>? <text> <tags>? $$`. but `token text { <-[\n]>*? }` however the non-greedy matching *? doesn't seem to work with tokens?
Anton Antonov <@93032313142669312> Sure, it can do the swallowing, but you can use the appropriate regexes. I responded to you context-free-grammar thinking request. 17:59
@tope#9134 Sure, it can do the swallowing, but you can use the appropriate regexes. I responded to your context-free-grammar thinking request.
tope hmm yeah come to think of it I'm not sure how I'd translate this to context-free either. I'd maybe write something like `text_and_tags -> <tags> $ | $ | . <text_and_tags>`, so it keeps trying to match the ending or tags and only falls back to adding chars to the text when it fails (where alternative tries left-to-right in order until success) 18:05
but I guess my problem is doing this kind of non-greedy matching in general
Anton Antonov <@93032313142669312> Ok I will post a response with the next hour. 18:06
tope or basic question: should I be using `regex` instead of `token` or `rule`? can token/rules do non-greedy matching?
Anton Antonov <@93032313142669312> I see -- `regex` is for comprehensive matching; `rule` and `token` are for more streamlined and optimized parsing, without parsing too hard. 18:08
Sorry, for being too vague, but the precise definitions are given in the documentation at raku.org . 18:09
tope yeah I've gone through the grammar tutorial & the grammar page, though I feel I have this curse where what I want to do never matches the examples given in tutorials. and/or I'm not smart enough to see how / translate it. 18:10
I could try searching github for people who have written other, more complicated parsers using grammar {} 18:11
Anton Antonov <@93032313142669312> -- Ahh, I know that curse too! :0 18:12
@tope#9134 -- Ahh, I know that curse too! 🙂
<@93032313142669312> So, are you a fan of org-mode, or just cursed with it? (Because of a certain project or else...)
tope No, I actually love org-mode. But use it mostly in lieu of markdown for writing rather than as an organizational tool, as I find markdown way too simplistic, but org-mode has more flexibility and formatting options as it comes with latex, tables, (advanced) footnotes, etc etc. 18:14
and just in general prefer the org-mode syntax over markdown's, find it useful to have a separation of =typewriter= from ~code~ in formatting, prefer /italics/ etc. 18:16
Anton Antonov <@93032313142669312> Ok. I use org-mode a lot in my projects.
tope yeah and so I have a sort of blog/homepage/technical writeups written in markdown currently (mdbook), and looking into making some tools for myself to feed org-mode instead of markdown into mdbook -- and thought it could be a sort of beginner-project for learning raku 18:18
Anton Antonov Ok, I am doing kind of the same things -- if you want we can "join forces."
tope I used and loved perl5 15+ years ago, but since then all I've done is mostly python/rust/c++, however I do love a lot of the language features in raku
esp. nice that raku is one of the few languages doing operators The Right Way(tm), i.e. completely free user-defined operators like in Haskell, rather than some static list of accepted symbols that can be overloaded. 18:20
Anton Antonov <@93032313142669312> Here is what I have so far: 18:24
```
grammar OrgMode {
rule TOP { [ <org-section-header> | <line-spec> ]+ % "\n" }
regex org-section-header {
| <org-section-header-not-tags> \h+ <org-tag-list>
| <org-section-header-not-tags> }
regex org-section-header-not-tags { <org-stars-spec> [\h+]? <org-todo-marker>? [\h+]? <text> }
token org-stars-spec { '*'+ }
regex text { [\V]+ } 18:25
token org-todo-marker { 'TODO' | 'DONE' | 'CANCEL' }
token org-tag { ':' \w+ }
regex org-tag-list { <org-tag>+ }
}
Produces this output:
```
** TODO First sectionï½£
org-section-header => ï½¢** TODO First sectionï½£
org-section-header-not-tags => ï½¢** TODO First sectionï½£
org-stars-spec => ï½¢**ï½£
org-todo-marker => ï½¢TODOï½£
text => ï½¢First sectionï½£
ï½¢** TODO First section :TAG1:TAG2ï½£
org-section-header => ï½¢** TODO First section :TAG1:TAG2ï½£
org-section-header-not-tags => ï½¢** TODO First sectionï½£
org-stars-spec => ï½¢**ï½£
org-todo-marker => ï½¢TODOï½£
text => ï½¢First sectionï½£
tope ah hehe, you weren't kidding, you're actually writing an org-mode parser too 18:26
Anton Antonov I did not define the `<line-spec>` , but yeah why not write an org-mode parse.
I did not define the `<line-spec>` , but yeah why not write an org-mode parser.
Note, that I am kind of dealing with the greediness in a sort of ad hoc manner with the rule `<org-section-header>`. 18:27
tope yeah you're dealing with it by using `regex` that backtracks I guess? 18:28
Anton Antonov Sure, that, but I also give precedence to parsing of section specs with a tags lists 18:30
Sure, that, but I also give precedence to parsing of section specs with a tags lists.
It is not a well-though grammar yet -- I just wrote it... 18:31
But, as you suggested, I strongly suspect someone has already written an org-mode grammar in Raku and posted in the web... 18:32
tope ```
grammar Test {
token TOP { ^^ "*"+ <todo>? <prio>? <tt> $$ }
token todo { "TODO" || "DONE" || "IDEA" || "KILL" || "PROG" }
token prio { "[#" <[\V]> "]" }
token tt { <tags>? $$ || <[\V]> <tt> }
token tags { ":" <tag>+ % ":" ":" }
token tag { <[\w] + [\# @ %]>+ }
}
```
this was my only idea, to use a sort of muncher. would need the actions to handle collecting the chars into the title i guess
but of course this is just for a simple heading. my next quest is to figure out how one would express the fact that the <section> after a heading with X stars should only match headings with X+1 or more stars.. 18:36
i.e. if that sort of logic can be taken care of by the grammar-actions, without requiring extra logic outside of the grammar stuff 18:37
lizmat and yet another Rakudo Weekly News hits the Net: rakudoweekly.blog/2021/12/27/2021-...-released/
Anton Antonov Yes this can be done with grammars. I have implementations that have conditional parsing similar to what you describe. The book by M. Lentz mentioned above has examples for that kind of parsing. 18:42
tope ah, thanks, I'll see if I can procure it and add it to my bedtime reading list 18:43
Anton Antonov Here is my org-mode grammar so far: 18:44
```
grammar OrgMode {
regex TOP { [ <empty-line> | <org-section-header> | <line-spec> ]* % \v }
regex org-section-header {
| <org-section-header-not-tags> \h+ <org-tag-list>
| <org-section-header-not-tags> }
regex org-section-header-not-tags { <org-stars-spec> [\h+]? <org-todo-marker>? [\h+]? <text> } 18:45
token org-stars-spec { '*'+ }
regex text { [\V]+ }
token org-todo-marker { 'TODO' | 'DONE' | 'CANCEL' }
token org-tag { ':' \w+ }
Here is a parsing result of org-mode text:
```
ï½¢** TODO First section :TAG1:TAG2
** TODO Second section :TAG1:TAG3
- This text 1
18:45 discord-raku-bot left, discord-raku-bot joined
@tobe This package of mine evaluates Raku code sections in both Markdown and org-mode: modules.raku.org/dist/Text::CodePr...an:ANTONOV 18:49
@tobe Do you use Babel-org-mode ? 18:50
19:05 mmat joined, mmat left
tope @Anton Antonov#7232 thanks for the link, I'll look at that too! as for babel I'm not sure -- is babel just the part that allow you to evaluate code directly in org-mode? If so, then yes -- I've used it extensively to write literate documents with code. but if it's something else then no, I don't think so 19:33
Anton Antonov @tobe -- Yes, that Babel is for "literate programming." 19:34
tope @Anton Antonov#7232 btw it's `tope` not `tobe` so I missed the @ notifictions. but yes, cool, then I do. (I went through and solved all the problems on cryptohack.org interactively in an org document, for example.) Though I'm far from an emacs expert, I'm mostly just using doom-emacs + a few personal settings. 19:37
Anton Antonov <@93032313142669312> Sorry for misspelling your identifier / name. 19:40
tope no problem
Anton Antonov <@93032313142669312> As for cryptography -- have tried to use Mathematica for cryptography? I am a big fan of Mathematica and I have programmed several types of connections between Mathematica and Raku. (Documenting them right now...) 19:42
So, basically I want to use literate programming with Raku through Mathematica notebooks.
tope Nah, I haven't, but I'm very comfortable with Python/SageMath and have built up a ton of personal tooling for cryptography-related stuff over some years where I've participated in CTF competitions, so never felt the need to look elsewhere 19:47
Anton Antonov Ok, I am interested comparing programming languages over different computational workflows. I plan to design a conversational agent for Cryptography workflows in the next few months. (Using Raku of course.) So, I might ask for input from you. 19:49
Ok, I am interested in comparing programming languages over different computational workflows. I plan to design a conversational agent for Cryptography workflows in the next few months. (Using Raku of course.) So, I might ask for input from you.
tope have a lot of writeups for crypto-related problems at franksh.gitlab.io/ctf/ tho the old articles are probably a bit quirky since they were forcibly converted from org-mode documents to markdown+katex.
(and they're probably not very useful to people who didn't play those CTFs and are thus not familiar with the problems) 19:51
Anton Antonov Ah, back org-mdoe ?! 🙂 See this please: github.com/alainbebe/org-mode-gtk.raku
tope ah yes, that's great, thanks
thowe what does -Ofun mean? 21:56
as seen in the weekly news
gfldex thowe: Raku is optimised for fun. 22:02
thowe I get that, but what is "-O" ? 22:05
is that not a Raku lang thing?
I feel that is some kind of idiom or inside joke that had something to do with Raku, but I don't see that when trying to search the docs for it. 22:06
gfldex gcc -O3 your-file.cc 22:22
tope syntax question: ```> say(<a b> X~ ^2); say(<a b>X~^2)
(a0 a1 b0 b1)
(S P)
``` the first result I understand (and expected), the second result I have no idea what means / what happens.
gfldex c++ is clearly optimised so you need 3 ppl to review your code :) 22:23
gfldex m: dd <a b>X~^2; 22:34
m: dd <a b>(X~^2); 22:36
thowe Ah, compiler switch. Thanks. makes sense.
tope yeah never mind I'm dumb, I though X~ was a separate operator, but X is a metaop, which modifies the infix `~^` to apply across the lists, basically xoring the chars 22:40