17:18
discord-raku-bot left,
discord-raku-bot joined
|
|||
tope | hi, totally new to raku and a question about grammars: is there some guide translating my "context-free grammar" thinking into thinking about grammars? I'm struggling with non-greedy matching in particular. I want to match something on the form `<text> <stuff>?` where `<stuff>` have a strict "structure" that is easy to match, but `<text>` is essentially "anything, but non-greedily" like `<-[\n]>*?`, i.e. | 17:42 | |
Anton Antonov | "[...] translating my "context-free grammar" thinking into thinking about grammars" -- Hmmm... should be straightforward. Do think in BNF of EBNF? | 17:47 | |
Here is a more useful answer: `rule org-section { <stars-spec> <todo-marker>? <text> <tag-list>?}` | 17:49 | ||
thowe | I haven't read the Lenz Grammars book yet... | 17:50 | |
I have it, but haven't gotten that far. | 17:51 | ||
Anton Antonov | Here is a more useful answer: `rule org-section { <stars-spec> <todo-marker>? <text> <org-tag-list>?}` | ||
Here is a more useful answer: `rule org-section { <stars-spec> <org-todo-marker>? <text> <org-tag-list>?}` | |||
``` | 17:53 | ||
rule text { \w+ } | |||
rule marker { 'TODO' | 'DONE' | 'CANCEL' } | |||
rule org-tag { \w+ } | |||
rule org-tag-list { <org-tag> % ':' } | |||
``` | |||
Well I have to say the above Raku code is just an example -- some small important tweaks might have to be done in order to work on actual org-mode sections. | |||
(Meaning, I just wrote that code here in the chat, did not try to run it.) | |||
tope | @Anton Antonov#7232 But I'd like for `<text>` to maybe contain `:` and the like. E.g. `** Tags on the form :ARCHIVE: are special :tag1:tag2:` | 17:55 | |
Anton Antonov | @thowe Moritz Lenz' book "Parsing with Perl 6 Regexes and Grammars" was first Perl6 book I read. | ||
thowe | I got both of his books and "Learning Perl 6". | 17:56 | |
I started with his basics book, but then didn't touch it for a while and started again with Learning. Going through it now. Also watching a lot of Raku YouTube videos. | 17:57 | ||
Anton Antonov | Then include ':' in the text rule : ` rule text { [ ':' | \w]+ }`. | ||
tope | but won't that just swallow the tag? | ||
my starting point was something like `^^ <stars> " " <todo>? <priority>? <text> <tags>? $$`. but `token text { <-[\n]>*? }` however the non-greedy matching *? doesn't seem to work with tokens? | |||
Anton Antonov | <@93032313142669312> Sure, it can do the swallowing, but you can use the appropriate regexes. I responded to you context-free-grammar thinking request. | 17:59 | |
@tope#9134 Sure, it can do the swallowing, but you can use the appropriate regexes. I responded to your context-free-grammar thinking request. | |||
tope | hmm yeah come to think of it I'm not sure how I'd translate this to context-free either. I'd maybe write something like `text_and_tags -> <tags> $ | $ | . <text_and_tags>`, so it keeps trying to match the ending or tags and only falls back to adding chars to the text when it fails (where alternative tries left-to-right in order until success) | 18:05 | |
but I guess my problem is doing this kind of non-greedy matching in general | |||
Anton Antonov | <@93032313142669312> Ok I will post a response with the next hour. | 18:06 | |
tope | or basic question: should I be using `regex` instead of `token` or `rule`? can token/rules do non-greedy matching? | ||
Anton Antonov | <@93032313142669312> I see -- `regex` is for comprehensive matching; `rule` and `token` are for more streamlined and optimized parsing, without parsing too hard. | 18:08 | |
Sorry, for being too vague, but the precise definitions are given in the documentation at raku.org . | 18:09 | ||
tope | yeah I've gone through the grammar tutorial & the grammar page, though I feel I have this curse where what I want to do never matches the examples given in tutorials. and/or I'm not smart enough to see how / translate it. | 18:10 | |
I could try searching github for people who have written other, more complicated parsers using grammar {} | 18:11 | ||
Anton Antonov | <@93032313142669312> -- Ahh, I know that curse too! :0 | 18:12 | |
@tope#9134 -- Ahh, I know that curse too! 🙂 | |||
<@93032313142669312> So, are you a fan of org-mode, or just cursed with it? (Because of a certain project or else...) | |||
tope | No, I actually love org-mode. But use it mostly in lieu of markdown for writing rather than as an organizational tool, as I find markdown way too simplistic, but org-mode has more flexibility and formatting options as it comes with latex, tables, (advanced) footnotes, etc etc. | 18:14 | |
and just in general prefer the org-mode syntax over markdown's, find it useful to have a separation of =typewriter= from ~code~ in formatting, prefer /italics/ etc. | 18:16 | ||
Anton Antonov | <@93032313142669312> Ok. I use org-mode a lot in my projects. | ||
tope | yeah and so I have a sort of blog/homepage/technical writeups written in markdown currently (mdbook), and looking into making some tools for myself to feed org-mode instead of markdown into mdbook -- and thought it could be a sort of beginner-project for learning raku | 18:18 | |
Anton Antonov | Ok, I am doing kind of the same things -- if you want we can "join forces." | ||
tope | I used and loved perl5 15+ years ago, but since then all I've done is mostly python/rust/c++, however I do love a lot of the language features in raku | ||
esp. nice that raku is one of the few languages doing operators The Right Way(tm), i.e. completely free user-defined operators like in Haskell, rather than some static list of accepted symbols that can be overloaded. | 18:20 | ||
Anton Antonov | <@93032313142669312> Here is what I have so far: | 18:24 | |
``` | |||
grammar OrgMode { | |||
rule TOP { [ <org-section-header> | <line-spec> ]+ % "\n" } | |||
regex org-section-header { | |||
| <org-section-header-not-tags> \h+ <org-tag-list> | |||
| <org-section-header-not-tags> } | |||
regex org-section-header-not-tags { <org-stars-spec> [\h+]? <org-todo-marker>? [\h+]? <text> } | |||
token org-stars-spec { '*'+ } | |||
regex text { [\V]+ } | 18:25 | ||
token org-todo-marker { 'TODO' | 'DONE' | 'CANCEL' } | |||
token org-tag { ':' \w+ } | |||
regex org-tag-list { <org-tag>+ } | |||
} | |||
Produces this output: | |||
``` | |||
** TODO First sectionï½£ | |||
org-section-header => ï½¢** TODO First sectionï½£ | |||
org-section-header-not-tags => ï½¢** TODO First sectionï½£ | |||
org-stars-spec => ï½¢**ï½£ | |||
org-todo-marker => ï½¢TODOï½£ | |||
text => ï½¢First sectionï½£ | |||
ï½¢** TODO First section :TAG1:TAG2ï½£ | |||
org-section-header => ï½¢** TODO First section :TAG1:TAG2ï½£ | |||
org-section-header-not-tags => ï½¢** TODO First sectionï½£ | |||
org-stars-spec => ï½¢**ï½£ | |||
org-todo-marker => ï½¢TODOï½£ | |||
text => ï½¢First sectionï½£ | |||
tope | ah hehe, you weren't kidding, you're actually writing an org-mode parser too | 18:26 | |
Anton Antonov | I did not define the `<line-spec>` , but yeah why not write an org-mode parse. | ||
I did not define the `<line-spec>` , but yeah why not write an org-mode parser. | |||
Note, that I am kind of dealing with the greediness in a sort of ad hoc manner with the rule `<org-section-header>`. | 18:27 | ||
tope | yeah you're dealing with it by using `regex` that backtracks I guess? | 18:28 | |
Anton Antonov | Sure, that, but I also give precedence to parsing of section specs with a tags lists | 18:30 | |
Sure, that, but I also give precedence to parsing of section specs with a tags lists. | |||
It is not a well-though grammar yet -- I just wrote it... | 18:31 | ||
But, as you suggested, I strongly suspect someone has already written an org-mode grammar in Raku and posted in the web... | 18:32 | ||
tope | ``` | ||
grammar Test { | |||
token TOP { ^^ "*"+ <todo>? <prio>? <tt> $$ } | |||
token todo { "TODO" || "DONE" || "IDEA" || "KILL" || "PROG" } | |||
token prio { "[#" <[\V]> "]" } | |||
token tt { <tags>? $$ || <[\V]> <tt> } | |||
token tags { ":" <tag>+ % ":" ":" } | |||
token tag { <[\w] + [\# @ %]>+ } | |||
} | |||
``` | |||
this was my only idea, to use a sort of muncher. would need the actions to handle collecting the chars into the title i guess | |||
but of course this is just for a simple heading. my next quest is to figure out how one would express the fact that the <section> after a heading with X stars should only match headings with X+1 or more stars.. | 18:36 | ||
i.e. if that sort of logic can be taken care of by the grammar-actions, without requiring extra logic outside of the grammar stuff | 18:37 | ||
lizmat | and yet another Rakudo Weekly News hits the Net: rakudoweekly.blog/2021/12/27/2021-...-released/ | ||
Anton Antonov | Yes this can be done with grammars. I have implementations that have conditional parsing similar to what you describe. The book by M. Lentz mentioned above has examples for that kind of parsing. | 18:42 | |
tope | ah, thanks, I'll see if I can procure it and add it to my bedtime reading list | 18:43 | |
Anton Antonov | Here is my org-mode grammar so far: | 18:44 | |
``` | |||
grammar OrgMode { | |||
regex TOP { [ <empty-line> | <org-section-header> | <line-spec> ]* % \v } | |||
regex org-section-header { | |||
| <org-section-header-not-tags> \h+ <org-tag-list> | |||
| <org-section-header-not-tags> } | |||
regex org-section-header-not-tags { <org-stars-spec> [\h+]? <org-todo-marker>? [\h+]? <text> } | 18:45 | ||
token org-stars-spec { '*'+ } | |||
regex text { [\V]+ } | |||
token org-todo-marker { 'TODO' | 'DONE' | 'CANCEL' } | |||
token org-tag { ':' \w+ } | |||
Here is a parsing result of org-mode text: | |||
``` | |||
ï½¢** TODO First section :TAG1:TAG2 | |||
** TODO Second section :TAG1:TAG3 | |||
- This text 1 | |||
18:45
discord-raku-bot left,
discord-raku-bot joined
|
|||
@tobe This package of mine evaluates Raku code sections in both Markdown and org-mode: modules.raku.org/dist/Text::CodePr...an:ANTONOV | 18:49 | ||
@tobe Do you use Babel-org-mode ? | 18:50 | ||
19:05
mmat joined,
mmat left
|
|||
tope | @Anton Antonov#7232 thanks for the link, I'll look at that too! as for babel I'm not sure -- is babel just the part that allow you to evaluate code directly in org-mode? If so, then yes -- I've used it extensively to write literate documents with code. but if it's something else then no, I don't think so | 19:33 | |
Anton Antonov | @tobe -- Yes, that Babel is for "literate programming." | 19:34 | |
tope | @Anton Antonov#7232 btw it's `tope` not `tobe` so I missed the @ notifictions. but yes, cool, then I do. (I went through and solved all the problems on cryptohack.org interactively in an org document, for example.) Though I'm far from an emacs expert, I'm mostly just using doom-emacs + a few personal settings. | 19:37 | |
Anton Antonov | <@93032313142669312> Sorry for misspelling your identifier / name. | 19:40 | |
tope | no problem | ||
Anton Antonov | <@93032313142669312> As for cryptography -- have tried to use Mathematica for cryptography? I am a big fan of Mathematica and I have programmed several types of connections between Mathematica and Raku. (Documenting them right now...) | 19:42 | |
So, basically I want to use literate programming with Raku through Mathematica notebooks. | |||
tope | Nah, I haven't, but I'm very comfortable with Python/SageMath and have built up a ton of personal tooling for cryptography-related stuff over some years where I've participated in CTF competitions, so never felt the need to look elsewhere | 19:47 | |
Anton Antonov | Ok, I am interested comparing programming languages over different computational workflows. I plan to design a conversational agent for Cryptography workflows in the next few months. (Using Raku of course.) So, I might ask for input from you. | 19:49 | |
Ok, I am interested in comparing programming languages over different computational workflows. I plan to design a conversational agent for Cryptography workflows in the next few months. (Using Raku of course.) So, I might ask for input from you. | |||
tope | have a lot of writeups for crypto-related problems at franksh.gitlab.io/ctf/ tho the old articles are probably a bit quirky since they were forcibly converted from org-mode documents to markdown+katex. | ||
(and they're probably not very useful to people who didn't play those CTFs and are thus not familiar with the problems) | 19:51 | ||
Anton Antonov | Ah, back org-mdoe ?! 🙂 See this please: github.com/alainbebe/org-mode-gtk.raku | ||
tope | ah yes, that's great, thanks | ||
thowe | what does -Ofun mean? | 21:56 | |
as seen in the weekly news | |||
gfldex | thowe: Raku is optimised for fun. | 22:02 | |
thowe | I get that, but what is "-O" ? | 22:05 | |
is that not a Raku lang thing? | |||
I feel that is some kind of idiom or inside joke that had something to do with Raku, but I don't see that when trying to search the docs for it. | 22:06 | ||
gfldex | gcc -O3 your-file.cc | 22:22 | |
tope | syntax question: ```> say(<a b> X~ ^2); say(<a b>X~^2) | ||
(a0 a1 b0 b1) | |||
(S P) | |||
``` the first result I understand (and expected), the second result I have no idea what means / what happens. | |||
gfldex | c++ is clearly optimised so you need 3 ppl to review your code :) | 22:23 | |
gfldex | m: dd <a b>X~^2; | 22:34 | |
m: dd <a b>(X~^2); | 22:36 | ||
thowe | Ah, compiler switch. Thanks. makes sense. | ||
tope | yeah never mind I'm dumb, I though X~ was a separate operator, but X is a metaop, which modifies the infix `~^` to apply across the lists, basically xoring the chars | 22:40 |