🦋 Welcome to the MAIN() IRC channel of the Raku Programming Language (raku.org). Log available at irclogs.raku.org/raku/live.html . If you're a beginner, you can also check out the #raku-beginner channel! Set by lizmat on 6 September 2022. |
|||
00:52
Manifest0 left
01:01
MasterDuke joined
01:19
Chanakan joined
02:00
hulk joined,
kylese left
02:03
Guest24 joined
02:05
Guest24 left
02:15
hulk left,
kylese joined
02:40
eseyman left
02:42
eseyman joined
04:46
ACfromTX joined
05:30
Sgeo left
06:02
Chanakan5591 joined
06:03
Chanakan left
07:17
wayland joined
|
|||
wayland | [Coke]: Regarding tables, the short version is think of something like an SQL table. The medium version is that I want to write a more generic class that can use the same interface for in-memory tables, CSV files, SQL tables, and the like. The long version is a journey that starts at wayland.github.io/table-oriented-p...n/What.xml . The really long version starts at wayland.github.io/table-oriented-programming/ . | 07:23 | |
tellable6 | 2024-06-05T06:43:59Z #raku <ab5tract> wayland76: why would the use of | be a requirement for using RakuAST? Your code could have used unshift, for example | ||
wayland | ab5tract: I got the same results with unshift, I'm pretty sure. | ||
ab5tract: No, oddly, unshift seems to do nothing. I tried using @astparams.unshift, and I got the exact same results as if I used @astparams (ie. they're still in [ ] ). I even tried @astparams.unshift.unshift, and that was also the same as just @astparams. | 07:46 | ||
07:48
derpydoo joined
|
|||
wayland | librasteve: Oh, nice! That looks like it's for array-oriented programming instead of table-oriented programming, but I'd like to see all branches of the data-oriented programming family (Array, Table, and Tree) implemented in Raku :) | 07:53 | |
07:54
jpn joined
|
|||
wayland | librasteve: After more of a look at the DataFrames stuff, it looks like it's going in the direction I was going in before I started trying to integrate Logic Programming into the whole concept. | 07:58 | |
But it looks like it's covering some of the Table-Oriented stuff. Not all, by any means, but definitely some. | 08:00 | ||
And, I'm definitely stronger on table-oriented and even tree-oriented than on array-oriented. | |||
08:00
Manifest0 joined
08:01
sena_kun joined
|
|||
wayland | How would you feel about backends for Postgres and the like? | 08:01 | |
08:14
dakkar joined
09:30
haxxelotto joined
|
|||
antononcube | @wayland I also deal with tabular data in Raku. I am using “datasets” not data frames. And instead of Logic Programming, I use a natural DSL. Doing data wrangling with Raku though, is more of an “afterthought” — I use Raku to generate data wrangling code for other languages. | 10:00 | |
Here are the definitions: discord.com/channels/5384078799804...1060760706 | 10:03 | ||
11:31
derpydoo left
|
|||
wayland | That link seems to have been converted into something that only works in discord, from what I can see. | 11:33 | |
antononcube | @wayland Sorry. Here is the corresponding text: > - A dataset is a table that as a data structure is most naturally interpreted as an array of hashes, each hash representing a row in the table. > - A data frame is a table that as a data structure is most naturally interpreted as an array of hashes, each hash representing a column in the table. | ||
wayland | Oh, thanks! | 11:35 | |
Well, I regard it as a flaw that Raku, one of the most multi-paradigm languages in the world (behind only Julia and Wolfram, according to Wikipedia) supports zero of the data-oriented paradigms (though to be fair, the data-oriented paradigms have not represented themselves well). I'm attempting to rectify that flaw :) . | 11:39 | ||
To be fair, it *does* have most of the stuff needed to do Array-Oriented Programming. | 11:42 | ||
antononcube | The workflows described (contained) in this flowchart are fully supported in | 11:47 | |
Raku: raw.githubusercontent.com/antononc...kflows.jpg | |||
@wayland I read the Table Oriented Programming (TOP) Why.html — I am not sure is this project Raku-only, or Raku-centric, or neither. (Meaning, cross-language.) | 11:53 | ||
wayland | Oh, yes, I agree all these things can be done (it being Turing-complete), but as far as making the easy things easy, I don't think it does well. | ||
Regarding the Table-Oriented programming, apologies that that website is poorly designed (I did it). It functions as a few related websites. So the sections (accessible from the top menu) are basically a) General table-oriented programming (ie. advocacy for the paradigm), b) Raku TOP (table-oriented programming), which is what I was hoping to do before I got sidetracked by the declarative stuff, and c) the RAD section (which is much less fleshed out than | 11:56 | ||
the other parts) | |||
So the general Table-Oriented Programming section is mostly cross-language. The Raku TOP section is Raku-specific, but may make a lot more sense after reading the general TOP section. | 12:03 | ||
antononcube | @wayland “[…] I agree all these things can be done […] but as far as making the easy things easy, I don't think it does well.” — I would argue it does pretty well for small to medium size data with “Data::Reshapers”, which is a pure Raku implementation. | 12:20 | |
I do have problems with the ingestion speed — even ability — to ingest large data files. | 12:21 | ||
Well, my data size ingestion observations, are actually at least a year old… | 12:22 | ||
wayland | antononcube: OK, maybe I just need to learn the Raku non-core packages better :) . | ||
12:35
haxxelotto left
12:37
xinming left
12:38
xinming joined
12:41
haxxelotto joined
12:48
wayland76 joined
12:49
wayland left
12:57
jpn left
13:09
jpn joined
|
|||
antononcube | @wayland I am curiuos what would happen your TOP writings are fed to an LLM. | 13:12 | |
13:15
goblin joined,
HobGoblin left
|
|||
librasteve | wayland: not sure that you were around at the time, but it took a while for me to compare / contrast anton's Data::Shapers (which I gather are similar to Mathematica DataSets) and my Dan, Dan::Pandas, Dan::Polars modules (which are a work in progress to support the long established DataFrame and Series structures found in Python Pandas and Rust Polars). The idea is that Dan provides provide a raku only abstraction | 13:29 | |
tellable6 | librasteve, I'll pass your message to wayland | ||
for DataFrame and Series, Dan::Pandas lets you hook into the Python library to get fast processing and Dan::Polars similar to Rust Polars. My mental model is that Data::Shapers are awesome for data wrangling - by which I mean a loosely structured and evolving workflow, often in a Jupyter style notebook setting, to rapidly iterate to converge on the best way to extract meaningful results from a dataset, so the emphasis is a on a | |||
tool to make very flexible, decantable, hierarchical data strcutures "on the fly". In contrast, the Dan family is intended to ape the much more structured DataFrame / Series model that can drive fast, vector execution for large datasets in the setting of performant libraries. So, I feel that the two approaches are complementary, with Data::Shapers helping to speed progress towards the best analytic workflow and Dan::xx helping | |||
to conform the data to structures that can be fed to the fast execution engines. I would say that for raku to support Data Science workflows, Dan is kind of table stakes to avoid sucking at execution speed and Data::Shapers is a super power that makes raku much more effective for whipping up models in the wrangling phase. If you look a the Nutshell here github.com/librasteve/raku-Dan-Polars, you see that there is a | |||
dedicated method so you can go df.to-dataset to export a DataFrame to an Anton compatible DataSet. | |||
librasteve | since I think that you are asking whether it would make sense to have a TOP (I skim read the "Why" doc you linked), working with a Dan DataFrame, my answer is a resounding "don't know" - I do know that there are some Pandas libraries that allow SQL to be run against a Pandas DataFrame, but I suspect that take up of these is a tiny fraction of the Data Science / Pandas world ... the reason afaik that Pandas won is | 13:33 | |
that despite the pain of columnar data from a coding point of view, the execution engines are fast | |||
apart from that TOP looks very interesting and I look forward to playing with it ... I think that raku's mutability and Slang abilities makes it a great choice (and I suppose you can hook in a real database under the hood to make it go fast) | 13:35 | ||
@antononcube this is my first look at Data::TypeSystem, thanks for sharing! It looks quite interesting ... since Wolfram says Dataset[data] represents a structured dataset based on a hierarchy of lists and associations. this is very familiar to perl and raku coders with Arrays and Hashes being able to represent almost anything found in the wild. I am a bit wary that you have added a bunch of "near raku" types | 13:39 | ||
like Tuples, Atoms ... but I trust that these are the "right" selection for the problem domain. | |||
antononcube | The review of my Raku-dataset-wrangling efforts by @librasteve is correct. (Except, the package is "Data::Reshapers", not "Data::Shapers".) | 13:40 | |
librasteve | sorry | ||
antononcube | I named it after a popular R-package "reshape2" : seananderson.ca/2013/10/19/reshape/ | 13:41 | |
@librasteve "I am a bit wary that you have added a bunch of "near raku" types like Tuples, Atoms [...]" -- Well, I had no idea what is a good design / names, and I had to make an MVP in order to get some other projects "moving forward." And, yes, I mostly followed Mathematica's conventions. Those types though should not be that hard to change or restyle. | 13:44 | ||
As for Wolfram Language (WL) documentation on dataset -- yeah, WL-datasets are general data structures, not just tabular ones. To some extend, I would say, that is why teaching WL-datasets is cumbersome. (Well, compared to R-data-frames and Python-pandas.) | 13:46 | ||
Actually, I find Python-pandas cumbersome compared to R-data-frames, too. | |||
librasteve | I think its fairly common in raku land to have types that fill in the gaps (that often get adopted into core in one way or another), Sets, Bags, Mixes; Tuples; ValueTypes and so on … so perhaps not that unexpected to extend the type classes in play when one is extending the features in module space… | 13:48 | |
14:27
MasterDuke left
14:52
haxxelotto left
15:10
Sgeo joined
15:29
soverysour joined,
soverysour left,
soverysour joined
16:12
soverysour left
16:34
dakkar left
16:49
soverysour joined,
soverysour left,
soverysour joined
17:07
rir left
17:12
teatime left
17:16
haxxelotto joined
17:25
teatime joined,
teatime left
17:26
teatime joined
17:35
soverysour left
18:05
soverysour joined,
soverysour left,
soverysour joined
18:11
jpn left
18:15
haxxelotto left
18:52
soverysour left
19:32
haxxelotto joined
20:23
jpn joined
20:58
jpn left
|
|||
wayland76 | Other reading I've fouind useful/interesting in this discussion: a) phoenixnap.com/kb/rdd-vs-dataframe-vs-dataset because it talks about Spark 2 merging dataframes and datasets, but that typed/untyped languages access via different APIs, which means Raku could use either, and b) en.wikipedia.org/wiki/Array_programming for those who haven't run across the term (we probably all have, but just in case). | 21:25 | |
tellable6 | hey wayland76, you have a message: gist.github.com/108d4b8653000dc562...647d42e790 | ||
wayland76 | antononcube: Any reading I've done about datasets/dataframes seems to be specific to Apache Spark. Are they the same in Wolfram? | 21:32 | |
librasteve: Yeah, I feel like both your Dan, and Array-Oriented Programming belong to the "Make it fast" space. Table-oriented programming was originally seen as a competitor to object-oriented programming (though mainly successful amongst business programming types), but my site re-envisions it as one paradigm among many that sits on top of OOP. | 21:37 | ||
antononcube | @wayland76 "Are they the same in Wolfram?" -- No. Datasets both in WL and Raku a are just arrays of hashes. Or hashes of hashes. | 21:39 | |
@wayland76 seen the examples here: cdn.discordapp.com/attachments/633...2b8cb& | 21:40 | ||
wayland76 | The TOP sector has had 3 main branches a) SQL databases b) Spreadsheets, and c) general-purpose TOP languages (eg. dBase, Clipper, xHarbour), but this last group has languished because everyone (rightly) went in an OO direction; I'd like to try to bring the advantages of that third group to Raku. | ||
Right. So Googling Dataframe vs. Dataset was counter-productive then :) | 21:42 | ||
librasteve: Not sure what you mean by "table stakes" | 21:44 | ||
21:47
bpalmer joined
|
|||
wayland76 | Interestingly, I was planning to use names like Tuple and Atom in TOP as well. Well, Tuple anyway: wayland.github.io/table-oriented-p...uction.xml -- for atom, I was planning to make it a control structure: wayland.github.io/table-oriented-p...olFlow.xml | 21:54 | |
22:13
haxxelotto left
|
|||
[Coke] | table stakes is the minimum to be considered a viable option as that thing. | 22:49 | |
brandmarketingblog.com/articles/br...-business/ | 22:51 | ||
wayland76 | Oh, that kind of table! I've been thinking about table-oriented programming for too long :) | ||
Thanks! | |||
22:54
sena_kun left,
wayland76 left
|