This channel is intended for people just starting with the Raku Programming Language (raku.org). Logs are available at irclogs.raku.org/raku-beginner/live.html Set by lizmat on 8 June 2022. |
|||
00:04
Heptite left
00:46
apathor joined
00:55
MasterDuke joined
01:05
Manifest0 left
02:16
hythm joined
|
|||
codesections | Oh, interesting – if I run the vesion I posted on gfldex's data, I get segfaults. Fun | 05:26 | |
They go away if I drop the `hyper`, though | |||
I guess it's sharing arrays in a non-theadsafe way, oops | |||
06:29
hythm left
|
|||
gfldex | codesections: aye, recursion gets you separate stacks and thus non-sharing data structures for free :) | 06:39 | |
However, `start` and Phasers appear a bit wonky right now. | |||
06:56
ab5tract joined
07:09
ab5tract left
07:11
ab5tract joined
07:17
kjp left
07:19
kjp joined
07:59
famra left,
jerome7059 joined
|
|||
jerome7059 | Hello | 07:59 | |
yabobay | hi | ||
jerome7059 | Are there any resources to use parquet files in rakudo? | 08:00 | |
yabobay | doesn't seem to be | 08:01 | |
but you can use Inline::Python or Inline::Perl5 to use a module from those languages | |||
jerome7059 | Ok thanks | 08:02 | |
08:17
jerome7059 left
08:23
lizmat left,
lizmat joined,
MasterDuke left
|
|||
nemokosch | What are parquet files? | 09:25 | |
yabobay | a weird data format by apache that apparently exists | 09:26 | |
nemokosch | What is it used for? | 09:27 | |
yabobay | idk you just enter data into it | 09:29 | |
kjp | I ran across the parquet file format a couple of weeks when the Overture Maps project had there first data release. The data is stored on AWS S3 in parquet files. For this purpose it stores rather large amounts of data. | 09:38 | |
I had the vague thought of writing some Raku to read it, but haven't gone beyopnd that yet. | 09:39 | ||
nemokosch | Hadoop, Pandas kind of stuff | ||
It would be useful probably, I remember the Arrow format as something that you need to care about if you want to wrap Pandas | 09:40 | ||
kjp | I don't think so; juist the raw data. The download instructions use DuckDB to access the actual data. | ||
The format is described at parquet.apache.org/docs/file-format | 09:42 | ||
nemokosch | Maybe it wasn't Pandas but NumPy, I don't remember much about that part. What I do remember is that librasteve wrapped a Python data library and the interface to the data didn't make much sense without understanding Arrow | ||
kjp | Sounds like our memories are a bit vague. | 09:43 | |
nemokosch | It was Polars actually 😆 | 09:45 | |
09:49
Manifest0 joined
|
|||
kjp | Ah, that's right -- it stores data in columnar format. Some people seem to like the idea. | 09:51 | |
nemokosch | Apparently this parquet format is a possible "backend" of Arrow, and Arrow is quite big in the dataframe scene | 09:52 | |
kjp | That's ot an area I'm particularly familiar with. | 09:58 | |
nemokosch | same 😄 | 10:22 | |
10:35
ab5tract left
11:24
ab5tract joined
11:26
lizmat left
12:19
NemokoschKiwi joined
12:51
NemokoschKiwi left
13:04
ab5tract left
|
|||
rcmlz | It helps me, but I was focussed on parallizing the recursion on the Less and More part, not during the classification. PS: I beliefe a solution using just Before/After and a single pivot element is only correct for list of unique elements. | 13:35 | |
The spawn() part is cool - I was worried if I use Promises manually on Less and More, I will create to many processes. (thats why I thought using map() is better). Thank you @codesections and @gfldex, I will re-think my approach an come back. | 13:39 | ||
14:13
ab5tract joined
14:27
Heptite joined
16:07
elcaro left,
elcaro joined
|
|||
Here is what I got now using your hints sub quicksort- recursive-parallel(@input) { return @input if @input.elems < 2; my $pivot = @input.pick; @input.hyper.classify(-> $element { $element cmp $pivot }) andthen { my %partiton = $_; my $less = start { %partiton{Less}:exists ?? samewith(%partiton{Less}) !! [] }; my $more = start { %partiton{More}:exists ?? | 16:22 | ||
samewith(%partiton{More}) !! [] }; await $less, $more; |$less.result, |%partiton{Same}, |$more.result } } I put also a more evolved version tuning concurrency and limiting the number of parallel threads into gist.github.com/rcmlz/552726f93e01...159947901b - that was fun, thank you @codesections and @gfldex. | |||
16:49
ab5tract left,
ab5tract joined
18:04
Heptite left
20:58
lizmat joined
22:45
saint- joined
22:49
MasterDuke joined
23:25
Manifest0 left
|