This channel is intended for people just starting with the Raku Programming Language ( Logs are available at
Set by lizmat on 8 June 2022.
00:04 Heptite left 00:46 apathor joined 00:55 MasterDuke joined 01:05 Manifest0 left 02:16 hythm joined
codesections Oh, interesting – if I run the vesion I posted on gfldex's data, I get segfaults. Fun 05:26
They go away if I drop the `hyper`, though
I guess it's sharing arrays in a non-theadsafe way, oops
06:29 hythm left
gfldex codesections: aye, recursion gets you separate stacks and thus non-sharing data structures for free :) 06:39
However, `start` and Phasers appear a bit wonky right now.
06:56 ab5tract joined 07:09 ab5tract left 07:11 ab5tract joined 07:17 kjp left 07:19 kjp joined 07:59 famra left, jerome7059 joined
jerome7059 Hello 07:59
yabobay hi
jerome7059 Are there any resources to use parquet files in rakudo? 08:00
yabobay doesn't seem to be 08:01
but you can use Inline::Python or Inline::Perl5 to use a module from those languages
jerome7059 Ok thanks 08:02
08:17 jerome7059 left 08:23 lizmat left, lizmat joined, MasterDuke left
nemokosch What are parquet files? 09:25
yabobay a weird data format by apache that apparently exists 09:26
nemokosch What is it used for? 09:27
yabobay idk you just enter data into it 09:29
kjp I ran across the parquet file format a couple of weeks when the Overture Maps project had there first data release. The data is stored on AWS S3 in parquet files. For this purpose it stores rather large amounts of data. 09:38
I had the vague thought of writing some Raku to read it, but haven't gone beyopnd that yet. 09:39
nemokosch Hadoop, Pandas kind of stuff
It would be useful probably, I remember the Arrow format as something that you need to care about if you want to wrap Pandas 09:40
kjp I don't think so; juist the raw data. The download instructions use DuckDB to access the actual data.
The format is described at 09:42
nemokosch Maybe it wasn't Pandas but NumPy, I don't remember much about that part. What I do remember is that librasteve wrapped a Python data library and the interface to the data didn't make much sense without understanding Arrow
kjp Sounds like our memories are a bit vague. 09:43
nemokosch It was Polars actually 😆 09:45
09:49 Manifest0 joined
kjp Ah, that's right -- it stores data in columnar format. Some people seem to like the idea. 09:51
nemokosch Apparently this parquet format is a possible "backend" of Arrow, and Arrow is quite big in the dataframe scene 09:52
kjp That's ot an area I'm particularly familiar with. 09:58
nemokosch same 😄 10:22
10:35 ab5tract left 11:24 ab5tract joined 11:26 lizmat left 12:19 NemokoschKiwi joined 12:51 NemokoschKiwi left 13:04 ab5tract left
rcmlz It helps me, but I was focussed on parallizing the recursion on the Less and More part, not during the classification. PS: I beliefe a solution using just Before/After and a single pivot element is only correct for list of unique elements. 13:35
The spawn() part is cool - I was worried if I use Promises manually on Less and More, I will create to many processes. (thats why I thought using map() is better). Thank you @codesections and @gfldex, I will re-think my approach an come back. 13:39
14:13 ab5tract joined 14:27 Heptite joined 16:07 elcaro left, elcaro joined
Here is what I got now using your hints sub quicksort- recursive-parallel(@input) { return @input if @input.elems < 2; my $pivot = @input.pick; @input.hyper.classify(-> $element { $element cmp $pivot }) andthen { my %partiton = $_; my $less = start { %partiton{Less}:exists ?? samewith(%partiton{Less}) !! [] }; my $more = start { %partiton{More}:exists ?? 16:22
samewith(%partiton{More}) !! [] }; await $less, $more; |$less.result, |%partiton{Same}, |$more.result } } I put also a more evolved version tuning concurrency and limiting the number of parallel threads into - that was fun, thank you @codesections and @gfldex.
16:49 ab5tract left, ab5tract joined 18:04 Heptite left 20:58 lizmat joined 22:45 saint- joined 22:49 MasterDuke joined 23:25 Manifest0 left