#raku-beginner on 29 September 2023 - Raku Programming Language Log

This channel is intended for people just starting with the Raku Programming Language (raku.org). Logs are available at irclogs.raku.org/raku-beginner/live.html Set by lizmat on 8 June 2022.
00:04 Heptite left 00:46 apathor joined 00:55 MasterDuke joined 01:05 Manifest0 left 02:16 hythm joined
codesections	Oh, interesting – if I run the vesion I posted on gfldex's data, I get segfaults. Fun	05:26	Copy link Message link Add to gist Remove
	They go away if I drop the `hyper`, though		Copy link Message link Add to gist Remove
	I guess it's sharing arrays in a non-theadsafe way, oops		Copy link Message link Add to gist Remove
06:29 hythm left
gfldex	codesections: aye, recursion gets you separate stacks and thus non-sharing data structures for free :)	06:39	Copy link Message link Add to gist Remove
	However, `start` and Phasers appear a bit wonky right now.		Copy link Message link Add to gist Remove
06:56 ab5tract joined 07:09 ab5tract left 07:11 ab5tract joined 07:17 kjp left 07:19 kjp joined 07:59 famra left, jerome7059 joined
jerome7059	Hello	07:59	Copy link Message link Add to gist Remove
yabobay	hi		Copy link Message link Add to gist Remove
jerome7059	Are there any resources to use parquet files in rakudo?	08:00	Copy link Message link Add to gist Remove
yabobay	doesn't seem to be	08:01	Copy link Message link Add to gist Remove
	but you can use Inline::Python or Inline::Perl5 to use a module from those languages		Copy link Message link Add to gist Remove
jerome7059	Ok thanks	08:02	Copy link Message link Add to gist Remove
08:17 jerome7059 left 08:23 lizmat left, lizmat joined, MasterDuke left
nemokosch	What are parquet files?	09:25	Copy link Message link Add to gist Remove
yabobay	a weird data format by apache that apparently exists	09:26	Copy link Message link Add to gist Remove
nemokosch	What is it used for?	09:27	Copy link Message link Add to gist Remove
yabobay	idk you just enter data into it	09:29	Copy link Message link Add to gist Remove
kjp	I ran across the parquet file format a couple of weeks when the Overture Maps project had there first data release. The data is stored on AWS S3 in parquet files. For this purpose it stores rather large amounts of data.	09:38	Copy link Message link Add to gist Remove
	I had the vague thought of writing some Raku to read it, but haven't gone beyopnd that yet.	09:39	Copy link Message link Add to gist Remove
nemokosch	Hadoop, Pandas kind of stuff		Copy link Message link Add to gist Remove
	It would be useful probably, I remember the Arrow format as something that you need to care about if you want to wrap Pandas	09:40	Copy link Message link Add to gist Remove
kjp	I don't think so; juist the raw data. The download instructions use DuckDB to access the actual data.		Copy link Message link Add to gist Remove
	The format is described at parquet.apache.org/docs/file-format	09:42	Copy link Message link Add to gist Remove
nemokosch	Maybe it wasn't Pandas but NumPy, I don't remember much about that part. What I do remember is that librasteve wrapped a Python data library and the interface to the data didn't make much sense without understanding Arrow		Copy link Message link Add to gist Remove
kjp	Sounds like our memories are a bit vague.	09:43	Copy link Message link Add to gist Remove
nemokosch	It was Polars actually 😆	09:45	Copy link Message link Add to gist Remove
09:49 Manifest0 joined
kjp	Ah, that's right -- it stores data in columnar format. Some people seem to like the idea.	09:51	Copy link Message link Add to gist Remove
nemokosch	Apparently this parquet format is a possible "backend" of Arrow, and Arrow is quite big in the dataframe scene	09:52	Copy link Message link Add to gist Remove
kjp	That's ot an area I'm particularly familiar with.	09:58	Copy link Message link Add to gist Remove
nemokosch	same 😄	10:22	Copy link Message link Add to gist Remove
10:35 ab5tract left 11:24 ab5tract joined 11:26 lizmat left 12:19 NemokoschKiwi joined 12:51 NemokoschKiwi left 13:04 ab5tract left
rcmlz	It helps me, but I was focussed on parallizing the recursion on the Less and More part, not during the classification. PS: I beliefe a solution using just Before/After and a single pivot element is only correct for list of unique elements.	13:35	Copy link Message link Add to gist Remove
	The spawn() part is cool - I was worried if I use Promises manually on Less and More, I will create to many processes. (thats why I thought using map() is better). Thank you @codesections and @gfldex, I will re-think my approach an come back.	13:39	Copy link Message link Add to gist Remove
14:13 ab5tract joined 14:27 Heptite joined 16:07 elcaro left, elcaro joined
	Here is what I got now using your hints sub quicksort- recursive-parallel(@input) { return @input if @input.elems < 2; my $pivot = @input.pick; @input.hyper.classify(-> $element { $element cmp $pivot }) andthen { my %partiton = $_; my $less = start { %partiton{Less}:exists ?? samewith(%partiton{Less}) !! [] }; my $more = start { %partiton{More}:exists ??	16:22	Copy link Message link Add to gist Remove
	samewith(%partiton{More}) !! [] }; await $less, $more; \|$less.result, \|%partiton{Same}, \|$more.result } } I put also a more evolved version tuning concurrency and limiting the number of parallel threads into gist.github.com/rcmlz/552726f93e01...159947901b - that was fun, thank you @codesections and @gfldex.		Copy link Message link Add to gist Remove
16:49 ab5tract left, ab5tract joined 18:04 Heptite left 20:58 lizmat joined 22:45 saint- joined 22:49 MasterDuke joined 23:25 Manifest0 left

Please report any issues / comments / feature requests as an issue on App::Raku::Log.

Thank you!