Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> No, polars or spark is not a good answer, those are optimized for data engineering performance, not a holistic approach to data science.

Can you expand on why Polars isn't optimised for a holistic approach to data science?



I have not work with Polars, but I would imagine any incompatibility with existing libraries (e.g. plotting libraries like plotnine, bokeh) would quickly put me off.

It is a curse I know. I would also choose a better interface. Performance is meh to me, I use SQL if i want to do something at scale that involves row/column data.


This is a non-issue with Polars dataframes to_pandas() method. You get all the performance of Polars for cleaning large datasets, and to_pandas() gives you backwards compatibility with other libraries. However, plotnine is completely compatible with Polars dataframe objects.


You can always convert from Polars to Pandas. Plotnine will do it automatically for you, even.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: