|
Description:
|
|
Watch the live stream:
Watch on YouTube
About the show
Sponsored by us! Support our work through:
Brian #1: Polars: Lightning-fast DataFrame library for Rust and Python
- Suggested by a several listeners
- “Polars is a blazingly fast DataFrames library implemented in Rust using Apache Arrow Columnar Format as memory model.
- Lazy | eager execution
- Multi-threaded
- SIMD (Single Instruction/Multiple Data)
- Query optimization
- Powerful expression API
- Rust | Python | ...”
- Python API syntax set up to allow parallel and execution while sidestepping GIL issues, for both lazy and eager use cases. From the docs: Do not kill parallelization
The syntax is very functional and pipeline-esque:
import polars as pl
q = (
pl.scan_csv("iris.csv")
.filter(pl.col("sepal_length") > 5)
.groupby("species")
.agg(pl.all().sum())
)
df = q.collect()
Polars User Guide is excellent and looks like it’s entirely written with Python examples.
- Includes a 30 min intro video from PyData Global 2021
Michael #2: PSF Survey is out
- Have a look, their page summarizes it better than my bullet points will.
Brian #3: Gin Config: a lightweight configuration framework for Python
Michael #4: Performance benchmarks for Python 3.11 are amazing
- via Eduardo Orochena
- Performance may be the biggest feature of all
- Python 3.11 has
- task groups in asyncio
- fine-grained error locations in tracebacks
- the self-type to return an instance of their class
- The "Faster CPython Project" to speed-up the reference implementation.
- See my interview with Guido and Mark: talkpython.fm/339
- Python 3.11 is 10~60% faster than Python 3.10 according to the official figures
- And a 1.22x speed-up with their standard benchmark suite.
- Arriving as stable until October
Extras
Michael:
Joke: Why wouldn't you choose a parrot for your next application |