Polars Sorting Data
last modified March 1, 2025
Polars is a fast and efficient DataFrame library in Python. Sorting data is a common operation in data analysis. This tutorial covers how to sort data in Polars with practical examples.
Sorting helps organize data for better analysis and visualization. Polars
provides methods like sort
and sort_by
for sorting
DataFrames.
Basic Sorting
This example shows how to sort a DataFrame by a single column.
import polars as pl data = { 'Name': ['Alice', 'Bob', 'Charlie', 'David'], 'Age': [25, 30, 22, 35] } df = pl.DataFrame(data) sorted_df = df.sort('Age') print(sorted_df)
The sort('Age')
sorts the DataFrame by the 'Age' column in ascending
order. This is useful for organizing data by a specific attribute.
Sorting in Descending Order
This example demonstrates sorting a DataFrame in descending order.
import polars as pl data = { 'Name': ['Alice', 'Bob', 'Charlie', 'David'], 'Age': [25, 30, 22, 35] } df = pl.DataFrame(data) sorted_df = df.sort('Age', reverse=True) print(sorted_df)
The sort('Age', reverse=True)
sorts the DataFrame by the 'Age'
column in descending order. This is useful for ranking data.
Sorting by Multiple Columns
This example shows how to sort a DataFrame by multiple columns.
import polars as pl data = { 'Name': ['Alice', 'Bob', 'Charlie', 'David'], 'Age': [25, 30, 22, 25], 'Salary': [50000, 60000, 45000, 55000] } df = pl.DataFrame(data) sorted_df = df.sort(['Age', 'Salary']) print(sorted_df)
The sort(['Age', 'Salary'])
sorts the DataFrame first by 'Age' and
then by 'Salary'. This is useful for hierarchical sorting.
Sorting with Null Values
This example demonstrates sorting a DataFrame with null values.
import polars as pl data = { 'Name': ['Alice', 'Bob', 'Charlie', 'David'], 'Age': [25, None, 22, 35] } df = pl.DataFrame(data) sorted_df = df.sort('Age', nulls_last=True) print(sorted_df)
The sort('Age', nulls_last=True)
sorts the DataFrame with null
values placed at the end. This is useful for handling missing data.
Sorting by Expression
This example shows how to sort a DataFrame using an expression.
import polars as pl data = { 'Name': ['Alice', 'Bob', 'Charlie', 'David'], 'Age': [25, 30, 22, 35] } df = pl.DataFrame(data) sorted_df = df.sort(pl.col('Age') * -1) print(sorted_df)
The sort(pl.col('Age') * -1)
sorts the DataFrame by 'Age' in
descending order using an expression. This is useful for custom sorting logic.
Sorting with Custom Order
This example demonstrates sorting a DataFrame with a custom order.
import polars as pl data = { 'Name': ['Alice', 'Bob', 'Charlie', 'David'], 'Age': [25, 30, 22, 35] } df = pl.DataFrame(data) sorted_df = df.sort('Name', descending=[False, True]) print(sorted_df)
The sort('Name', descending=[False, True])
sorts the DataFrame by
'Name' with a custom order. This is useful for specific sorting requirements.
Sorting by Index
This example shows how to sort a DataFrame by its index.
import polars as pl data = { 'Name': ['Alice', 'Bob', 'Charlie', 'David'], 'Age': [25, 30, 22, 35] } df = pl.DataFrame(data) sorted_df = df.sort(by='index', reverse=True) print(sorted_df)
The sort(by='index', reverse=True)
sorts the DataFrame by its index
in descending order. This is useful for reordering rows.
Best Practices for Sorting Data
- Understand Data: Analyze data structure before sorting.
- Choose Appropriate Columns: Select columns that align with your analysis goals.
- Handle Null Values: Use
nulls_last
to manage missing data. - Validate Results: Check sorted data for accuracy and consistency.
Source
In this article, we have explored how to sort data in Polars.
Author
List all Polars tutorials.