Polars Datetime Operations
last modified March 1, 2025
Polars is a fast DataFrame library in Rust with Python bindings. It is designed for efficient data manipulation and analysis. This tutorial covers datetime operations in Polars, with practical examples.
Datetime operations are essential for time series analysis, data filtering, and feature engineering. Polars provides robust support for handling datetime data.
Creating a DataFrame with Datetime
This example shows how to create a Polars DataFrame with a datetime column.
import polars as pl from datetime import datetime data = { "date": [datetime(2023, 1, 1), datetime(2023, 1, 2), datetime(2023, 1, 3)], "value": [10, 20, 30] } df = pl.DataFrame(data) print(df)
The datetime
module is used to create datetime objects. These are
stored in a Polars DataFrame for further analysis.
Filtering by Date
This example demonstrates filtering rows based on a specific date.
import polars as pl from datetime import datetime data = { "date": [datetime(2023, 1, 1), datetime(2023, 1, 2), datetime(2023, 1, 3)], "value": [10, 20, 30] } df = pl.DataFrame(data) filtered_df = df.filter(pl.col("date") == datetime(2023, 1, 2)) print(filtered_df)
The filter
method is used to select rows where the date matches
a specific value. This is useful for extracting specific time periods.
Extracting Date Components
This example shows how to extract year, month, and day from a datetime column.
import polars as pl from datetime import datetime data = { "date": [datetime(2023, 1, 1), datetime(2023, 1, 2), datetime(2023, 1, 3)], "value": [10, 20, 30] } df = pl.DataFrame(data) df = df.with_columns([ pl.col("date").dt.year().alias("year"), pl.col("date").dt.month().alias("month"), pl.col("date").dt.day().alias("day") ]) print(df)
The dt.year
, dt.month
, and dt.day
methods extract date components. These are useful for grouping or analysis.
Calculating Date Differences
This example demonstrates calculating the difference between dates.
import polars as pl from datetime import datetime data = { "start_date": [datetime(2023, 1, 1), datetime(2023, 1, 2)], "end_date": [datetime(2023, 1, 3), datetime(2023, 1, 5)] } df = pl.DataFrame(data) df = df.with_columns([ (pl.col("end_date") - pl.col("start_date")).alias("date_diff") ]) print(df)
The difference between two datetime columns is calculated using subtraction. This is useful for measuring time intervals.
Adding Time Intervals
This example shows how to add a time interval to a datetime column.
import polars as pl from datetime import datetime, timedelta data = { "date": [datetime(2023, 1, 1), datetime(2023, 1, 2)] } df = pl.DataFrame(data) df = df.with_columns([ (pl.col("date") + timedelta(days=5)).alias("new_date") ]) print(df)
The timedelta
object is used to add a 5-day interval to the
datetime column. This is useful for forecasting or scheduling.
Grouping by Date
This example demonstrates grouping data by a date component.
import polars as pl from datetime import datetime data = { "date": [datetime(2023, 1, 1), datetime(2023, 1, 2), datetime(2023, 1, 1)], "value": [10, 20, 30] } df = pl.DataFrame(data) grouped_df = df.groupby("date").agg(pl.col("value").sum()) print(grouped_df)
The groupby
method groups rows by the date column. Aggregations
like sum
can then be applied to each group.
Resampling Time Series Data
This example shows how to resample time series data to a different frequency.
import polars as pl from datetime import datetime data = { "date": [datetime(2023, 1, 1), datetime(2023, 1, 2), datetime(2023, 1, 3)], "value": [10, 20, 30] } df = pl.DataFrame(data) resampled_df = df.set_sorted("date").groupby_dynamic("date", every="1d").agg(pl.col("value").sum()) print(resampled_df)
The groupby_dynamic
method resamples data to a daily frequency.
This is useful for time series analysis.
Handling Time Zones
This example demonstrates converting datetime columns to different time zones.
import polars as pl from datetime import datetime import pytz data = { "date": [datetime(2023, 1, 1), datetime(2023, 1, 2)] } df = pl.DataFrame(data) df = df.with_columns([ pl.col("date").dt.convert_time_zone("UTC").alias("utc_date"), pl.col("date").dt.convert_time_zone("America/New_York").alias("ny_date") ]) print(df)
The dt.convert_time_zone
method converts datetime columns to
different time zones. This is useful for global data analysis.
Best Practices for Datetime Operations
- Use Consistent Formats: Ensure datetime data is in a consistent format.
- Handle Time Zones: Convert time zones when working with global data.
- Optimize Performance: Use Polars' efficient datetime operations for large datasets.
- Validate Data: Check for missing or invalid datetime values.
Source
In this article, we have explored datetime operations in Polars.
Author
List all Polars tutorials.