Migrating from 0.27 to 0.28

Pose* component types have been removed pose-component-types-have-been-removed

The following component types have been removed in favor of their more general counterparts:

  • components.PoseTranslation3Dcomponents.Translation3D
  • components.PoseRotationQuatcomponents.RotationQuat
  • components.PoseTransformMat3x3components.TransformMat3x3
  • components.PoseRotationAxisAnglecomponents.RotationAxisAngle
  • components.PoseScale3Dcomponents.Scale3D

Existing .rrd files will be automatically migrated when opened.

Transform3D no longer supports axis_length for visualizing coordinate axes transform3d-no-longer-supports-axislength-for-visualizing-coordinate-axes

The axis_length parameter/method has been moved from Transform3D to a new TransformAxes3D archetype, which you can log alongside of Transform3D. This new archetype also works with the CoordinateFrame archetype.

Existing .rrd recordings will be automatically migrated when opened (the migration converts Transform3D:axis_length components to TransformAxes3D:axis_length).

CoordinateFrame::frame_id has been renamed to CoordinateFrame::frame coordinateframeframeid-has-been-renamed-to-coordinateframeframe

The frame_id component of CoordinateFrame has been renamed to just frame, because the component type TransformFrameId already conveys the information that this is an id.

Existing .rrd recordings will be automatically migrated when opened (the migration renames the frame_id component).

Changes to Transform3D/InstancePose3D and Pinhole's transform properties are now treated transactionally by the Viewer changes-to-transform3dinstancepose3d-and-pinholes-transform-properties-are-now-treated-transactionally-by-the-viewer

If you previously updated only certain components of Transform3D/InstancePose3D and relied on previously logged values remaining present, you must now re-log those previous values every time you update the Transform3D/InstancePose3D.

If you always logged the same transform components on every log/send call or used the standard constructor of Transform3D, no changes are required!

import rerun as rr

rr.log("simple", rr.Transform3D(translation=[1.0, 2.0, 3.0]))

# Note that we explicitly only set the scale here:
# Previously, this would have meant that we keep the translation.
# However, in 0.27 the Viewer will no longer show apply the previous translation regardless.
rr.log("simple", rr.Transform3D.from_fields(scale=2))

Pinhole's transform properties, resolution & image_from_plane as well its new parent_frame & child_frame, fields are also affected by this change. Again, this means that any change to any of Pinhole's resolution/image_from_plane/parent_frame/child_frame, will reset all of these fields.

Details & motivation details--motivation

We changed the way Transform3D, InstancePose3D & Pinhole are queried under the hood!

Usually, when querying any collection of components with latest-at semantics, we look for the latest update of each individual component. This is useful, for example, when you log a mesh and only change its texture over time: a latest-at query at any point in time gets all the same vertex information, but the texture that is active at any given point in time may change.

However, for Transform3D, this behavior can be very surprising, as the typical expectation is that logging a Transform3D with only a rotation will not inherit previously logged translations to the same path. Previously, to work around this, all SDKs implemented the constructor of Transform3D such that it set all components to empty arrays, thereby clearing everything that was logged before. This caused significant memory (and networking) bloat, as well as needlessly convoluted displays in the viewer. With the arrival of explicit ROS-style transform frames, per-component latest-at semantics can cause even more surprising side effects.

Therefore, we decided to change the semantics of Transform3D such that any change to any of its components fully resets the transform state.

For example, if you change its rotation and scale fields but do not write to translation, we will not look further back in time to find the previous value of translation. Instead, we assume that translation is not set at all (i.e., zero), deriving the new overall transform state only from rotation and scale. Naturally, if any update to a transform always changes the same components, this does not cause any changes other than the simplification of not having to clear out all other components that may ever be set, thus reducing memory bloat both on send and query!

URDF loader: sending transform updates now requires parent_frame and child_frame fields to be set urdf-loader-sending-transform-updates-now-requires-parentframe-and-childframe-fields-to-be-set

Previous versions of the built-in URDF data-loader in Rerun required you to send transform updates with implicit frame IDs, i.e. having to send each joint transform on a specific entity path. Depending on the complexity of your robot model, this could quickly lead to long entity paths. E.g. when you wanted to update a joint deeper in your model hierarchy.

In 0.28, this is now dropped in favor of transforms with named frame IDs (parent_frame, child_frame). This is more in line with the TF2 system in ROS and allows you to send all transform updates on one single entity (e.g. a transforms entity).

In particular, this results in two changes compared after you load an URDF model into Rerun compared to previous releases:

  1. To update a joint with a Transform3D, the parent_frame and child_frame fields need to be set (analogous to how the joint is specified in the URDF file).
  2. The transformation must have both rotation and translation (again, analogous to the URDF). Updating only the rotation is no longer supported.

For more details about loading & updating URDF models, we added a "Loading URDF models" page to our documentation in this release.

Python SDK: "partition" renamed to "segment" in catalog APIs python-sdk-partition-renamed-to-segment-in-catalog-apis

In the rerun.catalog module, all APIs using "partition" terminology have been renamed to use "segment" instead. The old APIs are deprecated and will be removed in a future release.

Old APINew API
DatasetEntry.partition_ids()DatasetEntry.segment_ids()
DatasetEntry.partition_table()DatasetEntry.segment_table()
DatasetEntry.partition_url()DatasetEntry.segment_url()
DatasetEntry.download_partition()DatasetEntry.download_segment()
DatasetEntry.default_blueprint_partition_id()DatasetEntry.default_blueprint()
DatasetEntry.set_default_blueprint_partition_id()DatasetEntry.set_default_blueprint()
DataframeQueryView.filter_partition_id()DataframeQueryView.filter_segment_id()

The DataFusion utility functions in rerun.utilities.datafusion.functions.url_generation have also been renamed:

Old APINew API
partition_url()segment_url()
partition_url_udf()segment_url_udf()
partition_url_with_timeref_udf()segment_url_with_timeref_udf()

The partition table columns have also been renamed from rerun_partition_id to rerun_segment_id.

Additionally, the partition_id field on viewer event classes has been renamed to segment_id:

# Before (0.27)
def on_event(event):
    print(event.partition_id)

# After (0.28)
def on_event(event):
    print(event.segment_id)

This affects PlayEvent, PauseEvent, TimeUpdateEvent, TimelineChangeEvent, SelectionChangeEvent, and RecordingOpenEvent.

Python SDK: segment_table() and manifest() now return DataFrame directly python-sdk-segmenttable-and-manifest-now-return-dataframe-directly

The DatasetEntry.segment_table() and DatasetEntry.manifest() methods now return datafusion.DataFrame directly instead of a DataFusionTable. The .df() method call is no longer needed:

# Before (0.27)
df = dataset.partition_table().df()
manifest_df = dataset.manifest().df()

# After (0.28)
df = dataset.segment_table()
manifest_df = dataset.manifest()

Additionally, segment_table() now accepts optional join_meta and join_key parameters to join with external metadata:

# Join segment table with a metadata table
df = dataset.segment_table(join_meta=metadata_table, join_key="rerun_segment_id")

Python SDK: catalog entry listing APIs renamed python-sdk-catalog-entry-listing-apis-renamed

The CatalogClient methods for listing catalog entries have been renamed for clarity:

Old APINew API
CatalogClient.all_entries()CatalogClient.entries()
CatalogClient.dataset_entries()CatalogClient.datasets()
CatalogClient.table_entries()CatalogClient.tables()

The old methods are deprecated and will be removed in a future release.

Additionally, the new methods accept an optional include_hidden parameter:

  • datasets(include_hidden=True): includes blueprint datasets
  • tables(include_hidden=True): includes system tables (e.g., __entries)
  • entries(include_hidden=True): includes both

Python SDK: removed DataFrame-returning entry listing methods python-sdk-removed-dataframereturning-entry-listing-methods

The following methods that returned datafusion.DataFrame objects have been removed without deprecation:

Removed methodReplacement
CatalogClient.entries() (returning DataFrame)CatalogClient.get_table(name="__entries").reader()
CatalogClient.datasets() (returning DataFrame)CatalogClient.get_table(name="__entries").reader() filtered by entry kind
CatalogClient.tables() (returning DataFrame)CatalogClient.get_table(name="__entries").reader() filtered by entry kind

The new entries(), datasets(), and tables() methods now return lists of entry objects (DatasetEntry and TableEntry) instead of DataFrames. If you need DataFrame access to the raw entries table, use client.get_table(name="__entries").reader().

Python SDK: entry name listing methods now support include_hidden python-sdk-entry-name-listing-methods-now-support-includehidden

The CatalogClient methods for listing entry names now accept an optional include_hidden parameter, matching the behavior of entries(), datasets(), and tables():

  • entry_names(include_hidden=True): includes hidden entries (blueprint datasets and system tables like __entries)
  • dataset_names(include_hidden=True): includes blueprint datasets
  • table_names(include_hidden=True): includes system tables (e.g., __entries)

Python SDK: entry access methods renamed python-sdk-entry-access-methods-renamed

The CatalogClient methods for accessing individual entries have been renamed:

Old APINew API
CatalogClient.get_dataset_entry()CatalogClient.get_dataset()
CatalogClient.get_table_entry()CatalogClient.get_table()
CatalogClient.create_table_entry()CatalogClient.create_table()

The existing CatalogClient.create_dataset() method is already aligned with the new naming scheme and remains unchanged. The old methods are deprecated and will be removed in a future release.

Python SDK: get_table() now returns TableEntry instead of DataFrame python-sdk-gettable-now-returns-tableentry-instead-of-dataframe

The CatalogClient.get_table() method has been changed to return a TableEntry object instead of a datafusion.DataFrame. This is a breaking change.

# Before (0.27)
df = client.get_table(name="my_table")  # returns DataFrame

# After (0.28)
table_entry = client.get_table(name="my_table")  # returns TableEntry
df = table_entry.reader()  # call reader() to get the DataFrame

This change aligns get_table() with get_dataset(), which returns a DatasetEntry. Both methods now consistently return entry objects that provide access to metadata and data.

Python SDK: table write operations moved to TableEntry python-sdk-table-write-operations-moved-to-tableentry

Write operations for tables have been moved from CatalogClient to TableEntry. The new methods provide a cleaner API that operates directly on table entries:

Old APINew API
CatalogClient.write_table(name, batches, mode)TableEntry.append(batches)
TableEntry.overwrite(batches)
TableEntry.upsert(batches)
CatalogClient.append_to_table(name, batches)TableEntry.append(batches)
CatalogClient.update_table(name, batches)TableEntry.upsert(batches)

The old methods are deprecated and will be removed in a future release.

# Before (0.27)
client.write_table("my_table", batches, TableInsertMode.APPEND)
client.append_to_table("my_table", batches)
client.update_table("my_table", batches)

# After (0.28)
table = client.get_table(name="my_table")
table.append(batches)
table.overwrite(batches)
table.upsert(batches)

The new TableEntry methods also support writing Python objects directly via keyword arguments:

table.append(col1=[1, 2, 3], col2=["a", "b", "c"])

Note: TableInsertMode is no longer needed with the new API and will be removed in a future release.

Python SDK: schema and column types moved to rerun.catalog python-sdk-schema-and-column-types-moved-to-reruncatalog

The Schema class and related column descriptor/selector types have moved from rerun.dataframe to rerun.catalog.

Old import (0.27)New import (0.28)
from rerun.dataframe import Schemafrom rerun.catalog import Schema
from rerun.dataframe import ComponentColumnDescriptorfrom rerun.catalog import ComponentColumnDescriptor
from rerun.dataframe import ComponentColumnSelectorfrom rerun.catalog import ComponentColumnSelector
from rerun.dataframe import IndexColumnDescriptorfrom rerun.catalog import IndexColumnDescriptor
from rerun.dataframe import IndexColumnSelectorfrom rerun.catalog import IndexColumnSelector

The previous import paths are still supported but will be removed in a future release.

Python SDK: DatasetEntry.dataframe_query_view() replaced by DatasetEntry.reader() python-sdk-datasetentrydataframequeryview-replaced-by-datasetentryreader

The DatasetEntry.dataframe_query_view() method and the DataframeQueryView class have been removed. Their functionality is now available through DatasetEntry.reader(), optionally combined with DatasetEntry.filter_segments() and DatasetEntry.filter_contents() which return a DatasetView.

Migration paths migration-paths

Index selection: Now a parameter to reader():

# Before (0.27)
view = dataset.dataframe_query_view(index="timeline")
df = view.df()

# After (0.28)
df = dataset.reader(index="timeline")

Content filtering: Use filter_contents() to create a DatasetView:

# Before (0.27)
view = dataset.dataframe_query_view(index="timeline", contents={"/points": ["Position2D"]})
df = view.df()

# After (0.28)
view = dataset.filter_contents(["/points/**"])
df = view.reader(index="timeline")

Segment filtering: Use filter_segments():

# Before (0.27)
view = dataset.dataframe_query_view(index="timeline")
df = view.filter_partition_id(["recording_0"]).df()

# After (0.28)
view = dataset.filter_segments(["recording_0"])
df = view.reader(index="timeline")

Row filtering: Use DataFusion filtering on the returned DataFrame:

# Before (0.27)
view = dataset.dataframe_query_view(index="timeline")
df = view.filter_is_not_null(...).df()

# After (0.28)
from datafusion import col
df = dataset.reader(index="timeline").filter(col("/world/robot:Points3D:positions").is_not_null())

Latest-at fill: Now a parameter to reader():

# Before (0.27)
view = dataset.dataframe_query_view(index="timeline", fill_latest_at=True)
df = view.df()

# After (0.28)
df = dataset.reader(index="timeline", fill_latest_at=True)

Python SDK: TableEntry.df() renamed to TableEntry.reader() python-sdk-tableentrydf-renamed-to-tableentryreader

The TableEntry.df() method has been renamed to TableEntry.reader() for consistency with DatasetView.reader(). This is a breaking change.

# Before (0.27)
table = client.get_table(name="my_table")
df = table.df()

# After (0.28)
table = client.get_table(name="my_table")
df = table.reader()

Python SDK: DatasetEntry.download_segments() is deprecated python-sdk-datasetentrydownloadsegments-is-deprecated

This method is deprecated and will be removed in a future release.

Python SDK: blueprint APIs simplified python-sdk-blueprint-apis-simplified

The blueprint-related methods on DatasetEntry have been simplified:

Old APINew API
DatasetEntry.default_blueprint_partition_id()DatasetEntry.default_blueprint()
DatasetEntry.set_default_blueprint_partition_id()DatasetEntry.set_default_blueprint()
DatasetEntry.blueprint_dataset_id()Removed (use blueprint_dataset())

New methods have been added for common blueprint operations:

# Register a blueprint and set it as default
dataset.register_blueprint("s3://bucket/blueprint.rbl")

# Register a blueprint without setting it as default
dataset.register_blueprint("s3://bucket/blueprint.rbl", set_default=False)

# List all registered blueprints
blueprint_names = dataset.blueprints()

# Get/set the default blueprint
current = dataset.default_blueprint()
dataset.set_default_blueprint("my_blueprint")

Python SDK: register() and register_batch() merged into unified register() API python-sdk-register-and-registerbatch-merged-into-unified-register-api

The DatasetEntry.register() and DatasetEntry.register_batch() methods have been merged into a single register() method that returns a RegistrationHandle. The DatasetEntry.register_prefix() now also returns a RegistrationHandle. The Tasks and Task classes have been removed. The recording_layer parameter has been renamed to layer_name.

RegisterHandle has wait() method similar to the former Tasks.wait() method. It no longer requires at timeout value, and now returns a RegisteredSegments object containing the segment IDs of the registered segments.

Single URI registration:

# Before (0.27)
segment_id = dataset.register("s3://bucket/recording.rrd")

# After (0.28)
segment_id = dataset.register("s3://bucket/recording.rrd").wait().segment_ids[0]

Batch registration:

# Before (0.27)
dataset.register_batch(["file:///uri1.rrd", "file:///uri2.rrd"], recording_layers=["base", "base"]).wait()
# Note: no direct way to get segment IDs

# After (0.28)
handle = dataset.register(["file:///uri1.rrd", "file:///uri2.rrd"], layer_name="base")
result = handle.wait()
segment_ids = result.segment_ids

Streaming results with progress tracking:

from tqdm import tqdm
from rerun.catalog import SegmentRegistrationResult

uris = ["file:///uri1.rrd", "file:///uri2.rrd"]
handle = dataset.register(uris, layer_name="base")

for result in tqdm(handle.iter_results(), total=len(uris), desc="Registering"):
    if result.is_error:
        print(f"Failed to register {result.uri}: {result.error}")
    else:
        print(f"Registered {result.uri} as {result.segment_id}")

Python SDK: search index APIs python-sdk-search-index-apis

So far, our APIs use of the "index" term for two distinct things: the dataset index columns, and the FTS/vector indexes. To reduce this ambiguity, we renamed our vector and FTS APIs to use the "search index" term:

Old APINew API
DatasetEntry.create_fts_index()DatasetEntry.create_fts_search_index()
DatasetEntry.create_vector_index()DatasetEntry.create_vector_search_index()
DatasetEntry.list_indexes()DatasetEntry.list_search_indexes()
DatasetEntry.delete_indexes()DatasetEntry.delete_search_indexes()

The old methods are deprecated and will be removed in a future release.

In addition, for consistency with the other API updates, the DatasetEntry.search_fts() and DatasetEntry.search_vector() methods now return datafusion.DataFrame directly instead of DataFusionTable. DataFusionTable no longer exists. This is a breaking change.

# Before (0.27)
result = dataset.search_fts("query", column).df()
result = dataset.search_vector(embedding, column, top_k=10).df()

# After (0.28)
result = dataset.search_fts("query", column)
result = dataset.search_vector(embedding, column, top_k=10)