API#

Opening Measurement Sets#

The standard xarray.backends.api.open_dataset() and xarray.backends.api.open_datatree() methods should be used to open either a Dataset or a DataTree.

>>> dataset = xarray.open_dataset(
                "/data/data.ms",
                partition_columns=["DATA_DESC_ID", "FIELD_ID"])
>>> datatree = xarray.backends.api.open_datatree(
                "/data/data.ms",
                partition_columns=["DATA_DESC_ID", "FIELD_ID"])

These methods defer to the relevant methods on the Entrypoint Class. Consult the method signatures for information on extra arguments that can be passed.

Entrypoint Class#

Entrypoint class for the MSv2 backend.

class xarray_ms.backend.msv2.entrypoint.MSv2PartitionEntryPoint#
open_dataset(filename_or_obj: str | os.PathLike[Any] | BufferedIOBase | AbstractDataStore, *, drop_variables: str | Iterable[str] | None = None, partition_columns: List[str] | None = None, partition_key: PartitionKeyT | None = None, auto_corrs: bool = True, ninstances: int = 8, epoch: str | None = None, structure_factory: MSv2StructureFactory | None = None) Dataset#

Create a Dataset presenting an MSv4 view over a partition of a MSv2 CASA Measurement Set

Parameters:
  • filename_or_obj – The path to the MSv2 CASA Measurement Set file.

  • drop_variables – Variables to drop from the dataset.

  • partition_columns – The columns to use for partitioning the Measurement set. Defaults to ['DATA_DESC_ID', 'FIELD_ID', 'OBSERVATION_ID'].

  • partition_key – A key corresponding to an individual partition. For example (('DATA_DESC_ID', 0), ('FIELD_ID', 0)). If None, the first partition will be opened.

  • auto_corrs – Include/Exclude auto-correlations.

  • ninstances – The number of Measurement Set instances to open for parallel I/O.

  • epoch – A unique string identifying the creation of this Dataset. This should not normally need to be set by the user

  • structure_factory – A factory for creating MSv2Structure objects. This should not normally need to be set by the user

Returns:

A Dataset referring to the unique partition specified by partition_columns and partition_key.

open_datatree(filename_or_obj: str | os.PathLike[Any] | BufferedIOBase | AbstractDataStore, *, drop_variables: str | Iterable[str] | None = None, partition_columns: List[str] | None = None, auto_corrs: bool = True, ninstances: int = 8, epoch: str | None = None, **kwargs) DataTree#

Create a DataTree presenting an MSv4 view over multiple partitions of a MSv2 CASA Measurement Set.

Parameters:
  • filename_or_obj – The path to the MSv2 CASA Measurement Set file.

  • drop_variables – Variables to drop from the dataset.

  • partition_columns – The columns to use for partitioning the Measurement set. Defaults to ['DATA_DESC_ID', 'FIELD_ID', 'OBSERVATION_ID'].

  • auto_corrs – Include/Exclude auto-correlations.

  • ninstances – The number of Measurement Set instances to open for parallel I/O.

  • epoch – A unique string identifying the creation of this Dataset. This should not normally need to be set by the user

Returns:

An xarray DataTree

Reading from Zarr#

Thin wrappers around xarray.Dataset.open_zarr() and xarray.DataTree.open_zarr() that encode Dataset attributes as JSON.

xarray_ms.xds_from_zarr(*args, **kwargs)#

Read a Measurement Set-like Dataset from a Zarr store.

Thin wrapper around xarray.open_zarr().

xarray_ms.xdt_from_zarr(*args, **kwargs)#

Read a Measurement Set-like DataTree from a Zarr store.

Thin wrapper around xarray.backends.api.open_datatree().

Writing to Zarr#

Thin wrappers around xarray.Dataset.to_zarr() and xarray.DataTree.to_zarr() that encode Dataset attributes as JSON.

xarray_ms.xds_to_zarr(ds: Dataset, *args, **kwargs) None#

Write a Measurement Set-like Dataset to a Zarr store.

Thin wrapper around xarray.Dataset.to_zarr().

xarray_ms.xdt_to_zarr(dt: DataTree, *args, **kwargs) None#

Read a Measurement Set-like DataTree to a Zarr store

Thin wrapper around xarray.core.datatree.DataTree.to_zarr().