API Reference

User Functions


Intake's dict-like config system

intake.readers.datatypes.recommend([url, ...])

Show which Intake data types can apply to the given details

intake.readers.convert.auto_pipeline(url, ...)

Create pipeline from given URL to desired output type

intake.readers.convert.path(start, end[, ...])

Find possible conversion paths from start to end types

intake.readers.entry.Catalog([entries, ...])

A collection of data and reader descriptions.


Defines some data: class and arguments.


A serialisable description of a reader or pipeline


Show which readers claim to support the given data instance or a superclass


Attempt to construct a reader instance by finding one that matches the function call

class intake.config.Config(filename=None, **kwargs)

Intake’s dict-like config system

Instance intake.conf is globally used throughout the package

get(key, default=None)

Return the value for key if key is in the dictionary, else default.


Update global config from YAML file

If fn is None, looks in global config directory, which is either defined by the INTAKE_CONF_DIR env-var or is ~/.intake/ .


Analyse environment variables and update conf accordingly


Set conf values back to defaults


Save current configuration to file as YAML

Uses self.filename for target location

set(update_dict=None, **kw)

Change config values within a context or for the session

values: dict

This can be deeply nested to set only leaf values

See also: intake.readers.utils.nested_keys_to_dict


Value resets after context ends

>>> with intake.conf.set(mybval=5):
...     ...

Set for whole session

>>> intake.conf.set(myval=5)

Set only a single leaf value within a nested dict

>>> intake.conf.set(intake.readers.utils.nested_keys_to_dict({"deep.2.key": True})
intake.readers.datatypes.recommend(url: Optional[str] = None, mime: Optional[str] = None, head: bool = True, storage_options=None, ignore: Optional[set[str]] = None) set[intake.readers.datatypes.BaseData]

Show which Intake data types can apply to the given details

url: str

Location of data

mime: str

MIME type, usually “x/y” form

head: bytes | bool | None

A small number of bytes from the file head, for seeking magic bytes. If it is True, fetch these bytes from th given URL/storage_options and use them. If None, only fetch bytes if there is no match by mime type or path, if False, don’t fetch at all.

storage_options: dict | None

If passing a URL which might be a remote file, storage_options can be used by fsspec.

ignore: set | None

Don’t include these in the output

set of matching datatype classes.
intake.readers.convert.auto_pipeline(url: str | intake.readers.datatypes.BaseData, outtype: str | tuple[str], storage_options: Optional[dict] = None, avoid: Optional[list[str]] = None) Pipeline

Create pipeline from given URL to desired output type

Will search for the shortest conversion path from the inferred data-type to the output.

url: input data, usually a location/URL, but maybe a data instance
outtype: pattern to match to possible output types
storage_options: if url is a remote str, these are kwargs that fsspec may need to

access it

avoid: don’t consider readers whose names match any of these strings
class intake.readers.entry.Catalog(entries: Optional[Union[Iterable[ReaderDescription], Mapping]] = None, aliases: Optional[dict[str, int]] = None, data: Optional[Union[Iterable[DataDescription], Mapping]] = None, user_parameters: Optional[dict[str, intake.readers.user_parameters.BaseUserParameter]] = None, parameter_overrides: Optional[dict[str, Any]] = None, metadata: Optional[dict] = None)

A collection of data and reader descriptions.

add_entry(entry, name=None)

Add entry/reader (and its requirements) in-place, with optional alias

extract_parameter(item: str, name: str, path: ~typing.Optional[str] = None, value: ~typing.Optional[~typing.Any] = None, cls=<class 'intake.readers.user_parameters.SimpleUserParameter'>, store_to: ~typing.Optional[str] = None)

Descend into data & reader descriptions to create a user_parameter

There are two ways to fund and replace values by a template:

  • if path is given, the kwargs will be walked to this location e.g., “field.0.special_value” -> kwargs[“field”][0][“special_value”]

  • if value is given, all kwargs will be recursively walked, looking for values that equal that given.

Matched values will be replaced by a template string like "{name}", and a user_parameter of class cls will be placed in the location given by store_to (could be “data”, “catalog”).

classmethod from_dict(data)

Assemble catalog from dict representation

classmethod from_entries(data: dict, metadata=None)

Assemble catalog from a dict of entries

static from_yaml_file(path: str, **kwargs)

Load YAML representation into a new Catalog instance


kwargs to pass to fsspec for opening the file to read; can pass as storage_options= or will pick up any unused kwargs for simplicity

get_aliases(entity: str)

Return those alias names that point to the given opaque key

get_entity(item: str)

Get the objects by reference

item can be an entry in .aliases, in which case the original wil be returned, or a key in .entries or .data. The entity in question is returned without processing.

give_name(tok: str, name: str, clobber=True)

Give an alias to a dataset


a key in the .entries dict

move_parameter(from_entity: str, to_entity: str, parameter_name: str) Catalog

Move user-parameter from between entry/data

entity is an alias name or entry/data token

promote_parameter_name(parameter_name: str, level: str = 'cat') Catalog

Find and promote given named parameter, assuming they are all identical


the key string referring to the parameter

level: cat | data

If the parameter is found in a reader, it can be promoted to the data it depends on. Parameters in a data description can only be promoted to a catalog global.

rename(old: str, new: str, clobber=True)

Change the alias of a dataset

search(expr) Catalog

Make new catalog with a subset of this catalog

The new catalog will have those entries which pass the filter expr, which is an instance of intake.readers.search.BaseSearch (i.e., has a method like filter(entry) -> bool).

In the special case that expr is just a string, the Text search expression will be used.

to_yaml_file(path: str, **storage_options)

Persist the state of this catalog as a YAML file


kwargs to pass to fsspec for opening the file to write

class intake.readers.entry.DataDescription(datatype: str, kwargs: Optional[dict] = None, metadata: Optional[dict] = None, user_parameters: Optional[dict] = None)

Defines some data: class and arguments. This may be laoded in a number of ways

A DataDescription normally resides in a Catalog, and can contain templated arguments. When there are user_parameters, these will also be applied to any reader that depends on this data.

get_kwargs(user_parameters: Optional[dict[str | intake.readers.user_parameters.BaseUserParameter]] = None, **kwargs) dict[str, Any]

Get set of kwargs for given reader, based on prescription, new args and user parameters

Here, user_parameters is intended to come from the containing catalog. To provide values for a user parameter, include it by name in kwargs

class intake.readers.entry.ReaderDescription(reader: str, kwargs: Optional[dict[str, Any]] = None, user_parameters: Optional[dict[str | intake.readers.user_parameters.BaseUserParameter]] = None, metadata: Optional[dict] = None, output_instance: Optional[str] = None)

A serialisable description of a reader or pipeline

This class is typically stored inside Catalogs, and can contain templated arguments which get evaluated at the time that it is accessed from a Catalog.


Are the packages listed in the “imports” key of the metadata available?

extract_parameter(name: str, path=None, value=None, cls=<class 'intake.readers.user_parameters.SimpleUserParameter'>)

Creates new version of the description

Creates new instance, since the token will in general change

classmethod from_dict(data)

Recreate instance from the results of to_dict()

get_kwargs(user_parameters=None, **kwargs) dict[str, Any]

Get set of kwargs for given reader, based on prescription, new args and user parameters

Here, user_parameters is intended to come from the containing catalog. To provide values for a user parameter, include it by name in kwargs


Create a Catalog containing only this entry


Show which readers claim to support the given data instance or a superclass

The ordering is more specific readers first

intake.readers.readers.reader_from_call(func: str, *args, join_lines=False, **kwargs) BaseReader

Attempt to construct a reader instance by finding one that matches the function call

Fails for readers that don’t define a func, probably because it depends on the file type or needs a dynamic instance to be a method of.

func: callable | str

If a callable, pass args and kwargs as you would have done to execute the function. If a string, it should look like "func(arg1, args2, kwarg1, **kw)", i.e., a normal python call but as a string. In the latter case, args and kwargs are ignored

Base Classes

These may be subclassed by developers


Prototype dataset definition

intake.readers.readers.BaseReader(*args[, ...])


Converts from one object type to another


A set of functions as an accessor on a Reader, producing a Pipeline


Prototype for a single term in a search expression


The base class allows for any default without checking/coercing

class intake.readers.datatypes.BaseData(metadata: Optional[dict[str, Any]] = None)

Prototype dataset definition

auto_pipeline(outtype: str | tuple[str])

Find a pipeline to transform from this to the given output type

contains: set[str] = {}

if using a directory URL, an ls() on that path will contain these things

filepattern: str = ''

regex, file URLs to match

magic: set[bytes | tuple] = {}

binary patterns, usually at the file head; each item identifies this data type

mimetypes: str = ''

regex, MIME pattern to match

property possible_outputs

Map of importable readers to the expected output class of each

property possible_readers

List of reader classes for this type, grouped by importability

structure: set[str] = {}

informational tags for nature of data, e.g., “array”


Create DataDescription version of this, for placing in a Catalog

to_reader(outtype: Optional[str] = None, reader: Optional[str] = None, **kw)

Find an appropriate reader for this data

If neither outtype or reader is passed, the first importable reader will be picked.

See also .possible_outputs

outtype: string to match against the output classes of potential readers
reader: string to match against the class names of the readers
class intake.readers.readers.BaseReader(*args, metadata: Optional[dict] = None, output_instance: Optional[str] = None, **kwargs)
property data

The BaseData this reader depends on, if it has one


Part of the data

The intent is to return a minimal dataset, but for some readers and conditions this may be up to the whole of the data. Output type is the same as for read().

classmethod doc()

Doc associated with loading function

func: str = 'builtins:NotImplementedError'

function name for loading data

func_doc: str = None

docstring origin if not from func

implements: set[intake.readers.datatypes.BaseData] = {}

datatype(s) this applies to

imports: set[str] = {}

top-level packages required to use this

optional_imports: set[str] = {}

packages that might be required by some options

other_funcs: set[str] = {}

function names to recognise when matching user calls

output_instance: str = None

type the reader produces

read(*args, **kwargs)

Produce data artefact

Any of the arguments encoded in the data instance can be overridden.

Output type is given by the .output_instance attribute


Create a Catalog containing on this reader


Create an entry version of this, ready to be inserted into a Catalog

to_reader(outtype: Optional[Union[tuple[str], str]] = None, reader: Optional[str] = None, **kw)

Make a different reader for the data used by this reader

class intake.readers.convert.BaseConverter(*args, metadata: Optional[dict] = None, output_instance: Optional[str] = None, **kwargs)

Converts from one object type to another

Most often, subclasses call a single function on the data, but arbitrary complex transforms are possible. This is designed to be one step in a Pipeline.

.run() will be called on the output object from the previous stage, subclasses will wither override that, or just provide a func=.

instances: dict[str, str] = {}

mapping from input types to output types

run(x, *args, **kwargs)

Execute a conversion stage on the output object from another stage

Subclasses may override this

class intake.readers.namespaces.Namespace(reader)

A set of functions as an accessor on a Reader, producing a Pipeline

acts_on: tuple[str] = ()

types that this namespace is associated with

imports: tuple[str] = ()

requires this top-level package

class intake.readers.search.SearchBase

Prototype for a single term in a search expression

The method filter() is meant to be overridden in subclasses.

filter(entry: ReaderDescription) bool

Does the given ReaderDescription entry match the query?

class intake.readers.user_parameters.BaseUserParameter(default, description='')

The base class allows for any default without checking/coercing


Change given type to one that matches this parameter’s intent


the value to use without user input


what is the function of this parameter


Change the default, if it validates


Dictionary representation of the instances contents

validate(value) bool

Is the given value allowed by this parameter?

Exceptions are treated as False


A new instance with different default, if it validates

(original object is left unchanged)

Data Classes

intake.readers.datatypes.ASDF(url[, ...])

Advanced Scientific Data Format

intake.readers.datatypes.AVRO(url[, ...])

Structured record passing file format

intake.readers.datatypes.CSV(url[, ...])

Human-readable tabular format, Comma Separated Values


Datatypes that are groupings of other data

intake.readers.datatypes.CatalogAPI(url[, ...])

An API endpoint capable of describing Intake catalogs

intake.readers.datatypes.CatalogFile(url[, ...])

Intake catalog expressed as YAML

intake.readers.datatypes.DICOM(url[, ...])

Imaging data usually from medical scans


Indexed set of parquet files with servioning and diffs

intake.readers.datatypes.Excel(url[, ...])

The well-known spreadsheet app's file format

intake.readers.datatypes.FITS(url[, ...])

Tabular or array data in text/binary format common in astronomy

intake.readers.datatypes.Feather1(url[, ...])

Deprecated tabular format from the Arrow project

intake.readers.datatypes.Feather2(url[, ...])

Tabular format based on Arrow IPC

intake.readers.datatypes.FileData(url[, ...])

Datatypes loaded from files, local or remote


One of the filetpes at https://gdal.org/drivers/raster/index.html


One of the filetypes at https://gdal.org/drivers/vector/index.html

intake.readers.datatypes.GRIB2(url[, ...])

"Gridded" file format commonly used in meteo forecasting

intake.readers.datatypes.GeoJSON(url[, ...])

Geo data (position and geometries) within JSON

intake.readers.datatypes.GeoPackage(url[, ...])

Geo data (position and geometries) in a SQLite DB file

intake.readers.datatypes.HDF5(url[, ...])

Hierarchical tree of ND-arrays, widely used scientific file format

intake.readers.datatypes.Handle(url[, ...])

An identifier registered on handle registry




Indexed set of parquet files with servioning and diffs

intake.readers.datatypes.JPEG(url[, ...])

Image format with good compression for the internet

intake.readers.datatypes.JSONFile(url[, ...])

Nested record format as readable text, very common over HTTP

intake.readers.datatypes.KerasModel(url[, ...])

Keras model parameter set

intake.readers.datatypes.Literal(data[, ...])

A value that can be embedded directly to YAML (text, dict, list)

intake.readers.datatypes.MatlabArray(path[, ...])

A single array in a .mat file

intake.readers.datatypes.MatrixMarket(url[, ...])

Text format for sparse array

intake.readers.datatypes.NetCDF3(url[, ...])

Collection of ND-arrays with coordinates, scientific file format

intake.readers.datatypes.Nifti(url[, ...])

Medical imaging or volume data file

intake.readers.datatypes.NumpyFile(url[, ...])

Simple array format

intake.readers.datatypes.ORC(url[, ...])

Columnar-optimized tabular binary file format

intake.readers.datatypes.OpenDAP(url[, ...])

Earth-science oriented searchable HTTP API

intake.readers.datatypes.PNG(url[, ...])

Portable Network Graphics, common image format

intake.readers.datatypes.Parquet(url[, ...])

Column-optimized binary format

intake.readers.datatypes.PickleFile(url[, ...])

Python pickle, arbitrary serialized object

intake.readers.datatypes.Prometheus(url[, ...])

Monitoring metric query service


Source code file

intake.readers.datatypes.RawBuffer(url, dtype)

A C or FORTRAN N-dimensional array buffer without metadata


Serialized model made by sklearn

intake.readers.datatypes.SQLQuery(conn, query)

Query on a database-like service

intake.readers.datatypes.SQLite(url[, ...])

Database data stored in files

intake.readers.datatypes.STACJSON(url[, ...])

Data assets related to geo data, either as static JSON or a searchable API

intake.readers.datatypes.Service(url[, ...])

Datatypes loaded from some service

intake.readers.datatypes.Shapefile(url[, ...])

Geo data (position and geometries) in a set of related binary files

intake.readers.datatypes.TFRecord(url[, ...])

Tensorflow record file, ready for machine learning


Datasets on a THREDDS server

intake.readers.datatypes.TIFF(url[, ...])

Image format commonly used for large data

intake.readers.datatypes.Text(url[, ...])

Any text file

intake.readers.datatypes.TileDB(url[, ...])

Service exposing versioned, chunked and potentially sparse arrays

intake.readers.datatypes.TiledDataset(url[, ...])

Data access service for data-aware portals and data science tools

intake.readers.datatypes.TiledService(url[, ...])

intake.readers.datatypes.WAV(url[, ...])

Waveform/sound file

intake.readers.datatypes.XML(url[, ...])

Extensible Markup Language file

intake.readers.datatypes.YAMLFile(url[, ...])

Human-readable JSON/object-like format

intake.readers.datatypes.Zarr(url[, ...])

Cloud optimised, chunked N-dimensional file format

Reader Classes

Includes readers, transformers, converters and output classes.


Finds the earthdata datasets that contain some data in the given query bounds


Read particular earthdata dataset by ID and parameter bounds


Datasets from HuggingfaceHub


Example datasets from sklearn.datasets


Uses SQLAlchemy to get the list of tables at some SQL URL

intake.readers.catalogs.STACIndex(*args[, ...])

Searches stacindex.org for known public STAC data sources


Create a Catalog from a STAC endpoint or file


Get stac objects matching a search spec from a STAC endpoint

intake.readers.catalogs.StackBands(*args[, ...])

Reimplementation of "StackBandsSource" from intake-stac


Read from THREDDS endpoint


Datasets from the TensorFlow public registry


Creates a catalog of Tiled datasets from a root URL


Standard example PyTorch datasets

intake.readers.convert.ASDFToNumpy(*args[, ...])


Converts from one object type to another



intake.readers.convert.DaskToRay(*args[, ...])



intake.readers.convert.DicomToNumpy(*args[, ...])

intake.readers.convert.DuckToPandas(*args[, ...])

intake.readers.convert.FITSToNumpy(*args[, ...])

intake.readers.convert.GenericFunc(*args[, ...])

Call given arbitrary function







intake.readers.convert.PandasToRay(*args[, ...])

intake.readers.convert.Pipeline(steps, ...)

Holds a list of transforms/conversions to be enacted in sequence

intake.readers.convert.PolarsEager(*args[, ...])

intake.readers.convert.PolarsLazy(*args[, ...])


intake.readers.convert.RayToDask(*args[, ...])

intake.readers.convert.RayToPandas(*args[, ...])

intake.readers.convert.RayToSpark(*args[, ...])

intake.readers.convert.SparkDFToRay(*args[, ...])



Implemented only if an attribute was not already chosen.


intake.readers.convert.TiledSearch(*args[, ...])

See https://blueskyproject.io/tiled/tutorials/search.html

intake.readers.convert.ToHvPlot(*args[, ...])

intake.readers.convert.TorchToRay(*args[, ...])

intake.readers.output.CatalogToJson(*args[, ...])



creates one of several output file types


Take a matplotlib figure and save to PNG file


Save a single array into a single binary file

intake.readers.output.PandasToCSV(*args[, ...])


intake.readers.output.PandasToHDF5(*args[, ...])


intake.readers.output.Repr(*args[, ...])

good for including "peek" at data in entries' metadata

intake.readers.output.ToMatplotlib(*args[, ...])


intake.readers.output.XarrayToZarr(*args[, ...])

intake.readers.readers.ASDFReader(*args[, ...])

intake.readers.readers.Awkward(*args[, ...])

intake.readers.readers.AwkwardAVRO(*args[, ...])

intake.readers.readers.AwkwardJSON(*args[, ...])


intake.readers.readers.Condition(*args[, ...])





intake.readers.readers.DaskCSV(*args[, ...])

intake.readers.readers.DaskDF(*args[, ...])


intake.readers.readers.DaskHDF(*args[, ...])

intake.readers.readers.DaskJSON(*args[, ...])

intake.readers.readers.DaskNPYStack(*args[, ...])

Requires a directory with .npy files and an "info" pickle file

intake.readers.readers.DaskParquet(*args[, ...])

intake.readers.readers.DaskSQL(*args[, ...])

intake.readers.readers.DaskZarr(*args[, ...])

intake.readers.readers.DeltaReader(*args[, ...])

intake.readers.readers.DicomReader(*args[, ...])

intake.readers.readers.DuckCSV(*args[, ...])

intake.readers.readers.DuckDB(*args[, ...])

intake.readers.readers.DuckJSON(*args[, ...])

intake.readers.readers.DuckParquet(*args[, ...])

intake.readers.readers.DuckSQL(*args[, ...])

intake.readers.readers.FITSReader(*args[, ...])


The contents of file(s) as bytes


intake.readers.readers.FileReader(*args[, ...])

Convenience superclass for readers of files




Dereference handle (hdl:) identifiers


intake.readers.readers.KerasAudio(*args[, ...])



intake.readers.readers.KerasText(*args[, ...])


intake.readers.readers.NumpyReader(*args[, ...])

intake.readers.readers.NumpyText(*args[, ...])

intake.readers.readers.NumpyZarr(*args[, ...])

intake.readers.readers.Pandas(*args[, ...])

intake.readers.readers.PandasCSV(*args[, ...])

intake.readers.readers.PandasExcel(*args[, ...])


intake.readers.readers.PandasHDF5(*args[, ...])

intake.readers.readers.PandasORC(*args[, ...])



intake.readers.readers.Polars(*args[, ...])

intake.readers.readers.PolarsAvro(*args[, ...])

intake.readers.readers.PolarsCSV(*args[, ...])


intake.readers.readers.PolarsExcel(*args[, ...])



intake.readers.readers.PolarsJSON(*args[, ...])



intake.readers.readers.PythonModule(*args[, ...])


intake.readers.readers.Ray(*args[, ...])

intake.readers.readers.RayBinary(*args[, ...])

intake.readers.readers.RayCSV(*args[, ...])

intake.readers.readers.RayDeltaLake(*args[, ...])

intake.readers.readers.RayJSON(*args[, ...])

intake.readers.readers.RayParquet(*args[, ...])

intake.readers.readers.RayText(*args[, ...])

intake.readers.readers.Retry(*args[, ...])

Retry (part of) a pipeline until it returns without exception






intake.readers.readers.SparkCSV(*args[, ...])



intake.readers.readers.SparkParquet(*args[, ...])

intake.readers.readers.SparkText(*args[, ...])

intake.readers.readers.TFORC(*args[, ...])



intake.readers.readers.TFSQL(*args[, ...])

intake.readers.readers.TFTextreader(*args[, ...])


intake.readers.readers.TileDBReader(*args[, ...])

intake.readers.readers.TiledClient(*args[, ...])

intake.readers.readers.TiledNode(*args[, ...])

intake.readers.readers.TorchDataset(*args[, ...])




intake.readers.transform.GetItem(*args[, ...])

Equivalent of x[item]

intake.readers.transform.Method(*args[, ...])

Call named method on object



intake.readers.transform.XarraySel(*args[, ...])