API Reference

User Functions

intake.config.Config([filename])

Intake's dict-like config system

intake.readers.datatypes.recommend([url, ...])

Show which Intake data types can apply to the given details

intake.readers.convert.auto_pipeline(url[, ...])

Create pipeline from given URL to desired output type

intake.readers.convert.path(start, end[, ...])

Find possible conversion paths from start to end types

intake.readers.entry.Catalog([entries, ...])

A collection of data and reader descriptions.

intake.readers.entry.DataDescription(datatype)

Defines some data: class and arguments.

intake.readers.entry.ReaderDescription(reader)

A serialisable description of a reader or pipeline

intake.readers.readers.recommend(data)

Show which readers claim to support the given data instance or a superclass

intake.readers.readers.reader_from_call(...)

Attempt to construct a reader instance by finding one that matches the function call

intake.readers.inspect.inspect_dataset(url)

Inspect a dataset at url and return a summary dictionary.

class intake.config.Config(filename=None, **kwargs)

Intake’s dict-like config system

Instance intake.conf is globally used throughout the package

Attributes:
environment_conf_parsestr

“ignore” (default), “warn” or raise an “error” when parsing local environment variables as strings.

get(key, default=None)

Return the value for key if key is in the dictionary, else default.

load(fn=None)

Update global config from YAML file

If fn is None, looks in global config directory, which is either defined by the INTAKE_CONF_DIR env-var or is ~/.intake/ .

load_env()

Analyse environment variables and update conf accordingly

reset()

Set conf values back to defaults

save(fn=None)

Save current configuration to file as YAML

Uses self.filename for target location

set(update_dict=None, **kw)

Change config values within a context or for the session

values: dict

This can be deeply nested to set only leaf values

See also: intake.readers.utils.nested_keys_to_dict

Examples

Value resets after context ends

>>> with intake.conf.set(mybval=5):
...     ...

Set for whole session

>>> intake.conf.set(myval=5)

Set only a single leaf value within a nested dict

>>> intake.conf.set(intake.readers.utils.nested_keys_to_dict({"deep.2.key": True})
intake.readers.datatypes.recommend(url: str | None = None, mime: str | None = None, head: bool = True, contents: bool = False, storage_options=None, ignore: set[str] | None = None) set[BaseData]

Show which Intake data types can apply to the given details

Parameters:
url: str

Location of data

mime: str

MIME type, usually “x/y” form

head: bytes | bool | None

A small number of bytes from the file head, for seeking magic bytes. If it is True, fetch these bytes from th given URL/storage_options and use them. If None, only fetch bytes if there is no match by mime type or path, if False, don’t fetch at all.

contents: bool | None

Attempt to delve into URL to analyse constituent files. This can significantly slow your recommendation.

storage_options: dict | None

If passing a URL which might be a remote file, storage_options can be used by fsspec.

ignore: set | None

Don’t include these in the output

Returns:
set of matching datatype classes.
intake.readers.convert.auto_pipeline(url: str | BaseData, outtype: str | tuple[str] = '', storage_options: dict | None = None, avoid: list[str] | None = None, prefer: list[str] | None = None, exclude: list[str] | None = None) Pipeline

Create pipeline from given URL to desired output type

Will search for the shortest conversion path from the inferred data-type to the output.

Parameters:
url: input data, usually a location/URL, but maybe a data instance
outtype: pattern to match to possible output types (instance or last converter)
storage_options: if url is a remote str, these are kwargs that fsspec may need to

access it

avoid: don’t consider readers whose names match any of these strings
prefer:

List of substring patterns (case-insensitive) matched against reader class names. Matching readers are tried before non-matching ones when multiple candidates satisfy the path. Example: prefer=["Polars", "Duck"].

exclude:

List of substring patterns (case-insensitive) matched against reader class names. Any reader whose class name matches is removed from consideration. Example: exclude=["Spark", "Ray"].

class intake.readers.entry.Catalog(entries: Iterable[ReaderDescription] | Mapping | None = None, aliases: dict[str, int] | None = None, data: Iterable[DataDescription] | Mapping = None, user_parameters: dict[str, BaseUserParameter] | None = None, parameter_overrides: dict[str, Any] | None = None, metadata: dict | None = None)

A collection of data and reader descriptions.

add_entry(entry, name: str | None = None, clobber: bool = True, simplify: bool = False)

Add entry/reader (and its requirements) in-place, with optional alias

Parameters:
entry: instance of BaseData, BaseReader or their descriptions
name: set the key value the iterm will be known as
clobber: if False, will not overwrite an entry
simplify: if True, checks if an equivalent entity already exists, and

returns it’s token if found. Such comparisons are relatively slow when you have >>100 entries.

alias(tok: str, name: str, clobber=True)

Give an alias to a dataset

tok:

a key in the .entries dict

delete(name, recursive=False)

Remove named entity (data/entry) from catalog

We do not check whether any other entity in the catalog refers to what is being deleted, so you can break other entries this way.

Parameters:
recursive: bool

Also removed data/entries references by the given one, and those they refer to in turn.

extract_parameter(item: str, name: str, path: str | None = None, value: ~typing.Any = None, cls=<class 'intake.readers.user_parameters.SimpleUserParameter'>, store_to: str | None = None, **kw)

Descend into data & reader descriptions to create a user_parameter

There are two ways to fund and replace values by a template:

  • if path is given, the kwargs will be walked to this location e.g., “field.0.special_value” -> kwargs[“field”][0][“special_value”]

  • if value is given, all kwargs will be recursively walked, looking for values that equal that given.

Matched values will be replaced by a template string like "{name}", and a user_parameter of class cls will be placed in the location given by store_to (could be “data”, “catalog”).

classmethod from_dict(data)

Assemble catalog from dict representation

classmethod from_entries(data: dict, metadata=None)

Assemble catalog from a dict of entries

static from_yaml_file(path: str, **kwargs)

Load YAML representation into a new Catalog instance

storage_options:

kwargs to pass to fsspec for opening the file to read; can pass as storage_options= or will pick up any unused kwargs for simplicity

get_aliases(entity: str)

Return those alias names that point to the given opaque key

get_entity(item: str)

Get the objects by reference

Use this method if you want to change the catalog in-place

item can be an entry in .aliases, in which case the original wil be returned, or a key in .entries, .user_parameters or .data. The entity in question is returned without processing.

give_name(tok: str, name: str, clobber=True)

Give an alias to a dataset

tok:

a key in the .entries dict

move_parameter(from_entity: str, to_entity: str, parameter_name: str) Catalog

Move user-parameter from between entry/data

entity is an alias name or entry/data token

promote_parameter_name(parameter_name: str, level: str = 'cat') Catalog

Find and promote given named parameter, assuming they are all identical

parameter_name:

the key string referring to the parameter

level: cat | data

If the parameter is found in a reader, it can be promoted to the data it depends on. Parameters in a data description can only be promoted to a catalog global.

rename(old: str, new: str, clobber=True)

Change the alias of a dataset

search(expr) Catalog

Make new catalog with a subset of this catalog

The new catalog will have those entries which pass the filter expr, which is an instance of intake.readers.search.BaseSearch (i.e., has a method like filter(entry) -> bool).

In the special case that expr is just a string, the Text search expression will be used.

to_yaml_file(path: str, **storage_options)

Persist the state of this catalog as a YAML file

storage_options:

kwargs to pass to fsspec for opening the file to write

class intake.readers.entry.DataDescription(datatype: str, kwargs: dict = None, metadata: dict = None, user_parameters: dict = None)

Defines some data: class and arguments. This may be laoded in a number of ways

A DataDescription normally resides in a Catalog, and can contain templated arguments. When there are user_parameters, these will also be applied to any reader that depends on this data.

get_kwargs(user_parameters: dict[str | BaseUserParameter] | None = None, **kwargs) dict[str, Any]

Get set of kwargs for given reader, based on prescription, new args and user parameters

Here, user_parameters is intended to come from the containing catalog. To provide values for a user parameter, include it by name in kwargs

class intake.readers.entry.ReaderDescription(reader: str, kwargs: dict[str, Any] | None = None, user_parameters: dict[str | BaseUserParameter] | None = None, metadata: dict | None = None, output_instance: str | None = None)

A serialisable description of a reader or pipeline

This class is typically stored inside Catalogs, and can contain templated arguments which get evaluated at the time that it is accessed from a Catalog.

check_imports()

Are the packages listed in the “imports” key of the metadata available?

extract_parameter(name: str, path=None, value=None, cls=<class 'intake.readers.user_parameters.SimpleUserParameter'>, **kw)

Creates new version of the description

Creates new instance, since the token will in general change

classmethod from_dict(data)

Recreate instance from the results of to_dict()

get_kwargs(user_parameters=None, **kwargs) dict[str, Any]

Get set of kwargs for given reader, based on prescription, new args and user parameters

Here, user_parameters is intended to come from the containing catalog. To provide values for a user parameter, include it by name in kwargs

to_cat(name=None)

Create a Catalog containing only this entry

intake.readers.readers.recommend(data)

Show which readers claim to support the given data instance or a superclass

The ordering is more specific readers first

intake.readers.readers.reader_from_call(func: str, *args, join_lines=False, **kwargs) BaseReader

Attempt to construct a reader instance by finding one that matches the function call

Fails for readers that don’t define a func, probably because it depends on the file type or needs a dynamic instance to be a method of.

Parameters:
func: callable | str

If a callable, pass args and kwargs as you would have done to execute the function. If a string, it should look like "func(arg1, args2, kwarg1, **kw)", i.e., a normal python call but as a string. In the latter case, args and kwargs are ignored

intake.readers.inspect.inspect_dataset(url: str, storage_options: dict | None = None, max_bytes: int = 50000000, timeout: float | None = 30.0, metadata: dict | None = None, prefer: list[str] | None = None, exclude: list[str] | None = None, retry: bool = True) dict

Inspect a dataset at url and return a summary dictionary.

Parameters:
url:

Location of the data. Any fsspec-compatible URL is accepted (s3://, gs://, https://, local path, …).

storage_options:

Keyword arguments forwarded to fsspec (credentials, etc.).

max_bytes:

Maximum file size (bytes) for which a Tier-3 (full-read) reader will be attempted. Set to None to disable the guard entirely.

timeout:

Wall-clock seconds to allow for each discover() call. None disables the timeout. Note: the background thread may continue after a timeout is triggered.

metadata:

Extra metadata dict merged into the BaseData instance.

prefer:

List of substring patterns (case-insensitive) matched against reader class names. Matching readers are moved to the front of the candidate list (while still sorted by tier within the preferred group). Example: prefer=["Polars", "Duck"].

exclude:

List of substring patterns (case-insensitive). Any reader whose class name contains one of these patterns is removed from the candidate list entirely before any attempt is made. Example: exclude=["Spark", "Ray"].

retry:

If True (default), when the chosen reader’s discover() raises or times out the next candidate in the ordered list is tried automatically, continuing until one succeeds or the list is exhausted. If False, the first failure is recorded and the function returns immediately without trying further readers.

Returns:
dict with keys:
url

The input URL.

detected_type

Class name of the first matching BaseData subclass, or None.

detected_type_qname

Fully-qualified name ("module:Class"), or None.

structure

Set of structural tags from the datatype (e.g. {"table"}).

reader_used

Class name of the reader that ultimately succeeded, or None.

reader_tier

Integer 1/2/3 for the reader that succeeded, or None.

readers_attempted

Ordered list of reader class names that were tried (including failures).

description

Value of metadata["description"] from the data instance, if any.

datashape

Dict of schema information (columns + dtypes, or xarray dims, etc.). Does not include shape — that lives exclusively at the top-level shape key.

shape

List of integer dimensions (e.g. [1000, 4]), or None when the shape cannot be determined without a full scan (lazy DataFrames, partial reads, etc.).

npartitions

Number of partitions as reported by the discovered object (Dask, Ray, etc.). For file-based data with no in-memory partition count this falls back to n_files.

n_files

Number of individual files that make up the dataset (after glob expansion), or None if the URL is not file-based / unknowable.

file_size_bytes

Total size in bytes across all files, or None if any file’s size could not be determined or the URL is not file-based.

repr

Plain-text repr() of the discovered object (capped at 1000 chars).

html_repr

HTML string from _repr_html_() / _repr_svg_(), or None.

thumbnail

data:image/png;base64,… URI, or None.

metadata

The metadata dict attached to the BaseData / BaseReader.

readers

Dict mapping every candidate reader class name to a sub-dict with keys "importable" (bool) and "tier" (int 1/2/3). Whether a reader is importable reflects the current environment only; another machine may have different packages installed.

errors

List of error strings for non-fatal problems encountered.

Base Classes

These may be subclassed by developers

intake.readers.datatypes.BaseData([metadata])

Prototype dataset definition

intake.readers.readers.BaseReader(*args[, ...])

intake.readers.convert.BaseConverter(*args)

Converts from one object type to another

intake.readers.namespaces.Namespace(reader)

A set of functions as an accessor on a Reader, producing a Pipeline

intake.readers.search.SearchBase()

Prototype for a single term in a search expression

intake.readers.user_parameters.BaseUserParameter(default)

The base class allows for any default without checking/coercing

class intake.readers.datatypes.BaseData(metadata: dict[str, Any] | None = None)

Prototype dataset definition

auto_pipeline(outtype: str | tuple[str], avoid: list[str] | None = None, prefer: list[str] | None = None, exclude: list[str] | None = None)

Find a pipeline to transform from this to the given output type

Parameters:
outtype:

Pattern matched against possible output types / converter names.

avoid:

Reader/converter names (substring patterns) to exclude from the graph search entirely.

prefer:

Substring patterns (case-insensitive) matched against reader class names. Matching readers are tried before others when multiple candidates exist.

exclude:

Substring patterns (case-insensitive) matched against reader class names. Matching readers are removed from consideration.

contains: set[str] = {}

if using a directory URL, an ls() on that path will contain these things

filepattern: str = ''

regex, file URLs to match; empty if relying on magic or contains

magic: set[bytes | tuple] = {}

binary patterns, usually at the file head; each item identifies this data type

mimetypes: str = ''

regex, MIME pattern to match

property possible_outputs

Map of importable readers to the expected output class of each

property possible_readers

List of reader classes for this type, grouped by importability

structure: set[str] = {}

informational tags for nature of data, e.g., “array”

to_entry()

Create DataDescription version of this, for placing in a Catalog

to_reader(type_or_reader=None, outtype: str | None = None, reader: str | None = None, prefer: list[str] | None = None, exclude: list[str] | None = None, **kw)

Find an appropriate reader for this data

If all Nones are passed, the first importable reader will be picked. If there is any selection, you will get ValueError on failure.

See also .possible_outputs

Parameters:
type_or_reader: matches either on type or reader name, whichever is found first
outtype: string to match against the output classes of potential readers
reader: string to match against the class names of the readers
prefer:

List of substring patterns (case-insensitive). Matching readers are tried before non-matching ones when multiple candidates satisfy the selection criteria. Example: prefer=["Polars", "Duck"].

exclude:

List of substring patterns (case-insensitive). Any reader whose class name matches is removed from consideration entirely. Example: exclude=["Spark", "Ray"].

to_reader_cls(type_or_reader=None, outtype: tuple[str] | str | None = None, reader: tuple[str] | str | type | None = None, prefer: list[str] | None = None, exclude: list[str] | None = None)

Return the reader class best suited for this data instance.

Parameters:
type_or_reader:

Convenience argument: tried first as outtype, then as reader.

outtype:

Substring pattern(s) matched (case-insensitively) against each candidate reader’s output_instance string.

reader:

Either a fully-qualified import string ("pandas:read_csv"), a reader class directly, or a substring pattern matched case-insensitively against each candidate reader’s qualified name.

prefer:

List of substring patterns (case-insensitive). Matching readers are tried before non-matching ones when multiple candidates satisfy outtype or reader. Has no effect when a bare reader class or exact import string is given.

exclude:

List of substring patterns (case-insensitive). Any reader whose class name matches is removed from consideration entirely.

class intake.readers.readers.BaseReader(*args, metadata: dict | None = None, output_instance: str | None = None, **kwargs)
property data

The BaseData this reader depends on, if it has one

discover(**kwargs)

Part of the data

The intent is to return a minimal dataset, but for some readers and conditions this may be up to the whole of the data. Output type is the same as for read().

classmethod doc()

Doc associated with loading function

func: str = 'builtins:NotImplementedError'

function name for loading data

func_doc: str = None

docstring origin if not from func

implements: set[BaseData] = {}

datatype(s) this applies to

imports: set[str] = {}

top-level packages required to use this

classmethod is_ok(data) bool

Determine whether this reader is suitable for the given data instance.

This is called after the type-based implements check and allows a reader to inspect the properties of a concrete data instance (e.g. the shape of its URL, whether it is a remote resource, etc.) to decide whether it should be recommended.

Override this in subclasses to add instance-level constraints on top of the class-level implements declaration.

Parameters:
data:

The BaseData instance being evaluated.

Returns:
bool

True if this reader can handle the data instance, False to exclude it from the recommend() results.

optional_imports: set[str] = {}

packages that might be required by some options

other_funcs: set[str] = {}

function names to recognise when matching user calls

output_instance: str = None

type the reader produces

prefer_for_inspect: bool = True

Whether this reader should be preferred by inspect_dataset().

Set to False on readers whose output is designed purely for interactive display (e.g. panel.pane.Image) and carries no queryable schema. Such readers are tried last by inspect_dataset, only after every reader with prefer_for_inspect = True has been exhausted.

read(*args, **kwargs)

Produce data artefact

Any of the arguments encoded in the data instance can be overridden.

Output type is given by the .output_instance attribute

to_cat(name=None)

Create a Catalog containing on this reader

to_entry()

Create an entry version of this, ready to be inserted into a Catalog

to_reader(outtype: tuple[str] | str | None = None, reader: str | None = None, **kw)

Make a different reader for the data used by this reader

class intake.readers.convert.BaseConverter(*args, metadata: dict | None = None, output_instance: str | None = None, **kwargs)

Converts from one object type to another

Most often, subclasses call a single function on the data, but arbitrary complex transforms are possible. This is designed to be one step in a Pipeline.

Subclasses should set:

instances

A {input_type_qname: output_type_qname} mapping. Keys and values are "module:Class" strings matching output_instance on readers.

func

The primary callable as a "module:name" string. Used by doc() and _func (inherited from BaseReader). Subclasses that perform more than one function call should still set func to the main entry-point for documentation purposes, and override run().

is_ok

Override to reject in-memory objects that this converter cannot handle even when the type name matches (e.g. wrong ndim or dtype).

classmethod doc()

Documentation for this conversion step.

Mirrors BaseReader.doc so that converters participate in the same help/introspection conventions as readers.

imports: set[str] = {}

derived automatically from instances

instances: dict[str, str] = {}

mapping from input types to output types

classmethod is_ok(x) bool

Return True if this converter can handle the concrete object x.

This is the converter analogue of BaseReader.is_ok(). The default implementation always returns True; subclasses override it to enforce constraints on the in-memory object (e.g. ndim, dtype, shape) that go beyond the type-name match in instances.

Parameters:
x:

The concrete in-memory object that would be passed to run().

Returns:
bool

False to exclude this converter from convert_classes() results for this particular object.

run(x, *args, **kwargs)

Execute a conversion stage on the output object from another stage

Subclasses may override this

class intake.readers.namespaces.Namespace(reader)

A set of functions as an accessor on a Reader, producing a Pipeline

acts_on: tuple[str] = ()

types that this namespace is associated with

imports: tuple[str] = ()

requires this top-level package

class intake.readers.search.SearchBase

Prototype for a single term in a search expression

The method filter() is meant to be overridden in subclasses.

filter(entry: ReaderDescription) bool

Does the given ReaderDescription entry match the query?

class intake.readers.user_parameters.BaseUserParameter(default, description='')

The base class allows for any default without checking/coercing

coerce(value)

Change given type to one that matches this parameter’s intent

default

the value to use without user input

description

what is the function of this parameter

set_default(value)

Change the default, if it validates

to_dict()

Dictionary representation of the instances contents

validate(value) bool

Is the given value allowed by this parameter?

Exceptions are treated as False

with_default(value)

A new instance with different default, if it validates

(original object is left unchanged)

Data Classes

intake.readers.datatypes.ASDF(url[, ...])

Advanced Scientific Data Format

intake.readers.datatypes.AVRO(url[, ...])

Structured record passing file format

intake.readers.datatypes.CSV(url[, ...])

Human-readable tabular format, Comma Separated Values

intake.readers.datatypes.Catalog([metadata])

Datatypes that are groupings of other data

intake.readers.datatypes.CatalogAPI(url[, ...])

An API endpoint capable of describing Intake catalogs

intake.readers.datatypes.CatalogFile(url[, ...])

Intake catalog expressed as YAML

intake.readers.datatypes.DICOM(url[, ...])

Imaging data usually from medical scans

intake.readers.datatypes.DeltalakeTable(url)

Indexed set of parquet files with servioning and diffs

intake.readers.datatypes.Excel(url[, ...])

The well-known spreadsheet app's file format

intake.readers.datatypes.FITS(url[, ...])

Tabular or array data in text/binary format common in astronomy

intake.readers.datatypes.Feather1(url[, ...])

Deprecated tabular format from the Arrow project (Feather v1)

intake.readers.datatypes.FileData(url[, ...])

Datatypes loaded from files, local or remote

intake.readers.datatypes.GDALRasterFile(url)

One of the filetpes at https://gdal.org/drivers/raster/index.html

intake.readers.datatypes.GDALVectorFile(url)

One of the filetypes at https://gdal.org/drivers/vector/index.html

intake.readers.datatypes.GRIB2(url[, ...])

"Gridded" file format commonly used in meteo forecasting

intake.readers.datatypes.GeoJSON(url[, ...])

Geo data (position and geometries) within JSON

intake.readers.datatypes.GeoPackage(url[, ...])

Geo data (position and geometries) in a SQLite DB file

intake.readers.datatypes.HDF5(url[, ...])

Hierarchical tree of ND-arrays, widely used scientific file format

intake.readers.datatypes.Handle(url[, ...])

An identifier registered on handle registry

intake.readers.datatypes.HuggingfaceDataset(name)

https://github.com/huggingface/datasets

intake.readers.datatypes.IcebergDataset(url)

Indexed set of parquet files with servioning and diffs

intake.readers.datatypes.JPEG(url[, ...])

Image format with good compression for the internet

intake.readers.datatypes.JSONFile(url[, ...])

Nested record format as readable text, very common over HTTP

intake.readers.datatypes.KerasModel(url[, ...])

Keras model parameter set

intake.readers.datatypes.Literal(data[, ...])

A value that can be embedded directly to YAML (text, dict, list)

intake.readers.datatypes.MatlabArray(path[, ...])

A single array in a .mat file

intake.readers.datatypes.MatrixMarket(url[, ...])

Text format for sparse array

intake.readers.datatypes.NetCDF3(url[, ...])

Collection of ND-arrays with coordinates, scientific file format

intake.readers.datatypes.Nifti(url[, ...])

Medical imaging or volume data file

intake.readers.datatypes.NumpyFile(url[, ...])

Simple array format

intake.readers.datatypes.ORC(url[, ...])

Columnar-optimized tabular binary file format

intake.readers.datatypes.OpenDAP(url[, ...])

Earth-science oriented searchable HTTP API

intake.readers.datatypes.PNG(url[, ...])

Portable Network Graphics, common image format

intake.readers.datatypes.Parquet(url[, ...])

Column-optimized binary format

intake.readers.datatypes.PickleFile(url[, ...])

Python pickle, arbitrary serialized object

intake.readers.datatypes.Prometheus(url[, ...])

Monitoring metric query service

intake.readers.datatypes.PythonSourceCode(url)

Source code file

intake.readers.datatypes.RawBuffer(url, dtype)

A C or FORTRAN N-dimensional array buffer without metadata

intake.readers.datatypes.SKLearnPickleModel(url)

Trained model made by sklearn and saved as pickle

intake.readers.datatypes.SQLQuery(conn, query)

Query on a database-like service

intake.readers.datatypes.SQLite(url[, ...])

Database data stored in files

intake.readers.datatypes.STACJSON(url[, ...])

Data assets related to geo data, either as static JSON or a searchable API

intake.readers.datatypes.Service(url[, ...])

Datatypes loaded from some service

intake.readers.datatypes.Shapefile(url[, ...])

Geo data (position and geometries) in a set of related binary files

intake.readers.datatypes.TFRecord(url[, ...])

Tensorflow record file, ready for machine learning

intake.readers.datatypes.THREDDSCatalog(url)

Datasets on a THREDDS server

intake.readers.datatypes.TIFF(url[, ...])

Image format commonly used for large data

intake.readers.datatypes.TileDB(url[, ...])

Service exposing versioned, chunked and potentially sparse arrays

intake.readers.datatypes.TiledDataset(url[, ...])

Data access service for data-aware portals and data science tools

intake.readers.datatypes.TiledService(url[, ...])

intake.readers.datatypes.WAV(url[, ...])

Waveform/sound file

intake.readers.datatypes.XML(url[, ...])

Extensible Markup Language file

intake.readers.datatypes.YAMLFile(url[, ...])

Human-readable JSON/object-like format

intake.readers.datatypes.Zarr(url[, ...])

Cloud optimised, chunked N-dimensional file format

Reader Classes

Includes readers, transformers, converters and output classes.

intake.readers.catalogs.EarthdataCatalogReader(*args)

Finds the earthdata datasets that contain some data in the given query bounds

intake.readers.catalogs.EarthdataReader(*args)

Read particular earthdata dataset by ID and parameter bounds

intake.readers.catalogs.HuggingfaceHubCatalog(*args)

Datasets from HuggingfaceHub

intake.readers.catalogs.SKLearnExamplesCatalog(*args)

Example datasets from sklearn.datasets

intake.readers.catalogs.SQLAlchemyCatalog(*args)

Uses SQLAlchemy to get the list of tables at some SQL URL

intake.readers.catalogs.STACIndex(*args[, ...])

Searches stacindex.org for known public STAC data sources

intake.readers.catalogs.StacCatalogReader(*args)

Create a Catalog from a STAC endpoint or file

intake.readers.catalogs.StacSearch([metadata])

Get stac objects matching a search spec from a STAC endpoint

intake.readers.catalogs.StackBands(*args[, ...])

Reimplementation of "StackBandsSource" from intake-stac

intake.readers.catalogs.THREDDSCatalogReader(*args)

Read from THREDDS endpoint

intake.readers.catalogs.TensorFlowDatasetsCatalog(*args)

Datasets from the TensorFlow public registry

intake.readers.catalogs.TiledCatalogReader(*args)

Creates a catalog of Tiled datasets from a root URL

intake.readers.catalogs.TorchDatasetsCatalog(*args)

Standard example PyTorch datasets

intake.readers.convert.ASDFToNumpy(*args[, ...])

intake.readers.convert.BaseConverter(*args)

Converts from one object type to another

intake.readers.convert.DaskArrayToTileDB(*args)

intake.readers.convert.DaskDFToPandas(*args)

intake.readers.convert.DaskToRay(*args[, ...])

intake.readers.convert.DeltaQueryToDask(*args)

intake.readers.convert.DeltaQueryToDaskGeopandas(*args)

intake.readers.convert.DicomToNumpy(*args[, ...])

intake.readers.convert.DuckToPandas(*args[, ...])

intake.readers.convert.FITSToNumpy(*args[, ...])

intake.readers.convert.GenericFunc(*args[, ...])

Call given arbitrary function

intake.readers.convert.HuggingfaceToRay(*args)

intake.readers.convert.NibabelToNumpy(*args)

intake.readers.convert.NumpyToTileDB(*args)

intake.readers.convert.PandasToGeopandas(*args)

intake.readers.convert.PandasToMetagraph(*args)

intake.readers.convert.PandasToPolars(*args)

intake.readers.convert.PandasToRay(*args[, ...])

intake.readers.convert.Pipeline(steps, ...)

Holds a list of transforms/conversions to be enacted in sequence

intake.readers.convert.PolarsEager(*args[, ...])

intake.readers.convert.PolarsLazy(*args[, ...])

intake.readers.convert.PolarsToPandas(*args)

intake.readers.convert.RayToDask(*args[, ...])

intake.readers.convert.RayToPandas(*args[, ...])

intake.readers.convert.RayToSpark(*args[, ...])

intake.readers.convert.SparkDFToRay(*args[, ...])

intake.readers.convert.TileDBToNumpy(*args)

intake.readers.convert.TileDBToPandas(*args)

Implemented only if an attribute was not already chosen.

intake.readers.convert.TiledNodeToCatalog(*args)

intake.readers.convert.TiledSearch(*args[, ...])

See https://blueskyproject.io/tiled/tutorials/search.html

intake.readers.convert.ToHvPlot(*args[, ...])

intake.readers.convert.TorchToRay(*args[, ...])

intake.readers.output.CatalogToJson(*args[, ...])

intake.readers.output.DaskArrayToZarr(*args)

intake.readers.output.GeopandasToFile(*args)

creates one of several output file types

intake.readers.output.MatplotlibToPNG(*args)

Take a matplotlib figure and save to PNG file

intake.readers.output.NumpyToNumpyFile(*args)

Save a single array into a single binary file

intake.readers.output.PandasToCSV(*args[, ...])

intake.readers.output.PandasToFeather(*args)

intake.readers.output.PandasToHDF5(*args[, ...])

intake.readers.output.PandasToParquet(*args)

intake.readers.output.Repr(*args[, ...])

good for including "peek" at data in entries' metadata

intake.readers.output.ToMatplotlib(*args[, ...])

intake.readers.output.XarrayToNetCDF(*args)

intake.readers.output.XarrayToZarr(*args[, ...])

intake.readers.readers.ASDFReader(*args[, ...])

intake.readers.readers.Awkward(*args[, ...])

intake.readers.readers.AwkwardAVRO(*args[, ...])

intake.readers.readers.AwkwardJSON(*args[, ...])

intake.readers.readers.AwkwardParquet(*args)

intake.readers.readers.Condition(*args[, ...])

intake.readers.readers.CupyNumpyReader(*args)

intake.readers.readers.CupyTextReader(*args)

intake.readers.readers.DaskAwkwardJSON(*args)

intake.readers.readers.DaskAwkwardParquet(*args)

intake.readers.readers.DaskCSV(*args[, ...])

intake.readers.readers.DaskDF(*args[, ...])

intake.readers.readers.DaskDeltaLake(*args)

intake.readers.readers.DaskHDF(*args[, ...])

intake.readers.readers.DaskJSON(*args[, ...])

intake.readers.readers.DaskNPYStack(*args[, ...])

Requires a directory with .npy files and an "info" pickle file

intake.readers.readers.DaskParquet(*args[, ...])

intake.readers.readers.DaskSQL(*args[, ...])

intake.readers.readers.DaskZarr(*args[, ...])

intake.readers.readers.DeltaReader(*args[, ...])

intake.readers.readers.DicomReader(*args[, ...])

intake.readers.readers.DuckCSV(*args[, ...])

intake.readers.readers.DuckDB(*args[, ...])

intake.readers.readers.DuckJSON(*args[, ...])

intake.readers.readers.DuckParquet(*args[, ...])

intake.readers.readers.DuckSQL(*args[, ...])

intake.readers.readers.FITSReader(*args[, ...])

intake.readers.readers.FileReader(*args[, ...])

Convenience superclass for readers of files

intake.readers.readers.GeoPandasReader(*args)

intake.readers.readers.GeoPandasTabular(*args)

intake.readers.readers.HandleToUrlReader(*args)

Dereference handle (hdl:) identifiers

intake.readers.readers.HuggingfaceReader(*args)

intake.readers.readers.KerasAudio(*args[, ...])

intake.readers.readers.KerasImageReader(*args)

intake.readers.readers.KerasModelReader(*args)

intake.readers.readers.NibabelNiftiReader(*args)

intake.readers.readers.NumpyReader(*args[, ...])

intake.readers.readers.NumpyText(*args[, ...])

intake.readers.readers.NumpyZarr(*args[, ...])

intake.readers.readers.Pandas(*args[, ...])

intake.readers.readers.PandasCSV(*args[, ...])

intake.readers.readers.PandasExcel(*args[, ...])

intake.readers.readers.PandasFeather(*args)

intake.readers.readers.PandasHDF5(*args[, ...])

intake.readers.readers.PandasORC(*args[, ...])

intake.readers.readers.PandasParquet(*args)

intake.readers.readers.PandasSQLAlchemy(*args)

intake.readers.readers.Polars(*args[, ...])

intake.readers.readers.PolarsAvro(*args[, ...])

intake.readers.readers.PolarsCSV(*args[, ...])

intake.readers.readers.PolarsDeltaLake(*args)

intake.readers.readers.PolarsExcel(*args[, ...])

intake.readers.readers.PolarsFeather(*args)

intake.readers.readers.PolarsIceberg(*args)

intake.readers.readers.PolarsJSON(*args[, ...])

intake.readers.readers.PolarsParquet(*args)

intake.readers.readers.PrometheusMetricReader(*args)

intake.readers.readers.PythonModule(*args[, ...])

intake.readers.readers.RasterIOXarrayReader(*args)

intake.readers.readers.Ray(*args[, ...])

intake.readers.readers.RayBinary(*args[, ...])

intake.readers.readers.RayCSV(*args[, ...])

intake.readers.readers.RayDeltaLake(*args[, ...])

intake.readers.readers.RayJSON(*args[, ...])

intake.readers.readers.RayParquet(*args[, ...])

intake.readers.readers.Retry(*args[, ...])

Retry (part of) a pipeline until it returns without exception

intake.readers.readers.SKImageReader(*args)

intake.readers.readers.SKLearnExampleReader(*args)

intake.readers.readers.SKLearnModelReader(*args)

intake.readers.readers.ScipyMatlabReader(*args)

intake.readers.readers.ScipyMatrixMarketReader(*args)

intake.readers.readers.SparkCSV(*args[, ...])

intake.readers.readers.SparkDataFrame(*args)

intake.readers.readers.SparkDeltaLake(*args)

intake.readers.readers.SparkParquet(*args[, ...])

intake.readers.readers.SparkText(*args[, ...])

intake.readers.readers.TFORC(*args[, ...])

intake.readers.readers.TFPublicDataset(*args)

intake.readers.readers.TFRecordReader(*args)

intake.readers.readers.TFSQL(*args[, ...])

intake.readers.readers.TileDBDaskReader(*args)

intake.readers.readers.TileDBReader(*args[, ...])

intake.readers.readers.TiledClient(*args[, ...])

intake.readers.readers.TiledNode(*args[, ...])

intake.readers.readers.TorchDataset(*args[, ...])

intake.readers.readers.XArrayDatasetReader(*args)

intake.readers.readers.YAMLCatalogReader(*args)

intake.readers.transform.DataFrameColumns(*args)

intake.readers.transform.GetItem(*args[, ...])

Equivalent of x[item]

intake.readers.transform.Method(*args[, ...])

Call named method on object

intake.readers.transform.PysparkColumns(*args)

intake.readers.transform.THREDDSCatToMergedDataset(*args)

intake.readers.transform.XarraySel(*args[, ...])