Other Classes

Cache Types

intake.source.cache.FileCache(driver, spec)

Cache specific set of files

intake.source.cache.DirCache(driver, spec[, ...])

Cache a complete directory tree

intake.source.cache.CompressedCache(driver, spec)

Cache files extracted from downloaded compressed source

intake.source.cache.DATCache(driver, spec[, ...])

Use the DAT protocol to replicate data

intake.source.cache.CacheMetadata(*args, ...)

Utility class for managing persistent metadata stored in the Intake config directory.

class intake.source.cache.FileCache(driver, spec, catdir=None, cache_dir=None, storage_options={})

Cache specific set of files

Input is a single file URL, URL with glob characters or list of URLs. Output is a specific set of local files.

class intake.source.cache.DirCache(driver, spec, catdir=None, cache_dir=None, storage_options={})

Cache a complete directory tree

Input is a directory root URL, plus a depth parameter for how many levels of subdirectories to search. All regular files will be copied. Output is the resultant local directory tree.

class intake.source.cache.CompressedCache(driver, spec, catdir=None, cache_dir=None, storage_options={})

Cache files extracted from downloaded compressed source

For one or more remote compressed files, downloads to local temporary dir and extracts all contained files to local cache. Input is URL(s) (including globs) pointing to remote compressed files, plus optional decomp, which is “infer” by default (guess from file extension) or one of the key strings in intake.source.decompress.decomp. Optional regex_filter parameter is used to load only the extracted files that match the pattern. Output is the list of extracted files.

class intake.source.cache.DATCache(driver, spec, catdir=None, cache_dir=None, storage_options={})

Use the DAT protocol to replicate data

For details of the protocol, see https://docs.datproject.org/ The executable dat must be available.

Since in this case, it is not possible to access the remote files directly, this cache mechanism takes no parameters. The expectation is that the url passed by the driver is of the form:

dat://<dat hash>/file_pattern

where the file pattern will typically be a glob string like “*.json”.

class intake.source.cache.CacheMetadata(*args, **kwargs)

Utility class for managing persistent metadata stored in the Intake config directory.

keys() a set-like object providing a view on D's keys
pop(k[, d]) v, remove specified key and return the corresponding value.

If key is not found, d is returned if given, otherwise KeyError is raised.

update([E, ]**F) None.  Update D from mapping/iterable E and F.

If E present and has a .keys() method, does: for k in E: D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v

Auth

intake.auth.secret.SecretAuth(*args, **kwargs)

A very simple auth mechanism using a shared secret

intake.auth.secret.SecretClientAuth(secret)

Matching client auth plugin to SecretAuth

class intake.auth.secret.SecretAuth(*args, **kwargs)

A very simple auth mechanism using a shared secret

Parameters
secret: str

The string that must be matched in the requests. If None, a random UUID is generated and logged.

key: str

Header entry in which to seek the secret

allow_access(header, source, catalog)

Is the given HTTP header allowed to access given data source

Parameters
header: dict

The HTTP header from the incoming request

source: CatalogEntry

The data source the user wants to access.

catalog: Catalog

The catalog object containing this data source.

allow_connect(header)

Is the requests header given allowed to talk to the server

Parameters
header: dict

The HTTP header from the incoming request

class intake.auth.secret.SecretClientAuth(secret, key='intake-secret')

Matching client auth plugin to SecretAuth

Parameters
secret: str

The string that must be included requests.

key: str

HTTP Header key for the shared secret

get_headers()

Returns a dictionary of HTTP headers for the remote catalog request.

Containers

intake.container.dataframe.RemoteDataFrame(...)

Dataframe on an Intake server

intake.container.ndarray.RemoteArray(*args, ...)

nd-array on an Intake server

intake.container.semistructured.RemoteSequenceSource(...)

Sequence-of-things source on an Intake server

class intake.container.dataframe.RemoteDataFrame(*args, **kwargs)

Dataframe on an Intake server

read()

Load entire dataset into a container and return it

to_dask()

Return a dask container for this data source

class intake.container.ndarray.RemoteArray(*args, **kwargs)

nd-array on an Intake server

read()

Load entire dataset into a container and return it

read_partition(i)

Return a part of the data corresponding to i-th partition.

By default, assumes i should be an integer between zero and npartitions; override for more complex indexing schemes.

to_dask()

Return a dask container for this data source

class intake.container.semistructured.RemoteSequenceSource(*args, **kwargs)

Sequence-of-things source on an Intake server

read()

Load entire dataset into a container and return it

to_dask()

Return a dask container for this data source

Server

intake.cli.server.server.IntakeServer(catalog)

Main intake-server tornado application

intake.cli.server.server.ServerInfoHandler(...)

Basic info about the server

intake.cli.server.server.SourceCache()

Stores DataSources requested by some user

intake.cli.server.server.ServerSourceHandler(...)

Open or stream data source

class intake.cli.server.server.IntakeServer(catalog)

Main intake-server tornado application

class intake.cli.server.server.ServerInfoHandler(application: Application, request: HTTPServerRequest, **kwargs: Any)

Basic info about the server

initialize(cache, catalog, auth)
class intake.cli.server.server.SourceCache

Stores DataSources requested by some user

peek(uuid)

Get the source but do not change the last access time

class intake.cli.server.server.ServerSourceHandler(application: Application, request: HTTPServerRequest, **kwargs: Any)

Open or stream data source

The requests “action” field (open|read) specified what the request wants to do. Open caches the source and created an ID for it, read uses that ID to reference the source and read a partition.

get()

Access one source’s info.

This is for direct access to an entry by name for random access, which is useful to the client when the whole catalog has not first been listed and pulled locally (e.g., in the case of pagination).

initialize(catalog, cache, auth)

GUI