avocado.Dataset¶

class avocado.Dataset(name, metadata, observations=None, objects=None, chunk=None, num_chunks=None, object_class=<class 'avocado.astronomical_object.AstronomicalObject'>)¶

A dataset of many astronomical objects.

Parameters:

name (str) – Name of the dataset. This will be used to determine the filenames of various outputs such as computed features and predictions.
metadata (pandas.DataFrame) – DataFrame where each row is the metadata for an object in the dataset. See AstronomicalObject for details.
observations (pandas.DataFrame) – Observations of all of the objects’ light curves. See AstronomicalObject for details.
objects (list) – A list of AstronomicalObject instances. Either this or observations can be specified but not both.
chunk (int (optional)) – If the dataset was loaded in chunks, this indicates the chunk number.
num_chunks (int (optional)) – If the dataset was loaded in chunks, this is the total number of chunks used.

__init__(name, metadata, observations=None, objects=None, chunk=None, num_chunks=None, object_class=<class 'avocado.astronomical_object.AstronomicalObject'>)¶: Create a new Dataset from a set of metadata and observations

Methods

`__init__`(name, metadata[, observations, …])	Create a new Dataset from a set of metadata and observations
`extract_raw_features`(featurizer[, keep_models])	Extract raw features from the dataset.
`from_objects`(name, objects, **kwargs)	Load a dataset from a list of AstronomicalObject instances.
`get_bands`()	Return a list of all of the bands in the dataset.
`get_models_path`([tag])	Return the path to where the models for this dataset should lie on disk
`get_object`([index, object_class, object_id])	Parse keywords to pull a specific object out of the dataset
`get_predictions_path`([classifier])	Return the path to where the predicitons for this dataset for a given classifier should lie on disk.
`get_raw_features_path`([tag])	Return the path to where the raw features for this dataset should lie on disk
`label_folds`([num_folds, random_state])	Separate the dataset into groups for k-folding
`load`(name[, metadata_only, chunk, …])	Load a dataset that has been saved in HDF5 format in the data directory.
`load_predictions`([classifier])	Load the predictions for a classifier from disk.
`load_raw_features`([tag])	Load the raw features from disk.
`plot_interactive`()	Make an interactive plot of the light curves in the dataset.
`plot_light_curve`(args, *kwargs)	Plot the light curve for an object in the dataset.
`predict`(classifier)	Generate predictions using a classifier.
`read_object`(object_id[, object_class])	Read an object with a given object_id.
`select_features`(featurizer)	Select features from the dataset for classification.
`write`([overwrite])	Write the dataset out to disk.
`write_models`([tag])	Write the models of the light curves to disk.
`write_predictions`([classifier])	Write predictions for this classifier to disk.
`write_raw_features`([tag])	Write the raw features out to disk.

Attributes

path Return the path to where this dataset should lie on disk