grannules package#

Submodules#

grannules.neural_net module#

class grannules.neural_net.NNPredictor(model_name: str | None = None, train_data: DataFrame | None = None, test_data: DataFrame | None = None, features: list[str] = ['M', 'R', 'Teff', 'FeH', 'KepMag', 'phase'], e_features: list[str] | None = None, targets: list[str] = ['H', 'P', 'tau', 'alpha'], e_targets: list[str] | None = None, X_transformer=None, y_transformer=None, state: TrainState = None, trial: Trial = None, params: dict = None, random_state: int = None)[source]#

Bases: object

Initializes a NNPredictor.

Important

The user shouldn’t call this directly. See NNPredictor.get_default_predictor(), NNPredictor.train_new(), NNPredictor.from_study(), or NNPredictor.deserialize().

Parameters:

train_data (pandas.DataFrame, optional) – The training data to train the model on in _train_net. Defaults to None.
test_data (pandas.DataFrame, optional) – The testing data to train the model on in _train_net. Defaults to None.
features (list[str], optional) – The features to train the model on. Defaults to NNPredictor.DEFAULT_FEATURES.
e_features (list[str], optional) – The uncertainties of the features. Defaults to features with an 'e_' prefix on each element.
targets (list[str], optional) – The targets to train the model on. Defaults to NNPredictor.DEFAULT_TARGETS.
e_targets (list[str], optional) – The uncertainties of the targets. Defaults to targets with an 'e_' prefix on each element.
X_transformer (object, optional) – The transformer to use for the features. Calls fit_transform on the training data, and transform when using predictor.predict(). Defaults to neural_net.DefaultXTransformer.
y_transformer (object, optional) – The transformer to use for the targets. Calls fit_transform on the training data, transform when training, and inverse_transform when predicting. Defaults to neural_net.DefaultyTransformer.
nn_state (train_state.TrainState, optional) – The state of the neural network. Defaults to None.
random_state (int, optional) – The random state to use for training the net. Defaults to None.

DEFAULT_FEATURES = ['M', 'R', 'Teff', 'FeH', 'KepMag', 'phase']#

DEFAULT_TARGETS = ['H', 'P', 'tau', 'alpha']#

classmethod deserialize(path: str | Path = None)[source]#

Deserialize a neural network from a directory.

This method reads in the same format as NNPredictor.serialize()

Parameters:

path (str or Path) –

Path to the directory containing the serialized model files. The directory should include:

params.json: JSON file with model parameters.
state.pkl: Pickle file with the model’s state dictionary.
transform.npy: Numpy file with transformation parameters for input and output scaling.

Returns:

An instance of NNPredictor initialized with the deserialized model, state, and transformers.

Return type:

NNPredictor

Creates an NNPredictor from the best trial in an Optuna study.

Parameters:

study_or_path – Union[optuna.study.Study, str] An Optuna study or a path to an Optuna database. If a path is provided, the study is loaded from the database, and study_name must be specified.
data – Optional[pandas.DataFrame] The complete dataset to train the model. It will be split using neural_net.split_data(). Required if train_data and test_data are not provided.
train_data – Optional[pandas.DataFrame] The training dataset. Required if data is not provided.
test_data – Optional[pandas.DataFrame] The testing dataset. Required if data is not provided.
study_name – Optional[str] The name of the study to load from the database. This is required if study_or_path is a path. Ignored if study_or_path is an Optuna study.
random_state – Optional[int] The random state to use for splitting the data and training the neural network.
kwargs – dict

Returns:

NNPredictor An instance of NNPredictor initialized with the best trial from the study.

classmethod get_default_predictor(*args, **kwargs)[source]#

Loads a pre-trained NNPredictor singleton.

Returns:: A pre-trained NNPredictor
Return type:: NNPredictor

predict(X: DataFrame, to_df=False) → ndarray[source]#

Predicts the parameters \(H,\, P,\, \tau,\) and \(\alpha\) for red giant stars using a pre-trained neural network.

Parameters:

X (pandas.DataFrame) –
A pandas DataFrame with columns ‘M’, ‘R’, ‘Teff’, ‘FeH’, ‘KepMag’, and ‘phase’.
- ’M’: Mass of the star in solar masses.
- ’R’: Radius of the star in solar radii.
- ’Teff’: Effective temperature of the star in Kelvin.
- ’FeH’: Metallicity of the star.
- ’KepMag’: Apparent magnitude of the star in the Kepler band.
- ’phase’: Phase of the star.
to_df (bool) – If True, returns the predictions as a pandas DataFrame. Otherwise, returns a NumPy array.

Returns:

Predicted values for \(H,\, P,\, \tau,\,\) and \(\alpha\). If to_df is True, the result is a pandas DataFrame with columns [‘H’, ‘P’, ‘tau’, ‘alpha’]. Otherwise, it is a NumPy array.

Return type:

pandas.DataFrame or numpy.ndarray

serialize(path: str | Path = None, overwrite: bool = False)[source]#

Serialize the neural network model, its parameters, and data transformations to the specified directory.

This method saves the model’s parameters, state, and data transformation details into a directory for later use. If the directory already exists, it can optionally overwrite it.

Parameters:

path (str | pathlib.Path, optional) – The directory path where the model will be serialized. Defaults to the current working directory with the name “grannules-predictor”.
overwrite (bool) – Whether to overwrite the directory if it already exists. Defaults to False.

Raises:

RuntimeError – If attempting to overwrite the current working directory, root, or home.
FileExistsError – If the directory already exists and overwrite is set to False.

classmethod train_new(study_or_path: Study | str, data: DataFrame | None = None, train_data: DataFrame | None = None, test_data: DataFrame | None = None, study_name: str | None = None, random_state: int | None = None, load_study_if_exists: bool = True, pruner: BasePruner = <optuna.pruners._nop.NopPruner object>, study_kwargs: dict = {}, n_trials: int = 100, optuna_kwargs: dict = {}, **kwargs) → tuple[NNPredictor, Study][source]#

Train a new neural network model using Optuna for hyperparameter optimization. This method allows training a neural network model by either creating a new Optuna study or using an existing one. It supports splitting data into training and testing sets, and optimizing the model’s hyperparameters through Optuna’s study framework.

Parameters:

study_or_path (optuna.study.Study | str) – Either an Optuna study object or a string path to the study’s storage.
data (pandas.DataFrame | None) – The complete dataset to be split into training and testing sets. If provided, train_data and test_data will be ignored.
train_data (pandas.DataFrame | None) – Pre-split training data. Used if data is not provided.
test_data (pandas.DataFrame | None) – Pre-split testing data. Used if data is not provided.
study_name (str | None) – Name of the Optuna study. Required if creating a new study.
random_state (int | None) – Random seed for reproducibility in data splitting and training.
load_study_if_exists (bool) – Whether to load an existing study if it already exists.
pruner (optuna.pruners.BasePruner) – Optuna pruner to use for early stopping during optimization.
study_kwargs (dict) – Additional keyword arguments for creating the Optuna study.
n_trials (int) – Number of trials to run for hyperparameter optimization.
optuna_kwargs (dict) – Additional keyword arguments for the study.optimize method.
kwargs (dict) – Additional keyword arguments for the neural network predictor initialization.

Returns:

A tuple containing the trained neural network predictor and the Optuna study.

Return type:

tuple[NNPredictor, optuna.study.Study]

grannules.neural_net.predict(X: DataFrame, to_df: bool = False, *args, **kwargs) → ndarray | DataFrame[source]#

Predicts the parameters \(H,\, P,\, \tau,\) and \(\alpha\) for red giant stars using a pre-trained neural network.

Parameters:

X (pandas.DataFrame) – A pandas DataFrame with columns ‘M’, ‘R’, ‘Teff’, ‘FeH’, ‘KepMag’, and ‘phase’. - ‘M’: Mass of the star in solar masses. - ‘R’: Radius of the star in solar radii. - ‘Teff’: Effective temperature of the star in Kelvin. - ‘FeH’: Metallicity of the star. - ‘KepMag’: Apparent magnitude of the star in the Kepler band. - ‘phase’: Phase of the star.
to_df (bool) – If True, returns the predictions as a pandas DataFrame. Otherwise, returns a NumPy array.

Returns:

Predicted values for \(H,\, P,\, \tau,\,\) and \(\alpha\). If to_df is True, the result is a pandas DataFrame with columns [‘H’, ‘P’, ‘tau’, ‘alpha’]. Otherwise, it is a NumPy array.

Return type:

numpy.ndarray | pandas.DataFrame

grannules package#

Subpackages#

Submodules#

grannules.neural_net module#

Module contents#