grannules package#

Subpackages#

Submodules#

grannules.neural_net module#

class grannules.neural_net.NNPredictor(model_name: str | None = None, train_data: DataFrame | None = None, test_data: DataFrame | None = None, features: list[str] = ['M', 'R', 'Teff', 'FeH', 'KepMag', 'phase'], e_features: list[str] | None = None, targets: list[str] = ['H', 'P', 'tau', 'alpha'], e_targets: list[str] | None = None, X_transformer=None, y_transformer=None, state: TrainState = None, trial: Trial = None, params: dict = None, random_state: int = None)[source]#

Bases: object

Initializes a NNPredictor.

Parameters:
  • train_data (pandas.DataFrame, optional) – The training data to train the model on in _train_net. Defaults to None.

  • test_data (pandas.DataFrame, optional) – The testing data to train the model on in _train_net. Defaults to None.

  • features (list[str], optional) – The features to train the model on. Defaults to NNPredictor.DEFAULT_FEATURES.

  • e_features (list[str], optional) – The uncertainties of the features. Defaults to features with an 'e_' prefix on each element.

  • targets (list[str], optional) – The targets to train the model on. Defaults to NNPredictor.DEFAULT_TARGETS.

  • e_targets (list[str], optional) – The uncertainties of the targets. Defaults to targets with an 'e_' prefix on each element.

  • X_transformer (object, optional) – The transformer to use for the features. Calls fit_transform on the training data, and transform when using predictor.predict(). Defaults to neural_net.DefaultXTransformer.

  • y_transformer (object, optional) – The transformer to use for the targets. Calls fit_transform on the training data, transform when training, and inverse_transform when predicting. Defaults to neural_net.DefaultyTransformer.

  • nn_state (train_state.TrainState, optional) – The state of the neural network. Defaults to None.

  • random_state (int, optional) – The random state to use for training the net. Defaults to None.

DEFAULT_FEATURES = ['M', 'R', 'Teff', 'FeH', 'KepMag', 'phase']#
DEFAULT_TARGETS = ['H', 'P', 'tau', 'alpha']#
classmethod deserialize(path: str | Path = None)[source]#

Deserialize a neural network from a directory.

This method reads in the same format as NNPredictor.serialize()

Parameters:

path (str or Path) –

Path to the directory containing the serialized model files. The directory should include:

  • params.json: JSON file with model parameters.

  • state.pkl: Pickle file with the model’s state dictionary.

  • transform.npy: Numpy file with transformation parameters for input and output scaling.

Returns:

An instance of NNPredictor initialized with the deserialized model, state, and transformers.

Return type:

NNPredictor

classmethod from_study(study_or_path: Study | str, data: DataFrame | None = None, train_data: DataFrame | None = None, test_data: DataFrame | None = None, study_name: str | None = None, random_state: int | None = None, **kwargs) NNPredictor[source]#

Creates an NNPredictor from the best trial in an Optuna study.

Parameters:
  • study_or_path – Union[optuna.study.Study, str] An Optuna study or a path to an Optuna database. If a path is provided, the study is loaded from the database, and study_name must be specified.

  • data – Optional[pandas.DataFrame] The complete dataset to train the model. It will be split using neural_net.split_data(). Required if train_data and test_data are not provided.

  • train_data – Optional[pandas.DataFrame] The training dataset. Required if data is not provided.

  • test_data – Optional[pandas.DataFrame] The testing dataset. Required if data is not provided.

  • study_name – Optional[str] The name of the study to load from the database. This is required if study_or_path is a path. Ignored if study_or_path is an Optuna study.

  • random_state – Optional[int] The random state to use for splitting the data and training the neural network.

  • kwargs – dict

Returns:

NNPredictor An instance of NNPredictor initialized with the best trial from the study.

classmethod get_default_predictor(*args, **kwargs)[source]#

Loads a pre-trained NNPredictor singleton.

Returns:

A pre-trained NNPredictor

Return type:

NNPredictor

predict(X: DataFrame, to_df=False) ndarray[source]#

Predicts the parameters \(H,\, P,\, \tau,\) and \(\alpha\) for red giant stars using a pre-trained neural network.

Parameters:
  • X (pandas.DataFrame) –

    A pandas DataFrame with columns ‘M’, ‘R’, ‘Teff’, ‘FeH’, ‘KepMag’, and ‘phase’.

    • ’M’: Mass of the star in solar masses.

    • ’R’: Radius of the star in solar radii.

    • ’Teff’: Effective temperature of the star in Kelvin.

    • ’FeH’: Metallicity of the star.

    • ’KepMag’: Apparent magnitude of the star in the Kepler band.

    • ’phase’: Phase of the star.

  • to_df (bool) – If True, returns the predictions as a pandas DataFrame. Otherwise, returns a NumPy array.

Returns:

Predicted values for \(H,\, P,\, \tau,\,\) and \(\alpha\). If to_df is True, the result is a pandas DataFrame with columns [‘H’, ‘P’, ‘tau’, ‘alpha’]. Otherwise, it is a NumPy array.

Return type:

pandas.DataFrame or numpy.ndarray

serialize(path: str | Path = None, overwrite: bool = False)[source]#

Serialize the neural network model, its parameters, and data transformations to the specified directory.

This method saves the model’s parameters, state, and data transformation details into a directory for later use. If the directory already exists, it can optionally overwrite it.

Parameters:
  • path (str | pathlib.Path, optional) – The directory path where the model will be serialized. Defaults to the current working directory with the name “grannules-predictor”.

  • overwrite (bool) – Whether to overwrite the directory if it already exists. Defaults to False.

Raises:
  • RuntimeError – If attempting to overwrite the current working directory, root, or home.

  • FileExistsError – If the directory already exists and overwrite is set to False.

classmethod train_new(study_or_path: Study | str, data: DataFrame | None = None, train_data: DataFrame | None = None, test_data: DataFrame | None = None, study_name: str | None = None, random_state: int | None = None, load_study_if_exists: bool = True, pruner: BasePruner = <optuna.pruners._nop.NopPruner object>, study_kwargs: dict = {}, n_trials: int = 100, optuna_kwargs: dict = {}, **kwargs) tuple[NNPredictor, Study][source]#

Train a new neural network model using Optuna for hyperparameter optimization. This method allows training a neural network model by either creating a new Optuna study or using an existing one. It supports splitting data into training and testing sets, and optimizing the model’s hyperparameters through Optuna’s study framework.

Parameters:
  • study_or_path (optuna.study.Study | str) – Either an Optuna study object or a string path to the study’s storage.

  • data (pandas.DataFrame | None) – The complete dataset to be split into training and testing sets. If provided, train_data and test_data will be ignored.

  • train_data (pandas.DataFrame | None) – Pre-split training data. Used if data is not provided.

  • test_data (pandas.DataFrame | None) – Pre-split testing data. Used if data is not provided.

  • study_name (str | None) – Name of the Optuna study. Required if creating a new study.

  • random_state (int | None) – Random seed for reproducibility in data splitting and training.

  • load_study_if_exists (bool) – Whether to load an existing study if it already exists.

  • pruner (optuna.pruners.BasePruner) – Optuna pruner to use for early stopping during optimization.

  • study_kwargs (dict) – Additional keyword arguments for creating the Optuna study.

  • n_trials (int) – Number of trials to run for hyperparameter optimization.

  • optuna_kwargs (dict) – Additional keyword arguments for the study.optimize method.

  • kwargs (dict) – Additional keyword arguments for the neural network predictor initialization.

Returns:

A tuple containing the trained neural network predictor and the Optuna study.

Return type:

tuple[NNPredictor, optuna.study.Study]

grannules.neural_net.predict(X: DataFrame, to_df: bool = False, *args, **kwargs) ndarray | DataFrame[source]#

Predicts the parameters \(H,\, P,\, \tau,\) and \(\alpha\) for red giant stars using a pre-trained neural network.

Parameters:
  • X (pandas.DataFrame) – A pandas DataFrame with columns ‘M’, ‘R’, ‘Teff’, ‘FeH’, ‘KepMag’, and ‘phase’. - ‘M’: Mass of the star in solar masses. - ‘R’: Radius of the star in solar radii. - ‘Teff’: Effective temperature of the star in Kelvin. - ‘FeH’: Metallicity of the star. - ‘KepMag’: Apparent magnitude of the star in the Kepler band. - ‘phase’: Phase of the star.

  • to_df (bool) – If True, returns the predictions as a pandas DataFrame. Otherwise, returns a NumPy array.

Returns:

Predicted values for \(H,\, P,\, \tau,\,\) and \(\alpha\). If to_df is True, the result is a pandas DataFrame with columns [‘H’, ‘P’, ‘tau’, ‘alpha’]. Otherwise, it is a NumPy array.

Return type:

numpy.ndarray | pandas.DataFrame

Module contents#