C-elegans Lifespan Prediction - Final Notebook

This notebook, run.ipynb, is the final implementation for predicting worm lifespan based on early behavioral data. It integrates data preprocessing, feature engineering, machine learning models, and evaluation techniques to produce accurate lifespan predictions.

Project Overview
Prerequies
Overview of key functions
Notebook Workflow
Results and Outputs
Project Overview The project predicts the lifespan of worms using behavioral time-course data from laboratory experiments. The behavioral features include center-of-mass coordinates, speed, and other derived metrics, allowing the training of machine learning models to make lifespan predictions.

add part for Optogenetics (check what we have done so far)

This notebook is self-contained, calling modularized functions from external files for efficient computation and analysis.

Prerequies

Python 3.8+
Required Libraries**:
- numpy
- pandas
- matplotlib
- scikit-survival
- scikit-learn

How to run

To install dependencies, run : pip install numpy pandas matplotlib scikit-survival scikit-learn
Clone the repository: git clone https://github.com/Tournedos/ML-Project-2.git
Navigate to the project folder: cd worm-lifespan-prediction
Open and run the notebook: jupyter notebook run.ipynb

Overview of Key Functions

• run.ipynb : The main notebook that integrates all steps, from data loading and preprocessing to analysis and visualization. • helpers.py : Contains utility functions used throughout the project. • models.py : Includes machine learning models used for predictions. • nan_imputation.py : Provides functions for handling missing values in the data. • Preprocessing.py : Handles general preprocessing tasks to clean the data. • preprocessing_features.py : Focuses on feature-specific preprocessing, such as scaling and extraction. Calculates new features based on the basic features that comes with the data • load_data.py : Includes functions like load_lifespan and load_earlylifespan for loading datasets. • try.ipynb : Not used in the final notebook, but contains (raw) previous analysis made to arrive to the final results.

Notebook Workflow

The notebook run.ipynb is structured to guide you through the complete process of worm lifespan prediction.

Part 1 : 1. Lifespan prediction based on early behavior

Setup :

Import libraries (numpy,pandas...) and custom modules (helpers.py, models.py...)
The root directory and data paths are set up for seamless data loading.

Data loading :

Load lifespan data, make sure of proper loading (only csc files)

Data Preprocessing :

Cleans data by imputing NaNs
Remove frames where the worms are detected to be dead.
Standardizes to prepares features for modeling.

Feature Engineering:

Extracts early behavior metrics from the raw data.
Constructs datasets for regression and classification tasks.

Model Training and Evaluation:

Trains machine learning models to predict lifespan, using early behavioral features.
Evaluates model performance using metrics and visualizations.

Results Analysis:

Analyze predictions against ground truth using metrics like RMSE and accuracy.
Kaplan-Meier curves for survival analysis.
Error histograms for lifespan prediction models.

Part 2 : Assessment of personality of worms based on early behavior

Setup :

make any additionnal needed imports
load data, specifically Optogenetics file this time

Preprocessing optogenetics data :

NaN imputation

Feature Engeneering :

Derive personality metrics from early movement patterns such as consistency in movement and preferred activity levels

Clustering Analysis :

Perform clustering to group worms based on similar behavioral traits
Visualize clusters to identify distinct personality types

Behavioral Traits Evaluation :

Quantify differences between clusters using statistical methods
Highligh key behavioral features that differentiate groups

Visualization :

Generate plots to visualize personnality traits and cluster distributions

Insights and interpretation:

Draw connections between personnality traits and lifespan predictions from Part 1.
Provide actionable based on behavioral clustering
Cluster plots showing distinct worm personality types.
Behavioral feature distributions across clusters.

Results and Outputs

• Predictions: Provides lifespan predictions for worms based on their early behavior. • Visualizations: Includes Kaplan-Meier survival curves and other plots for understanding model performance. • Evaluation: analysis of OLS coefficients. • Performance Metrics: Reports accuracy, RMSE, and survival analysis metrics.

Add new data

To run the pipeline with new worms data make sure that the new files are saved in the same format (.csv) and contain the same informations. Then, put the files in one of the 'Data' subfolders and run the notebook. If the new files contain different data (from different experiments or with different drugs) the notebook will run, but to make it semantically meaningful it may be required to modify just parts of 'load_daya.py'.

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
Data		Data
__pycache__		__pycache__
.DS_Store		.DS_Store
Preprocessing.py		Preprocessing.py
README.md		README.md
evaluations.py		evaluations.py
helpers.py		helpers.py
isdead.py		isdead.py
light_pusle.py		light_pusle.py
load_data.py		load_data.py
nan_imputation.py		nan_imputation.py
personality.py		personality.py
preprocessing_features.py		preprocessing_features.py
run.ipynb		run.ipynb
try.ipynb		try.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

C-elegans Lifespan Prediction - Final Notebook

Table of Contents

add part for Optogenetics (check what we have done so far)

Prerequies

How to run

Overview of Key Functions

Notebook Workflow

Results and Outputs

Add new data

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

C-elegans Lifespan Prediction - Final Notebook

Table of Contents

add part for Optogenetics (check what we have done so far)

Prerequies

How to run

Overview of Key Functions

Notebook Workflow

Results and Outputs

Add new data

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages