cover image
Institut DataIA Paris-Saclay

Institut DataIA Paris-Saclay

dataia.eu

3 Jobs

15 Employees

About the Company

Créé en 2017 dans le cadre de la Stratégie Nationale pour l’Intelligence Artificielle, l’Institut DataIA est le pôle d’excellence en IA de l’Université Paris-Saclay. Il fédère 14 établissements d’enseignement supérieur et de recherche, dont CentraleSupélec et l’ENS Paris-Saclay, ainsi que des organismes nationaux et partenaires académiques. L’Institut œuvre à structurer l’écosystème IA autour de la recherche, de la formation et de l’innovation, avec des actions phares comme le projet SaclAI-School, lauréat de l’appel Compétences et Métiers d’Avenir.

Listed Jobs

Company background Company brand
Company Name
Institut DataIA Paris-Saclay
Job Title
Internship - Transfer learning models able to handle MISSing data for the survival analysis of rare cancer from multi-OMICS data
Job Description
**Job title** Internship – Transfer Learning Models for Survival Analysis of Rare Cancer with Missing Multi‑Omics Data **Role Summary** Develop and evaluate machine learning models that leverage transfer learning and joint dimensionality‑reduction to predict survival outcomes for rare cancers, using heterogeneous multi‑omics data with missing values. **Expectations** - Participate in end‑to‑end model development from data preprocessing to deployment. - Produce reproducible code and documentation. - Present experimental results in team meetings. - Collaborate with statisticians, bioinformaticians, and clinicians. **Key Responsibilities** 1. Acquire, clean, and preprocess multi‑omics datasets (genomics, epigenomics, transcriptomics, proteomics). 2. Design and implement missing‑data handling strategies (imputation, model‑based methods). 3. Build and train transfer‑learning architectures that incorporate clinical covariates and omics features. 4. Apply joint dimension‑reduction techniques to reduce high dimensionality before survival analysis. 5. Perform survival analysis using appropriate statistical models (Cox, DeepSurv, random survival forests). 6. Evaluate model performance with metrics such as concordance index, time‑dependent AUC, and calibration plots. 7. Compare new methods to baseline clinical‑only models and existing multi‑omics methods. 8. Document code, data pipelines, and experimental results for reproducibility. **Required Skills** - Programming: Python, R (or Java/Scala optional). - Machine learning frameworks: TensorFlow, PyTorch, or Keras. - Survival analysis libraries: lifelines, scikit-survival, or equivalent. - Statistical modeling and inference. - Experience with high‑dimensional data and dimensionality‑reduction (PCA, NMF, MultiFA, CCA). - Familiarity with missing data techniques (multiple imputation, EM, matrix completion). - Basic knowledge of genomics and bioinformatics pipelines. - Good version‑control usage (Git) and documentation practices. **Required Education & Certifications** - Current enrollment in or completion of a Bachelor’s or Master’s program in Computer Science, Statistics, Bioinformatics, Applied Mathematics, or related field. - Coursework or projects demonstrating experience with data science or machine learning. - No specific certifications required; familiarity with genomics resources (ENCODE, TCGA, GTEx) is a plus.
Gif-sur-yvette, France
On site
20-01-2026
Company background Company brand
Company Name
Institut DataIA Paris-Saclay
Job Title
Stage Master 2 - Analyse spatiale des cellules de pluie par machine learning
Job Description
**Job Title** Stage Master 2 – Spatial Analysis of Rainfall Cells by Machine Learning **Role Summary** Support a research project focused on evaluating the influence of urban heat islands on precipitation. Develop and train supervised contrastive learning models on radar‑derived rainfall maps to extract region‑specific representations, enabling characterization and classification of rainfall patterns. **Expectations** - Complete ML model development within the internship duration. - Produce reproducible code and documentation. - Deliver a technical report summarizing methodology, results, and potential extensions. **Key Responsibilities** - Acquire and preprocess 50 × 50 km rainfall maps from Météo France radar network for Paris and Val de Loire regions. - Implement a supervised contrastive learning framework (e.g., Khosla et al. 2020) in Python/PyTorch or TensorFlow. - Train the model to bring together maps from the same region and separate those from different regions. - Evaluate representations for downstream tasks: characterization, classification, and exploratory analysis. - Investigate augmentations such as incorporating rainfall cell velocity fields and meta‑data embeddings. - Explore scalability to additional metropolitan regions. **Required Skills** - Strong programming in Python; experience with deep learning libraries (PyTorch, TensorFlow). - Familiarity with supervised contrastive learning or related representation learning techniques. - Ability to handle and process large geospatial datasets. - Knowledge of basic atmospheric science concepts (urban heat islands, precipitation dynamics) is advantageous. - Proficiency in scientific computing, data analysis, and version control (Git). **Required Education & Certifications** - Current Master’s student (MSc or equivalent) in Atmospheric Sciences, Physics, Applied Mathematics, Computer Science, or related field. - Coursework or experience in machine learning, statistical analysis, and geospatial data processing. ---
Gif-sur-yvette, France
On site
10-02-2026
Company background Company brand
Company Name
Institut DataIA Paris-Saclay
Job Title
Développement et industrialisation d’une solution d’intelligence artificielle (IA) pour la détection de substances dopantes par spectrométrie de masse haute résolution LC-HRMS
Job Description
Job Title: Development and Industrialisation of an AI Solution for Detection of Doping Substances via LC-HRMS Mass Spectrometry Role Summary: Assist in finalising the operational deployment of an AI-driven detection system for doping substances in a high-resolution mass spectrometry laboratory. The role involves data science, chemistry, automation, experimental validation, and quality integration over a 4‑6 month internship. Expectations: - Complete development of AI pipelines, APIs, and validation workflows. - Ensure robustness, accuracy, and usability for laboratory operations. - Produce documentation and reports for sustained integration. Key Responsibilities: - Develop and optimise AI pipelines: adapt and train models to improve prediction robustness and precision. - Build an API to integrate the solution seamlessly into the laboratory’s ecosystem. - Conduct experimental validation: compare model predictions with reference results, analyse discrepancies. - Monitor performance, identify error sources, and propose continuous improvements. - Draft technical documentation for code and processes to support long‑term workflow integration. Required Skills: - Proficiency in Python (pandas, matplotlib, scikit‑learn, etc.). - Experience with SQL and database management. - Strong analytical and problem‑solving abilities. - Ability to work autonomously and collaborate within a multidisciplinary team. Required Education & Certifications: - Master’s (M1 or M2) in Computer Science, Data Science, or Artificial Intelligence. - No mandatory certifications required.
Gif-sur-yvette, France
On site
10-02-2026