Publications
2025
- torchsom: The Reference PyTorch Library for Self-Organizing Maps2025Preprint submitted to Journal of Machine Learning Research
This paper introduces torchsom, an open-source Python library that provides a reference implementation of the Self-Organizing Map (SOM) in PyTorch. This package offers three main features: (i) dimensionality reduction, (ii) clustering, and (iii) friendly data visualization. It relies on a PyTorch backend, enabling (i) fast and efficient training of SOMs through GPU acceleration, and (ii) easy and scalable integrations with PyTorch ecosystem. Moreover, torchsom follows the scikit-learn API for ease of use and extensibility. The library is released under the Apache 2.0 license with 90% test coverage, and its source code and documentation are available at this https URL: https://github.com/michelin/TorchSOM.
@misc{berthier2025torchsom, title = {torchsom: The Reference PyTorch Library for Self-Organizing Maps}, author = {Berthier, Louis and Shokry, Ahmed and Moreaud, Maxime and Ramelet, Guillaume and Moulines, Eric}, year = {2025}, eprint = {2510.11147}, archiveprefix = {arXiv}, primaryclass = {stat.ML}, note = {Preprint submitted to Journal of Machine Learning Research}, url = {https://arxiv.org/abs/2510.11147}, } - Software
torchsom: The Reference PyTorch Library for Self-Organizing Maps2025@software{berthier2025torchsom_software, author = {Berthier, Louis}, title = {torchsom: The Reference PyTorch Library for Self-Organizing Maps}, year = {2025}, version = {1.1.1}, url = {https://github.com/michelin/TorchSOM}, } - Knowledge Discovery in Large-Scale Batch Processes through Explainable Boosted Models and Uncertainty Quantification: Application to Rubber MixingSystems and Control Transactions, 2025
Rubber mixing (RM) is a vital batch process producing high-quality composites, which serve as input material for manufacturing different types of final products, such as tires. Due to its complexity, this process faces two main challenges regarding the final quality: i) lack of online measurement and ii) limited comprehension of the influence of the different factors involved in the process. While data-driven and machine learning (ML) based soft-sensing methods have been widely applied to address the first challenge, the second challenge, to the best of the author’s knowledge, has not yet been addressed in the rubber industry. This work presents a data-driven method for extracting knowledge and providing explainability in the quality prediction in RM processes. The method centers on an XGBoost model while leveraging high-dimensional data collected over extended time periods from one of Michelins complex mixing processes. First, a recursive feature elimination-based procedure is used for selecting relevant features, which reduces the number of input features used for building the ML model by 82% while improving its predictive performance by 17%. Secondly, SHapley Additive exPlanations (SHAP) techniques are employed to explain the ML models predictions through global and local analyses of feature interactions. The selected quality-related variables can be leveraged to improve process control and supervision. Finally, an uncertainty quantification (UQ) module, based on Split Conformal Prediction (SCP), is combined with the ML model, providing confidence intervals with 90% coverage and empirically verified theoretical guarantees. This module ensures prediction reliability and robustness in real applications.
@article{berthier2025knowledge_discovery_rubber_mixing, title = {Knowledge Discovery in Large-Scale Batch Processes through Explainable Boosted Models and Uncertainty Quantification: Application to Rubber Mixing}, author = {Berthier, Louis and Shokry, Ahmed and Moulines, Eric and Ramelet, Guillaume and Desroziers, Sylvain}, journal = {Systems and Control Transactions}, volume = {4}, pages = {1518--1523}, year = {2025}, doi = {10.69997/sct.183525}, url = {https://doi.org/10.69997/sct.183525}, } - Detecting fast-ripples on both micro- and macro-electrodes in epilepsy: A wavelet-based CNN detectorJournal of Neuroscience Methods, 2025
Background: Fast-ripples (FR) are short ( 10 ms) high-frequency oscillations (HFO) between 200 and 600 Hz that are helpful in epilepsy to identify the epileptogenic zone. Our aim is to propose a new method to detect FR that had to be efficient for intracerebral EEG (iEEG) recorded from both usual clinical macro-contacts (millimeter scale) and microwires (micrometer scale). New Method: Step 1 of the detection method is based on a convolutional neural network (CNN) trained using a large database of > 11,000 FR recorded from the iEEG of 38 patients with epilepsy from both macro-contacts and microwires. The FR and non-FR events were fed to the CNN as normalized time-frequency maps. Step 2 is based on feature-based control techniques in order to reject false positives. In step 3, the human is reinstated in the decision-making process for final validation using a graphical user interface. Results: WALFRID achieved high performance on the realistically simulated data with sensitivity up to 99.95 % and precision up to 96.51 %. The detector was able to adapt to both macro and micro-EEG recordings. The real data was used without any pre-processing step such as artefact rejection. The precision of the automatic detection was of 57.5. Step 3 helped eliminating remaining false positives in a few minutes per subject. Comparison with Existing Methods: WALFRID performed as well or better than 6 other existing methods. Conclusion: Since WALFRID was created to mimic the work-up of the neurologist, clinicians can easily use, understand, interpret and, if necessary, correct the output.
@article{gardy2025fast_ripples, title = {Detecting fast-ripples on both micro- and macro-electrodes in epilepsy: A wavelet-based CNN detector}, author = {Gardy, Ludovic and Curot, Jonathan and Valton, Luc and Berthier, Louis and Barbeau, Emmanuel J. and Hurter, Christophe}, journal = {Journal of Neuroscience Methods}, volume = {415}, pages = {110350}, year = {2025}, publisher = {Elsevier}, doi = {10.1016/j.jneumeth.2024.110350}, }
2023
- 2DSBG: A 2D Semi Bi-Gaussian Filter Adapted for Adjacent and Multi-Scale Line Feature DetectionIn ICASSP 2023–2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023
Existing filtering techniques fail to precisely detect adjacent line features in multi-scale applications. In this paper, a new filter composed of a bi-Gaussian and a semi-Gaussian kernel is proposed, capable of highlighting complex linear structures such as ridges and valleys of different widths, with noise robustness. Experiments have been performed on a set of both synthetic and real images containing adjacent line features. The obtained results show the performance of the new technique in comparison to the main existing filtering methods.
@inproceedings{magnier2023_2dsbg, title = {2DSBG: A 2D Semi Bi-Gaussian Filter Adapted for Adjacent and Multi-Scale Line Feature Detection}, author = {Magnier, Baptiste and Shokouh, Ghulam Sakhi and Berthier, Louis and Pie, Marcel and Ruggiero, Adrien}, booktitle = {ICASSP 2023–2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, pages = {1--5}, year = {2023}, organization = {IEEE}, doi = {10.1109/ICASSP49357.2023.10095570}, }