Skip to content
View pmatorras's full-sized avatar

Block or report pmatorras

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
pmatorras/README.md

Β‘Hola! I'm Pablo Matorras-Cuevas

Data Scientist | PhD Physicist | ML & NLP Specialist

I'm a data scientist with a PhD in particle physics, specialising in Machine Learning and Large Language Models. With over six years of experience in statistical modelling and high-dimensional data analysis, I combine CERN's scientific rigour with modern AI techniques to solve real-world problems.

🌍 Dual Spanish-Swiss citizen | Based in Santander, Spain
πŸ”¬ Former CERN Researcher at CMS Collaboration
πŸŽ“ PhD in Particle Physics (cum laude) | Universidad de Cantabria

πŸ› οΈ Tech Stack

Machine Learning & AI
PyTorch Hugging Face scikit-learn TensorFlow

Data Science & Analysis
Python Pandas NumPy Plotly

Physics & Research
C++ ROOT LaTeX

DevOps & Tools
Docker Git Linux

πŸš€ Featured Projects

Machine learning system predicting S&P 500 outperformers using market data, fundamentals, and sentiment indicators.

  • 20.2% annual returns | 0.93 Sharpe ratio
  • Random Forest with time series cross-validation

Advanced NLP pipeline fine-tuning FinBERT for financial sentiment analysis.

  • Multi-task architecture (classification + regression)
  • 85.4% accuracy across news, social media, and forums

End-to-end system predicting match outcomes across Europe's top five leagues.

  • Combines Elo ratings with Random Forest
  • 50.5% accuracy on three-class prediction

Interactive dashboard exploring macroeconomic indicators across countries using Plotly/Dash.

πŸ“š Background

  • Graduate Researcher | Instituto de FΓ­sica de Cantabria (IFCA) | 2019-2025

    • Large-scale data analysis on CMS detector datasets
    • Monte Carlo simulations and distributed computing
    • International collaboration with 3000+ scientists
  • Teaching Assistant | Universidad de Cantabria | 2021-2023

  • Research Assistant | University of Zurich | 2018-2019

πŸ”¬ Research Code & Large-Scale Computing

Contributor to CMS Experiment software frameworks (~100k+ lines of code):

🌏 International Experience

Lived and worked in 4 countries (Spain, Switzerland, UK, USA) and visited 39 others. Exchange programmes at University College London, University of Pittsburgh, and University of Zurich.

πŸ“« Let's Connect


Pinned Loading

  1. financial-ml financial-ml Public

    S&P 500 equity selection system (20.2% return, 0.93 Sharpe) using Random Forest. End-to-end backtesting pipeline.

    Python

  2. financial-sentiment-llm financial-sentiment-llm Public

    Multi-task FinBERT achieving 85% accuracy on financial sentiment (News/Social/Forums). Explores LoRA efficiency and domain adaptation for trading signals.

    Python

  3. footAI footAI Public

    End-to-end football analytics system with live Dashboards, Elo ratings, and ML predictions (50.5% acc) for Europe's top 5 leagues.

    Python

  4. MacroEconomics MacroEconomics Public

    Fetch, analyze, and visualize IMF macroeconomic indicators. Includes CLI pipeline, time-series charts with projection styling, and interactive Dash dashboard with European maps.

    Python

  5. PlotsConfigurations PlotsConfigurations Public

    Forked from scodella/PlotsConfigurations

    Plots configuration for mkShape

    Python 1

  6. webpage webpage Public

    creating a webpage for my domain from scratch

    HTML