Skip to content
View mjsushanth's full-sized avatar

Highlights

  • Pro

Block or report mjsushanth

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
mjsushanth/README.md

Hello, I'm Joel M

AI enthusiast & Data Engineer | M.S. in Artificial Intelligence @ Northeastern University

I like AI/ML research, enterprise-scale data engineering, building systems in deep learning, NLP, and computer vision. Please visit: https://mjsushanth.github.io/


Portfolio:

  • Do visit my portfolio here

Study Notes:

  • My study notes are categorized and placed in here!
  • Credits to Obsidian, such a perfect app for taking notes. Here, you can expect to find deep-math diving, clear mental models, intuition, project-research on whatever my work has produced.

Research & Academic Projects

  • FinRAG/FinSights: Production-Grade Financial Intelligence System — Hybrid dual-path architecture combining structured queries (DuckDB/SQL dimension tables) with semantic retrieval. Processes 72M→1M sentences via stratified sampling with temporal weighting across regulatory eras. Check this out!
    • Data Engineering Pipeline: DuckDB stratified sampling (30+ SQL scripts) with weighted multi-objective scoring, fuzzy-matched integrations, conditional temporal stratification.
    • Advanced RAG Engineering: Sentence-level embeddings, multi-query expansion with window-hopping retrieval, citation provenance via document headers for exact traceability. Polars/Parquet logging, serverless-ready architecture. Achieves $0.017-0.025/query cost, no managed DB overhead.
  • Text-to-Pose Diffusion: Built a CLIP-conditioned diffusion model with cross-attention + anatomical loss for 3D pose generation.
    • Has deeply researched concepts on Motion/3D data: (pose representation, N-joint hierarchical mapping, kinematic chains, pelvis-spine-extremity validation) and the architecture of Hybrid CNN-Transformer Diffusion, CLIP Semantic Encoding & Projection, Dual-Pass CFG and Anatomical Constraint Enforcement. See Report here., See Design here.
  • Multi-View 3D Scene Analysis: Created a 10k+ LOC pipeline with MV scene analysis, pose-guided filtering, occlusion handling, and RANSAC validation on ETH3D. See Design Flow.
  • Protein Structure Prediction: Implemented HMM, CRF, BiLSTM; CRF reached 67% accuracy on CB513 using evolutionary + context features. See Report here.
  • SocrAItic Circle: Multi-Agent Debate LLMs workflow, designed with multi-phase debate cycles, iterative refinement, YAML-driven orchestration, and judge modules.
  • Artist Classification: Compared SVM-SIFT-BoVW, CAEs, VAEs, and CNNs; SVM achieved 89% accuracy on 50-class dataset.

Other works:

  • Multi-vector ViT+CLIP with LoRA and ColBERT-style MaxSim retrieval Demo Notebook.
  • An example workflow of ML-Serving using Gitub CI/CD and AWS Lambda, SAM Infrastructure. Src Code. , Notes here. Study notes.
  • Usage of Optuna and MLFlow using a synthetic time-series generator Src Code.

📫 Connect with Me

LinkedIn
Email
GitHub

Pinned Loading

  1. Finsights-MLOps/FinSights Finsights-MLOps/FinSights Public

    Group project for Coursework; (MLOps IE7374). Northeastern University.

    Jupyter Notebook 3

  2. Multi_Agent_LLM_Debater Multi_Agent_LLM_Debater Public

    A modular framework for orchestrating structured debates between multiple large language models (LLMs) with specialized judge evaluation. This project implements an adversarial training approach to…

    Jupyter Notebook 2

  3. CLIP-Conditioned-Diffusion-T2Pose-Generation CLIP-Conditioned-Diffusion-T2Pose-Generation Public

    Dataset - HumanML3D. Large Pipeline with research oriented implementation and exploration for T2P from static pose; ConditionedUNetModels, Anatomical Awareness, Text embedding conditioning, Cross-a…

    Jupyter Notebook 1

  4. ML_Protein_Structure_Prediction ML_Protein_Structure_Prediction Public

    Probabilistic approaches for protein secondary structure prediction using Hidden Markov Models and Conditional Random Fields (CS 6140)

    Jupyter Notebook 1

  5. mlops-labs-portfolio mlops-labs-portfolio Public

    Submission of Labs for MLOps - IE7374. Will likely include other Practice Work!

    Jupyter Notebook

  6. MultiView_Image_Analysis_CS5330 MultiView_Image_Analysis_CS5330 Public

    A research project that attempts to understand elements for complete 3D reconstruction from static images: Feature correspondences, Scene Adaptiveness, Reliability, Camera Intrinsics, Distances, Ov…

    Jupyter Notebook