๐ฌ Computational Physicist turned Scientific Software Developer and Data Engineer
๐งโ๐ป Currently designing databases for large-scale atomistic simulation data at TU Eindhoven ร IBM Research
๐ Passionate about bridging scientific computing with modern data engineering and AI technologies
๐ก Love building tools that make complex scientific and engineering workflows accessible and scalable
๐ Based in Den Bosch, Netherlands
โก Fun fact: I enjoy cooking, drone cinematography, and swimming when not debugging simulation data pipelines!
Building an open-source, cloud-native pipeline for large-scale molecular dynamics data. Using MinIO for scalable storage, Apache Spark and Delta Lake for transforming raw trajectories into structured formats, and Trino for fast SQL querying. Integrating MLflow for reproducible AI workflows and orchestrating everything with Apache Airflow. Focused on scalable, metadata-rich infrastructure for scientific computing.
Developed a modular Python toolkit for LAMMPS simulation analysis, backed by 270+ tests (94% coverage), Dockerized for portability, and powered by robust CI/CD. Achieved 60% memory savings and 40% faster performance compared to typical scientific scripting workflows.
Extended LAMMPS with C++ to integrate two open-source packages for novel electrochemical device simulations, navigating complex licensing and attribution challenges.
๐ฏ Seeking opportunities in:
- Scientific Software Development & Computational Materials Science
- Data Engineering, Analytics & Data Stewardship
- Modeling & Simulation Engineering
- AI/ML Applications in Scientific Computing
๐ง Core Expertise:
- Data Analysis & Insights: Statistical analysis of large scientific datasets with advanced visualization
- Materials Science Modeling: Molecular dynamics simulations, DFT calculations, and multi-scale modeling
- Performance Optimization: Algorithmic improvements achieving significant memory and speed gains
- Data Pipeline Architecture: Real-time streaming and batch processing for scientific workflows
- Data Governance: Metadata management, data quality assurance, and reproducible research practices
- Full-Stack Scientific Computing: From Python APIs to C++ algorithms to cloud deployment
- Production Software Development: CI/CD, automated testing, containerization, and package distribution
๐ฑ Databricks Certified Data Engineer (in progress)
๐ฑ Cloud-native data lake architectures & data governance
๐ฑ Advanced statistical analysis and predictive modeling
๐ฑ Graph-based ML for scientific applications
๐ฑ Natural language interfaces for scientific databases
PhD in Materials Science & Engineering from Shanghai Jiao Tong University with expertise in computational modeling, machine learning, and numerical methods. Transitioned from pure research to building production-ready scientific software that solves real-world problems.
Key Achievement: Led IBM collaboration resulting in 10x device stability improvement through innovative simulation algorithms and data processing pipelines.
- ๐ฌ๐ง English (Professional - C2)
- ๐ณ๐ฑ Dutch (Learning - Beginner)
- ๐จ๐ณ Chinese (Basic - A1)
- ๐ฎ๐ณ Hindi, Assamese, Bengali (Native)
๐ฌ Let's connect! I'm always excited to discuss scientific computing, materials science research, data engineering challenges, or opportunities to make complex data more accessible through better analysis and visualization. Whether you're looking to optimize simulation workflows, design scalable data architectures, implement data governance, or bridge the gap between research and production - I'd love to hear from you!
๐ซ Reach out: [email protected] | LinkedIn | Portfolio