Data Scientist, M6 Group
Nov 2018 - Present
Paris Area, France
Lookalike Audience Extension
- Build a general purpose Machine Learning platform based on Apache Spark framework.
- Build a cross-platform (desktop, mobile and TV) lookalike audience extension system serving millions of users.
- Build a content recommender system based on the general purpose ML platform.
- Mentor junior and mid-level data scientists.
Data Scientist @alittlemarket.com, Etsy
Apr 2014 – Sep 2017, 3 yrs 6 mos
Paris Area, France
- Created the first Machine Learning powered application on alittlemarket.com.
- Designed and developed a user behavioral data collecting system with Node.js and Elasticsearch.
- Developed a unique item purchase predictive model in Python(scikit-learn) and an Algolia reranking process.
- Deployed the predictive model in production as API using Flask.
- Designed and monitored Key Performance Indicators in Google Analytics and internal tools.
- Increased revenue twice by 10% and 5% (A/B test).
- Implemented a content-based recommender system with Spark Scala for millions of unique items.
- Parallelized batch jobs by using MapReduce programming model.
- Reduced time complexity from \( O(n^2) \) to \( O(n^2/k) \).
- Built a pipeline in Python for extraction and spelling correction of e-commerce expressions in French.
- Built a parallelization tool to reduce processing time with Python multiprocessing.
Sorbonne University, Pierre and Marie Curie campus (Jussieu campus)
Master’s degree, Artificial Intelligence and Decision, 2014
Engineer’s degree, Computer Networks and Multimedia Communications, 2013
Baby health (2017): Created a web application that calculates a 0-5 year old baby’s Body Mass Index and ranks their height and weight with babies around the world using World Health Organization open data.
Home intelligence (2016): Built an automatic lighting control system using Internet of Things.
Additional Experience and Awards
Finalist, Meilleur Dev de France
- Participated in Meilleur Dev de France 2018, an algorithm hackathon, being one of the 140 finalists out of 2000 developers.
Top 12%, Data Science Bowl
- Participated in 2018 Data Science Bowl, a computer vision challenge and ranked within top 12% (427th of 3634).
Top 6% (bronze medal), Kaggle competition, “Predicting Red Hat Business Value”, 2016
- Created a classification algorithm that accurately identifies which customers have the most potential business value for Red Hat based on their characteristics and activities.
- Programming Language: Python (Scikit-learn, numpy, scipy, pandas, TensorFlow, Keras, matplotlib, flask); Scala (Spark, zeppelin); Java; PHP; Node.js
- Data Storage: MySQL; Elasticsearch; Cassandra; Redis; AWS S3; Algolia
Activities in 2018
I spent most of my time supporting my family in 2018. In my spare time, I continued learning and practicing.
One of the 140 finalists of Meilleur Dev de France 2018 #MDF18 out of 2000
23 Oct 2018
Participated in Meilleur Dev de France 2018, an algorithm hackathon, being one of the 140 finalists out of 2000 developers.
Data Structures and Algorithms Training
Apr 2018 - Jul 2018, 4 mos
Training on hundreds of algorithmic coding problems, focusing on time and space optimization.
2018 Data Science Bowl
Mar 2018 - Apr 2018, 1 mo
Participated in 2018 Data Science Bowl, a challenge of automating nucleus detection and ranked within top 12% (427th of 3634).
Deep Learning Specialization, deeplearning.ai
Jan 2018 - Mar 2018, 3 mos
Completed Deep Learning, a 5-course specialization by deeplearning.ai on Coursera.