Senior Data Scientist | AI/ML Consultant ? Based in Munich, Germany | ? Available Remotely ? Expertise Highlights: With over six years of experien
Aktualisiert am 22.11.2024
Profil
Freiberufler / Selbstständiger
Remote-Arbeit
Verfügbar ab: 01.12.2024
Verfügbar zu: 100%
davon vor Ort: 0%
Data Scientist
Portuguese
Muttersprache
English
Fluent
Spanish
Conversational
German
Grundkenntnisse
French
Grundkenntnisse

Einsatzorte

Einsatzorte

Deutschland, Schweiz
möglich

Projekte

Projekte

4 years 9 months
2020-03 - now

Conducted economic viability analyses

Senior Data Scientist Consultant Spark Snowflake SQL ...
Senior Data Scientist Consultant
  • Conducted economic viability analyses to guide strategic decisions and optimize resource allocation, integrating real-time data pipelines using Spark, Snowflake, and SQL.
  • Ideated and developed AI-driven product solutions tailored to client needs, leveraging LangChain and LlamaIndex for LLM application development and knowledge integration.
  • Designed and deployed Retrieval Augmented Generation (RAG) pipelines to improve information retrieval and contextual responses, using Pinecone and Weaviate for vector search optimization.
  • Built and fine-tuned Large Language Models (LLMs) for chatbot applications, utilizing OpenAI APIs, prompt engineering, and generative AI frameworks.
  • Created scalable RESTful APIs with FastAPI to deploy AI and ML services, ensuring seamless integration with existing systems.
  • Deployed generative AI models in production, focusing on quantization, inference optimization, and deployment using PyTorch, TensorFlow, and Hugging Face Transformers.
  • Applied computer vision techniques for creative tagging and performance analysis, leveraging CLIP, Deep Learning, and Keras, driving campaign ROI improvements.
  • Designed and implemented MLOps workflows, including monitoring, versioning, and scaling AI systems with Kubernetes, Docker, and cloud platforms like AWS EKS, GCP, and Azure.
  • Developed robust data pipelines for real-time analytics and machine learning workflows, integrating BigQuery, DBT, and Databricks to enhance performance and scalability.
  • Integrated vector databases (e.g., Pinecone, Weaviate) and embeddings for advanced semantic search and retrieval in AIdriven applications.
  • Collaborated with stakeholders across engineering, product, and business teams to define AI strategy and ensure alignment with business objectives.
  • Created comprehensive documentation and implemented monitoring solutions for ML systems using Datadog and other observability tools.
  • Optimized database performance for large-scale applications, supporting SQL databases and NoSQL solutions to handle high-volume AI workflows.
Spark Snowflake SQL PyTorch TensorFlow Hugging Face Transformers Docker AWS EKS GCP Azure NoSQL LangChain LlamaIndex Pinecone Weaviate LLMs CLIP Deep Learning Keras Kubernetes BigQuery DBT Databricks Datadog
3 months
2024-08 - 2024-10

Healthcare Chatbot

Lead Data Scientist LLM Generative AI Python
Lead Data Scientist

  • Ideated and developed AI-driven product solutions tailored to client needs, leveraging and for LLM application development and knowledge integration.
  • Designed and deployed Retrieval Augmented Generation () pipelines to improve information retrieval and contextual responses, using and for vector search optimization.
  • Built and fine-tuned Large Language Models for chatbot applications

LLM Generative AI Python
Healthcare
8 months
2024-01 - 2024-08

Probabilistic Attribution

Lead Data Scientist Python LightGBM Causal Inference
Lead Data Scientist

  • Addressed third-party cookie deprecation by using probabilistic clustering in full funnel attribution, using to pipeline data to enhance conversion tracking and data accuracy.
  • Leveraged causal inference (double robust learners) to refine conversion tracking, boosting accuracy and stakeholder confidence on top of clustering output.

Python LightGBM Causal Inference
3 years 8 months
2021-01 - 2024-08

Developed and architected very complex real-time ML systems

Senior Data Scientist
Senior Data Scientist
  • Developed and architected very complex real-time ML systems multiple times to production as a team.
  • Developed and managed an automated bidding model optimized to maximise revenue/margin, generating a 6% uplift in spend and decrease in CPA.
  • Developed and managed an automated text generation models (keywords, ad copy) using classic NLP and LLM techniques (self-hosting and external, prompt engineering, persuasion techniques).
  • Led Research Project on Probabilistic Attribution: Addressed third-party cookie deprecation by using probabilistic clustering in full funnel attribution, using Spark to pipeline data to enhance conversion tracking and data accuracy.
  • Advanced Attribution with Causal Inference: Leveraged causal inference (double robust learners ? econml) to refine conversion tracking, boosting accuracy and stakeholder confidence on top of clustering output.
  • Introduced ?campaign pacing? as a KPI: Developed, A/B tested and deployed with Docker/Kubernetes a linear regressionbased algorithm, achieving a 3.7% YoY revenue increase and enhancing pacing KPIs by 35%.
  • Successfully led A/B testing for new product features, driving improvement in core company KPIs.
  • Collaborated with cross-functional teams to enable data-driven decision-making and stakeholder buy-in.
  • Recruited elements for the DS team.
RADANCY
Remote DE
4 months
2024-01 - 2024-04

Campaign Pacing Optimization

Lead Data Scientist
Lead Data Scientist

  • Introduced ?campaign pacing? as a KPI: Developed, A/B tested and deployed with Docker/Kubernetes a linear regression-based algorithm, achieving a 3.7% YoY revenue increase and enhancing pacing KPIs by 35%.
  • Successfully led A/B testing for new product features, driving improvement in core company KPIs.
  • Collaborated with cross-functional teams to enable data-driven decision-making and stakeholder buy-in.

3 months
2023-04 - 2023-06

LLM Fine Tuning

Lead Data Scientist Python Pytorch LLM ...
Lead Data Scientist

  • Deployed generative AI models in production, focusing on quantization, inference optimization, and deployment using PyTorch, TensorFlow, and Hugging Face Transformers.
  • Applied computer vision techniques for creative tagging and performance analysis, leveraging CLIP, Deep Learning, and Keras, driving campaign ROI improvements.
  • Designed and implemented MLOps workflows, including monitoring, versioning, and scaling AI systems with Kubernetes, Docker, and cloud platforms like AWS EKS, GCP, and Azure.
  • Developed robust data pipelines for real-time analytics and machine learning workflows, integrating BigQuery, DBT, and Databricks to enhance performance and scalability.

Python Pytorch LLM Docker Kubernetes AWS BigQuery DBT Databricks
1 year 1 month
2022-01 - 2023-01

Adwords Optimization

Lead Data Scientist Python NLP LLM
Lead Data Scientist

  • Developed and architected very complex real-time ML systems multiple times to production as a team.
  • Developed and managed an model to /margin, generating a 6% in spend and decrease in .
  • Developed and managed an models (keywords, ad copy) using classic NLP and LLM techniques (self-hosting and external, prompt engineering, persuasion techniques).

Python NLP LLM
1 year 1 month
2022-01 - 2023-01

Risk Adjusted Portfolio Optimization

Lead Data Scientist
Lead Data Scientist

  • Developed a system to optimize portfolio risk: key risk KPIs (implied volatility, maximum drawdown) decreased on average 7.6%.
  • Delivered a system for option pricing using deep learning, leading to an improvement in average daily returns of 2.7%.
  • Managed end-to-end data processing, enhancing system performance through effective sourcing, preprocessing, and partitioning for model training and inference.
  • Utilized NLP techniques to analyze tweets, creating a feature store (embedding) for machine learning models that enhanced stock market understanding.
  • Developed a back and forward testing framework.

8 months
2020-06 - 2021-01

Developed a system to optimize portfolio risk

Data Scientist
Data Scientist
  • Developed a system to optimize portfolio risk: key risk KPIs (implied volatility, maximum drawdown) decreased on average 7.6%.
  • Delivered a system for option pricing using deep learning, leading to an improvement in average daily returns of 2.7%.
  • Managed end-to-end data processing, enhancing system performance through effective sourcing, preprocessing, and partitioning for model training and inference.
  • Utilized NLP techniques to analyze tweets, creating a feature store (embedding) for machine learning models that enhanced stock market understanding.
  • Developed a back and forward testing framework.
AXOVISION
1 year 8 months
2018-11 - 2020-06

develop a data pipeline

Junior Data Scientist
Junior Data Scientist
  • Collaborated with the team to develop a data pipeline for IoT data from vehicles (around 10.000 vehicles) amounting to around 100GB daily vehicle sensor data using Spark Scala from scratch.
  • Developed a linear-regression model for driving ranking based on fuel efficiency.
  • Started the groundwork for a predictive maintenance system, doing extensive data analysis and connected stakeholders from non-technical backgrounds.
DAIMLER
Lisbon, PT
4 months
2016-06 - 2016-09

various data cleaning procedures

Data Scientist (Intern) Python Pandas Airflow ...
Data Scientist (Intern)
  • Collaborated with the team to engineer various data cleaning procedures using Python, Pandas and Airflow of financial and transactional data.
  • Developed a NER system to identify key information in financial documents and PDFs, using OCR and Python.
Python Pandas Airflow PDFs OCR
EY
Dublin, IE

Aus- und Weiterbildung

Aus- und Weiterbildung

2017

CS Machine Learning & Robotics

MSc

Instituto Superior Técnico, University of Lisbon, Lisbon,PT

Kompetenzen

Kompetenzen

Top-Skills

Data Scientist

Produkte / Standards / Erfahrungen / Methoden

Technical:

  • Pandas
  • Scikit-learn
  • Keras
  • Tensorflow
  • CLTV
  • Churn
  • Python
  • BigQuery
  • DBT
  • Snowflake
  • Dash
  • Model Interpretability
  • Tree models
  • AWS
  • GCP
  • Databricks
  • Scala
  • pySpark
  • Spark
  • Kubernetes
  • Docker
  • Excel
  • Datadog
  • Looker
  • Tableau
  • A/B Testing
  • KPI
  • Optimization
  • MLIP
  • LP
  • Deep Learning
  • XGBoost
  • LightGBM

Einsatzorte

Einsatzorte

Deutschland, Schweiz
möglich

Projekte

Projekte

4 years 9 months
2020-03 - now

Conducted economic viability analyses

Senior Data Scientist Consultant Spark Snowflake SQL ...
Senior Data Scientist Consultant
  • Conducted economic viability analyses to guide strategic decisions and optimize resource allocation, integrating real-time data pipelines using Spark, Snowflake, and SQL.
  • Ideated and developed AI-driven product solutions tailored to client needs, leveraging LangChain and LlamaIndex for LLM application development and knowledge integration.
  • Designed and deployed Retrieval Augmented Generation (RAG) pipelines to improve information retrieval and contextual responses, using Pinecone and Weaviate for vector search optimization.
  • Built and fine-tuned Large Language Models (LLMs) for chatbot applications, utilizing OpenAI APIs, prompt engineering, and generative AI frameworks.
  • Created scalable RESTful APIs with FastAPI to deploy AI and ML services, ensuring seamless integration with existing systems.
  • Deployed generative AI models in production, focusing on quantization, inference optimization, and deployment using PyTorch, TensorFlow, and Hugging Face Transformers.
  • Applied computer vision techniques for creative tagging and performance analysis, leveraging CLIP, Deep Learning, and Keras, driving campaign ROI improvements.
  • Designed and implemented MLOps workflows, including monitoring, versioning, and scaling AI systems with Kubernetes, Docker, and cloud platforms like AWS EKS, GCP, and Azure.
  • Developed robust data pipelines for real-time analytics and machine learning workflows, integrating BigQuery, DBT, and Databricks to enhance performance and scalability.
  • Integrated vector databases (e.g., Pinecone, Weaviate) and embeddings for advanced semantic search and retrieval in AIdriven applications.
  • Collaborated with stakeholders across engineering, product, and business teams to define AI strategy and ensure alignment with business objectives.
  • Created comprehensive documentation and implemented monitoring solutions for ML systems using Datadog and other observability tools.
  • Optimized database performance for large-scale applications, supporting SQL databases and NoSQL solutions to handle high-volume AI workflows.
Spark Snowflake SQL PyTorch TensorFlow Hugging Face Transformers Docker AWS EKS GCP Azure NoSQL LangChain LlamaIndex Pinecone Weaviate LLMs CLIP Deep Learning Keras Kubernetes BigQuery DBT Databricks Datadog
3 months
2024-08 - 2024-10

Healthcare Chatbot

Lead Data Scientist LLM Generative AI Python
Lead Data Scientist

  • Ideated and developed AI-driven product solutions tailored to client needs, leveraging and for LLM application development and knowledge integration.
  • Designed and deployed Retrieval Augmented Generation () pipelines to improve information retrieval and contextual responses, using and for vector search optimization.
  • Built and fine-tuned Large Language Models for chatbot applications

LLM Generative AI Python
Healthcare
8 months
2024-01 - 2024-08

Probabilistic Attribution

Lead Data Scientist Python LightGBM Causal Inference
Lead Data Scientist

  • Addressed third-party cookie deprecation by using probabilistic clustering in full funnel attribution, using to pipeline data to enhance conversion tracking and data accuracy.
  • Leveraged causal inference (double robust learners) to refine conversion tracking, boosting accuracy and stakeholder confidence on top of clustering output.

Python LightGBM Causal Inference
3 years 8 months
2021-01 - 2024-08

Developed and architected very complex real-time ML systems

Senior Data Scientist
Senior Data Scientist
  • Developed and architected very complex real-time ML systems multiple times to production as a team.
  • Developed and managed an automated bidding model optimized to maximise revenue/margin, generating a 6% uplift in spend and decrease in CPA.
  • Developed and managed an automated text generation models (keywords, ad copy) using classic NLP and LLM techniques (self-hosting and external, prompt engineering, persuasion techniques).
  • Led Research Project on Probabilistic Attribution: Addressed third-party cookie deprecation by using probabilistic clustering in full funnel attribution, using Spark to pipeline data to enhance conversion tracking and data accuracy.
  • Advanced Attribution with Causal Inference: Leveraged causal inference (double robust learners ? econml) to refine conversion tracking, boosting accuracy and stakeholder confidence on top of clustering output.
  • Introduced ?campaign pacing? as a KPI: Developed, A/B tested and deployed with Docker/Kubernetes a linear regressionbased algorithm, achieving a 3.7% YoY revenue increase and enhancing pacing KPIs by 35%.
  • Successfully led A/B testing for new product features, driving improvement in core company KPIs.
  • Collaborated with cross-functional teams to enable data-driven decision-making and stakeholder buy-in.
  • Recruited elements for the DS team.
RADANCY
Remote DE
4 months
2024-01 - 2024-04

Campaign Pacing Optimization

Lead Data Scientist
Lead Data Scientist

  • Introduced ?campaign pacing? as a KPI: Developed, A/B tested and deployed with Docker/Kubernetes a linear regression-based algorithm, achieving a 3.7% YoY revenue increase and enhancing pacing KPIs by 35%.
  • Successfully led A/B testing for new product features, driving improvement in core company KPIs.
  • Collaborated with cross-functional teams to enable data-driven decision-making and stakeholder buy-in.

3 months
2023-04 - 2023-06

LLM Fine Tuning

Lead Data Scientist Python Pytorch LLM ...
Lead Data Scientist

  • Deployed generative AI models in production, focusing on quantization, inference optimization, and deployment using PyTorch, TensorFlow, and Hugging Face Transformers.
  • Applied computer vision techniques for creative tagging and performance analysis, leveraging CLIP, Deep Learning, and Keras, driving campaign ROI improvements.
  • Designed and implemented MLOps workflows, including monitoring, versioning, and scaling AI systems with Kubernetes, Docker, and cloud platforms like AWS EKS, GCP, and Azure.
  • Developed robust data pipelines for real-time analytics and machine learning workflows, integrating BigQuery, DBT, and Databricks to enhance performance and scalability.

Python Pytorch LLM Docker Kubernetes AWS BigQuery DBT Databricks
1 year 1 month
2022-01 - 2023-01

Adwords Optimization

Lead Data Scientist Python NLP LLM
Lead Data Scientist

  • Developed and architected very complex real-time ML systems multiple times to production as a team.
  • Developed and managed an model to /margin, generating a 6% in spend and decrease in .
  • Developed and managed an models (keywords, ad copy) using classic NLP and LLM techniques (self-hosting and external, prompt engineering, persuasion techniques).

Python NLP LLM
1 year 1 month
2022-01 - 2023-01

Risk Adjusted Portfolio Optimization

Lead Data Scientist
Lead Data Scientist

  • Developed a system to optimize portfolio risk: key risk KPIs (implied volatility, maximum drawdown) decreased on average 7.6%.
  • Delivered a system for option pricing using deep learning, leading to an improvement in average daily returns of 2.7%.
  • Managed end-to-end data processing, enhancing system performance through effective sourcing, preprocessing, and partitioning for model training and inference.
  • Utilized NLP techniques to analyze tweets, creating a feature store (embedding) for machine learning models that enhanced stock market understanding.
  • Developed a back and forward testing framework.

8 months
2020-06 - 2021-01

Developed a system to optimize portfolio risk

Data Scientist
Data Scientist
  • Developed a system to optimize portfolio risk: key risk KPIs (implied volatility, maximum drawdown) decreased on average 7.6%.
  • Delivered a system for option pricing using deep learning, leading to an improvement in average daily returns of 2.7%.
  • Managed end-to-end data processing, enhancing system performance through effective sourcing, preprocessing, and partitioning for model training and inference.
  • Utilized NLP techniques to analyze tweets, creating a feature store (embedding) for machine learning models that enhanced stock market understanding.
  • Developed a back and forward testing framework.
AXOVISION
1 year 8 months
2018-11 - 2020-06

develop a data pipeline

Junior Data Scientist
Junior Data Scientist
  • Collaborated with the team to develop a data pipeline for IoT data from vehicles (around 10.000 vehicles) amounting to around 100GB daily vehicle sensor data using Spark Scala from scratch.
  • Developed a linear-regression model for driving ranking based on fuel efficiency.
  • Started the groundwork for a predictive maintenance system, doing extensive data analysis and connected stakeholders from non-technical backgrounds.
DAIMLER
Lisbon, PT
4 months
2016-06 - 2016-09

various data cleaning procedures

Data Scientist (Intern) Python Pandas Airflow ...
Data Scientist (Intern)
  • Collaborated with the team to engineer various data cleaning procedures using Python, Pandas and Airflow of financial and transactional data.
  • Developed a NER system to identify key information in financial documents and PDFs, using OCR and Python.
Python Pandas Airflow PDFs OCR
EY
Dublin, IE

Aus- und Weiterbildung

Aus- und Weiterbildung

2017

CS Machine Learning & Robotics

MSc

Instituto Superior Técnico, University of Lisbon, Lisbon,PT

Kompetenzen

Kompetenzen

Top-Skills

Data Scientist

Produkte / Standards / Erfahrungen / Methoden

Technical:

  • Pandas
  • Scikit-learn
  • Keras
  • Tensorflow
  • CLTV
  • Churn
  • Python
  • BigQuery
  • DBT
  • Snowflake
  • Dash
  • Model Interpretability
  • Tree models
  • AWS
  • GCP
  • Databricks
  • Scala
  • pySpark
  • Spark
  • Kubernetes
  • Docker
  • Excel
  • Datadog
  • Looker
  • Tableau
  • A/B Testing
  • KPI
  • Optimization
  • MLIP
  • LP
  • Deep Learning
  • XGBoost
  • LightGBM

Vertrauen Sie auf Randstad

Im Bereich Freelancing
Im Bereich Arbeitnehmerüberlassung / Personalvermittlung

Fragen?

Rufen Sie uns an +49 89 500316-300 oder schreiben Sie uns:

Das Freelancer-Portal

Direktester geht's nicht! Ganz einfach Freelancer finden und direkt Kontakt aufnehmen.