STATUS: ACTIVE // Q2_2026_AVAILABILITY

Rigorous
Statistics.
Fast Systems.

I help manufacturing, pharma, defense, and finance companies make better decisions — through rigorous forecasting, statistical analysis, and production-grade ML systems. PhD in Mathematics.

ID_00

About Me

I am a passionate Data Scientist and Systems Engineer with a deep-rooted fascination for High-Performance Computing (HPC) and rigorous statistical methodology.

My journey began with a strong foundation in statistics, leading to a Ph.D. where I explored complex data structures and inference models. However, I quickly realized that theoretical models are only as good as the systems that run them. This drove me to bridge the gap between abstract mathematics and bare-metal performance.

Today, I specialize in dismantling slow, legacy data pipelines (often written in Python or R) and rebuilding them using systems-level languages like Rust and C++, combined with modern analytical engines like DuckDB. My goal is to create data architectures that are not just accurate, but blisteringly fast and infinitely scalable.

> Ph.D. in Statistics> Systems Engineering Advocate> Open Source Contributor
Dr. Simon Müller
ID: SM_0x1A4

Why Clients Engage Me

Challenge_01

Forecasts are unreliable at scale

I build forecasting systems that learn and improve automatically — already proven across 500k+ products, reducing errors by 25–35% and enabling proactive planning.

Challenge_02

Data quality issues erode trust

I create automated monitoring that catches anomalies early and delivers clear reports — so your team can trust the numbers behind every decision.

Challenge_03

Good analysis never reaches production

I turn prototypes into reliable, fast systems your team can depend on daily — not just one-off analyses that sit in someone's laptop.

Challenge_04

Regulated industries demand proof

I deliver validated statistical methods with full audit trails that satisfy regulators — so your quality and compliance teams can sleep at night.

How I Work

Every engagement follows a clear, structured process — so you always know what to expect.

01

Discovery

I listen first. We define the problem, review your data landscape, and agree on what success looks like.

02

Analysis & Design

I explore the data, prototype statistical approaches, and present a clear technical plan — no black boxes.

03

Implementation

I build production-grade systems — tested, documented, and deployed to your infrastructure with CI/CD.

04

Handover & Support

Your team gets full ownership — training, documentation, and ongoing support to keep everything running.

ID_01

Core Expertise

Statistics & ML

Turning complex data into defensible decisions — Bayesian inference, Functional Data Analysis, and ML methods grounded in a PhD in Mathematics.

[R][Python][Julia]

Forecasting

Reducing forecast error and enabling proactive planning — hierarchical demand models, causal inference, and automated retraining deployed at scale.

[R][Python][Rust][SQL]

Software Engineering

Shipping statistical methods as reliable, fast software — production Rust, C++, and Python systems that teams can actually depend on.

[Rust][C++][Python][WASM][Claude Code]
Working Knowledge

Cloud & MLOps

Getting models out of notebooks and into production — automated pipelines, containerization, and monitoring on AWS, Azure, and GCloud.

[AWS][Azure][GCloud][Docker][MLflow][CI/CD][Git]

Data Architecture

Consolidating fragmented data sources into a single source of truth — fast analytical platforms that feed reliable data into models and dashboards.

[DuckDB][SAP][MS Dynamics][Spark]

Visualization

Making results accessible to decision-makers — interactive dashboards and automated reports that translate analytics into action.

[Shiny][Marimo]
ID_02

Enterprise Deployments

Delivering measurable impact — reduced forecast errors, automated pipelines, and data-driven decisions across pharma, defense, energy, manufacturing, and finance.

Forecast Error Reduction
SKU-level, automated monthly retraining
Analysis Acceleration
From prototype to production-ready pipelines
Automated Quality Control
Contract validation across global entities
ACTIVE#Manufacturing#Forecasting#Consulting

Demand Forecasting Strategy & Enablement

Consulting a manufacturing company on demand forecasting in Kinaxis Maestro — selecting forecast metrics and levels, evaluating forecast quality, and maximizing the platform's forecasting capabilities.

Kinaxis Maestro
ACTIVE#CMC#Pharma#GMP

CMC Statistics

Rigorous statistical analysis of spectral data in GMP-validated CMC environments for pharmaceutical manufacturing.

RPython
ACTIVE#Defense#Forecasting#Finance

Financial Forecasting

Automated ML pipeline for Order Intake, Revenue, and Cash Flow prediction, used by the controlling department.

PythonDocker
ACTIVE#Manufacturing#Data Quality

Automated Input Data Quality

Automated data quality pipeline validating input data before it feeds into the forecasting engine, with reporting on AWS S3.

PythonAWS S3Automated Reporting
ACTIVE#Manufacturing#SupplyChain

SKU-Level Demand Forecasting

Production ML pipeline on AWS SageMaker for SKU-level demand forecasting with a clear data pipeline enabling the local data scientist to run reproducible experiments.

PythonAWS SageMakerKinaxisMLflow
ACTIVE#Manufacturing#SupplyChain#Demand Planning#Forecasting

Automated Demand Planning Workflow

Scheduled monthly demand planning pipeline running twice per cycle — one run for APO data and one for Kinaxis data — embedded in a strict demand planning workflow.

PythonAWS SageMakerKinaxisAPOMLflowDocker
#Manufacturing#Forecasting#Supply Chain

Spare Part Demand Forecasting

Specialised forecasting engine for spare parts, handling intermittent demand patterns and optimising safety stock levels to minimise stockouts and improve equipment uptime.

PythonAWSStatistical Modelling
#LaserTech#ERP#DataEng#Risk

Purchasing Risk Analytics

ETL consolidation of MS Dynamics 365 data sources into a unified risk scoring engine for Purchasing and Sales exposure assessment.

PythonMS Dynamics 365SQLPower BI
#Manufacturing#Forecasting

End-of-Month Revenue Forecasting Engine

Rust-to-WebAssembly compiled forecasting engine embedded in Google Sheets, used by controlling for end-of-month revenue forecasting across subsidiaries.

RustWebAssemblyGoogle WorkspaceGitHub Actions
#CMC#GMP#Pharma

Principal CMC Statistician & Quality Strategy

Statistical strategy for BioPharma CMC — rigorous statistical analysis, process validation, and equivalence testing under GMP.

RProcess ValidationRoot Cause Analysis
#Manufacturing#Forecasting

Next-Order Prediction Engine

Containerized predictive analytics pipeline on MS Azure forecasting next-order dates via feature-engineered customer and territory models on automated weekly/monthly schedules.

PythonDockerMS Azure
#Manufacturing#Forecasting

Demand Forecasting

Statistical time-series models in R predicting customer order windows, with dockerized Quarto reporting pipelines deployed on Azure Cloud.

RDockerAzure
#Energy#IoT#Maintenance

Solar Asset Predictive Maintenance

Hybrid predictive maintenance combining photovoltaic generation models with statistical time-series forecasting and R Shiny monitoring dashboards.

RShinyIoT Sensors
#Financial Services#Insurance#Risk

Advanced Analytics Consulting

Established Data Science practice — Credit Risk scoring models and Insurance Pricing engines on MS Azure with team mentoring and agile integration.

PythonRAzure MLSQL Server
#Energy#Forecasting

Smart Meter Big Data Forecasting

Distributed forecasting platform on Azure Databricks processing smart meter readings with Spark MLlib model training.

SparkAzure DatabricksPythonRDeep Learning
#Financial Services#Data Quality

Contract Data Quality Assurance

High-performance R/C++ package implementing Mahalanobis distance and Isolation Forest methods for multivariate outlier detection in global contract data.

RC++

Have a similar challenge?

Whether it's demand forecasting, data quality, or getting statistical models into production — let's talk about how I can help.

Start_Conversation
ID_03

Research & Development

Building the next generation of tools — from inventory optimization to deep learning forecasting — to solve harder problems faster.

ACTIVE#AI#RAG

Magpie

High-performance Rust RAG framework for document ingestion, chunking, embedding, and retrieval with async pipeline orchestration.

Rust
ACTIVE#Supply Chain#Optimization#Inventory Optimization

Inventory Optimization Engine

Deterministic and probabilistic inventory optimization framework in Rust — safety stock, reorder points, replenishment policies, and Monte Carlo simulation.

Rust
ACTIVE#Deep Learning#Forecasting

Chronos in Rust

Rust implementation of Amazon Chronos-2 time-series forecasting on the Burn deep learning framework with multi-backend GPU/CPU inference and finetuning.

RustBurnCUDAWebGPU
ACTIVE#ML

Machine Learning in Rust

Scikit-learn-inspired machine learning library for Rust built on ndarray — preprocessing, trees, ensembles, clustering, and cross-validation pipelines.

Rust

DUCKDB_EXTENSIONS

ACTIVE#DuckDB#LLM#SQL

Anofox Context

LLM-augmented SQL for DuckDB — call any OpenAI-compatible model from SQL to classify, extract, and enrich data with schema-typed results and atomic execution.

C++DuckDBLLM APIs
ACTIVE#DuckDB#What-If Analysis#S&OP

Anofox Scenario

Git-like branching for DuckDB — isolated what-if scenarios with copy-on-write storage, row-level diffs, immutable snapshots, and embedded audit trails in a single .duckdb file.

C++DuckDB
ID_04

Open Source

Giving back to the community — production-tested libraries and tools available on GitHub.

STATISTICS_&_ML

ACTIVE#Finance#ML

Financial Machine Learning Toolkit

Python bindings for the mlfinance AFML toolkit — a high-performance Rust implementation of methods from Advances in Financial Machine Learning by Marcos López de Prado.

RustPython
ACTIVE#Functional Data Analysis

Functional Data Analysis in R

R interface for high-performance Functional Data Analysis (FDA) powered by a Rust core for ultra-fast computation.

RRust
ACTIVE#Finance#Econometrics

Event Study

Comprehensive R package for conducting event studies in finance and economics with high-speed execution.

RC++
ACTIVE#ML#Statistics#Decision Support

Case Based Reasoning for Medicine

Implementation of Case-Based Reasoning (CBR) systems for intelligent decision support and pattern matching.

RC++

DUCKDB_EXTENSIONS

Featured

anofox-forecast

A Rust-native DuckDB extension providing a complete time-series forecasting toolkit via SQL. Integrates 32 models including AutoARIMA, AutoETS, TBATS, MSTL, and intermittent demand methods (Croston, ADIDA, IMAPA). Supports hierarchical time series, expanding/sliding window cross-validation, conformal prediction intervals, changepoint detection, and 76+ tsfresh-compatible feature extraction functions with native DuckDB parallelization.

C++RustDuckDB
PERFORMANCEAutoARIMA executes 912x faster with 1.9x less memory than equivalent Python implementations. Handles millions of series through parallel processing with SQL-native cross-validation and conformal prediction intervals.
query.sql
-- Forecast 10,000 products in one query
SELECT * FROM ts_forecast_by(
'sales', item_id, date, quantity,
'AutoARIMA', 12, '1M',
MAP{'seasonal_period': '12'}
);
ACTIVE#DuckDB#Statistics#SQL

anofox-statistics

DuckDB extension for in-database regression (OLS, Ridge, Elastic Net, Quantile), hypothesis testing, and diagnostics — validated against R's statistical packages.

C++RustDuckDB
ACTIVE#DuckDB#Data Quality#Anomaly Detection

anofox-tabular

DuckDB extension providing 81 SQL functions for data validation, anomaly detection (Isolation Forest, DBSCAN, OutlierTree), PII masking, and data diffing.

C++DuckDB

ROBOTICS

#C++#Robotics#Kalman Filter

Autonomous Robot Localization

Implementation of Kalman and Particle Filters in C++ for real-time robot navigation and environment mapping.

C++Embedded Development
PARTNERS_&_CLIENTS

Trusted by Global
Industry Leaders

Boehringer Ingelheim
Kärcher SE & Co. KG
Hensoldt AG
Festo
Daimler Financial Services
E.ON
Schott
SCANLAB

About Me

I'm Simon Müller — a mathematician turned systems engineer with over a decade of experience helping companies make better decisions through data.

After completing my PhD in Mathematics, I spent years working at the intersection of rigorous statistics and production software — first in academia, then in consulting for some of Europe's largest manufacturers, pharma companies, and financial institutions.

What sets me apart: I don't just build models — I ship them. My clients get production systems they can depend on, not prototypes that need another team to productionize. Whether that's a Bayesian forecasting engine running on AWS SageMaker or a Rust library compiled to WebAssembly running in the browser.

Based in Germany. Available for remote and on-site engagements across Europe.

PhD
Mathematics
12+
Years Experience
13+
Enterprise Projects
5
Industry Sectors
ID_05

Initialize Connection

Have a forecasting, data quality, or statistical challenge? Let's discuss how I can help — from short consulting engagements to full system implementation.

Voice_Channel
+49 160 6393263
Geographic_Node
Biberach an der Riß, DE

Transmission_Form

We use cookies for analytics (Google Analytics) to improve this website. No data is collected without your consent.