Hey, I'm Daniel
Technical Leader · ML Engineer · Bioinformatics Scientist
I'm a machine learning engineer and technical leader with a BS in Computer Science from DePaul University and an MS in Data Science from the University of Chicago. I specialize in designing and scaling deep learning systems for large-scale bioinformatics problems. I'm focused on building high-impact AI teams and translating cutting-edge research into real-world solutions.
Connect
Find me online
Check out my work, models, and open-source projects.
Build
Agents
AI agents I've built — coming soon.
Craft
Prompts
Prompts I use for ML workflows — click to expand and copy.
# Role & Objective
Act as a Senior Data Scientist. I am evaluating a binary classification model built using
Python and scikit-learn. I need you to generate [CHOOSE ONE: the Python evaluation code / a detailed
analysis of the provided metric outputs] based on the configuration below.
# Context
- Project/Objective: [e.g., Predicting customer churn for a SaaS platform]
- Positive Class (1): [e.g., Churned]
- Negative Class (0): [e.g., Retained]
- Class Imbalance: [e.g., Highly imbalanced, 90% Negative / 10% Positive]
- Cost of Errors: [e.g., False Positives trigger unnecessary retention offers (costly). False Negatives are missed churn events (lost revenue).]
# Required Evaluation Metrics
Please structure your response to address the following metrics. For each, explain what the results indicate regarding the specific objective outlined above.
- Cumulative Gains Curve: To assess how effectively we can prioritize the top percentiles of our data based on model confidence.
- ROC Curve & AUC: To evaluate the general baseline separation of the model against a random guesser.
- Confusion Matrix (at threshold = [e.g., 0.5]): To view the exact distribution of False Positives vs. False Negatives.
- Classification Report: To review Precision, Recall, and F1-scores for both individual classes.
- Precision-Recall (PR) Curve & AUC-PR: To evaluate performance specifically on the positive class, given the data balance, and ensure the model isn't generating excessive False Positives.
- Lift Curve: To determine exactly how many times better the model is than a random guess at specific data deciles.
- Brier Score (or Log Loss / Cross-Entropy): To measure probability calibration and verify that the output scores are mathematically reliable for automated downstream pipelines.
- Kolmogorov-Smirnov (KS) Statistic: To measure the maximum degree of separation between the positive and negative class probability distributions.
# Input Data / Instructions
sklearn.metrics,
matplotlib, and scikit-plot or yellowbrick to
calculate and plot all the metrics requested above. Assume I have y_true,
y_pred (hard predictions), and y_proba (probability
predictions) already loaded in my pipeline.
[PASTE METRIC OUTPUTS/SCORES HERE]
# Role & Objective
Act as a Senior Data Scientist. I am evaluating a multi-class classification model built
using Python and scikit-learn. I need you to generate [CHOOSE ONE: the Python evaluation code / a detailed
analysis of the provided metric outputs] based on the configuration below.
# Context
- Project/Objective: [e.g., Categorizing support tickets by severity level for automated routing]
- Classes:
- Class 0: [e.g., Critical / Outage]
- Class 1: [e.g., High Priority]
- Class 2: [e.g., Medium Priority]
- Class 3: [e.g., Low / Informational]
- Class Imbalance: [e.g., Moderate imbalance. Classes 2 and 3 make up 70% of the data; Classes 0 and 1 make up 30%.]
- Cost of Errors: [e.g., Misclassifying a 'Critical' ticket as 'Low' causes SLA breaches. Confusing 'High' with 'Medium' is suboptimal but manageable.]
# Required Evaluation Metrics
Please structure your response to address the following metrics. For each, explain what the results indicate regarding the specific objective and cost of errors outlined above.
- Classification Report (Macro & Weighted Averages): To review Precision, Recall, and F1-scores for individual classes, and assess how the model handles minority classes via the Macro average.
- N x N Confusion Matrix: To identify specific misclassifications and class overlap (e.g., which specific categories are being confused with each other).
- One-vs-Rest (OvR) ROC AUC: To evaluate how well the model isolates each individual class from the rest of the dataset.
- Cohen's Kappa: To verify that the model's overall accuracy is driven by genuine learning rather than random chance or majority-class bias.
- Multi-Class Log Loss: To measure probability calibration and ensure
the
predict_probadistributions across all classes are mathematically reliable.
# Input Data / Instructions
sklearn.metrics,
matplotlib, and seaborn (for the heatmap) to calculate and
plot all the metrics requested above. Ensure parameters like
average='macro' or multi_class='ovr' are set correctly. Assume
I have y_true, y_pred (hard predictions), and
y_proba (probability predictions array of shape
(n_samples, n_classes)) already loaded.
[PASTE METRIC OUTPUTS/SCORES HERE]
# Role & Objective
Act as a Senior Data Scientist specializing in Time-Series Forecasting. I am trying to determine the optimal architectural approach for a forecasting problem. I need to decide whether to use a traditional parametric model (e.g., ARIMA, SARIMA, VAR) or a non-parametric deep learning model (e.g., LSTM, GRU).
Please generate [CHOOSE ONE: the Python diagnostic code / a detailed analysis of the provided diagnostic outputs] based on the configuration below.
# Context
- Project/Objective: [e.g., Forecasting weekly product demand across regional warehouses to optimize inventory allocation.]
- Data Frequency: [e.g., Monthly data]
- Data Volume: [e.g., 20 years of historical data, ~240 observations per series]
- Multivariate/Univariate: [e.g., Multivariate — predicting demand while using promotional spend and weather data as exogenous features]
# Required Evaluation Diagnostics
Please structure your response to address the following diagnostics. For each, explain what the results indicate regarding the choice between a parametric model and a neural network (LSTM).
- Stationarity Checks (ADF & KPSS Tests): To determine if the data has a unit root or deterministic trend, and if standard differencing can make it stationary.
- ACF & PACF Analysis: To assess the strength and clarity of linear dependencies across lags.
- Seasonal Decomposition (STL): To evaluate the complexity of the seasonal patterns and the size/structure of the residual noise.
- Volatility / Non-Linearity (ARCH/Ljung-Box): To detect conditional heteroskedasticity or complex non-linear patterns that standard linear models cannot capture.
- Dimensionality & Viability Check: Based on the data volume and number of features provided in the context, explicitly state whether an LSTM is prone to overfitting here, or if the dataset is robust enough to support deep learning.
# Input Data / Instructions
statsmodels, pandas,
and matplotlib. The code should calculate the ADF and KPSS
p-values, plot the ACF/PACF, perform an STL decomposition, and run an ARCH test
for volatility. Assume I have a pandas DataFrame df with a datetime index
and my target variable in a column named target.
[PASTE METRIC OUTPUTS/SCORES/PLOT DESCRIPTIONS HERE]
# Role & Objective
Act as a Senior Data Scientist. My core objective is to answer the question: "What is the best, most natural number of groups (k) found in this dataset?" I am using unsupervised learning (K-Means clustering) to segment my data. I need you to write complete, executable Python code that scales the data, determines the optimal number of clusters by mathematically calculating the inflection point of an Elbow Curve, generates the final model, and labels the dataset.
# Context
- Project/Objective: [e.g., Discovering natural customer segments based on purchasing behavior and engagement metrics.]
- Data Characteristics: [e.g., 5 continuous numeric features: Avg Order Value, Purchase Frequency, Days Since Last Purchase, Session Duration, and Pages Viewed. No missing values.]
- Business Value: [e.g., By accurately segmenting customers, we can create tailored marketing campaigns for each group rather than using a one-size-fits-all strategy.]
# Required Workflow & Outputs
Please generate a single, cohesive Python script that performs the following steps:
- Pre-processing: Apply
StandardScalerto the features, as K-Means is a distance-based algorithm and requires normalized data. - Determine Optimal 'k' (Elbow Method & Silhouette):
- Iterate through a range of clusters (e.g., k=2 to k=10).
- Calculate the Within-Cluster Sum of Squares (WCSS/Inertia) for each 'k'.
- Calculate the Silhouette Score for each 'k' as a secondary validation metric.
- Calculate the Inflection Point: Do not rely on visual inspection.
Programmatically calculate the exact inflection point (the "elbow") of the WCSS
curve. You may use the
kneedlibrary (KneeLocator) or a distance-to-line mathematical implementation. - Visualization (
seaborn):- Create a professional
seaborn(sns) line plot of the Elbow Curve (WCSS vs. Number of Clusters). - Add a vertical dashed line or a distinct marker on the plot to explicitly highlight the calculated inflection point.
- Create a professional
- Final Model & Labeling:
- Instantiate a final K-Means model using the programmatically calculated optimal 'k'.
- Fit the model and predict the cluster labels.
- Append these labels as a new column named
Cluster_Labelto the original dataframe.
- Cluster Profiling: Group the original dataframe by
Cluster_Labeland calculate the mean for each feature. Print this summary table so I can interpret the real-world characteristics of each group.
# Input Data / Instructions
pandas, scikit-learn,
matplotlib.pyplot, and seaborn. Assume my data is already
loaded into a pandas DataFrame named df and the features I want to cluster
on are in a list named features_to_cluster.
[PASTE CLUSTER PROFILE / SILHOUETTE SCORES HERE]