Hi, I'm DAISY CHEBET

Impact-Driven Data Scientist | Operations Automation Specialist @Daisythedatascientist

WHAT I DO

I'm Daisy Chebet, a data Scientist and operations automation enthusiast with a background in Biostatistics. With 5+ years of experience, I specialize in transforming messy, complex datasets into clear insights and streamlined workflows. My work bridges data science and operational impact. From building production-ready algorithms to designing intuitive data tools that drive real-world decision-making.

My experience spans NGOs, startups, and independent consulting projects, where I've led data initiatives that improved efficiency, supported field operations, and enabled evidence-based planning. I'm passionate about using data to solve meaningful problems, especially those tied to sustainability, public health, and social impact.

Outside of work, I’m a quiet thinker who finds joy in running, painting, exploring new tech tools, and helping others learn without fear. This portfolio is a growing reflection of my journey from apprentice to impact driven analyst, and now, a confident, independent and purpose-driven data scientist.

Have fun browsing through the content, thank you for visiting!

What My Parents Think I Do

While I wish I was this awesome, the truth is... I spend most of my time cleaning data, writing SQL queries, and waiting for models to finish training 😅.

Services

Data Visualization & Dashboarding

I design intuitive and insightful dashboards using tools like Plotly, Seaborn, Tableau, and Google Data Studio. Whether you need executive summaries or operational tracking, I tailor visuals to match your goals.

Data Cleaning & Automation

Tired of messy spreadsheets? I build Python scripts that clean, organize, and format your data automatically saving hours of manual work and reducing errors. Ideal for annotation pipelines, audits, or client-facing reports.

Predictive Modeling

From health risk prediction to resource optimization, I develop models that turn historical data into actionable forecasts. I use scikit-learn, XGBoost, and statsmodels to build reliable, interpretable models.

SQL & Data Pipeline Automation

I automate SQL workflows from cleaning and processing data to pushing it to BigQuery and integrating with lightweight data engineering tools. I specialize in building reproducible pipelines that support decision-making and reduce manual work.

Tools & Technical Skills

  • Languages: Python, SQL, R
  • Libraries: Pandas, NumPy, Scikit-learn, Matplotlib, Seaborn, XGBoost
  • Visualization: Tableau, Power BI, Plotly, Google Data Studio
  • Version Control: Git, GitHub
  • Other Tools: Jupyter, Google Colab, Excel, BigQuery, Azure, ArcGIS