I'm Daisy Chebet, a data Scientist and operations automation enthusiast with a background in Biostatistics. With 5+ years of experience, I specialize in transforming messy, complex datasets into clear insights and streamlined workflows. My work bridges data science and operational impact. From building production-ready algorithms to designing intuitive data tools that drive real-world decision-making.
My experience spans NGOs, startups, and independent consulting projects, where I've led data initiatives that improved efficiency, supported field operations, and enabled evidence-based planning. I'm passionate about using data to solve meaningful problems, especially those tied to sustainability, public health, and social impact.
Outside of work, I’m a quiet thinker who finds joy in running, painting, exploring new tech tools, and helping others learn without fear. This portfolio is a growing reflection of my journey from apprentice to impact driven analyst, and now, a confident, independent and purpose-driven data scientist.
Have fun browsing through the content, thank you for visiting!
In reality, I’m building predictive models that learn from patterns in data to recommend smarter decisions.
While I wish I was this awesome, the truth is... I spend most of my time cleaning data, writing SQL queries, and waiting for models to finish training 😅.
Services
Data Visualization & Dashboarding
I design intuitive and insightful dashboards using tools like Plotly, Seaborn, Tableau, and Google Data Studio. Whether you need executive summaries or operational tracking, I tailor visuals to match your goals.
Data Cleaning & Automation
Tired of messy spreadsheets? I build Python scripts that clean, organize, and format your data automatically saving hours of manual work and reducing errors. Ideal for annotation pipelines, audits, or client-facing reports.
Predictive Modeling
From health risk prediction to resource optimization, I develop models that turn historical data into actionable forecasts. I use scikit-learn, XGBoost, and statsmodels to build reliable, interpretable models.
SQL & Data Pipeline Automation
I automate SQL workflows from cleaning and processing data to pushing it to BigQuery and integrating with lightweight data engineering tools. I specialize in building reproducible pipelines that support decision-making and reduce manual work.
Tools & Technical Skills
- Languages: Python, SQL, R
- Libraries: Pandas, NumPy, Scikit-learn, Matplotlib, Seaborn, XGBoost
- Visualization: Tableau, Power BI, Plotly, Google Data Studio
- Version Control: Git, GitHub
- Other Tools: Jupyter, Google Colab, Excel, BigQuery, Azure, ArcGIS