Kai's Data Science Blog

Hope you good luck in your search!

My MSc Journey at LSHTM

This post shares my two publications at LSHTM, showcasing my two student roles and achievements: Student and Digital Content Ambassador Student Liason Officer at the Vaccine Centre Songs That Accompany Our Study In this blog post, I share the diverse music landscape shaped by my MSc Health Data Science classmates from across the globe. This post is also being highlighted on the World Music Day, 21th June 2024, by the LHSTM channel of Instagram and Facebook.
2024-06-13

Where COVID Has Shifted Flu and RSV Seasons

This post shares a publication on Think Global Health, showcasing collaborative efforts with my teammamtes: Wan-Jen Lee, Giovanni Jacob, Dhihram Tenrisau, and Caitlynne McGaff. Read our article by navitagting to Think Global Health. Data Visualisations Epicurve in England This graph has been recreated by Editor Allison Krugman. Selected Heatmaps Heatmaps showing the log hospitalization rates for influenza and RSV throughout each season. Darker shades means higher hospitalization rate. This graph has been recreated by Editor Allison Krugman.
2024-05-13

The Beginnig of the End: What Is the Expected Notification Rate of the Polio Surveillance Indicator?

This post presents my research proposal for my MSc thesis: “Predicting the rate of Acute Flaccid Paralysis in different settings to support polio surveillance and elimination”. You will learn why non-polio Acute Flaccid Paralysis (NPAFP) is important in support of polio eradication. The current challenge involves interpreting the AFP indicator that exceed the targeted levels. Potential data sources and modelling methods are raised to fill this knowledge gap. Aim and Objectives Research Aim To investigate and model the mechanisms underlying the notification of the key polio surveillance indicator - NPAFP rates in endemic and outbreak settings.
2024-05-03

World Immunisation Week 2024

For poeple doing vaccination works, their world wareness day is not a single day - it’s a week, spanning from 24th to 30th April each year. This year, WHO celebrate 50 years of the “Expanded Programme on Immunization (EPI)”. In LSHTM’s Vaccine Centre (VaC), our world immunisation week (WIW) topic this year is “responding to outbreaks”. We want to highlight the collective action needed to protect people from outbreaks and vaccine-preventable diseases.
2024-04-22

Equity vs. Equality: Modelling Vaccine Distribution Strategies for Outbreak Response Decision-Making

This group project is a collective effort, with contributions from my teammates Minn Thit Aung, Polly Nightingale, Simon Kent, and Xavier Dunn, listed alphabetically. This post aims to: Determine the the basic reproduction number (R0) of an hypothetical outbreak Assess the impact of various strategies for vaccination and school closures on accumulative cases and peak cases/timing Provide instructions for constructing a SEIR model using Berkeley Madonna Offer simple R code for converting a ggplot into a GIF Setting the Scene Many countries have recently experienced the first wave of an influenza pandemic caused by the strain HuNz.
2024-03-30

Survival Analysis in Electronic Health Records Data

How do you apply time-to-event analysis to compare the impact of different prescriptions on death? This article examines the survival function of two prescriptions using Kaplan-Meier and Cox models in an electronic health records (EHR) setting. EHR data are powerful real-world data. They are conducive to time-to-event analysis owing to the characteristic of sequential visits to primary and secondary care services. Take UK’s OpenSAFELY for instance, this secure, transparent, and open-source platform provides an Trusted Research Environment (TRE) for National Health Service (NHS) EHR data analysis, which supported urgent research into the COVID-19 emergency.
2024-03-23

Predicting Remission Status in Healthcare: A Comparative Analysis of Lasso, Random Forest, and Adaboost Machine Learning Models

In this article, we will explore the following topics: Using regularized method (Lasso) for predictive variable selection Tuning hyperparameters for tree-based methods Employing the weighted sum of weak learners for boosted classifier Comparing prediction performances and predictors importance Basic Methods Inmagine you possess a dataset comprising 30 biomarker varaibles with 5000+. How would you use it to predict patient’s remission status, i.e. remission or active disease? One common approach that may cross your mind is the logistic regression, as illustrated below:
2024-02-10

Minimizing data linkage error in an ETL pipeline using R: an intersection of MIMIC III and ODK database

What can you learn from this article? Understand the concepts of data linkage, especially deterministic linakge. Address linkage error in the conjunction of MIMIC III (served in a postgreSQL database) and ODK database. Employ R to design the Extract, Transform, and Load (ETL) pipeline. Use Quarto document to generate a report in PDF format. Concepts of Data Linkage In a data scientist’s typical day, the merge/join function is an inevitable task.
2024-01-06

Assessing the Non-inferiority of the Single-dose Human Papillomavirus Vaccination Schedule: A Hypothetical Multi-country Cohort Study

What can you learn from this article? An experimental cohort study design for policy evaluation, where countries made the move from a two-dose to single-dose schedule for national HPV vaccination programmes since 2023. Introduction Since the first licensing of the Human Papillomavirus (HPV) vaccine in 2006, evidence has been emerging showing that single-dose schedules provide comparable efficacy to the conditional regimens, i.e. two or three doses. In 2022, a review of World Health Organization (WHO) Strategic Advisory Group of Experts on Immunization (SAGE) concluded that a single-dose HPV vaccine delivers solid protection against HPV.
2023-12-15

Optimizing Data Protection: Leveraging Information Governance Principles from GDPR into Research Planning

This article delivers two key insights: Advantages of applying UK GDPR in research planning. Five approaches to safeguard data protection in lung cancer patient interviews. Setting th Scene The need of assessing the Quality of Life (QoL) in patients with lung cancer undergoing chemotherapy has been increasing. After treatment, patients may experience breathlessness or fatigue, along with potential challenges in their daily and occupational functioning. These side effects of chemotherapy consistently rank as a common complaint among patients.
2023-10-30