Applied Data Analytics for Impact

Posts

Showing posts from September, 2025

Predictive Modelling Without Sensitive Attributes or Sensitive Text Signals

September 26, 2025

Introduction Predictive models often perform best when given more data. But more data is not always better data. Sensitive attributes such as exact age, location, income, or raw text signals can boost short term accuracy while quietly increasing privacy risk, bias, and governance complexity. In many cases, these features are included because they are available, not because they are essential. The real challenge is building predictive models that remain accurate, explainable, and defensible without relying on sensitive attributes or raw text . Why eliminating sensitive attributes is important Models influence decisions at scale. When sensitive features are used directly: models become harder to audit and explain bias and proxy discrimination risks increase feature access becomes difficult to justify model reuse and sharing are restricted By contrast, privacy aware predictive modelling: reduces ethical and legal risk improves long term maintainability encou...