Posts

Designing Privacy Aware NLP Pipelines

Image
 Introduction Text data is one of the most privacy sensitive assets organisations hold. Customer feedback, emails, chat logs, and notes often contain names, locations, contact details, or contextual clues that can identify individuals. Unlike structured data, this information is embedded in free text and is easy to overlook during analysis. As NLP becomes more common in analytics, the risk is not misuse of models, but unintentional exposure of personal data through text pipelines . The challenge is building NLP workflows that extract insight without retaining or amplifying sensitive information . Why designing Privacy aware NLP is required NLP pipelines often sit outside traditional governance controls. Text is copied into notebooks. Raw comments are shared for validation. Model outputs inadvertently surface personal details. This creates several risks: analysts gain access to information they don’t need derived datasets become unsafe to share downstream users in...

From Idea to Insight: My Data Analytics Project Journey

Image
Collecting the Data To explore whether consumer behaviour is shifting from traditional retail to quick commerce, I began by identifying data sources that could capture real world intent and trends . Since direct datasets comparing small local shops and quick commerce platforms are limited, I used search behaviour as a proxy for consumer interest . Data Source: Search Trends I used Google Trends to collect time-series data on how frequently different terms are searched over time. Search data is particularly useful because it reflects: What people are actively looking for Changes in consumer intent Emerging behavioural patterns Quick Commerce Keywords To represent quick commerce platforms, I selected: Instamart Blinkit Zepto These platforms are widely used in India and reflect the growing demand for instant delivery services. Traditional Retail Proxies Since there is no direct dataset for small local shops, I used search queries as proxies: “kirana store” “...

From Idea to Insight: My Data Analytics Project Journey

Image
How I Identified a Real World Problem Worth Analysing  Lately, I’ve been noticing a shift in how people shop for everyday items. More and more people are turning to quick commerce platforms like Instamart for convenience. Groceries, snacks, essentials everything arrives within minutes. It’s fast, efficient, and becoming a habit. But at the same time, I started wondering about something else. What’s happening to small local shops? The ones run by a single owner. The ones where everything is within reach, and service is quick because they know their store inside out. These shops used to be part of everyday life. Now, it feels like fewer people are walking in. This made me curious to explore the problem from a data perspective. I want to understand: Are customers actually shifting towards quick commerce? What factors are influencing this behaviour? What impact could this have on small retailers over time? Blog Series: From Idea to Insight This post is part of a series where I’ll docum...

What Senior Data Analysts Actually Do (Beyond Dashboards)

  When people hear “data analyst,” they often imagine dashboards, SQL queries, and spreadsheets. Those skills are important. But at a senior level, the role moves far beyond building reports. A high level data analyst designs systems. They create structure. They ensure that data flows cleanly from source to insight, and that insight translates into better decisions. The focus shifts from producing outputs to building reliable decision infrastructure. This difference is subtle but critical. From Reporting to Decision Architecture At an operational level, analysts answer questions. At a strategic level, analysts define how questions should be answered. Senior analysts think in terms of: What decisions does the organisation need to make? Which metrics truly represent performance? Are definitions consistent across teams? Can this analysis scale beyond a one off request? Instead of reacting to ad hoc reporting needs, they establish frameworks that prevent fragmentat...

The Future of Food Safety Tech: How AI Driven Transparency Can Transform Global Consumer Health

Image
  Extending the FoodSense concept beyond India through responsible data systems and applied machine learning The global food safety challenge Food safety challenges are not confined to geography. They appear in different forms across countries, but the underlying risks are shared. Allergen exposure remains one of the most preventable causes of severe food related harm, yet communication failures continue to occur. Expiry dates are frequently misunderstood by both consumers and businesses, contributing to avoidable illness on one end and large scale food waste on the other. These issues place sustained pressure on public health systems worldwide. What varies between regions is not the existence of the problem, but the maturity of the systems designed to manage it. Effective food safety today requires more than compliance. It requires visibility, consistency, and decision support at the point where food is prepared, stored, and consumed. Why AI is an appropriate tool in this do...

Inside the Smart Food Safety System: Architecture, Data Pipelines, and ML Models Explained

Image
  A deep technical walkthrough of the data pipelines, algorithms, and design decisions behind my food safety prototype Architecture overview Once the prototype moved beyond experimentation, I needed a structure that could survive real-world input. Food labels are noisy. OCR is imperfect. Safety decisions cannot rely on a single model prediction. The architecture reflects that reality by separating concerns clearly and defensively. At a high level, the system flows as follows: Image / Label Input ↓ OCR + Text Parsing ↓ ETL + Validation Layer ↓ Feature Engineering ↓ Freshness ML Model ↓ Rule - Based Safety Engine ↓ Human - Readable Output Each layer can fail safely without corrupting the next. Data Engineering layer (ETL, validation, anonymisation) This layer exists to answer one question: Can this data be trusted enough to make a safety decision? ETL ingestion Raw inputs enter the system either as: OCR extracted ...

Designing Analytics Architecture for Small to Mid Size Data Teams

Image
 Introduction Many small to mid size data teams struggle not because of lack of talent, but because their analytics architecture grows accidentally. A script is added here. A dashboard is patched there. Another data source is bolted on to meet an urgent request. Over time, analytics becomes fragile. Changes are risky, performance degrades, and no one is quite sure how everything fits together. The problem is not scale. It’s the absence of i ntentional architecture . Why this problem matters Analytics architecture determines: how quickly teams can respond to new questions how safely systems can evolve how much effort goes into maintenance versus insight how easily new analysts can contribute Without a clear architectural approach, small teams pay a disproportionate cost. They spend time firefighting instead of compounding value. Good architecture allows small teams to operate like larger ones , without the overhead. Architecture as flow, not tools At a pr...

Turning a Food Safety Idea Into a Real Prototype: My Data & ML Build Journey

Image
How I taught myself practical machine learning and engineered a working prototype using Python, OCR, and rule based logic Why I started building and not just thinking After mapping the food safety problem, I reached a point where thinking wasn’t enough. Ideas can sound convincing in words. Diagrams can make them look coherent. But without a working prototype, everything stays hypothetical. I didn’t want this project to live as a concept or a case study. I wanted to know whether it could actually work. Building felt like the only honest way to validate the idea. So I decided to commit to a fixed window and treat it like an engineering challenge, not a side thought. Sixty days. One end to end prototype. No shortcuts. Starting point: theory heavy, practice light During my Master’s, I had studied machine learning, NLP, and Python. I understood models conceptually. I knew how algorithms worked on paper. I had written isolated scripts and notebooks. But I had never built a complete system wh...