Tech Talk

Tuesday, February 17, 2026

Split before data pre-processing or after?

In machine learning workflows, the standard practice is to split datasets into training and testing subsets before applying most preprocessing transformations to prevent data leakage.

However, certain preliminary data cleaning operations may be performed safely on the entire dataset beforehand, as they do not depend on statistical summaries or introduce information from the test set into the training process.

Given below are examples of preprocessing that can be done before splitting.

Removing duplicates.
Fixing data types - e.g. date strings
Remove bad data or impossible values - e.g. age > 150
Removing whitespace from strings - e.g. trim the text

Thus, as long as you are not using statistics to impute missing values in the dataset, you can do the preprocessing before the split (into training/test).

Operations involving data-derived statistics—such as imputation with means/medians, standardization, one-hot encoding based on frequencies, or percentile-based outlier removal—must be fitted exclusively on the training set. Hence this kind of data pre-processing should be only done after splitting, or you will end up with something called as 'data leakage'.

So what exactly is data leakage? You can understand it with the following analogy.

Imagine you're studying for an exam.
You’re supposed to practice using your textbook (training data) and then take the exam (test data) to see how well you’ve learned.
Now imagine someone secretly shows you some of the exam questions while you’re studying.
When you take the test, you score really high — but not because you truly understood the material. You just recognized the questions. That’s data leakage!!!

In simple terms:

The training data is what the model learns from.
The test data is supposed to check how well it learned.
If information from the test data sneaks into training, the model gets an unfair advantage.
It looks like it performs very well.
But when you give it completely new data in the real world, performance drops.
So data leakage makes the model look smarter than it actually is — and that’s dangerous because it won’t work as well in real-life situations.

Example where imputation is done before splitting.

Suppose you are building a model to predict house prices, and the dataset contains missing values in the feature “Lot Size.”
You calculate the mean lot size using the entire dataset (including both training and test data) and use that value to fill in all missing entries.
After performing this imputation, you split the data into training and test sets.
This creates data leakage because the imputed values were influenced by information from the test set.
As a result, the model’s evaluation may appear more accurate than it truly is, since the training process indirectly incorporated knowledge from unseen data.

Another example of data leakage is target leakage explained here - https://www.narendranaidu.com/2026/02/ruminating-on-target-leakage-in-ml.html

Ruminating on target leakage in ML models

Target leakage is a type of data leakage where training data includes info directly tied to the outcome (target variable), but that info wouldn't exist at prediction time. Your model "cheats" during training, looks amazing on paper, but fails on new data. This often sneaks in through feature engineering or data collection, leading to overfitting.

Examples:

You're building a model to spot who'll get a sinus infection. Your dataset has a feature "took_antibiotics." Sounds useful, right? Wrong—patients take antibiotics after getting sick, so this feature leaks the target. Drop it!
Predicting if employees will quit. Including "retention_bonus_offered" leaks info because bonuses come after quit signals, not before. The model learns from a reaction to churn, not its causes.
In credit card fraud prediction, using "chargeback_filed" as a feature is leakage gold. Chargebacks happen post-fraud, so the model peeks at the future.

Golden rule to avoid target leakage: Always ask: "Would this feature exist before the prediction?" If no, remove it.

Tuesday, November 11, 2025

Azure tenant to tenant migration

Migrating cloud workloads across Microsoft Azure tenants is essential during business changes like acquisitions or divestitures. The process involves planning to securely separate or join workloads between tenants while maintaining data and service integrity.

Microsoft has published a good pattern for such scenarios: https://learn.microsoft.com/en-us/azure/architecture/solution-ideas/articles/migrate-cloud-workloads-across-security-tenants

When moving data between Azure tenants, creating a sidecar subscription in the same source tenant before moving it is better because of several key reasons:

Temporary Holding Space: The sidecar subscription acts as a temporary holding place for backups, cloned data service resources, or virtual machines. This setup allows reliable data cloning or backup within the original tenant environment before the subscription is moved to the target tenant. This isolation improves safety and manageability during the migration process.

Simplified Subscription Move: Moving the entire sidecar subscription (containing the cloned or backed-up resources) to the new tenant transfers all needed resources at once. This reduces errors compared to moving individual resources cross-tenant, which Azure often does not support directly.

Smooth Migration Path: After moving the subscription, resources can be either moved to the target resource groups or restored to pre-created resources, facilitating a controlled and staged migration or cutover plan

Separation of Concerns and Operational Control: Utilizing a dedicated sidecar subscription helps separate migration workloads from ongoing production workloads, minimizing impact and giving migration teams operational flexibility and control.

Tuesday, October 14, 2025

Dataspaces, Gaia-X and Simpl Middleware

In today’s digital economy, data is a powerful resource that drives innovation, economic growth, and improved public services. Recognizing this, the European Union has developed a comprehensive plan called the European Strategy for Data to create a single, integrated market for data across Europe.

The goal is to make more data available in ways that protect individual privacy and company control, allowing the safe sharing and reuse of data across many sectors. This strategy supports key areas such as healthcare, transport, energy, finance, and public services.

To create a data-driven ecosystem, the strategy promotes the creation of Common European Data Spaces. These spaces are digital environments designed for easy and safe data sharing within specific industries or domains. Imagine them as secure hubs where different organizations—whether companies, public authorities, or researchers—can share data under agreed rules. This fosters collaboration, spurs innovation, and enables new data-driven products and services. Each Data Space focuses on strategic fields such as health, agriculture, manufacturing, energy, mobility, finance, public administration, and environmental initiatives like the green deal. By bringing together data from multiple sources, these spaces help unlock fresh opportunities and improve services that impact everyday life.

A major technological initiative supporting the strategy is Gaia-X on which I had blogged before - https://www.narendranaidu.com/2024/01/gaia-x-catena-x-data-usage-governance.html

Simpl (Smart Middlware Platform) is an open source middleware platform that supports data access and interoperability among European data spaces. Simpl relies on aligning with existing standards and frameworks developed by initiatives like Gaia-X, which specify data and metadata models. Simpl acts as an abstraction layer or middleware platform that enables data spaces to interoperate smoothly despite potentially differing underlying data formats.

The European Strategy for Data sets the stage for a bold future where data flows easily but securely across industries and borders, powering innovation and economic growth. Common European Data Spaces, Gaia-X, and Simpl middleware are foundational elements that will enable this vision. Together, they create a trustworthy, competitive, and transparent environment where individuals and organizations can share data with confidence and control.

Ruminating on 'Service as Software'

The evolution of Agentic AI has ushered in a new paradigm called 'Service as Software'. The following article lays down the core concept very succinctly: https://www.thoughtful.ai/blog/service-as-software

Snippet from the article:

Service as a Software (SaS) is a business model where the value isn’t the software itself, but the complete service it delivers through automation. Instead of selling access to tools, companies sell the outcome those tools create. In SaS, the software operates in the background while AI Agents perform the actual work—handling tasks, making decisions, and producing results without human input. This removes the need for the customer to operate the software, train staff, or manage workflows—the outcome is what’s sold, not the tool.

The biggest advantage of this paradigm shift towards Digital Labour is infinite scalability & real-time agility. Traditional services scale linearly—hire more people for more demand, pray for retention. SaS agents? They ramp up instantly. Seasonal surges in e-commerce support? No frantic recruiting. Project deadlines looming? Agents multiply efforts without overtime. A customer support team, for instance, swaps human shifts for AI deployments that triage queries, resolve 80% on the spot, and escalate the rest seamlessly—slashing costs while boosting satisfaction. Pricing seals the deal: Move beyond flat fees to outcome-based models. Charge per resolved ticket, qualified lead, or audit completed.

Friday, October 10, 2025

Ruminating on Sovereign Cloud Security Certifications

Different countries have developed their own cloud security standards to help businesses pick trustworthy providers. Two important European standards are SecNumCloud from France and C5 from Germany.

C5 stands for Cloud Computing Compliance Criteria Catalogue. It’s a German government-backed security framework developed by the Federal Office for Information Security (BSI). C5 sets out a list of security controls cloud providers should have to protect customer data. It covers many aspects like risk management, access controls, encryption, and physical security. Customers can use C5 certification as a trusted sign that a cloud provider meets strong security requirements based on international standards such as ISO 27001.

SecNumCloud is a French cloud security qualification offered by ANSSI, the French National Cybersecurity Agency. It is known for being one of the most demanding and strict cloud certifications in Europe. SecNumCloud covers Infrastructure-as-a-Service (IaaS), Platform-as-a-Service (PaaS), and Software-as-a-Service (SaaS) cloud models and focuses on protecting sensitive and critical data. To earn SecNumCloud, cloud providers must meet over 700 security requirements covering technical, operational, and legal aspects. It ensures very high data protection, resilience to cyberattacks, and compliance with tight European data protection laws. The certification lasts three years and requires regular audits to maintain.

SecNumCloud is known for higher complexity and stricter requirements compared to the more accessible C5 standard. The major hyperscale cloud providers—AWS, Microsoft Azure, and Google Cloud—are compliant with the German C5 cloud security standard.

But SecNum is a tough certification even for the hyperscalers! Some of the French cloud providers who are SecNum certified are OVHcloud, Orange Cloud and Outscale (Subsidiary of Dassault).

Saturday, October 04, 2025

Ruminating on zero-code instumentation and monkey patching for Python

Zero-code instrumentation for Python applications enables automatic monitoring and telemetry collection without requiring any changes to the application's source code. This is achieved by attaching an OpenTelemetry Python agent that uses monkey patching—a technique that dynamically modifies or wraps existing functions at runtime—to inject telemetry collection into popular frameworks like Flask and FastAPI. This approach allows capturing traces, metrics, and logs from incoming requests, database calls, and other library operations seamlessly.

In FastAPI or Flask apps, the OpenTelemetry auto-instrumentation agent monkey patches core HTTP handling methods and middleware when the application starts. This means that the application's routing and request lifecycle remain untouched by developer code, but telemetry data about request duration, errors, and dependencies is automatically captured and exported to observability backends. Setup involves installing OpenTelemetry distribution packages and running a bootstrap command to detect and attach the appropriate instrumentation libraries based on your app’s dependencies.

The biggest advantage of zero-code instrumentation is the ability to quickly gain observability with minimal operational overhead, especially useful for existing large codebases or third-party services. It eliminates manual instrumentation effort while providing standardized telemetry to troubleshoot and monitor Python web apps effectively. Thus, zero-code instrumentation combined with monkey patching offers a powerful, low-friction solution for bringing deep observability to FastAPI and Flask applications.

Monday, September 22, 2025

Ruminating on Parquet, Delta Lake and Iceberg

In this blogpost, I shall try to demystify some of the concepts around Parquet, Delta Lake and Iceberg in three easy learning steps.

Step 1: Let's start with the basics first - what is Parquet? (pronounced as paa.kay)

A Parquet file is a type of data storage format optimized for handling large datasets. Think of it like a highly organized filing cabinet for data. Instead of storing data in rows (like a traditional spreadsheet), Parquet stores data in columns. This makes it super efficient for queries where you only need specific columns, as it doesn’t have to scan the entire dataset.

Key Features of Parquet:

Columnar Storage: Stores data by columns, not rows, which speeds up queries for specific fields.
Compression: Shrinks data to save space and make reading faster.
Compatibility: Works well with big data tools like Hadoop, Spark, and others

Parquet file format is great for storing large datasets where you need to analyze specific columns, like sales data or user activity logs.

Example: Imagine you have a massive table with customer names, ages, and purchases. If you only want to analyze purchases, Parquet lets you grab just that column quickly without touching the rest.

Step 2: What is a Delta Lake?

Delta Lake is a storage layer that builds on top of Parquet files to add extra features for managing data. Think of it as a smart manager for your Parquet files, ensuring your data stays reliable, consistent, and easy to work with over time.

Delta Lake stores data in Parquet files but adds a transaction log (in JSON format) that tracks changes, versions, and metadata. This provides ACID (atomicity, consistency, isolation, durability) compliance, allowing safe streaming, updates, deletes, and inserts. Delta Lake supports efficient management of many Parquet files within a table, handling schema evolution, and concurrency control.

A great article that illustrates the value that Delta Lake storage layer provides on top of Parquet is here: https://delta.io/blog/delta-lake-vs-parquet-comparison/

Delta Lake storage layer was spearheaded by the company Databricks.

Step 3: What is a Apache Iceberg?

Apache Iceberg is another storage layer, similar to Delta, but with a focus on flexibility and performance for massive datasets. It was spearheaded by the Apache opensource foundation, but later embraced by Snowflake. It was first created by Netflix and Apple. It also uses the Parquet file format by default, but also supports other file formats like ORC and Avro.

Parquet and ORC are columnar, best for read-heavy analytical workloads due to their high compression and query performance, with Parquet being the default. Avro is a row-based format, ideal for write-heavy streaming or ingestion workloads due to its faster write times.

The battle between Databricks' Delta Lake and Snowflake's adoption of Apache Iceberg is heating up, reflecting broader shifts in data architecture toward open, interoperable lakehouses. What started as specialized storage layers for big data has evolved into a high-stakes rivalry, with both companies vying for dominance in unified analytics platforms.

Databricks, the creator of Delta Lake, has poured resources into open table formats, including a massive acquisition to bolster Iceberg support. Meanwhile, Snowflake, a cloud data warehousing giant, is aggressively embracing Iceberg to enhance its multi-engine capabilities and counter Databricks' lakehouse momentum.

A good article comparing both these storage layers is here: https://dataengineeringcentral.substack.com/p/delta-lake-vs-apache-iceberg-the

In terms of capabilities, both of these storage layers are neck-to-neck with each player bringing in feature parity in a few weeks. The choice ultimately depends on your existing tech stack, scale requirements, and long-term goals. If openness and multi-tool support matter more, go with Iceberg; for Spark-optimized efficiency, choose Delta Lake.

Please note that recently Databricks announced full support of Iceberg :) -- https://www.databricks.com/blog/announcing-full-apache-iceberg-support-databricks

Thursday, September 04, 2025

JSON-RPC vs REST and why JSON-RPC is used in MCP?

I was going down the rabbit hole of MCP protocol details and realised that it was using JSON-RPC instead of REST.

JSON-RPC is a simple protocol that lets a program on one computer run a function on another computer. It uses JSON to send and receive the requests and responses, making it easy to use and understand.

It is transport-agnostic and can work over HTTP, TCP, sockets, or other message-passing environments. A JSON-RPC request typically includes the method to be called, parameters for that method (optional), and an ID to match the response.

Given below is a simple example of a request and response:

Given below are the top 3 differences in JSON-RPC vs REST for API design:

Architecture Style:

JSON-RPC: RPC-oriented, focusing on invoking specific methods or procedures on the server (e.g., calling a function like getBalance()). It treats interactions as direct commands.
REST: Resource-oriented, centered on manipulating resources (e.g., /users/{id}) using standard HTTP methods like GET, POST, PUT, and DELETE.

Endpoints and HTTP Methods:

JSON-RPC: Uses a single endpoint (e.g., /rpc) with POST requests for all method calls, simplifying routing but limiting HTTP verb usage.
REST: Employs multiple endpoints (e.g., /users, /orders) and leverages various HTTP methods (GET, POST, PUT, DELETE) to represent different actions on resources.

Request/Response Structure:

JSON-RPC: Requests and responses follow a strict JSON format with fields like "method", "params", "id", and "result" or "error". Supports batching natively.
REST: Uses flexible request formats (URL paths, query parameters, headers) and responses rely on HTTP status codes (e.g., 200, 404) with custom payloads; batching requires custom implementation.

MCP chose JSON-RPC because its lightweight, single-endpoint design ensures fast and efficient communication for real-time AI tasks. It supports batch requests, allowing multiple operations in one call, which suits MCP’s complex AI workflows.

Tuesday, August 05, 2025

Steps for training a custom document extraction model on Azure AI

Given below is a step-by-step guide on how to use bounding boxes to train custom document models in Azure Vision + Document AI:

Step 1: Prepare Your Document Samples

Collect a minimum of about 5-10 sample documents representative of the type you want the model to learn. Ensure the documents contain the fields or visual elements you want to extract (e.g., invoice numbers, tables, checkboxes).

Step 2: Upload Documents to Azure Document Intelligence Studio or AI Foundry Portal

Navigate to the Azure Document Intelligence Studio or the AI Foundry portal. Create a new custom model project and upload your labeled documents here.

Step 3: Annotate the Documents with Bounding Boxes

Open each document in the annotation tool. Use the interface to draw bounding boxes around each field or element you want your model to detect. For example, draw a rectangle around the "Invoice Number" field or the table area. Assign a meaningful label/tag to each bounding box (e.g., "InvoiceNumber," "TotalAmount," "Table").

Step 4: Review and Adjust Annotations

Carefully review each bounding box for accuracy and completeness. Adjust sizes and positions as needed to tightly encase the relevant text or visual elements.

Step 5: Train the Custom Model

Once all documents are annotated, start the training process. The AI will learn to recognize visually similar regions and extract text or data associated with each labeled bounding box.

Step 6: Evaluate the Model

Test the model using a set of new, unseen documents. Review the extracted fields to check accuracy and completeness. If necessary, add more labeled documents or refine annotations and retrain.

Step 7: Deploy and Use the Model

When satisfied with the model’s performance, deploy it via the Azure portal. You can now integrate the model through APIs or SDKs to automate document processing in your applications.

This bounding-box annotation process is crucial for training effective custom document AI models in Azure Vision + Document AI, ensuring the system understands exactly where and what information to extract from documents.

Azure Vision + Document AI supports two main types of custom models:

Custom Template Model (formerly Custom Form Model): Best for documents with a consistent and static layout or visual template (e.g., questionnaires, structured forms, applications). Extracts labeled key-value pairs, selection marks (checkboxes), tables, signature fields, and regions from documents with little variation in structure.
Custom Neural Model (also called Custom Document Model): Designed for documents with more layout variation, including structured, semi-structured, or unstructured document types (e.g., invoices, receipts, purchase orders). Uses deep learning trained on a base of diverse document types and fine-tuned on your labeled dataset. Recommended for higher accuracy and advanced extraction scenarios when documents vary in layout or complexity.

The custom neural model in Azure Vision + Document AI is based on Microsoft's proprietary deep learning architecture specifically designed for document understanding. It as a deep learning model trained on a large collection of documents and then fine-tuned on your labeled dataset to recognize key-value pairs, tables, selection marks, and signatures in structured, semi-structured, and unstructured documents.

Behind the scenes, the architecture likely combines convolutional neural networks (traditional Computer Vision CNN like YOLO) for layout/visual understanding together with transformer-based LMMs (large multi-model models) or sequence models for text and contextual understanding. This hybrid use of vision and language models is what enables the service to process multi-modal inputs (visual layout plus text) effectively.

Important Note: Before you embark on creating a custom fine-tuned neural net model, please check if your usecase can be satisfied with the pre-built models (which will be true for 90% of the usecases).

https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/concept/choose-model-feature?view=doc-intel-4.0.0

A lot of usecases can just be fulfilled by using the "Layout analysis model with the optional query string parameter features=keyValuePairs enabled"

Ruminating on the .well-known path in URLs

The .well-known path is a reserved directory at the root of a website’s domain (e.g., https://example.com/.well-known/).

It serves as a standardized location for hosting metadata files that provide information about the domain or its services. The standard ensures that clients can reliably find these files without needing to guess their location or query the server in non-standard ways.

This also enhances interoperability as it ensures that clients (e.g., browsers, bots, or APIs) can find and parse metadata consistently across domains. It also supports automation as it enables automated systems to discover information like security policies, verification files, or protocol configurations.

Here are some widely used files and directories under /.well-known/ and their purposes:

security.txt

Purpose: Provides contact information for reporting security vulnerabilities.
Example: https://example.com/.well-known/security.txt
Content: A text file with details like email addresses, encryption keys, or preferred reporting methods.
Use Case: Security researchers use it to report vulnerabilities responsibly.

change-password

Purpose: Indicates the URL where users can change their passwords.
Example: https://example.com/.well-known/change-password
Use Case: Improves user experience by standardizing password management endpoints.

For the Agent2Agent (A2A) Protocol, introduced by Google in April 2025, the .well-known path is used to host the Agent Card, a JSON metadata document that enables AI agents to discover and interact with each other in a standardized, secure, and interoperable way - e.g. https://example.com/.well-known/agent.json

Friday, July 25, 2025

Ruminating on FastAPI’s Speed and how to scale it with multiple Uvicorn workers

Python’s Global Interpreter Lock (GIL) often raises questions about concurrency and performance, especially for web frameworks like FastAPI. How does FastAPI stay so fast despite the GIL, and how can you run it with multiple workers to fully leverage multi-core CPUs?

Let’s explore these concepts clearly.

The Global Interpreter Lock, or GIL, is a mutex that ensures only one thread executes Python bytecode at any given moment inside a single process. This simplifies memory management and protects Python objects from concurrent access issues. However, it means pure Python threads cannot run code in parallel on multiple CPU cores, limiting how multi-threaded Python programs handle CPU-bound tasks.

This sounds like bad news for a web framework that needs to handle many requests simultaneously, right? Not entirely.

How FastAPI Achieves High Performance Despite the GIL?

FastAPI is designed to handle many simultaneous requests efficiently by leveraging Python’s asynchronous programming capabilities, specifically the async/await syntax.

Asynchronous I/O: FastAPI endpoints can be defined as async functions. When these functions perform I/O operations like waiting for a database query, network response, or file access, they yield control (using await) back to an event loop. This means while one request is waiting, the server can start working on other requests, without the need for multiple threads running in parallel.
Single-threaded event loop: FastAPI runs on ASGI servers like Uvicorn that manage an event loop in a single thread. This avoids the overhead and complexity of thread locking under the GIL because only one thread executes Python code at a time, but efficiently switches between many tasks waiting for I/O.
Ideal for I/O-bound tasks: Web APIs typically spend a lot of time waiting for I/O operations, so asynchronous concurrency lets FastAPI handle many requests without needing multiple CPU cores or threads.

But What If Your Application Is CPU-bound or You Need More Parallelism?

For CPU-bound workloads (heavy calculations) or simply to better utilize multi-core CPUs for handling many requests in parallel, you need multiple processes. This is where Uvicorn’s worker processes come in.

Uvicorn, the ASGI server often used to run FastAPI, supports spawning multiple worker processes via the --workers option. Each worker is a separate process with its own Python interpreter and GIL. Workers run independently and can handle requests concurrently across different CPU cores. The master Uvicorn process listens on a port and delegates incoming requests to the worker processes.

This model effectively bypasses the single-thread GIL limitation by scaling workload horizontally over processes rather than threads (unlike multi-threading in Java or .NET frameworks - e.g. Spring Boot, ASP.NET MVC)

Set the number of workers roughly equal to your CPU cores for optimal utilization. Each worker is a separate process, so memory usage will increase.

When deploying with containers or orchestration tools like Kubernetes, it’s common to run one worker per container and scale containers horizontally.

Please NOTE that 95% of web applications and REST apis are NOT CPU-bound, but I/O bound. So even a single FastAPI server with async programming should more than suffice. Throw-in an additional server with a load balancer for high availability.

But what if you have synchronous libraries and cannot run async in FastAPI? Well FastAPI can handle sync routes also as follows:

When FastAPI routes are defined as synchronous functions (def), the framework handles them by running the route handlers in an external thread pool instead of the main event loop thread. This approach prevents blocking the server's event loop, allowing requests to be processed concurrently despite the synchronous code. The synchronous route is effectively executed on a worker thread managed by the thread pool executor in the underlying Starlette framework.

Each thread releases the Global Interpreter Lock (GIL) when performing blocking I/O operations. This allows other threads to acquire the GIL and run concurrently during I/O waits, improving efficiency in I/O-bound tasks.

While this allows parallel execution, blocking I/O operations in sync routes still consume a thread and can reduce scalability under heavy load. Therefore, sync routes in FastAPI run concurrently but rely on thread-based parallelism rather than true asynchronous non-blocking concurrency as with async def routes. The default number of threads in FastAPI's thread pool for handling synchronous routes is 40.

Friday, July 04, 2025

Ruminating on Continued Pre-Training and Fine-Tuning of LLMs

In my previous blogpost, we discussed about the differences between RAG and Fine-Tuning. Besides fine-tuning, there is another technique called as "Continued Pre-Training" that can be used to improve the performance of LLMs.

Continued pre-training involves taking a pre-trained model—typically trained on a large, general dataset .....and further training it on a new, often domain-specific dataset. The goal is to adapt the model’s general knowledge to a specific domain, such as medical texts, legal documents, or scientific literature, without starting from scratch. This enhances the model’s understanding of a specific domain while retaining its general knowledge.

Suppose you have a pre-trained language model like BERT, originally trained on a general corpus like Wikipedia and BookCorpus. You want to use it for analyzing medical research papers. Since BERT’s general training may not capture medical jargon or context, you perform continued pre-training.

To do this, you gather a large dataset of medical texts, such as PubMed articles or clinical notes. Fine-tune BERT’s weights on the medical corpus, allowing it to learn medical terminology and context. The new model (call it “MedicalBERT”) has adapted to medical terminology and can better understand domain-specific texts.

Other examples of continued pre-training:

Adapting a Language Model for Legal Documents: You have a pre-trained model like RoBERTa, trained on general web data, but you need it to understand legal terminology and context for analyzing contracts or court documents.
Adapting a Vision Model for Satellite Imagery: A pre-trained vision model like ResNet, trained on ImageNet (general images like animals and objects), needs to be adapted for analyzing satellite imagery for urban planning or environmental monitoring.

Fine-tuning takes a pre-trained model (or a model after continued pre-training) and trains it on a smaller, task-specific dataset to optimize it for a particular task, such as classification, translation, or question answering. Fine-tuning adjusts the model’s weights to improve performance on the target task while leveraging the general knowledge learned during pre-training.

Examples of fine-tuning:

Fine-Tuning for Object Detection in Medical Imaging: You want to use a pre-trained vision model like YOLOv5, adapted for medical imaging (e.g., via continued pre-training on X-ray images), to detect specific abnormalities like tumors in chest X-rays.

Given below is a comparison table for RAG vs Continued-Pretraining vs Fine tuning

Aspect	Retrieval-Augmented Generation (RAG)	Continued Pre-Training	Fine-Tuning
Definition	Combines a pre-trained language model with a retrieval mechanism to fetch relevant external documents for generating contextually accurate responses.	Further trains a pre-trained model on a large, domain-specific dataset to adapt it to a particular domain.	Optimizes a pre-trained model for a specific task using a smaller, labeled dataset in a supervised manner.
Objective	Enhance model responses by incorporating external knowledge dynamically during inference.	Adapt a model to understand domain-specific patterns, terminology, or context.	Optimize a model for a specific task, such as classification or translation.
Data Requirement	Requires a large corpus of documents for retrieval (often unstructured) and a pre-trained model.	Requires a large, domain-specific dataset, typically unlabeled or weakly labeled.	Requires a smaller, task-specific, labeled dataset.
Learning Type	Combines generative modeling with retrieval; no additional training required during inference.	Self-supervised or unsupervised learning (e.g., masked language modeling).	Supervised learning with task-specific objectives (e.g., classification loss).
Process	Retrieves relevant documents from an external knowledge base and uses them as context for the model to generate responses.	Continues training the model on domain-specific data to update its weights broadly.	Updates model weights specifically for a target task using labeled data.
Computational Cost	Moderate; requires efficient retrieval systems but no additional training during inference.	High; involves training on large datasets, requiring significant compute resources.	Moderate to low; uses smaller datasets, but may require careful tuning to avoid overfitting.
Data Availability	Needs a well-curated, accessible knowledge base for retrieval (e.g., Wikipedia, company documents).	Requires a large, domain-specific corpus, which may be hard to obtain for niche domains.	Needs labeled data, which can be costly or time-consuming to annotate.
Model Modification	No modification to model weights; relies on external knowledge for context.	Broad updates to model weights to capture domain-specific knowledge.	Targeted updates to model weights for task-specific performance.
Scalability	Scales well with large knowledge bases, but retrieval quality affects performance.	Scales with data and compute resources; time-consuming for large datasets.	Scales with labeled data availability; risk of overfitting with small datasets.

Thursday, June 05, 2025

Securing and validating JWT tokens

JSON Web Tokens (JWTs) are widely used for authentication and authorization in web applications. However, securely storing JWTs on the client side is critical to prevent vulnerabilities like cross-site scripting (XSS) attacks. While local storage and session storage are common choices, HttpOnly cookies offer a superior alternative for storing JWTs.

HttpOnly cookies are browser cookies with the HttpOnly flag set, meaning they cannot be accessed by client-side JavaScript. This restriction makes them immune to XSS attacks, as malicious scripts injected into a webpage cannot read or manipulate the cookie's contents. Unlike local storage or session storage, which are fully accessible to JavaScript, HttpOnly cookies are managed by the browser and sent automatically with HTTP requests to the specified domain.

HttpOnly cookies can be configured with additional security flags like Secure and SameSite:

Secure: Ensures the cookie is only sent over HTTPS, preventing interception over unencrypted connections.
SameSite: Mitigates cross-site request forgery (CSRF) attacks by restricting when cookies are sent in cross-origin requests. For example, setting SameSite=Strict ensures the cookie is only sent for requests originating from the same site.

HttpOnly cookies allow developers to set precise expiration times, aligning with the JWT's own expiration. This ensures the token is automatically removed by the browser when it expires, reducing the risk of stale tokens lingering on the client.

To maximize the security of HttpOnly cookies for JWT storage, follow these best practices:

Always set the HttpOnly, Secure, and SameSite attributes.
Use short-lived JWTs with refresh tokens to minimize the impact of potential breaches.

Some security experts feel that CSRF tokens should be used along with cookies.

But when using SameSite=Strict on cookies, CSRF tokens are generally not required for most scenarios because SameSite=Strict provides strong protection against cross-site request forgery (CSRF) attacks.

Here's why:

How SameSite=Strict Works: The SameSite=Strict attribute ensures that the browser only sends the cookie with requests originating from the same site (i.e., the same domain as the cookie). Cross-origin requests, such as those initiated by a malicious site, will not include the cookie, effectively blocking CSRF attacks.
Why CSRF Tokens May Not Be Needed: Since the cookie (e.g., containing a JWT) is not sent with cross-origin requests, an attacker cannot exploit the cookie to make unauthorized requests on behalf of the user. This eliminates the primary vector for CSRF attacks, making additional CSRF tokens redundant in many cases.

If you cannot have the restriction of "SameSite=Strict", then it is best to implement CSRF tokens. Also for extremely secure mission-critical applications, having CSRF tokens will add an additional layer of security.

Validation of JWT on server side:

Validating JWTs using signatures ensures the token's integrity and authenticity. This process relies on cryptographic signatures, which can be implemented using symmetric or asymmetric algorithms.

Symmetric key validation uses a single shared secret key for both signing and verifying the JWT. The most common algorithm is HMAC (e.g., HMAC-SHA256, denoted as HS256 in JWT). The recipient (e.g., a server) uses the same secret key to recompute the HMAC of the Header and Payload. If the computed signature matches the one in the JWT, the token is valid and untampered. Both the issuer (Identity JWT provider) and verifier (server side code) must share the same secret key securely.

So on the server-side, this entails storing the secret symmetric key securely in some place.

Asymmetric key validation uses a pair of keys: a private key for signing and a public key for verification. Common algorithms include RS256 (RSA with SHA-256) and ES256 (ECDSA with SHA-256). The issuer (Identity JWT provider) signs the JWT using the private key. The recipient uses the corresponding public key to decrypt and verify the signature. If the decrypted hash matches the hash of the Header and Payload, the token is valid. The private key is kept secret by the issuer, while the public key can be freely distributed to verifiers.

For most modern, distributed web applications, asymmetric key validation (e.g., RS256) is preferred due to its flexibility and security, while symmetric key validation (e.g., HS256) suits simpler, internal systems.

Wednesday, April 30, 2025

Ruminating on "Zero ETL"

Traditional ETL workflows consist of three separate stages:

Extracting data from source systems
Transforming it to fit analytical needs
Loading it into a target database or data warehouse.

Although this method is reliable, it often requires considerable time, resources, and technical effort, which can introduce delays and impede timely decision-making.

In contrast, Zero ETL eliminates these conventional steps by allowing direct access to source data and applying transformations on-the-fly during query execution. This significantly reduces latency, limits unnecessary data movement, and streamlines the integration process. By harnessing modern cloud infrastructure and sophisticated query tools, Zero ETL offers a more efficient, scalable approach to data management.

Examples of Zero ETL:

AWS’s Zero ETL solutions, such as the integration between Amazon Aurora and Amazon Redshift, allow organizations to query data across systems without constructing traditional pipelines.
The Snowflake Data Cloud supports federated queries and data sharing, enabling access to data across platforms without ETL processes.
Google Cloud BigQuery Omni facilitates cross-cloud analytics, allowing users to query data residing in AWS, Azure, or Google Cloud Platform without data replication.
Airbyte is a popular open-source data integration engine that automates the movement of data from various sources to destinations (data warehouses, lakes, databases) with minimal custom coding. Airbyte offers over 350 pre-built connectors, orchestration features, and robust security, making it suitable for streamlined, scalable data integration without heavy ETL pipelines.

Friday, March 28, 2025

Ruminating on AI Security

Artificial intelligence has evolved from a distant vision into a transformative force reshaping industries and daily life. Yet, alongside its immense potential lies a pressing need: security. AI systems, unlike conventional applications, bring distinct vulnerabilities that require a fundamental rethink of security strategies.

Security by Design: A Core Principle, Not a Last Step

In AI, security isn’t a feature to tack on—it’s a foundational element that must permeate every phase of the process. From initial design to development, deployment, and ongoing management, a "secure by default" mindset ensures that protection is intrinsic to the system’s DNA, not an optional extra.

Design: Establish clear security goals upfront, identifying potential risks and weaknesses.
Development: Prioritize secure coding, rigorous testing, and techniques to strengthen models against attacks.
Deployment: Implement safeguarded environments, strict access controls, and real-time monitoring.
Operations: Maintain vigilance with ongoing assessments, monitoring, and rapid-response plans.

Effective AI security hinges on threat modeling—assessing how a breached AI component could ripple across systems, users, organizations, and society. Proactively imagining these scenarios sharpens our defenses. Consider risks like data leaks, operational collapses, or AI weaponization by bad actors.Recognize the ethical stakes, as insecure AI can amplify societal harm.

AI applications face threats that traditional security frameworks aren’t built to handle. Here are some critical challenges:

Training Data Tampering: Attackers can poison datasets, skewing models to produce biased or dangerous results, risking flawed decisions or breakdowns.
Prompt Manipulation: In generative AI, crafted inputs can hijack outputs, leading to erratic or harmful behavior—especially in systems driven by user interactions.
Model Theft and Reverse-Engineering: Adversaries may extract or decode models, exposing proprietary logic or sensitive data.
Adversarial Inputs: Subtle tweaks to inputs can trick models into errors, undermining reliability.

A secure AI future demands investment in key areas:

Developer Empowerment: Equip teams with training in secure coding, responsible AI practices, and advanced techniques like adversarial hardening and privacy preservation.
Thorough Monitoring: Deploy robust systems to log and track AI inputs—queries, prompts, or requests—ensuring accountability, auditability, and swift action if compromised.
Collective Expertise: Encourage collaboration among researchers, developers, and security experts to pool insights and solutions.
Proactive Audits: Regularly evaluate AI systems to uncover and patch vulnerabilities.

Securing AI is not a one-time fix but a continuous endeavor requiring relentless innovation and alertness. By embedding security across the AI lifecycle and tackling its unique challenges head-on, we can forge a dependable AI ecosystem that safely unlocks its promise for society.

Sunday, March 23, 2025

What is an "AI Factory"?

"AI Factory" is a metaphor for a scalable and industralized approach for building and deploying AI solutions in the enterprise.

An AI factory builds intelligence the way a regular factory builds products. It takes data as its starting point, uses AI tools to process it, and creates smart outputs like predictions and automated tasks, all in a structured and repeatable way.

Just like a factory turns materials into goods, an AI factory turns data into useful AI results for the enterprise. "AI Factory" approach is imperative for creating AI models and applications in a standardized, repeatable way and incorporates all the best practices of DataOps, MLOps/GenOps to consistently deliver value to the business.

The key pillars for an AI Factory are as follows:

Scalable Data Platform: Automation of DataOps value chain - think of data pipelines and data governance.
AI Models & MLOps: Automation of model training, model versioning, scalable model deployment for inference, data drift measurement, model performance monitoring, etc.
Alignment with AI Stategy: As discussed here - https://www.narendranaidu.com/2025/02/crafting-ai-strategy-for-enterprise.html
Seamless integration into business processes: This is about making AI a key part of the company's decision-making and operational processes. Applying AI's findings to directly influence and improve how the business runs. Embed AI to turbo-charge all automation activities.

While ad-hoc AI projects can be unpredictable, an AI factory delivers consistent and reliable results, minimizing errors. This reliability allows companies to innovate faster, adapt to market changes, and personalize offerings, ultimately outperforming competitors.

Wednesday, March 19, 2025

Ruminating on Data Mesh and Data-as-a-Product

In today’s fast-paced, data-driven world, organizations are constantly looking for better ways to handle the massive amounts of data they generate and use. Traditional techniques frequently rely on a single, centralized staff to manage all the data—like a gigantic control center managing the central data warehouse or data lake. But as firms develop and data grows more complicated, this traditional technique may become sluggish, wasteful, and impossible to scale.

Data Mesh is a fresh and innovative strategy that’s altering how organizations think about data. Data Mesh is a decentralized way to manage data. Instead of one team being in charge of everything, Data Mesh spreads the responsibility across different groups—or "domains"—within the organization. A department such as product development, sales, or marketing could be considered a domain. Each domain owns its own data, meaning they collect it, store it, maintain its quality, and make it available to others.

Data Mesh is built on a few key ideas:

Domain-Driven Ownership: Each team takes full control of the data tied to their area of work. For example, the sales team manages sales data, while the customer support team handles support-related data.
Self-Service: Domains get access to platforms and technology that let them manage their data independently, without always needing help from a central IT team. Even though data is managed separately, there are company-wide standards to make sure everything connects smoothly and stays secure.
Data as a Product: One of the standout ideas in Data Mesh is treating data as a product. This concept is borrowed from how companies build software products—with a focus on making them user-friendly, reliable, and well-supported. In a Data Mesh, each domain doesn’t just store data, they polish it up and package it like a product that others in the organization can easily use.

The marketing team would build a ready-to-use "customer behavior data product," making it easily accessible via an API. This allows other departments, such as product design and leadership, to directly utilize reliable, well-organized data without needing to process raw data or request assistance.

Benefits of Data Mesh Architecture: Giving teams ownership of their data promotes scalability, allowing them to manage their own data needs as the company expands. It also accelerates workflows, as teams can independently develop and share data products. This ownership drives higher data quality, as teams rely on its accuracy, and provides the flexibility to adjust data to changing demands, leading to a more responsive organization.

Potential Challenges:While decentralization offers benefits, it necessitates careful coordination to prevent data silos and ensure interoperability. Managing numerous independent data products presents complexity, requiring teams to have adequate technical resources and skills. Robust governance is also crucial to avoid data duplication and security breaches.

For example - By using Data Mesh, an online retailer lets teams own their data: product manages catalogs, customer service handles reviews, and logistics oversees shipping. These "data products" are then easily accessible to other teams, like for a live sales dashboard, without needing a central data team, while maintaining consistency through shared standards.

Data Mesh is a mindset shift, not just technology, that empowers teams to own their data as products, unlocking its full potential, especially in large companies. Though requiring setup efforts, it leads to faster, smarter, and more adaptable data use.

Sunday, February 23, 2025

Crafting an AI strategy for the enterprise

AI has emerged as a pivotal force for enterprise transformation, offering avenues to reduce operational costs, enhance service delivery, improve customer experiences, boost employee productivity, and generate new revenue streams. Given below is a simple structured approach that can be leveraged by enterprises to craft their AI strategy.

1. Define the business objectives: The foundation of any AI strategy lies in a clear articulation of the enterprise's business vision and goals. This step ensures that AI initiatives are not pursued in isolation but are deeply integrated with the company's strategic objectives.

Clarifying the vision: The business vision should be a well-defined, inspiring direction that guides all strategic decisions. For instance, a retail enterprise might envision becoming the market leader in personalized shopping experiences by 2028. This vision sets the stage for AI applications that enhance customer interactions.
Setting specific objectives: Objectives should be specific, measurable, achievable, relevant, and time-bound (SMART). Examples include increasing sales by 20% within a year or reducing customer service response times by 30%. These goals provide clear targets for AI to support, such as automating routine inquiries to free up human agents for complex issues.
AI alignment: The AI strategy must align with these goals, identifying areas where AI can provide a competitive edge or address specific challenges. For example, if the goal is to enhance customer satisfaction, AI can be leveraged through chatbots for instant support or through personalized recommendation engines. This alignment ensures that AI efforts are not just technological experiments but strategic enablers.

2. Define success metrics and identify potential AI solutions: Once the business goals are established, the next step is to delineate how AI will drive these objectives and define metrics to measure success. This step is crucial for ensuring that AI initiatives deliver tangible business value.

Identifying AI Use Cases: AI applications span a wide range, including predictive analytics for sales forecasting, automation for streamlining processes, customer segmentation for targeted marketing, and fraud detection for security. Identify and funnel the most relevant to your business objectives.
Mapping AI solutions to Goals: For each business goal, map the relevant AI use cases. If the goal is to reduce operational costs, AI can automate routine tasks like data entry, freeing up employees for higher-value work. If the goal is to improve customer experience, AI chatbots can handle inquiries, improving response times. This mapping ensures every AI project has a direct link to a strategic objective.
Defining Success Metrics: Success metrics should be quantifiable and aligned with business goals. For reducing costs, metrics could include the percentage reduction in manual labor hours, cost savings from automation, or improved process efficiency. For enhancing customer experience, metrics might include customer satisfaction scores, retention rates, or net promoter scores (NPS). For example, if the goal is to generate new revenue streams, success metrics could include the revenue generated from AI-powered products, such as subscription-based AI tools, or the increase in cross-sell opportunities through AI-driven recommendations.

3. Define workstreams and Project Prioritization Framework:With a clear understanding of business goals and AI's role, the next step is to define specific workstreams (projects or initiatives) and prioritize them based on their potential impact and feasibility. This ensures efficient resource allocation and focus on high-value projects.

Identify projects: List potential AI projects that align with the business goals and AI use cases identified earlier. For example, deploying AI for customer segmentation to improve marketing effectiveness or using AI for supply chain optimization to reduce costs.
Prioritization criteria: Develop a framework to prioritize these projects. Key criteria include:

Business Impact: The potential value the project can bring, such as revenue growth or cost reduction.
Technical Feasibility: The ease or difficulty of implementing the project, considering current technological capabilities.
Resource Requirements: The resources (time, money, personnel) needed, ensuring alignment with available budgets and skills.
Risk Assessment: The potential risks associated with the project, such as ethical concerns or technical challenges.

Prioritization matrix: Use a matrix or scoring system to evaluate each project against these criteria and rank them accordingly. For example, assign scores from 1 to 5 for each criterion and calculate a total score for prioritization. A project with high business impact, low technical risk, and minimal resource requirements would rank higher. This systematic approach ensures that enterprises focus on initiatives with the greatest return on investment.

4. Address AI risks: AI implementation introduces risks such as data bias, privacy concerns, security vulnerabilities, and ethical dilemmas. Managing these risks through robust governance is essential for sustainable AI adoption.

Risk identification: Common risks include algorithmic bias leading to unfair outcomes, data security breaches, privacy violations, and non-compliance with regulatory standards.
Mitigation strategies: Develop specific strategies to address these risks. To mitigate algorithmic bias, implement regular auditing of AI models for fairness and accuracy, using tools like AI Fairness 360. For data security, employ robust encryption and access control measures. To address privacy concerns, ensure compliance with regulations like GDPR.
Governance structures: Establish governance bodies or committees to oversee AI projects, set policies, and ensure compliance. This could include an AI ethics committee to review and approve AI models before deployment. Governance also involves training employees on AI ethics, implementing data governance policies, and conducting regular audits.

5. Establish AI Governance and MLOps:To sustain AI’s value, it is imperative to integrate governance with Machine Learning Operations (MLOps) for scalable, reliable systems.

AI governance: Beyond risk mitigation, governance sets policies for AI lifecycle management development, deployment, and updates. This includes defining roles (e.g., data scientists, compliance officers) and standards for transparency and accountability.
MLOps framework: MLOps operationalizes AI by streamlining model training, deployment, monitoring, and maintenance. Tools like MLflow or Kubeflow automate workflows, ensuring models perform consistently in production.
Continuous monitoring: Track model performance (e.g., data drift) and business alignment, retraining as needed. For example, an AI chatbot’s effectiveness might decline if customer queries evolve, requiring updates.

With this structured approach for crafting an AI strategy, enterprises can unlock the full potential of AI, driving innovation, efficiency, and drive growth.

Tuesday, January 21, 2025

Ruminating on Standardizing Data

In the realm of statistics, we frequently face datasets of varied sizes and units. This might make it difficult to compare variables or use specific statistical approaches. To solve this challenge, we use a strong approach known as standardization.

Essentially, standardization transforms our original data into a new dataset where:

Mean:The average value of the new dataset is 0.
Standard Deviation:The measure of data dispersion around the mean is 1.

This process is also known as "z-score transformation".

Below are the advantages of standarizing data:

Comparability: Standardized data enables direct comparison of variables recorded on various scales. For example, heights in meters can be compared to weights in kilos.
Model Development: Standardized data improves the performance of many statistical models, including regression and machine learning methods. This increases the model's accuracy and stability.
Outlier Detection: When data is normalized, it is easier to identify numbers that vary considerably from the norm.

The formula for standardizing a data point (x) is:

z (standard value) = (x - mean) / standard-deviation

Example:

Original data: 150, 160, 170, 180, 190
Mean (μ) = 170, Standard Deviation (σ) = 15.8
Standardized data: -1.27, -0.63, 0, 0.63, 1.27

Standardizing data is a fundamental technique in statistics and data science. By transforming data to have a mean of 0 and a standard deviation of 1, we gain valuable insights and improve the performance of various statistical analyses.

Thursday, January 09, 2025

Top 5 GenAI usecases in the industry

There are hundreds of GenAI usecases in the market today. But if we were to summarize them under some common themes, then the following five areas are the overarching themes that most usecases will fall under.

1. Revolutionizing Customer Service Automation:

Imagine a world where customer service is available 24/7, responds instantly, and provides personalized solutions without long wait times. This is the promise of Generative AI in customer service. GenAI-powered chatbots and virtual assistants are moving beyond simple keyword matching to engage in natural, human-like conversations. They can handle a wide array of customer service tasks, including:

Answering FAQs: Instead of relying on pre-written scripts, these AI assistants can understand the nuances of customer questions and provide dynamic, context-aware answers.
Providing Product Information: GenAI can access and process vast amounts of product data to offer detailed descriptions, comparisons, and personalized recommendations.
Troubleshooting Technical Issues: By understanding the technical context of a problem, these AI systems can guide customers through troubleshooting steps, often resolving issues without human intervention.
Processing Orders and Returns: From order tracking to return authorizations, GenAI can streamline these processes, providing a seamless customer experience.

This automation not only improves customer satisfaction but also frees up human agents to handle more complex or sensitive issues, resulting in a more efficient and cost-effective customer service operation.

2. Supercharging Developer Productivity:

Generative AI is becoming an indispensable tool for developers, significantly boosting their productivity. By automating repetitive tasks and providing intelligent assistance, GenAI is changing the software development landscape. Key benefits include:

Code Generation and Completion: Tools like GitHub Copilot use GenAI to suggest code completions, generate entire functions from natural language descriptions, and even create boilerplate code, drastically reducing coding time.
Automated Testing and Debugging: GenAI can generate test cases, identify potential bugs, and even suggest fixes, leading to higher-quality and more reliable software.
Simplified Documentation: GenAI can automatically generate documentation from code, ensuring that documentation is always up-to-date and reducing manual effort.

By handling mundane tasks, GenAI allows developers to focus on more creative and complex problem-solving, leading to faster development cycles and innovative solutions.

3. Enhancing Personal Productivity with AI Copilots:

Beyond software development, GenAI is also empowering individuals to boost their personal productivity. AI "copilots" are emerging as valuable assistants for various tasks:

Email and Document Drafting: GenAI can help write emails, create reports, and even draft presentations, saving time and improving the quality of written communication.
Task Management and Scheduling: GenAI can analyze schedules, prioritize tasks, and even suggest optimal times for meetings or focused work.
Idea Generation and Brainstorming: GenAI can act as a brainstorming partner, generating new ideas, exploring different perspectives, and helping overcome creative blocks.

These AI copilots are not meant to replace human effort but rather to augment it, allowing individuals to work smarter, not harder.

4. Transforming Content Creation Across Formats:

Generative AI is revolutionizing content creation across various formats:

Graphics: Tools like DALL-E 2 and Midjourney can create stunning visuals from simple text descriptions, opening up new possibilities for graphic design and visual storytelling.
Video: GenAI is being used to create short videos, animations, and even generate realistic synthetic media, transforming video production and entertainment.
Text: From writing marketing copy and blog posts to generating creative fiction and poetry, GenAI is empowering writers and content creators with new tools and capabilities.

This democratization of content creation is empowering individuals and businesses to create high-quality content more efficiently and at scale.

5. Revolutionizing Knowledge Management:

The sheer volume of information available today can be overwhelming. GenAI is helping us manage and utilize this information more effectively:

Summarization and Synthesis: GenAI can condense large amounts of text into concise summaries, making it easier to grasp key information quickly.
Knowledge Extraction and Organization: GenAI can extract key concepts and relationships from unstructured data, helping to organize and structure knowledge for easier access and retrieval.
Personalized Learning and Recommendations: GenAI can analyze individual learning styles and preferences to recommend relevant resources and create personalized learning paths.

By making information more accessible and manageable, GenAI is empowering individuals and organizations to make better decisions and drive innovation.

Thursday, October 24, 2024

Reducing IT costs - A three-pronged strategy

In today's digital age, IT costs can quickly spiral out of control. To maintain profitability and competitiveness, businesses must proactively seek ways to reduce these expenses.

1. Harnessing the Power of Automation

Hyper Automation: By automating repetitive tasks across IT and business processes, organizations can increase efficiency, reduce errors, and free up valuable resources.
GenAI-Driven Knowledge Search: Implementing a GenAI-powered knowledge search solution can empower employees to find answers quickly and accurately, reducing support costs and improving productivity.
Automated Asset Management: An automated asset management system like ServiceNow can streamline IT operations, reduce manual tasks, and improve asset visibility.

2. Optimize Cloud Spend

FinOps: Adopting a FinOps approach allows organizations to manage cloud costs effectively by combining financial accountability, engineering practices, and business objectives.
Automated DevSecOps/MLOps/GenOps: Reduce the number of resources required to manage your cloud operations.
IaaC automation: Infrastructure as Code automation can help in quickly provisioning and tear down of environements leading to cost savings.
Leverage Dev Containers: Reduce onboarding time for developers and projects.

3. Simplifying and Consolidating

Rationalizing Enterprise Architecture: By simplifying your back office and eliminating redundant systems, you can reduce maintenance costs and improve overall efficiency.
Consolidating Vendor Landscape: Reducing the number of vendors can lead to better pricing, improved service levels, and reduced administrative overhead.

Wednesday, October 23, 2024

Articulating the business value of IT solutions

When you're trying to sell an IT solution to your stakeholders, it’s super important to talk about more than just the tech specs. You need to show how it can actually help the business.

Here’s a breakdown of five key ways to highlight the business value of your IT proposal:

Cost Reduction: Look for specific areas where your IT solution can help save money, like making processes smoother, automating tasks, or using resources more efficiently. Use solid metrics and financial forecasts to illustrate how much money could be saved. For instance, calculate potential savings from lower labor costs, reduced energy use, or less waste. Point out how these cost savings can lead to better profits, more competitiveness, and sustainable growth. Also emphasize on how these cost savings can create a "self-funding" model for their new transformative initiatives.

Generating new Revenue Streams: Talk about how the IT solution can open up fresh revenue opportunities, like launching new products or services, tapping into new markets, or coming up with innovative business models. Share data that shows demand in the market and how profitable these new revenue streams could be (market share). Explain how the solution can grow with the business as it expands (scalability).

Enhancing Customer Experience: Get to know what problems customers face and how your IT solution can help solve them through a deep understanding of business processes. Describe how the solution will make customers happier through quicker responses, personalized services, or added convenience. Stress how a great customer experience can set your business apart from competitors and build customer loyalty. Try to articulate the ROI in terms of NPS improvement or CSAT improvements.

Supporting Net Zero Goals of the Enterprise: Look at how the IT solution benefits the environment, like using less energy or cutting down carbon emissions. Align with the sustainability goals of the customer. Show how your solution fits into the company’s sustainability plans and helps create a greener future. Highlight the positive effects on society and local communities.

Gaining Deeper Business Insights: Explain how the IT solution can pull useful insights from current data, like spotting trends or opportunities. This is especially true when you are building data modernization solutions. Show how these insights can lead to smarter business choices based on data-driven-decisions. Emphasize that data-driven insights can give you an edge over competitors and spark innovation.

By clearly explaining the business value of ourIT solution,we have a better shot at getting buy-in from stakeholders and ensuring a successful rollout.

Monday, October 14, 2024

Ruminating on JDK builds and versions

Ever since Oracle changed the licensing of Java in 2019, there has been a lot of confusion in the market. Found the below blog article to be extremely useful in this regard -

https://www.marcobehler.com/guides/a-guide-to-java-versions-and-features

As of today, the best bet for developers looking for LTS (Long Term Support) is to use Eclipse Temurin. Memebers of this consortium include IBM/Red Hat, Microsoft, Azul, New Relic, Alibaba, T-systems, etc. A list of all OpenJDK builds by Temurin is available here https://adoptium.net/temurin/releases/

Other option for LTS is Amazon Corretto: https://aws.amazon.com/corretto/?filtered-posts.sort-by=item.additionalFields.createdDate&filtered-posts.sort-order=desc. Amazon provides LTS at no extra cost. Similary Microsoft, Red Hat also provide their LTS bundles of OpenJDK.

It is important to note that OpenJDK available at https://jdk.java.net/23/ (openjdk.org) does not have LTS. OpenJDK.org site is managed by Oracle primarily and Oracle provides TLS only if you purchase the license of Oracle Java SDK (not OpenJDK).

So, if you are just using the latest download from Openjdk site, then as soon as a new version is available, the older versions are NOT patched! So you will not get security patches for older versions and this can be a concern for production environments.

OpenJDK builds having LTS are typically supported with patches for 4 years...so you have time to plan your upgrades. Security is a major concern in software development. LTS versions receive regular security updates, ensuring that vulnerabilities are patched in a timely manner. Another good blog showcasing the various LTS options is here - https://www.petefreitag.com/blog/java-lts-versions/ and https://adoptium.net/support/

Friday, September 27, 2024

Ruminating on DORA regulation

The Digital Operational Resilience Act (DORA), a new EU regulation, aims to strengthen the cybersecurity and operational resilience of financial institutions and their critical ICT providers. IT companies, particularly those serving the financial sector, must be ready to comply with DORA's comprehensive requirements by its enforcement date of January 17, 2025.

DORA is basically a set of rules for financial companies in the EU to make sure they're safe from cyberattacks and other tech problems. It's like a safety net to keep their services running smoothly, no matter what happens. This applies to banks, insurance companies, and even the tech companies that help them out.

DORA's Core Components are as follows:

Cyber Risk Management Framework: Organizations must establish a comprehensive plan for identifying, assessing, and mitigating risks related to their information and communication technology systems.
Incident Response and Reporting Systems: Entities are required to implement procedures for monitoring, detecting, and reporting ICT-related incidents.
Digital Operational Resilience Testing: Regular testing of ICT systems is mandatory to evaluate their resilience against cyber threats and operational disruptions.
Third-Party Risk Management Controls: Stricter measures are necessary to assess and manage the risks associated with outsourcing ICT services to third-party providers.
Information Sharing Mechanisms: Entities must participate in collaborative efforts to share intelligence and best practices regarding cyber threats.

To comply with DORA regulations, an enterprise is expected to do the following:

Check Your Risk Management Plan: First, see if your organization already has a plan for managing ICT risks. This plan should include rules, procedures, and regular checks that fit your organization's specific risks.
Identify Gaps: Look at your current plan and compare it to what DORA requires. Find any areas where you might be lacking, like security testing or managing risks from third-party vendors. This will help you know what changes you need to make.
Review Your Incident Response: Make sure your processes for handling incidents are strong enough to meet DORA's standards. This means you should be able to watch for, manage, and report incidents effectively.
Improve Testing Procedures: Update your testing plan to include regular checks for vulnerabilities and penetration tests. DORA requires that critical organizations conduct threat-led penetration testing (TLPT) every three years.
Manage Third-Party Risks: Put in place strict measures for handling risks from third-party service providers. This includes keeping a detailed list of all contracts with these providers.
Share Information: Set up ways to share information about cyber threats with other organizations in the financial industry. Working together can help everyone become more resilient against cyber threats.

Wednesday, September 18, 2024

Ruminating on AWS Fargate Autoscaling

Amazon Fargate is a serverless compute engine that allows you to run containers without having to provision or manage servers. One of its powerful features is automatic scaling, which enables your application to adjust its capacity based on demand. This ensures optimal performance and cost efficiency.

Target Tracking and Step Scaling: A Dynamic Duo

Fargate automatic scaling primarily relies on two strategies: target tracking and step scaling. Let's delve into how these mechanisms work together to maintain desired application performance.

Target Tracking:

Defining a Metric: You specify a metric that represents your application's performance or resource utilization. This could be CPU utilization, memory usage, or a custom metric.
Setting a Target Value: You establish the desired target value for the metric. For instance, you might set a target CPU utilization of 70%.
Continuous Monitoring: Fargate continuously monitors the actual metric value and compares it to the target.
Scaling Actions: If the actual value deviates significantly from the target, Fargate triggers scaling actions to adjust the number of tasks.

Step Scaling:

Step Adjustments: Step scaling involves increasing or decreasing the number of tasks by a predefined step size.
Scaling Policies: You define scaling policies that specify:

Step size: The number of tasks to add or remove in each scaling action.
Cooldown period:The minimum time between scaling actions to prevent excessive fluctuations.
Thresholds:The deviation from the target metric that triggers scaling.

How They Work Together:

Target Tracking: Fargate monitors the specified metric and determines if it's deviating from the target.
Step Scaling: If the deviation exceeds the defined thresholds, Fargate applies the corresponding scaling policy.
Adjustment: The number of tasks is increased or decreased by the step size.
Evaluation: Fargate continues to monitor the metric and adjusts the number of tasks as needed to maintain the target value.

Imagine a web application/API that experiences sudden traffic spikes during peak hours. By using target tracking and step scaling, you can configure Fargate to automatically increase the number of tasks when demand surges, ensuring optimal performance for your users.

Wednesday, August 28, 2024

Ruminating on INVEST principles for user stories

The User Story, a concise, informal explanation of a desired product or functionality from the user's perspective, is a key component of Agile techniques. To ensure that these user stories are valuable, clear, and actionable, a set of principles known as INVEST were developed.

The INVEST acronym, coined by Bill Wake, outlines six essential qualities that user stories should possess:

Independent: User stories should be self-contained and not rely on others. This ensures that they may be built and tested independently, eliminating dependencies and simplifying the development process.
Negotiable: User stories are not contracts. They should be open to discussion and negotiation with stakeholders to ensure that they align with business objectives and user needs.
Valuable: User stories should deliver tangible value to the user or business. They should address real pain points or provide new capabilities that enhance the user experience.
Estimable: User stories should be estimable in terms of effort and time required to complete them. This enables the development team to plan effectively and prioritize work.
Size: User stories should be small enough to be completed within a single iteration or sprint. This promotes a steady flow of work and prevents the team from becoming overwhelmed with large, complex tasks.
Testable: User stories should be testable to ensure that they meet the defined acceptance criteria. This helps to verify that the implemented functionality meets the user's expectations.

To ensure that your user stories adhere to the INVEST principles, consider the following guidelines:

Prioritize independence: Break down large, complex features into smaller, independent user stories.
Foster negotiation: Encourage open communication and collaboration with stakeholders to refine user stories and ensure they align with business objectives.
Focus on value: Identify user stories that directly address customer needs or provide significant business benefits.
Estimate effort: Use techniques like story points or planning poker to estimate the relative size of user stories.
Define acceptance criteria: Clearly articulate the conditions that must be met for a user story to be considered complete.
Maintain size: Keep user stories small and focused to avoid overwhelming the development team.

By following the INVEST principles, we can create user stories that are clear, actionable, and aligned with the overall goals of our project. This will help improve communication, increase productivity, and deliver higher-quality software.