Saturday, November 22, 2025

Text Classification from Scratch using PyTorch

The AI/ ML development framework Keras 3x supports in recent times has got support for Torch & Jax backends, in addition to Tensorflow. However, given Keras's Tensorflow legacy large sections of the code are deeply integerated with Tensorflow. 

One such piece of code is text_classification_from_scratch.py from the keras-io/ examples project. Without tensorflow this piece of code simply won't run!

Here's text_classification_torch.py a pure Torch/ PyTorch port of the same code. The bits that needed modification:

  • Removing all tensorflow related imports
  • Loading the Imdb text files in "grain" format in place of "tf" format, by passing the appropriate param: 

    keras.utils->text_dataset_from_directory(format="grain") 

Also grain needs to be installed:

    pip3 install grain 

  • For building Vocab, Tokenizer, Vectorizing use torchtext:

    pip3 install torchtext

  • Few other changes such as ensure max_features constraint's honoured, text is standardized, padded, and so on   

Saturday, November 15, 2025

Guardrails & Guard-Llm's

With wide scale adoption of Llm's & Agentic models in production, there's also a pressing need to verify both the inputs & output for GenAI use cases. This should ideally be done in real-time just before serving the response to the end user. This would ensure that no invalid, harmful, hateful, confidential, etc content goes through in either direction. Guardrails are the answer to that very problem.

The simple idea with Guardrails is to apply intelligent input/ output filters that can sanitize and filter out both bad requests/ responses from getting through. There are many ways of implementing Guardrails as pattern based, rule engines, etc. Though these have worked so far, in an ever changing Agentic world it's now up to the self learning guard Llm's to judge & flag! 

Guard llm's are specifically trained to flag out harmful content. One such implementation is llama-guard which flags out violations of any of the ML Commons AI Safety Taxonomies/ Categories.

An implementation of the guard-llm can be found in the ApiCaller project. More specifically the ApiCaller->invokeWithGuardrails():

  •  First calls a local Ollama model with sanitized input to get a response
  •  Then calls the isSafe() method with the received response
  •  isSafe() internally makes a call to a different Ollama model llama-guard which flags out the content as safe/ unsafe

Check the TestApiCaller.py test case for better clarity.

References

  • https://mlcommons.org/2024/04/mlc-aisafety-v0-5-poc/
  • https://www.ibm.com/think/tutorials/llm-guardrails
  • https://ollama.com/library/llama-guard3

Friday, November 14, 2025

LangWatch Scenario with Ollama

LangWatch Scenario is a framework for a Agent testing based on pytest. Scenario runs with Openai compatible api's. Here we show how to get LangWatch running using local Llm's with Ollama.

The code test_ollama_client.py is along the same lines as the test_azure_api_gateway.py from the scenario python examples folder. 

Changes specific to Ollama being:

1. Set-up

    pip3 install langwatch-scenario 

Environment variables

    export OPENAI_API_BASE_URL=http://localhost:11434/api/
    export OPENAI_API_KEY=NOTHING

2. Create Ollama client

    ollama_client() -> OpenAI(base_url=<OLLAMA_BASE_URL>)

3. Configuring the Ollama model (gemma, etc) & custom_llm_provider ("ollama") in the Agents (UserSimulatorAgent & JudgeAgent)           

    scenario.UserSimulatorAgent(model=OLLAMA_MODEL, client=custom_client, custom_llm_provider=CUSTOM_LLM_PROVIDER)...

For better clarity see test_ollama_client.py.

4. Offline LangWatch Scenario Reporter

For every run LangWatch uploads run results to app.langwatch.ai endpoint. For a truly offline run set the LANGWATCH_ENDPOINT location: 

    export LANGWATCH_ENDPOINT= <https://YOUR_REPORTING_ENDPOINT>

There's no option to disable scenario reporting for now. Only work around is to set  to LANGWATCH_ENDPOINT to an invalid value (eg "http://localhost2333/invalid").

 

Wednesday, November 5, 2025

Agent2Agent (A2A) with a2a-sdk and Http2

Continuing with A2A evaluation next up is a2a-sdk (unrelated to previously evaluated a2a-server). This evaluation is largely based on getting the hello world from the a2a-samples project working as per the instruction of a2a-protocol. With additional, integration with other Http2 based non Python clients.

(I) Installation

pip install a2a-sdk 

# uvicorn python-dotenv (packages existing) 

# For Http2 support 

pip install hypercorn 

pip install h2==4.2.0 (See Issue 1 at the end & the bug details

git clone https://github.com/a2aproject/a2a-samples.git -b main --depth 1

(II) Replace uvicorn server with hypercorn (support for Http2) 

The a2a-samples make use of the uvicorn python server. However, uvicorn is a Http1.x compliant server and doesn't support Http2. Keep seeing the following messages if client requests from Http2: 

"WARNING:  Unsupported upgrade request. "

In order to support a wider & more updated category of clients, uvicorn is replaced with a hypercorn which is Http2 compliant.

In order to switch to hypercorn, the following changes are done to _main_.py of helloworld python project

#import uvicorn
 

# Use Hypercorn for Http2
import asyncio
from hypercorn.config import Config
from hypercorn.asyncio import serve

 ....

    config = Config()
    config.bind="127.0.0.1:8080"  # Binds to all interfaces on port 8080

    asyncio.run(serve(server.build(), config))
   # uvicorn.run(server.build(), host='127.0.0.1', port=8080, log-level='debug') 

(III) Run helloworld

python a2a-samples/samples/python/agents/helloworld/__main__.py 

(IV) View AgentCard

Open in the browser or via curl:

curl http:///127.0.0.1:8080/.well-known/agent-card.json

Response: 

{"capabilities":{"streaming":true},"defaultInputModes":["text"],"defaultOutputModes":["text"],"description":"Just a hello world agent","name":"Hello World Agent","preferredTransport":"JSONRPC","protocolVersion":"0.3.0","skills":[{"description":"just returns hello world","examples":["hi","hello world"],"id":"hello_world","name":"Returns hello world","tags":["hello world"]}],"supportsAuthenticatedExtendedCard":true,"url":"http://127.0.0.1:8080/","version":"1.0.0"} 

For the Authorized Extended Agent Card:

curl -H "Authorization: Bearer dummy-token-for-extended-card" --http2 http://127.0.0.1:8080/agent/authenticatedExtendedCard 

Response: 

{"capabilities":{"streaming":true},"defaultInputModes":["text"],"defaultOutputModes":["text"],"description":"The full-featured hello world agent for authenticated users.","name":"Hello World Agent - Extended Edition","preferredTransport":"JSONRPC","protocolVersion":"0.3.0","skills":[{"description":"just returns hello world","examples":["hi","hello world"],"id":"hello_world","name":"Returns hello world","tags":["hello world"]},{"description":"A more enthusiastic greeting, only for authenticated users.","examples":["super hi","give me a super hello"],"id":"super_hello_world","name":"Returns a SUPER Hello World","tags":["hello world","super","extended"]}],"supportsAuthenticatedExtendedCard":true,"url":"http://127.0.0.1:8080/","version":"1.0.1"} 

(V) Send/ Receive message to Agent

curl -H "Content-Type: application/json"  http:///127.0.0.1:8080 -d '{"jsonrpc":"2.0","id":"ee22f765-0253-40a0-a29f-c786b090889d","method":"message/send","params":{"message":{"role":"user","parts":[{"text":"hello there!","kind":"text"}],"messageId":"ccaf4715-712e-40c6-82bc-634a7a7136f2","kind":"message"},"configuration":{"blocking":false}}}' 

Response: 

 {"id":"ee22f765-0253-40a0-a29f-c786b090889d","jsonrpc":"2.0","result":{"kind":"message","messageId":"d813fed8-58cd-4337-8295-6282930d4d4e","parts":[{"kind":"text","text":"Hello World"}],"role":"agent"}}

(VI) Send/ Receive via Http2

curl -iv --http2 http://127.0.0.1:8080/.well-known/agent-card.json

curl -iv --http2  -H "Content-Type: application/json"  http://127.0.0.1:8080 -d '{"jsonrpc":"2.0","id":"ee22f765-0253-40a0-a29f-c786b090889d","method":"message/send","params":{"message":{"role":"user","parts":[{"text":"dragons and wizards","kind":"text"}],"messageId":"ccaf4715-712e-40c6-82bc-634a7a7136f2","kind":"message"},"configuration":{"blocking":false}}}'

(The responses are the same as shown above)

(VII) Send/ Receive from Java client

TBD

(VIII) Issues 

Issue 1: Compatibility issue with hypercorn (ver=0.17.3) & latest h2 (ver=4.3.0)

Ran in to the issue in the mentioned here:

    |   File "/home/algo/Tools/venv/langvang/lib/python3.13/site-packages/hypercorn/protocol/h2.py", line 138, in initiate
    |     event = h2.events.RequestReceived()
    | TypeError: RequestReceived.__init__() missing 1 required keyword-only argument: 'stream_id' 

Issue was resolved by downgrading to h2 (ver=4.2.0).

 

Tuesday, November 4, 2025

Agent2Agent (A2A) with a2a-server

Agent2Agent (A2A) is a protocol for AI agents to communicate amongst themselves. These Agents though built by different vendors by subscribing to the common a2a protocol will have a standardized way of inter-operating.  

Getting going with A2A 

(I) As a starting point got the python a2a-server installed. 

pip install a2a-server

Issue 1: Compatibility issue between latest a2a-server & a2a-json-rpc:

a2a-server & a2a-server also brings in a2a-json-rpc:  but there were compatibility issues between the latest a2a-json-rpc (ver.0.4.0) & a2a-server (ver. 0.6.1)

        ImportError: cannot import name 'TaskSendParams' from 'a2a_json_rpc.spec' (.../python3.13/site-packages/a2a_json_rpc/spec.py) 

Downgrading  a2a-json-rpc to previous 0.3.0 fixed it:

pip install a2a-json-rpc==0.3.0 

(II) To get the a2a-server running a agent.yaml file needs to be built with the configs like host, port, handler, provider, model, etc:

server:
  host: 127.0.0.1
  port: 8080

handlers:
  use_discovery: false
  default_handler: chuk_pirate
  chuk_pirate:
    type: a2a_server.tasks.handlers.chuk.chuk_agent_handler.ChukAgentHandler
    agent: a2a_server.sample_agents.chuk_pirate.create_pirate_agent
    name: chuk_pirate
    enable_sessions: false
    enable_tools: false
    provider: "ollama"
    model: "llama3.2:1b"
    version: "1.0.1"

    agent_card:
      name: Pirate Agent
      description: "Captain Blackbeard's Ghost with conversation memory"
      capabilities:
        streaming: false
        sessions: false
        tools: false 

-- 

Next, start the server using:

a2a-server -c agent.yaml --log-level debug 

(III) Test a2a-server endpoint from browser

Open http://127.0.0.1:8080/ which will lists the different Agents. 

Agent Card(s): 

http://127.0.0.1:8080/chuk_pirate/.well-known/agent.json 

(IV) Issues a2a-server 

Issue 2: Agent Card endpoint url 

Firstly, the Agent Card end point is that this is no longer a valid end point. As per the latest Agent Card protocol the Agent Card needs to be served from the location: http://<base_url>/ .well-known/agent-card.json

  • agent-card.json (& not agent.json) 
  • Without the agent's name (i.e. without chuk_pirate) 

The valid one would looks like:

http://127.0.0.1:8080/chuk_pirate/.well-known/agent.json 

Issue 3: Error message/send not found

The other issue is that the seems to be a lack of support for the method "message/ send"  used to send messages and chat with the agent. The curl request fails with an error: 

curl -iv -H "Content-Type: application/json"  http://127.0.0.1:8080/chuk_pirate -d '{"jsonrpc":"2.0","id":"ee22f765-0253-40a0-a29f-c786b090889d","method":"message/send","params":{"message":{"role":"user","parts":[{"text":"hello  there!","kind":"text"}],"messageId":"ccaf4715-712e-40c6-82bc-634a7a7136f2","kind":"message"},"configuration":{"blocking":false}}}' 

{"jsonrpc":"2.0","id":"ee22f765-0253-40a0-a29f-c786b090889d","result":null,"error":{"code":-32601,"message":"message/send not found"}} 

Due to all these issues with a2a-server and its lack of documentation there's no clarity on the library. So it's a no-go for the moment atleast.

Sunday, November 2, 2025

DeepEval

DeepEval helps to test and verify the correctness of LLMs. DeepEval is a framework with a suite of Metrics, Synthetic Data generation having integrations across all leading AI/ ML libraries. 

DeepEval can be used to set-up one LLM to judge the output of another LLM. This JudgeLLM set-up can be used at both the training as well as live inference stage for MlOps scenarios.

Getting started with DeepEval is simple with Ollama

(I) Installation

    pip install deepeval

Ollama installation was covered previously with a llama3.2 base model. 

(II) Set-Ollama model in DeepEval

# Unset the openai model - default for DeepEval     

deepeval unset-openai

# Set ollama model for DeepEval 

deepeval set-ollama "llama3.2:1b" --base-url="http://localhost:11434"  

(III) Create a JudgeLLM.py code

# Set up ollama model

model = OllamaModel(
  model="llama3.2:1b",
  base_url="http://localhost:11434",
  temperature=0.0,  # Example: Setting a custom temperature

# Set up evaluation metrics 

correctness_metric = GEval(
    name="Correctness",
    criteria="Determine whether the actual output is factually correct based on the expected output.",
    # NOTE: you can only provide either criteria or evaluation_steps, and not both
    evaluation_steps=[
        "Check whether the facts are true"    ],
    evaluation_params=[LLMTestCaseParams.INPUT, LLMTestCaseParams.ACTUAL_OUTPUT, LLMTestCaseParams.EXPECTED_OUTPUT],
   model=model, # ollama model
rubric=[
        Rubric(score_range=(0,2), expected_outcome="Factually incorrect."),
        Rubric(score_range=(3,6), expected_outcome="Mostly correct."),
        Rubric(score_range=(7,9), expected_outcome="Correct but missing minor details."),
        Rubric(score_range=(10,10), expected_outcome="100% correct."),
    ],
#    threshold=0.1   

# define the test case

test_case_maths = LLMTestCase(
    input="what is 80 in words? using only 1 word.",
    actual_output="eighty",
    expected_output="eighty"

 # Run the evaluation

evaluate(test_cases=[test_case_maths], metrics=[answer_relevancy]) 

(IV) Execute the JudgeLLM.py 

 deepeval test run JudgeLLM.py