Available for new opportunities

Hi, I'm Kazim Hussain
I build full-stack & AI products.

Full Stack Developer and Machine Learning Engineer with 5+ years of experience building scalable web applications, AI-integrated systems, and LLM evaluation workflows. I work across Python, TypeScript, and Linux to ship production-grade software.

5+
Years of experience
4
Companies shipped with
10+
Production projects
LLM
Fine-tuning & evals

Experience

Where I've shipped real work

Python / Machine Learning Engineer

Oct 2025 — Present

Turing Enterprises · Remote

  • Designed adversarial and domain-specific terminal benchmarking tasks for LLMs (Claude Sonnet 4 / 4.5, Qwen Coder, Hunyuan) in Linux environments, targeting model failure modes.
  • Authored golden Bash solutions and SFT training data to improve model performance on terminal operations and developer-tool workflows using Linux and Docker.
  • Supported NVIDIA projects focused on improving model capability in external API calling and technical response generation through curated fine-tuning and evaluation workflows.
  • Automated cross-application OS interaction tasks for the OS World project using Python and PyAutoGUI (Chrome, LibreOffice, Terminal, VS Code).
  • Evaluated model outputs for Python, JavaScript, and SQL tasks in the Meta Evaluation Project — assessing truthfulness, instruction adherence, correctness, verbosity, and quality.
  • Contributed to QA for Meta OpenClaw — training and evaluating API-calling models (Llama, Quite_Sand, Gemma) to complete user tasks through the Maton API (email, messaging, Jira, document drafting).

Senior Software Engineer

Sep 2023 — Present

Ingenious Programmer's Incorporate · Lahore

  • Leading a team of junior developers and collaborating with cross-functional teams to translate requirements into effective technical solutions.
  • Used Jira dashboards (burndown, velocity) and Agile practices to monitor team performance and identify bottlenecks.
  • Engineered and maintained CI/CD pipelines and architectural patterns (microservices, event-driven), ensuring code quality through peer reviews and agile best practices.
  • Spearheaded the development of large-scale, data-intensive web platforms with a focus on modular design, performance, and cloud deployment (AWS, Docker).

Software Developer

Jun 2022 — Aug 2023

Devsinc · Lahore

  • Designed and implemented backend APIs and services for dynamic frontend interactions with a focus on scalability, security, and RESTful standards.
  • Collaborated on AI-integrated features within full-stack solutions, adding intelligent automation and real-time decision support capabilities.
  • Worked to train and fine-tune LLMs in technical domains such as web-development environments.

Junior JavaScript Developer

Apr 2021 — May 2022

KICS, UET Lahore

  • Collaborated on frontend development using JavaScript, gaining practical exposure to DOM manipulation, event handling, and asynchronous programming.
  • Contributed to UI/UX enhancements and helped maintain web application interfaces to improve usability and responsiveness.
  • Assisted senior developers in debugging, testing, and optimizing JavaScript code across research-based projects.

Projects

AI/LLM & full-stack work

Turing — AI / LLM Engineering

Terminal Bench

Linux Docker Bash SFT LLM Eval

Designed adversarial and domain-specific terminal benchmarking tasks for LLMs (Claude Sonnet 4 / 4.5, Qwen Coder, Hunyuan) in Linux environments — targeting model failure modes. Authored golden Bash solutions and SFT training data to improve model performance on terminal operations, command-line reasoning, and developer-tool workflows. Clients: Alibaba & Tencent.

NVIDIA — API Calling & Technical Generation

Fine-Tuning API Calling Python Evaluation

Supported NVIDIA projects focused on improving model capability in external API calling and technical response generation through curated fine-tuning datasets and rigorous evaluation workflows.

OS World

Python PyAutoGUI OS Automation

Automated cross-application OS interaction tasks for the OS World project using Python and PyAutoGUI — covering real-world workflows across Chrome, LibreOffice, Terminal, and VS Code to generate high-quality agent training data.

Meta Evaluation Project

Python JavaScript SQL LLM Eval

Evaluated model outputs for Python, JavaScript, and SQL tasks — assessing truthfulness, instruction adherence, correctness, verbosity, and overall response quality to drive model improvement signals.

Meta OpenClaw

Llama Quite_Sand Gemma Maton API QA

Contributed to the QA team for Meta OpenClaw — training and evaluating API-calling models (Llama, Quite_Sand, Gemma) to complete user tasks through the Maton API, including email and message sending, document drafting, Jira automation, and related workflow execution.

Web & Full-Stack

Art Exhibition Platform

Django JavaScript 3D

A virtual art exhibition platform to grow audience reach and community engagement, featuring an interactive virtual gallery with 3D rendering of artworks.

Interactive Medical Animations Platform

Django Three.js Medical

Medical platform allowing users to view MRI, CT, X-ray, ultrasound, and endoscopy results as interactive 3D objects for better diagnostics and patient understanding.

Interior Designing Platform

Flask Three.js Sanity CMS

Parametric room-builder where users trace walls and the algorithm snaps angles and auto-calculates square footage. Real-time light-probe baking reduced render latency, with an integrated furniture CMS and live pricing API boosting upsell conversion.

AI-based Civil Engineering Toolkit

YOLO SAM NLP

Intelligent estimation toolkit that automates quantity take-off and cost estimation from architectural drawings using YOLO for object detection and Meta's Segment Anything Model (SAM) for precise segmentation of walls, doors, and windows from 2D blueprints.

Jenkins Pipeline — Internal Process-Improvement Dashboard

Jenkins Groovy React + TS TimescaleDB D3.js

Instrumented Jenkins 2.x with custom Groovy shared libraries and the Pipeline Events API to stream real-time build logs, job durations, and test results into a PostgreSQL TimescaleDB warehouse. Designed a React + TypeScript dashboard (Vite, Chakra UI) featuring D3 heat maps for stage latency, Pareto charts for failure causes, and SPC trend lines.

Technical Skills

Tools I reach for daily

Machine Learning / AI

PyTorch TensorFlow 2 scikit-learn Hugging Face ONNX MLflow Kubeflow Triton CNN / RNN / Transformer Quantization & Pruning

Full Stack Development

Node.js 20 Express NestJS FastAPI Django REST React Next.js Vue Tailwind GraphQL tRPC WebSockets Prisma Docker Kubernetes Celery Redis

NLP / AI Integration

LangChain OpenAI API Anthropic API RAG (Pinecone, Weaviate) RLHF SFT LoRA / QLoRA Q-learning

Languages

TypeScript JavaScript Python 3.12 Go SQL HTML / CSS Bash

Developer Tools

Git GitHub / GitLab VS Code JupyterLab Postman Storybook Docker Compose GitHub Actions SonarQube ESLint Prettier

Cloud & Observability

AWS (ECS, Lambda, S3, SageMaker) GCP (Cloud Run, Vertex AI) Azure (Functions, OpenAI Service) Prometheus Grafana Sentry

Education & Contact

Let's build something great

Get in touch

Open to full-stack and ML engineering roles, freelance projects, and AI-product collaborations. The fastest way to reach me is email — I usually respond within a day.

BSc, Electrical Engineering
University of Engineering and Technology, Lahore · 2017 — 2021