AI research leadership · Thoughtworks AI Labs

Phillip Howard

I lead a team of AI researchers making generative AI more reliable through pathfinding research on evals, interpretability, and model robustness. My recent work focuses on multimodal models, synthetic data generation, counterfactual evaluation, temporal reasoning, responsible AI, and model interpretability. Before joining Thoughtworks, I spent nearly a decade at Intel working on multimodal cognitive AI, vision-language models, natural language processing, and applied machine learning.

Google Scholar LinkedIn

Latest

News

2026Tutorial on Counterfactual Fairness Analysis in Language-Vision Models accepted to ACM FAccT 2026.
2026Three papers were accepted to ICLR 2026: p-less Sampling (oral), Scaling Knowledge Graph Construction through Synthetic Data Generation and Distillation, and Is Your Paper Being Reviewed by an LLM?
2026Learning from Reasoning Failures via Synthetic Data Generation was accepted to AAAI 2026.
2025Two papers were accepted to EMNLP 2025: Pruning the Paradox and Transformer-Based Temporal Information Extraction and Application: A Review.
2025A Semantic Parsing Framework for End-to-End Time Normalization was accepted to NeurIPS 2025.
2025SK-VQA was presented at ICML 2025 (oral).
2025Two papers were accepted to NAACL 2025: Uncovering Bias in Large Vision-Language Models at Scale with Counterfactuals (oral) and LVLM-Compress-Bench.

Focus Areas

Responsible Multimodal AI

Counterfactual datasets and evaluation methods for measuring social, cultural, and intersectional biases in vision-language models.

Synthetic Data

Targeted data generation for reasoning, retrieval-augmented multimodal generation, and robustness under distribution shift.

Temporal & Structured Reasoning

Models and representations for temporal question answering, time normalization, knowledge graphs, and structured reasoning.

Interpretability

Tools and methods for interpreting CLIP-like models, sparse autoencoders for vision models, and nonlinear representation learning.

Selected Papers

Preprint

Cross-Cultural Value Awareness in Large Vision-Language Models

Phillip Howard, Xin Su, Kathleen C. Fraser

arXiv

Preprint

Cultural Counterfactuals: Evaluating Cultural Biases in Large Vision-Language Models with Counterfactual Examples

Phillip Howard, Xin Su, Kathleen C. Fraser

arXiv

ICLR 2026 Oral

p-less Sampling: A Robust Hyperparameter-Free Approach for LLM Decoding

Runyan Tan, Shuang Wu, Phillip Howard

OpenReview arXiv

ICLR 2026

Scaling Knowledge Graph Construction through Synthetic Data Generation and Distillation

Prafulla Kumar Choubey, Xin Su, Man Luo, Xiangyu Peng, Caiming Xiong, Tiep Le, Shachar Rosenman, Vasudev Lal, Phil L. Mui, Ricky Ho, Phillip Howard, Chien-Sheng Wu

OpenReview arXiv

ICLR 2026

Is Your Paper Being Reviewed by an LLM? Benchmarking AI Text Detection in Peer Review

Sungduk Yu, Man Luo, Avinash Madasu, Vasudev Lal, Phillip Howard

OpenReview arXiv

AAAI 2026

Learning from Reasoning Failures via Synthetic Data Generation

Gabriela Ben Melech Stan, Estelle Aflalo, Avinash Madasu, Vasudev Lal, Phillip Howard

AAAI arXiv

NAACL 2025 Oral

Uncovering Bias in Large Vision-Language Models at Scale with Counterfactuals

Phillip Howard, Kathleen C. Fraser, Anahita Bhiwandiwalla, Svetlana Kiritchenko

ACL arXiv

NAACL 2025

LVLM-Compress-Bench: Benchmarking the Broader Impact of Large Vision-Language Model Compression

Souvik Kundu, Anahita Bhiwandiwalla, Sungduk Yu, Phillip Howard, Tiep Le, Sharath Nittur Sridhar, David Cobbley, Hao Kang, Vasudev Lal

ACL arXiv

EMNLP 2025

Pruning the Paradox: How CLIP's Most Informative Heads Enhance Performance While Amplifying Bias

Avinash Madasu, Vasudev Lal, Phillip Howard

ACL arXiv

EMNLP 2025

Transformer-Based Temporal Information Extraction and Application: A Review

Xin Su, Phillip Howard, Steven Bethard

ACL arXiv

ICML 2025 Oral

SK-VQA: Synthetic Knowledge Generation at Scale for Training Context-Augmented Multimodal LLMs

Xin Su, Man Luo, Kris W. Pan, Tien Pei Chou, Vasudev Lal, Phillip Howard

OpenReview arXiv

NeurIPS 2025

A Semantic Parsing Framework for End-to-End Time Normalization

Xin Su, Sungduk Yu, Phillip Howard, Steven Bethard

NeurIPS arXiv

CVPR 2024

SocialCounterfactuals: Probing and Mitigating Intersectional Social Biases in Vision-Language Models with Counterfactual Examples

Phillip Howard, Avinash Madasu, Tiep Le, Gustavo Lujan Moreno, Anahita Bhiwandiwalla, Vasudev Lal

Project arXiv

NAACL 2024

NeuroComparatives: Neuro-Symbolic Distillation of Comparative Knowledge

Phillip Howard, Junlin Wang, Vasudev Lal, Gadi Singer, Yejin Choi, Swabha Swayamdipta

ACL arXiv

EACL 2024 Demo

NeuroPrompts: An Adaptive Framework to Optimize Prompts for Text-to-Image Generation

Shachar Rosenman, Vasudev Lal, Phillip Howard

ACL arXiv

NAACL 2024

Semi-Structured Chain-of-Thought: Integrating Multiple Sources of Knowledge for Improved Language Model Reasoning

Xin Su, Tiep Le, Steven Bethard, Phillip Howard

ACL arXiv

NeurIPS 2023

COCO-Counterfactuals: Automatically Constructed Counterfactual Examples for Image-Text Pairs

Tiep Le, Vasudev Lal, Phillip Howard

arXiv Dataset

EMNLP 2023

Fusing Temporal Graphs into Transformers for Time-Sensitive Question Answering

Xin Su, Phillip Howard, Nagib Hakim, Steven Bethard

ACL arXiv

CIKM 2022

Cross-Domain Aspect Extraction using Transformers Augmented with Knowledge Graphs

Phillip Howard, Arden Ma, Vasudev Lal, Ana Paula Simoes, Daniel Korat, Oren Pereg, Moshe Wasserblat, Gadi Singer

arXiv

EMNLP 2022

NeuroCounterfactuals: Beyond Minimal-Edit Counterfactuals for Richer Data Augmentation

Phillip Howard, Gadi Singer, Vasudev Lal, Yejin Choi, Swabha Swayamdipta

ACL arXiv

AAAI 2022

TempoQR: Temporal Question Reasoning over Knowledge Graphs

Costas Mavromatis, Prasanna Lakkur Subramanyam, Vassilis N. Ioannidis, Adesoji Adeshina, Phillip Howard, Tetiana Grinberg, Nagib Hakim, George Karypis

arXiv

EACL 2021 Demo

InterpreT: An Interactive Visualization Tool for Interpreting Transformers

Vasudev Lal, Arden Ma, Estelle Aflalo, Phillip Howard, Ana Simoes, Daniel Korat, Oren Pereg, Gadi Singer, Moshe Wasserblat

ACL

Experience

2025-present

AI Researcher, Team Lead & Manager, Thoughtworks
Leading a team of AI researchers focused on applied and frontier AI research.

2016-2025

Senior Staff AI Research Scientist, Team Lead & Manager, Intel Labs
Led a team of AI researchers focused on multimodal cognitive AI.

Ph.D.

Arizona State University
Industrial Engineering. Dissertation: Distinct Feature Learning and Nonlinear Variation Pattern Discovery Using Regularized Autoencoders.