Evaluated LLM-generated simulations of causal outputs with algorithmic results across real-world, causal and synthetic datasets using metrics such as Precision, Recall, F1-score and Structural Hamming Distance.
Analyzed the CNNs trained on MNIST and FashionMNIST using neuron ablation, causal tracing and activation interventions to understand the role of individual neurons and their interactions in model predictions.
Developed a flip test to identify the critical reasoning steps in LLMs' chain-of-thoughts and studied the internal representation of these steps using residual stream analysis. Currently working on a mechanism to restore the correct reasoning path when critical steps are corrupted.
Generated and evaluated word embeddings with pre-trained GloVe and Word2Vec embeddings and performed cross-lingual alignment of English and Hindi embedding using Procrustes. Also, analyzed the race and gender bias in pre-trained embeddings via WEAT test.