Authors:

(1) Goran Muric, InferLink Corporation, Los Angeles, (California [email protected]);

(2) Ben Delay, InferLink Corporation, Los Angeles, California ([email protected]);

(3) Steven Minton, InferLink Corporation, Los Angeles, California ([email protected]).

Abstract and 1 Introduction

1.1 Motivation

2 Related Work and 2.1 Prompting techniques

2.2 In-context learning

2.3 Model interpretability

3 Method

3.1 Generating questions

3.2 Prompting LLM

3.3 Verbalizing the answers and 3.4 Training a classifier

4 Data and 4.1 Clinical trials

4.2 Catalonia Independence Corpus and 4.3 Climate Detection Corpus

4.4 Medical health advice data and 4.5 The European Court of Human Rights (ECtHR) Data

4.6 UNFAIR-ToS Dataset

5 Experiments

6 Results

7 Discussion

7.1 Implications for Model Interpretability

7.2 Limitations and Future Work

Reproducibility

Acknowledgment and References

A Questions used in ICE-T method

7 Discussion

Our study introduces the Interpretable CrossExamination Technique (ICE-T), a novel prompting method that integrates LLM responses with traditional classification algorithms to improve the performance on binary classification tasks. This technique addresses key limitations in zero-shot and few-shot learning by employing a structured, multi-prompt approach that transforms qualitative data into quantifiable metrics, thus allowing a small, traditional classifier to effectively make decisions. Our results confirm that ICE-T consistently surpasses zero-shot baselines across multiple datasets and metrics, particularly in scenarios where model interpretability is crucial. This prompting strategy also demonstrates the potential for fully automated, high-performing AI systems accessible even to nonexperts.

The ICE-T method has demonstrated its capability to not only enhance performance over the zero-shot approach but also to do so with smaller models that might not perform as well in a zeroshot configuration. For example, the improvement in the CREATININE and ENGLISH tasks within clinical trials data underscores the method’s ability to handle domain-specific challenges that require nuanced understanding, which zero-shot configurations typically struggle with.

7.1 Implications for Model Interpretability

A major advantage of the ICE-T approach is its interpretability. By generating a feature vector based on direct responses to structured prompts, experts can trace back the decision-making process, understanding which factors contributed most significantly to the model’s classification. This is particularly valuable in fields like medicine and law, where decision rationale is as important as accuracy. The ability to dissect and validate each step of the model’s reasoning aligns with the growing demand for transparency in AI applications, ensuring that decisions made by AI systems can be audited and trusted.

Moreover, ICE-T is particularly valuable in situations where fine-tuning models is not viable. Finetuned models often suffer from a significant drawback: they lack transparency and become “black boxes,” making their decision-making processes obscure. This lack of interpretability is particularly problematic in regulated sectors such as healthcare, law and finance, where it’s imperative to comprehend the basis of each decision. ICE-T overcomes these issues by employing a methodology that remains clear and interpretable, avoiding the opaqueness associated with fine-tuned systems.

This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license.