JINA EMBEDDINGS 2: 8192-Token General-Purpose Text Embeddings for Long Documents: Appendix

Written by @escholar | Published on 2024/2/23

TL;DR —

Text embedding models have emerged as powerful tools for transforming sentences into fixedsized feature vectors that encapsulate semantic information.

This paper is available on arxiv under CC 4.0 license.

Authors:

(1) Michael Günther, michael.guenther;

(2) Jackmin Ong, jackmin.ong;

(3) Isabelle Mohr, isabelle.mohr;

(4) Alaeddine Abdessalem, alaeddine.abdessalem;

(5) Tanguy Abel, tanguy.abel;

(6) Mohammad Kalim Akram, kalim.akram;

(7) Susana Guzman, susana.guzman;

(8) Georgios Mastrapas, georgios.mastrapas;

(9) Saba Sturua, saba.sturua;

(10) Bo Wang, bo.wang;

(11) Maximilian Werk, maximilian.werk;

(12) Nan Wang, nan.wang;

(13) Han Xiao, han.xiao}@jina.ai.

Table of Links

A Appendix

Table 4: Detailed Performance on the MTEB Classification Tasks

Table 5: Detailed Performance on the MTEB Clustering Tasks

Table 6: Detailed Performance on the MTEB Summarization Tasks

Table 7: Detailed Performance on the MTEB Pair Classification Tasks

Table 8: Detailed Performance on the MTEB ReRanking Tasks

Table 9: Detailed Performance on the MTEB Retrieval Tasks

Table 10: Detailed Performance on the MTEB STS Tasks

[story continues]

Written by

@escholar

We publish the best academic work (that's too often lost to peer reviews & the TA's desk) to the global tech community

Topics and
tags

This story on HackerNoon has a decentralized backup on Sia.

Transaction ID: a5-RyerX1Z51i33axPtq2DBh0W64LAA0Tg0gtqlQsik