![](https://pitti-backend-assets.ams3.digitaloceanspaces.com/huggingface_text_clustering_3d13490e12.png?w=3840&q=75)
Huggingface : Text Clustering
Date : 2024-01-12
Description
This summary was drafted with mixtral-8x7b-instruct-v0.1.Q5_K_M.gguf
The Text Clustering repository by HuggingFace serves as a minimal codebase for clustering texts. It contains a pipeline that uses existing standard methods such as Sentence Transformers, UMAP, Faiss, Plotly, Matplotlib, and Scikit-learn. The pipeline consists of several distinct blocks that can be customized and run in a few minutes on a consumer laptop. This repository also provides an example of clustering and topic labeling applied to the AutoMathText dataset, utilizing Cosmopedia's web labeling approach.
GitHub repo here
Recently on :
Artificial Intelligence
Information Processing | Computing
![](https://pitti-backend-assets.ams3.digitaloceanspaces.com/thumbnail_finetuning_modernbert_argilla_828e0d3969.png?w=384&q=75)
WEB - 2024-12-30
Fine-tune ModernBERT for text classification using synthetic data
David Berenstein explains how to finetune a ModernBERT model for text classification on a synthetic dataset generated from argi...
![](https://pitti-backend-assets.ams3.digitaloceanspaces.com/thumbnail_finetuning_modernbert_philschmidt_0d32e4f3eb.png?w=384&q=75)
WEB - 2024-12-25
Fine-tune classifier with ModernBERT in 2025
In this blog post Philipp Schmid explains how to fine-tune ModernBERT, a refreshed version of BERT models, with 8192 token cont...
![](https://pitti-backend-assets.ams3.digitaloceanspaces.com/thumbnail_modernbert_anserai_a65c02643c.png?w=384&q=75)
WEB - 2024-12-18
MordernBERT, finally a replacement for BERT
6 years after the release of BERT, answer.ai introduce ModernBERT, bringing modern model optimizations to encoder-only models a...
![](https://pitti-backend-assets.ams3.digitaloceanspaces.com/thumbnail_ai_bubble_thumbnail_8909f3f6f8.png?w=384&q=75)
PITTI - 2024-09-19
A bubble in AI?
Bubble or true technological revolution? While the path forward isn't without obstacles, the value being created by AI extends ...
![](https://pitti-backend-assets.ams3.digitaloceanspaces.com/thumbnail_LMSYS_arena_cf9d4a89a6.png?w=384&q=75)
PITTI - 2024-09-08
Artificial Intelligence : what everyone can agree on
Artificial Intelligence is a divisive subject that sparks numerous debates about both its potential and its limitations. Howeve...