Skip to main content
Filter by
Sorted by
Tagged with
0 votes
0 answers
56 views

I try to run Flux2 inference on 2 GPUs as follows: import torch from diffusers import Flux2Pipeline from accelerate import PartialState import argparse from pathlib import Path def main(): parser ...
Franck Dernoncourt's user avatar
0 votes
0 answers
26 views

I am now trying to use FSDP in Huggingface transformers Trainer. The training script is something like train_dataset = Mydataset(...) args = TrainingArguments(...) model = LlamaForCausalLM....
MR_Xhao's user avatar
  • 11
0 votes
0 answers
42 views

I'm building an Eclipse plugin for code completion using the Hugging Face API. My plugin sends a prompt to the endpoint: https://router.huggingface.co/hf-inference/v1/chat/completions I replaced the ...
kiruba T's user avatar
2 votes
1 answer
63 views

Question: I'm experiencing a question with the transformers library, specifically with the pipeline initialization. When I access the base_model attribute of a LlamaForCausalLM model, it seems to ...
Hank Wang's user avatar
0 votes
0 answers
78 views

I am currently experimenting with modifying the KV cache of the LLaVA model in order to perform controlled interventions during generation (similar to cache-steering methods in recent research). The ...
Pulkit Mittal's user avatar
1 vote
0 answers
163 views

I need to to run a series of pre-trained fine-tuned models from Hugging Face to Jupyter notebook. I have updated to the latest version of both PyTorch and Transformers, but when I run the code from ...
Alex Colville's user avatar
1 vote
1 answer
79 views

I'm trying to implement Speech-to-Text transcription in my Swift app using Hugging Face's swift-transformers package to run Whisper models locally. I've added the package to my Xcode project, but when ...
Zaid's user avatar
  • 449
0 votes
1 answer
74 views

I am trying to run Mistral-7B-Instruct-v0.2. Each run is PROMPT + details[i]. PROMPT has instructions on how to generate JSON based on details. As the prefix part of each input is same; kind of like a ...
acdhemtos's user avatar
0 votes
0 answers
103 views

I got Python 3.12.3 on an Ubuntu server. I tried to install transformers, tokenizers, datasets and accelerate to use the Seq2SeqTrainer in the transformers. I used a virtual environment for the ...
Raptor's user avatar
  • 54.4k
0 votes
0 answers
36 views

I'm fine-tuning T5-small using PyTorch Lightning and encountering a strange issue during validation and test steps. The Problem: During validation_step and test_step, model.generate() consistently ...
GeraniumCat's user avatar
3 votes
0 answers
112 views

I have encountered a particular problem while executing a function from the transformers library of huggingface on an Intel GPU wheel of torch. Since I am doing something I normally shouldn't be ...
Logarithmnepnep's user avatar
1 vote
0 answers
68 views

My proxy goal is to change LoRA from h = (W +BA)x to h = (W + BAP)x. Preliminary code attached for your reference My actual goal is to train a model with the following loss: 〖Θ ̃=(arg min)┬Δ ̂ 〗⁡〖‖𝑓_(...
Jason Rich Darmawan's user avatar
1 vote
2 answers
181 views

I am trying to run the code example for run some inference on the model Qwen/Qwen3-VL-4B-Instruct model: from transformers import Qwen3VLForConditionalGeneration, AutoProcessor # default: Load the ...
Franck Dernoncourt's user avatar
-1 votes
2 answers
99 views

I’m trying to use LangChain’s Hugging Face integration to chat with the model TinyLlama/TinyLlama-1.1B-Chat-v1.0 for the very first time, but I’m getting a StopIteration error when calling .invoke(). ...
forstudy's user avatar
9 votes
2 answers
2k views

Recently i have started to get some strange errors, for example RemoteEntryNotFoundError: 404 Client Error. (Request ID: Root=1-68e82630-293b962044bc3e6c1453ec73;43987a97-e033-4590-951e-829a3c87d2cb) ...
Алиса Алексеевна's user avatar
3 votes
2 answers
194 views

I am working with OmniEmbed model (https://huggingface.co/Tevatron/OmniEmbed-v0.1), which is built on Qwen2.5 7B. My goal is to get a multimodal embedding for images and videos. I have the following ...
n_arch's user avatar
  • 76
0 votes
0 answers
91 views

After failing to make the QwenImageEditPlus run (https://huggingface.co/spaces/discord-community/README/discussions/9#68d260e32053323e6bfab30c), I tried a different approach (thanks to all the example ...
Siladittya's user avatar
  • 1,215
0 votes
0 answers
103 views

when the program starts to initialize pipeline object, a unexpected error was thrown: [rank0]: Traceback (most recent call last): [rank0]: File "/root/anaconda3/envs/polar/lib/python3.12/site-...
Aerith's user avatar
  • 1
2 votes
1 answer
153 views

I'm building a simple agent using LangChain that leverages a locally-hosted HuggingFace model (gpt-oss-20b). I'm using the transformers pipeline and wrapping it in LangChain's HuggingFacePipeline. The ...
meysam's user avatar
  • 194
3 votes
0 answers
59 views

I am trying to deploy a fine-tuned Mistral-7B model on an Azure ML Online Endpoint. The deployment repeatedly fails during the init() phase of the scoring script with an huggingface_hub.errors....
User's user avatar
  • 157
0 votes
1 answer
87 views

I am getting the following error when running training, using the TRL library in the following HuggingFace space: vishaljoshi24/trl-4-dnd. My SDK is Docker and as far as I'm aware there are not ...
Vishal Joshi's user avatar
-1 votes
1 answer
597 views

I'm a bit stumped on an issue that just popped up. My code, which uses the transformers library, was running perfectly fine until I tried to install a CUDA-compatible version of PyTorch. Everything ...
meysam's user avatar
  • 194
1 vote
1 answer
92 views

raise KeyError(f"Cache only has {len(self)} layers, attempted to access layer with index {layer_idx}") KeyError: 'Cache only has 0 layers, attempted to access layer with index 0' When I try ...
OctSky's user avatar
  • 11
0 votes
0 answers
226 views

Description: I am trying to install the Hugging Face Transformers version that supports the Qwen2.5-Omni model. According to the official docs, the correct tag to install is v4.51.3-Qwen2.5-Omni-...
Promit Dey Sarker Arjan's user avatar
1 vote
0 answers
65 views

I'm fine-tuning a CrossEncoder model with LoRA using sentence-transformers library on Kaggle (12-hour limit). I need to resume training from a checkpoint, but I'm getting a ValueError when trying to ...
Tuan Anh Pham's user avatar
0 votes
0 answers
61 views

I trained a Qwen model on my own dataset. Now I need to evaluate my trained model using the loss function, but I don’t know how to do it. I saw examples for other metrics such as accuracy and ...
Kathi Meyer's user avatar
0 votes
0 answers
272 views

I have this code: import os import torch from datasets import Dataset from transformers import ( AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, TrainingArguments, ) from peft ...
Santhosh's user avatar
1 vote
2 answers
399 views

For my particular project, it would be very helpful to know how many tokens the BGE-M3 embedding model would break a string down into before I embed the text. I could embed the string and count the ...
ManBearPigeon's user avatar
2 votes
0 answers
90 views

I have gpt oss 20b model's weights locally. What are the necessary steps to run a 20B model using transformers. in files that I downloaded is multi safetensor files. and also a .bin file. which one of ...
miky's user avatar
  • 21
2 votes
1 answer
109 views

I need to stop hugging face pipeline operation. I tried to achieve this using a method from the following question, but it didn't work. I set the breakpoint on the line return flag and expected ...
Intolighter's user avatar
0 votes
0 answers
164 views

I'm trying to use optuna to find good hyperparameters for a fine-tuning task I'm doing with some different language models. My actual code is more complex, but here's a MWE: import torch import optuna ...
Jigsaw's user avatar
  • 449
0 votes
0 answers
49 views

I am using the llama-8b-llava model. I have made some modifications to the model, which are non-structural and do not introduce any parameters. During the model loading process, I used the torch....
ILOT's user avatar
  • 23
2 votes
1 answer
1k views

I'm in python 3.11.13 with these versions: huggingface-hub 0.31.4 transformers 4.52.4 sentence-transformers 5.1.0 And this OS (Mac): Darwin G9XFDK7K6J 24....
Clovis's user avatar
  • 4,483
1 vote
2 answers
172 views

I am reading about Text embeddings in LLM from the book Hands-On Large Language Models. It is mentioned that as follows: from sentence_transformers.evaluation import EmbeddingSimilarityEvaluator from ...
venkysmarty's user avatar
  • 11.5k
1 vote
0 answers
807 views

I’m trying to load gpt-oss-20b locally using Hugging Face transformers with CPU only. Minimal code: from transformers import pipeline model_path = "/mnt/d/Projects/models/gpt-oss-20b" pipe = ...
mindlesscoding's user avatar
0 votes
0 answers
92 views

I'm trying to perform LoRA fine-tuning using the transformers, trl, and peft libraries in a Google Colab environment with a T4 GPU. My goal is to load the model in 8-bit using bitsandbytes. I ...
ays's user avatar
  • 11
2 votes
3 answers
172 views

I have a FastAPI endpoint (/generateStreamer) that generates responses from an LLM model. I want to stream the output so users can see the text as it’s being generated, rather than waiting for the ...
sander's user avatar
  • 1,490
1 vote
0 answers
53 views

I'm trying to fine-tune Hugging Face BLIP (Bootstrapped Language-Image Pretraining) to classify pizza boxes as either recyclable (clean) or non-recyclable (contaminated) by generating captions that ...
Wow Wow's user avatar
  • 11
3 votes
1 answer
233 views

Why do non-identical inputs to ProtBERT generate identical embeddings when non-whitespace-separated? I've looked at answers here etc. but they appear to be different cases where the slicing of the out....
Maximilian Press's user avatar
0 votes
0 answers
47 views

I'm trying to use the TinyLlama/TinyLlama-1.1B-Chat-v1.0 model from Hugging Face with LangChain using the langchain_huggingface integration. My goal is to get a simple response from the model using ...
Simran Dalvi's user avatar
0 votes
0 answers
70 views

I'm trying to compute a measure of semantic similarity between titles of scientific publications using SPECTER2, but the model performs poorly. Here is my code: from transformers import AutoTokenizer ...
robertspierre's user avatar
0 votes
1 answer
318 views

I'm trying to load in a huggingface sentence transformers model like this: from sentence_transformers import SentenceTransformer model = SentenceTransformer("all-MiniLM-L6-v2") ##I've also ...
simulacrum's user avatar
0 votes
0 answers
40 views

I’m using Hugging Face’s T5ForConditionalGeneration and want to add a per‑token NE‑type embedding alongside the standard token embeddings. tok_embeds = model.encoder.embed_tokens(input_ids) ne_embeds ...
Analu Ramos's user avatar
0 votes
1 answer
241 views

I'm trying to load the Qwen2.5-VL-7B-Instruct model from hugging face with 4-bit weight-only quantization using TorchAoConfig (similar to how its mentioned in the documentation here), but I'm getting ...
Sankalp Dhupar's user avatar
1 vote
1 answer
27 views

I'm trying to import modules from bert_opinion.py and post.py after downloading them from the Hugging Face Hub using hf_hub_download, as described for my chosen model on the Hugging Face website. Here'...
Muhammad Abdullah's user avatar
0 votes
0 answers
69 views

I'm using this code for fine-tuning a LoRA model: bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_compute_dtype=torch.float16, bnb_4bit_use_double_quant=True, ...
Keithx's user avatar
  • 3,158
0 votes
0 answers
70 views

I want to use text_embeddings and combine them with output of an intermediate layer of the text_encoder of the clip. My input to the text_encoder is a learnable prompt embeddings which is intialized ...
user avatar
1 vote
1 answer
67 views

I downloaded an old custom model based on Llava that runs on transformer 4.31.0 and I tried to use it together with a Qwen model which uses transformer 4.53.1. After updating transformers the Llava ...
Raymond Li's user avatar
0 votes
0 answers
169 views

I'm working on a browser-based audio transcription app using Transformers.js by Xenova. I'm trying to transcribe a .wav file selected by the user using the following code: import { pipeline } from '@...
piyush's user avatar
  • 1
1 vote
1 answer
101 views

I am working on a Huggingface transformers EncoderDecoderModel consisting of a frozen BERT-Encoder (answerdotai-ModernBERT-base) and a trainable GPT2-Decoder. Due to the different architectures for ...
soosmann's user avatar
  • 119

1
2 3 4 5
68