Tags · ggml-org/llama.cpp

b5223

server : Prefilling assistant message in openai compatible API (#13174)

* Prefilling assistant message in openai compatible API

* fixed indentation

* fixed code convention

* simplify method usage

* no more than one assistant message at end of messages

* merge checks into prefill code

* Update examples/server/utils.hpp

---------

Co-authored-by: matteo <matteo@naspc.lan>
Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>

Apr 29, 2025
e2e1ddb
zip
tar.gz
Downloads

b5222

sampling : when top-k <= 0 -> noop (#13173)

ggml-ci

Apr 29, 2025
d9d398f
zip
tar.gz
Downloads

b5221

llama-bench: fixed size of fields to correctly map to values (#13183)

Apr 29, 2025
5a63980
zip
tar.gz
Downloads

b5220

CUDA: fix non-cont. inputs for batched mat mul (#13155)

Apr 29, 2025
cdf7658
zip
tar.gz
Downloads

b5219

llama : llm_type order by size (#13177)

Apr 29, 2025
7d3af70
zip
tar.gz
Downloads

b5218

mtmd : add qwen2vl and qwen2.5vl (#13141)

* llava : add clip_n_output_tokens, deprecate clip_n_patches

* mtmd : add qwen2vl and qwen2.5vl

* decode_embd_batch::set_position_...

* working version

* deprecate llama-qwen2vl-cli

* correct order W, H of clip_embd_nbytes_by_img

* edit existing line in hot topics

Apr 29, 2025
00e3e5a
zip
tar.gz
Downloads

b5217

llama : set qwen3 model type sizes (#13175)

Apr 29, 2025
e98b369
zip
tar.gz
Downloads

b5216

llama-graph : fix text position for mrope (#13159)

* llama-graph : fix text position for mrope

* fix typo

* explicitly set 4th dim in the loop

Apr 29, 2025
b6ce743
zip
tar.gz
Downloads

b5215

model : Nomic Embed Text V2 with Mixture-of-Experts (MoE) architecture (

#12466)

* Nomic Embed Text V2 with Mixture-of-Experts (MoE) architecture

- Adds MoE-based embedding model supporting multilingual embeddings.
- Selects architecture variant based on hyperparameter detection (MoE layers).
- Removes unnecessary subclass initialization checks for clarity.

https://www.nomic.ai/blog/posts/nomic-embed-text-v2

Co-authored-by: Jared Van Bortel <jared@nomic.ai>

* fix tokenizer

* don't rename this tensor

---------

Co-authored-by: Jared Van Bortel <jared@nomic.ai>

Apr 28, 2025
5f5e39e
zip
tar.gz
Downloads

b5214

clip : fix model size display (#13153)

Apr 28, 2025
eaea325
zip
tar.gz
Downloads

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

b5223

b5222

b5221

b5220

b5219

b5218

b5217

b5216

b5215

b5214

Tags: ggml-org/llama.cpp