Model Gallery

Discover and install AI models from our curated collection

76 models available
1 repositories
Documentation

Find Your Perfect Model

Filter by Model Type

Browse by Tags

qwopus3.6-27b-coder-mtp
🪐 Qwopus3.6-27B-v2 SFT Release Reasoning-Enhanced Dense Language Model Fine-Tuned on Qwen3.6-27B 🧬 Trace Inversion & Negentropy 🧠 27B Parameters 🔥 3-Stage Curriculum SFT 🛠️ Vision & Tool-use Support 💡 What is Qwopus3.6-27B-v2? 🪐 Qwopus3.6-27B-v2 is a reasoning-enhanced dense language model built on top of Qwen3.6-27B. By leveraging a multi-stage curriculum learning pipeline and augmented with Trace Inversion datasets (claude-opus-4.6/4.7-traceInversion), it reverse-engineers the compressed "Reasoning Bubbles" of commercial LLMs into structured, step-by-step synthetic reasoning traces, successfully eliminating logical shortcuts and knowledge fractures. 🧩 Structured Reasoning Injects reconstructed deep CoT chains to eliminate logical shortcuts via Trace Inversion. 🪶 Style Consistency Enforces strict constraints on the format and convergence of <think> tags. 🔁 Distillation Alignment Ensures high-quality cross-source SFT data alignment to narrow the capacity gap. ⚡ RL Scalability Sets up a stable formatting pipeline optimized for downstream Reinforcement Learning (RL). ## 💡 1. Base Model, Training Library & Cooperation ...

Repository: localaiLicense: apache-2.0

qwopus3.6-27b-v2-mtp
🪐 Qwopus3.6-27B-v2-MTP MTP Release Multi-Token Prediction reasoning model fine-tuned from Qwen3.6-27B 🧬 Trace Inversion & Negentropy 🧠 27B Parameters ⚡ Speculative Decoding 🛠️ Coding / DevOps / Math 💡 What is Qwopus3.6-27B-v2-MTP? 🪐 Qwopus3.6-27B-v2-MTP is a speed-oriented reasoning release built on top of Qwen3.6-27B. It keeps the Qwopus line's focus on reconstructed reasoning traces, coding discipline, DevOps procedures, and mathematical derivations, while adding Multi-Token Prediction for faster generation. The goal is simple: preserve the depth and structure of a 27B reasoning model while making real interactive use noticeably faster. ⚡ MTP DecodingAuxiliary future-token prediction improves throughput on long reasoning, code, math, and strict-format prompts. 🧩 Structured ReasoningInherits the Qwopus training recipe built around reconstructed step-by-step reasoning trajectories. 🧪 GB10 TestedValidated on a 30-question local benchmark across Logic, Coding, DevOps, Math, and Edge tasks. 🚀 Practical SpeedDesigned for workflows where strong answers matter, but waiting several extra minutes per task does not. ...

Repository: localaiLicense: apache-2.0

qwen3.6-27b-heretic-uncensored-finetune-neo-code-di-imatrix-max
Qwen3.6-27B-Heretic2-Uncensored-Finetune-Thinking Yes... fully uncensored AND fine tuned lightly. Freedom and brainpower. Trained on different Heretic base, with different KLD/Refusals. Model fine tune was used to finalize and "firm up" Heretic / uncensored changes. The goal here was light, minor fixes rather than full / heavy fine tune. That being said, the tuning still raised critical metrics. This is Version 2, using "trohrbaugh" Heretic, which has a lower refusal rate, and tuning bumped up the metrics a bit more too. This has also positively impacted "NEO-Coder Di-Matrix" (dual imatrix) GGUF quants as well (vs heretic/non heretic too). https://huggingface.co/DavidAU/Qwen3.6-27B-Heretic-Uncensored-FINETUNE-NEO-CODE-Di-IMatrix-MAX-GGUF ``` IN HOUSE BENCHMARKS [by Nightmedia]: arc-c arc/e boolq hswag obkqa piqa wino Qwen3.6-27B-Heretic2-Uncensored-Finetune-Thinking mxfp8 0.673,0.846,0.905... [instruct mode] Qwen3.6-27B-Heretic-Uncensored-Finetune-Thinking mxfp8 0.669,0.835,0.906,... [instruct mode] BASE UNTUNED MODEL: Qwen3.6-27B HERETIC (by llmfan46) [instruct mode] mxfp8 0.644,0.788,0.902,... ...

Repository: localaiLicense: apache-2.0

carnice-v2-27b
# Carnice-V2-27B for Hermes Agent Carnice-V2-27B is a full merged BF16 SFT of `Qwen/Qwen3.6-27B` for Hermes-style agent traces. This repository contains the standalone merged model weights, not only a LoRA adapter. ## BF16 Transformers Loading Fix The BF16 safetensors were republished with corrected `Qwen3_5ForConditionalGeneration` tensor prefixes. The original merge artifact accidentally serialized an extra Unsloth wrapper prefix, which caused direct HF Transformers loads to report the real weights as unexpected keys and initialize expected layers randomly. GGUF files were not affected because the GGUF conversion path normalized those prefixes. ## Benchmarks The benchmark artifact bundle is included under `benchmarks/`. It contains the rendered graph, extracted `metrics.json`, benchmark scripts, and raw result files used to make the chart. Scope note: the IFEval run is a short `limit=20` A/B smoke benchmark, not an official full leaderboard score. Held-out loss/perplexity is the exact assistant-only training-format validation metric from the SFT script. The raw BFCL two-case smoke files are included for auditability, but they are too small to use as a model-quality claim. ...

Repository: localaiLicense: apache-2.0

qwopus3.6-27b-v1-preview
# Qwen3.6-27B [](https://chat.qwen.ai) > [!Note] > This repository contains model weights and configuration files for the post-trained model in the Hugging Face Transformers format. > > These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc. Following the February release of the Qwen3.5 series, we're pleased to share the first open-weight variant of Qwen3.6. Built on direct feedback from the community, Qwen3.6 prioritizes stability and real-world utility, offering developers a more intuitive, responsive, and genuinely productive coding experience. ## Qwen3.6 Highlights This release delivers substantial upgrades, particularly in - **Agentic Coding:** the model now handles frontend workflows and repository-level reasoning with greater fluency and precision. - **Thinking Preservation:** we've introduced a new option to retain reasoning context from historical messages, streamlining iterative development and reducing overhead. For more details, please refer to our blog post Qwen3.6-27B. ## Model Overview ...

Repository: localaiLicense: apache-2.0

qwen3.6-27b
# Qwen3.6-27B [](https://chat.qwen.ai) > [!Note] > This repository contains model weights and configuration files for the post-trained model in the Hugging Face Transformers format. > > These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc. Following the February release of the Qwen3.5 series, we're pleased to share the first open-weight variant of Qwen3.6. Built on direct feedback from the community, Qwen3.6 prioritizes stability and real-world utility, offering developers a more intuitive, responsive, and genuinely productive coding experience. ## Qwen3.6 Highlights This release delivers substantial upgrades, particularly in - **Agentic Coding:** the model now handles frontend workflows and repository-level reasoning with greater fluency and precision. - **Thinking Preservation:** we've introduced a new option to retain reasoning context from historical messages, streamlining iterative development and reducing overhead. For more details, please refer to our blog post Qwen3.6-27B. ## Model Overview ...

Repository: localaiLicense: apache-2.0

qwen3.5-27b-claude-4.6-opus-reasoning-distilled-heretic-i1

Repository: localaiLicense: apache-2.0

qwen3.5-27b-claude-4.6-opus-reasoning-distilled-i1
Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-i1-GGUF - A GGUF quantized model optimized for local inference. Specialized for reasoning and chain-of-thought tasks. Based on Qwen 3.5 architecture with enhanced language understanding. Available in multiple quantization levels for various hardware requirements. Distilled from Claude-style reasoning models for enhanced logical reasoning capabilities.

Repository: localaiLicense: apache-2.0

q3.5-bluestar-27b

Repository: localaiLicense: mit

qwen3.5-397b-a17b

Repository: localaiLicense: apache-2.0

qwen3.5-27b

Repository: localaiLicense: apache-2.0

vibevoice-cpp-asr
VibeVoice ASR 7B (C++ / GGML, Q4_K) - long-form speech-to-text with speaker diarization. Returns per-speaker JSON segments with start/end timestamps. English-only. ~10 GB download.

Repository: localaiLicense: mit

qwen3-tts-cpp-1.7b-base
Qwen3-TTS 1.7B Base (C++ / GGML, qwentts.cpp), Q8_0 (~2.0 GB talker). Higher-quality streaming + voice cloning, 24kHz mono, 11 languages.

Repository: localaiLicense: mit

qwen3-tts-cpp-1.7b-base-q4
Qwen3-TTS 1.7B Base (C++ / GGML, qwentts.cpp), Q4_K_M (~1.2 GB talker). Streaming + voice cloning, 24kHz mono, 11 languages.

Repository: localaiLicense: mit

qwen3-tts-cpp-1.7b-customvoice
Qwen3-TTS 1.7B CustomVoice (C++ / GGML, qwentts.cpp), Q8_0. Named speakers via the `voice` field (serena, vivian, ryan, aiden, eric, dylan, ...). Streaming, 24kHz mono, 11 languages.

Repository: localaiLicense: mit

qwen3-tts-cpp-1.7b-customvoice-q4
Qwen3-TTS 1.7B CustomVoice (C++ / GGML, qwentts.cpp), Q4_K_M. Named speakers via the `voice` field. Streaming, 24kHz mono, 11 languages.

Repository: localaiLicense: mit

qwen3-tts-cpp-1.7b-voicedesign
Qwen3-TTS 1.7B VoiceDesign (C++ / GGML, qwentts.cpp), Q8_0. Synthesises a speaker from a free-text attribute instruction - REQUIRES the OpenAI `instructions` field (e.g. "male, young adult, moderate pitch"); requests without it are rejected. Streaming, 24kHz mono, 11 languages.

Repository: localaiLicense: mit

qwen3-tts-cpp-1.7b-voicedesign-q4
Qwen3-TTS 1.7B VoiceDesign (C++ / GGML, qwentts.cpp), Q4_K_M. Synthesises a speaker from a free-text attribute instruction - REQUIRES the `instructions` field. Streaming, 24kHz mono, 11 languages.

Repository: localaiLicense: mit

qwen3-tts-1.7b-custom-voice
Qwen3-TTS is a high-quality text-to-speech model supporting custom voice, voice design, and voice cloning.

Repository: localaiLicense: apache-2.0

qwen3-asr-1.7b
Qwen3-ASR is an automatic speech recognition model supporting multiple languages and batch inference.

Repository: localaiLicense: apache-2.0

mox-small-1-i1
The model, **vanta-research/mox-small-1**, is a small-scale text-generation model optimized for conversational AI tasks. It supports chat, persona research, and chatbot applications. The quantized versions (e.g., i1-Q4_K_M, i1-Q4_K_S) are available for efficient deployment, with the i1-Q4_K_S variant offering the best balance of size, speed, and quality. The model is designed for lightweight inference and is compatible with frameworks like HuggingFace Transformers.

Repository: localaiLicense: apache-2.0

Page 1 of many