LLM Turkey
Ecosystem · Network

Measure Intelligence.Build Trust.

LLMTurkey Network is Turkey's AI evaluation community. Join us.

The Problem

The trustworthiness of Turkish models is unknown.

Thousands of organizations are putting AI into production — but no independent source measures which model is accurate, safe and consistent in Turkish.

01

Turkish gap

Most global benchmarks are English-only. Turkish hallucination, fairness and reasoning performance is not systematically measured.

02

Vendor marketing

Scores published by model providers are not independent. Enterprises want to trust a third party, not the manufacturer.

03

Operational blind spot

Once a model is in production, most organizations have no infrastructure to continuously measure how it behaves in Turkish.

Vision

Turkey's EvalOps backbone.

Over the next three years, we aim to be the independent measurement reference for Turkish AI — establishing a shared evaluation language for academia, industry and government.

V1

Independent benchmark

Continuously updated Turkish-first scoreboards, free from vendor, project or political bias.

V2

EvalOps practice

Continuous evaluation operations enterprises can integrate into production to measure their own models.

V3

Research core

A research network producing open reports on Turkish LLM safety, fairness and robustness.

Sample Use Cases

What we actually work on inside the Network

  • U01Bank · Customer supportRisk of a chatbot quoting the wrong interest rate to a customer.TR-Truth scores + hallucination map across 8 models → safest model + remediation list.
  • U02Public · Document summarizationModel safety and fairness in official correspondence.5-model comparison on Safety & Bias parameters + KVKK/GDPR compliance note.
  • U03Health startup · Clinical assistMedical-term accuracy and explainability requirement.Truthfulness + Explainability scores, source-grounded answer tests, production decision report.
Platform Output

Concrete work the Network ships.

Members appear by name under these outputs — we are remembered for measurement, not manifestos.

O01

Turkish LLM Leaderboard

An open leaderboard refreshed every quarter, measured across 9 parameters and 12 scenarios.

O02

Sector Reports

Specialized evaluation reports for banking, public sector, health and education.

O03

EvalOps Playbooks

Open methodology and template kits enabling enterprises to set up an internal evaluation pipeline.

O04

Open Scenario Sets

Turkish-language evaluation scenarios extended by the community and published on GitHub.

Founding Council

Founding Council

A curated network of academics, researchers and industry leaders shaping the AI evaluation culture in Turkey.

  • 01Expert input shaping strategy and metric design
  • 02Attribution in benchmarks and reports
  • 03Reserved seat at the annual Founding Summit
What We Actually Need Now

The roles we're looking for right now

The Network isn't symbolic membership — we're looking for real contribution on open projects. If you fit one of the roles below, your application is prioritized.

R01

Turkish NLP researcher

To extend the Turkish data sets behind Bias & Fairness and Truthfulness scenarios.

R02

EvalOps engineer

To build the continuous benchmark pipeline and own API integrations.

R03

Domain expert (legal · health · finance)

To audit sectoral scenarios for real-world fidelity.

R04

Community coordinator

To run events, open calls and partner outreach.

Who Can Join
A01Students

Students

Anyone wanting to grow in AI evaluation, benchmark methodology and EvalOps.

A02Researchers

Researchers

Researchers contributing to benchmarks, AI safety, ethics and model evaluation.

A03Professionals

Professionals

Practitioners using AI in their workflows or specializing in this field.

A04Industry Leaders

Industry Leaders

Executives shaping reliable AI transformation in their organizations.

A05Partners

Partners

Universities, companies, technology ventures and communities.

What You Get

What you get when you join

  • 01

    EvalOps expertise

    12-week EvalOps Specialist program + applied work on live projects.

  • 02

    Real benchmark projects

    Contribute to evaluation studies published on Judex; results carry your name.

  • 03

    Enterprise AI evaluation experience

    Field experience on evaluation projects with banks, public sector and tech companies.

  • 04

    Research network

    Direct collaboration with researchers working on Turkish LLM safety and fairness.

  • 05

    Career opportunities

    Network-only referral channel into partner organizations' job and consulting listings.

  • 06

    Community and partner network

    Monthly closed sessions, access to founding members, intros to partner organizations.

Events

Members-only events

We tackle the most current topics in AI evaluation in small, focused groups.

  • E01WebinarsLive sessions on specific benchmark and EvalOps topics with expert speakers.
  • E02WorkshopsHands-on, small-group workshops focused on practice.
  • E03Founding SummitAn annual private summit with the founding council and selected partners.
  • E04RoundtablesClosed sessions where industry and academia discuss themed problems.
  • E05Benchmark SessionsTechnical sessions evaluating new models and metrics live.
Partnership

Let's Build theTrustworthy AIEcosystem Together

Long-term collaborations with academia, industry and the community form the backbone of LLMTurkey Network.

P01

Academic Partner

Joint benchmark studies, publications and curriculum collaboration with universities, labs and research institutes.

P02

Technology Partner

Integration and evaluation partnerships with model providers, infrastructure companies and AI ventures.

P03

Enterprise Partner

Custom benchmark and EvalOps programs for organizations measuring their AI transformation.

P04

Community Partner

Co-created content, events and visibility with communities, associations and meetups.